The word comics as a genre designation finds itself being applied to an ever increasing variety of forms, modes, and subgenres. It is clear, however, that comics must have their own, large genre classification because, as Aaron Meskin points out, they do not fit into any other literary designation. They cannot be seen as only literature or poetry or drama because of their heavy use of pictures. However, they cannot be seen as a member of the visual arts solely because of their use of text. Comics truly are a ‘hybrid’ art form (Meskin 2009: 234). The genre is growing and splitting so fast that, those who are immersed within the community of comics can’t quite keep up with the pace. Part of this rapid development is the branching off of various subgenres of comics such as immersive comics, metacomics, and foto-novelas just to name a few. One subgenre that is starting to get more attention is the subgenre of poetry comics. This subgenre, because of its association with poetry, causes interesting questions about the nature of comics and poetry that must be answered. Although there have been attempts to make poetry comics, I argue that most attempts fail to create a hybrid art form and instead favor one mode (the visual or textual) over the other, but this is not to say that poetry comics are impossible to create. By examining insights from narratology and multimodal studies, I will show that poetry comics are possible as long as the work maintains a harmonious multimodal presence based upon segmentivity.

Segmentivity as the essence of poetry and comics

The most basic problem that must be answered before establishing a theory of poetry comics is to settle on a working definition of poetry. Unfortunately, a complete theoretical definition and justification of poetry is beyond the scope of this paper. As such, I believe that the best course of action in creating a working definition of poetry comics is to follow the work of Brain McHale and John Miles Foley who both argue that segmentivity is the essence of poetry.

Brian McHale, drawing on the work of narratology, offers a theory of poetry as the genre which holds segmentivity as its underlining feature (McHale 2010: 28). Segments are indivisible, bounded units of meaning, ‘the smallest unit of resistance to meaning’ (McHale 2010: 29). The bounds that set apart units of meaning can comprise space, punctuation, or pauses. These units of meaning can be as small as a single letter or can run in length to an entire stanza. In the middle of the spectrum, segments could be words, phrases, lines, or sentences depending on the situation and the poem. It is also strong evidence that others working in different fields have also come to the conclusion that segmentivity is the essential aspect of poetry.

Coming from a background in oral poetry, John Miles Foley also claims that segmentivity, or the ‘word,’ is the essence of a poem. Foley’s theory started when he studied the claims of south Slavic oral poets who said that they could reproduce any epic poem ‘word for word’ (2002: 15). The conclusion for Foley is that a word is ‘a unit of utterance, an irreducible atom of performance, a speech-act’ (2002: 13). ‘Words’ become irreducible formations of meaning – atoms of the narrative. It is at this point that both Foley and McHale are in agreement. They see poetry as being built by ‘words’ or segments. There is no evidence that McHale is aware of the work of Foley and the other oral theorists, yet I find it telling that both would come to similar conclusions about poetry although approaching it from different fields.

As we think about segmentivity, it becomes clear that poetry is the traditional, written medium that deals with it the most. Within novels and drama, although a well-placed word is important, there is more room for flexibility in meaning making. The structure is more sweeping. This will usually place the atom of meaning at the level of phrase or idea rather than the word or line. In poetry, especially modern, written poetry, the reverse is true. Most of the atomic meaning is found in a short phrase, the word, or even the letter. Comics, though, as a new art form have also managed to embrace segmentivity as an essential part of their nature.

A working definition for comics comes from Scout McCloud who defines comics as ‘juxtaposed pictorial and other images in deliberate sequence, intended to convey information and/or to produce an aesthetic response in the viewer’ (1996: 9/5). However, to nuance McCloud’s own definition and to add a serious element that he overlooked, we must highlight the segmented nature of comics. Segmentivity in comics naturally derives from the use of frames to create the feel of both time and space.

If the images in comics had no frame, there would be no point of reference to construct meaning at its most basic level: time and space. Immanuel Kant argued that before a human being could even begin to think about the concepts and content of experience, those experiences had to be organized according to time and space (2007: 61). Assuming an artist simply scattered images of people – possibly even the same person – all across the page with no attempt to segment or frame the action, the meaning of the page would revert to chaos. However, by separating the various pictures into frames, the illusion of time and space immediately arises and meaning begins to emerge. The mode of framing then becomes an incredibly important mode in defining comics. To add an addendum to McCloud, we can say that a comic is juxtaposed, segmented images in deliberate sequence. However, without the modes of image and/or text to fill the framed panels, the comic would remain empty and meaningless. Both frames and content are needed to create meaning within comics.

At this point in our conversation, some individuals think that all they must do to create poetry comics is to take the segmentivity inherent in comics and merge it with the segmentivity of poetry. Once the mode of poetry is merged with the mode of comics, poetry comics should be born. This line of reasoning has proven problematic. It is clear that too many attempts to combine comics and poetry end up being mere illustrated poems. Whenever modes will be combined in real ways to make a new and unique hybrid, those modes must work together. In order to understand how modes work together, we must get a background in multimodal theory.

Multimodal theory and comics

To help us see the relationship between the key elements of a comic, the study of comics can be greatly facilitated by approaching comics through a multimodal theory. Looking specifically at the content of comics, Aaron Meskin argues that the primary mode is neither the visual nor the textual. The text and the image, in many cases, were conceived together and as such cannot be realistically taken apart or studied separately (Meskin 2009: 235). When we need to consider all of these elements in order to make meaning, we are going to be considering a multimodal approach. Gunter Kress explores the theoretical work of multimodality studies in his book Multimodality: A Social Semiotic Approach to Contemporary Communication. One of the concepts that shows how all of the modes should be working together is the concept of coherence. Coherence is when all of the modes within a multimodal artifact work with each other within the text to create a sense of completeness (2010: 148). However, completeness cannot be considered if only looking at the text itself. If a text is going to be complete, along with being internally coherent, it must also be cohesive with the social context in which it was made. While these two concepts, working in tandem, ensure that meaning can be made, this paper will focus more on the more formal aspect of coherence and save discussions about cohesion’s role in poetry comics for a future date.

As we have previously established, the comic genre is multimodal in that is uses modes of text and image as well as frames to give meaning to both. The comic must be understood using all of the modes (in this case text, image, and frame). As the modes work together, coherence of the multimodal text is achieved. To illustrate coherence, Kress uses the metaphor of music to explain how the modes of a multimodal text work together. When all the modes work together, they are considered as a modal ensemble (Kress 2010: 28). Just as in a musical ensemble, no voice can be left out without dramatically changing the texture, feel, and meaning of the musical composition. Kress further elaborates on this analogy by calling the multimodal process of assembling or creating a multimodal text orchestration (2010: 161–62).

Within an orchestra, the different voices of the group take turns passing melody around from voice to voice. However, a clever orchestration will find ways of having the different voices mingle within each other. While one voice has a melody, another will play a counter-melody and yet another will busy itself with harmonies or rhythms. In this way, each voice has a role to support and intertwine with the others. Within any one orchestration, even within solos, each voice must blend and rely on the others if the full meaning of the piece is to be realized. As Kress puts it, ‘different modes are foregrounded at different moments in the sequence…to provide complementary information…in a staged sequence’ (2010: 162). This is also relevant with a comic. There may be a panel, page, or line where one voice or mode steps forward enough for its own solo. But if it does so at the expense of other voices, the ensemble will break apart and the orchestration will weaken. The concept of orchestration ensures that each mode works with the others in a complete way in order for meaning to be found. The astute reader will be able to navigate the orchestration and realize that even if one mode is at the forefront of another at any given time, it will slip into the background later. Each mode is contingent on the others for overall meaning. As long as the comic’s modes maintain their contingency overall in relating to each other, the coherence of the multimodal text will be maintained.

With this discussion of the multimodal nature of comics concluded, we can turn our attention more fully to the proposition about the efficacy of poetry comics. As I will show, due to the segmented nature of both poetry and comics, a hybrid art form is possible; however, most attempts of creating poetry comics fall short.

Misses and hits in poetry comics

Neil Cohn argues that poetry comics can be made without the use of text. If we look at the established definitions we have for poetry as the genre that deals with segmentivity and comics as multimodal sequential art, then theoretically it does become clear that poetry comics do not need wording. A panel will naturally segment a narrative or idea and offer a mode to create a multimodal work. Cohn facilitates his argument with a discussion on visual rhyme or when ‘rhyming in the visual domain…establishes a correspondence between two different visual parts’ (Cohn 2006). He claims that ‘The visual form also allows for a type of rhyming that is unavailable to auditory poetry’ since there is virtually no limit to the images that could be made to ‘rhyme’ with each other (Cohn 2006).

I remain skeptical that the purely visual approach could work for all types of poetry. If the only thing to hold a purely visual poem together is a visual rhyme, this seems that it would severely limit the poetic forms that visual poetry could make. Epic poetry, for example, although containing meter and rhyme, is also tied heavily to a narrative. It remains unclear how someone could create a purely visual epic poem, maintaining visual rhymes without the narrative completely eclipsing the poetic elements. The poetic segments of an epic are not reducible to a panel structure that could keep the poetic aspect moving – the images would revert to a purely narrative structure. This criticism becomes more problematic when we apply it to free verse poems with no rhyme or recognizable meter or structure. It remains unclear how Cohen could create a visual poem that uses no visual rhyme to clue the reader in to the fact that it is meant as a poetic artifact rather than a narrative or, perhaps, a collage. For both of these examples, text could step up and maintain the poetic quality of the work, and by maintaining the poetic quality of the work, text can enable the possibility of epic or free verse poetry. Just as the move from oral poetry to visual poetry was able to expand the use of segments from phrases to individual letters and punctuation, written language will be able to expand the types of poetry that can be made in comic form. The language should be able to carry the poem beyond some of the limitations of the frame and image. It seems to be that in order to create longer or more complex poetry comics, text will be important; although, I am willing to concede that text is not needed for the definition.

The balance between text and the other modes has been the main problem with creating a true depiction of poetry comics. Some point to the work of William Blake as the start of the genre, but his pictures act more like illustrations to his poetry rather than part of the poems themselves. This tradition of illustrating poetry is still going strong. Unlike Blake who would often only include one illustration for one complete poem, modern illustrators are beginning to break poems up to make them more like comics. However, simply breaking a poem up into different panels cannot make it a true work of poetry comics. Most of these “poetry comics” are an adaptation of an original, usually classic, poem. Most, at best, offer only an illustration of what is happening in the poem. The text remains the dominant, almost isolated mode of the piece. An example of this can be seen in Fig. 1 which comprises two panels from a graphic representation of Samuel Taylor Coleridge’s ‘Kubla Khan.’ With this, the artist has only created a setting or backdrop in which to place the text of the poem. It is true that the poem is broken up and each section has been given its own frame, but each frame is only there to illustrate the action or the image dedicated within the selected section. The panel for section one shows a lush garden, with the palace and walls. The illustrator has chosen to bold those few words (palace, garden) of the poem. This is an attempt on the artist’s part to point out what parts of the poem she decided to focus on. At best it is a superficial reminder to the reader that there is an illustration present. However, the illustration is there only to support the text not add anything to it. Granted, the picture is lush and detailed, but the fact that the text has been placed as a block solidly in the foreground while everything in the picture remains in the background shows that the text is indeed what is most important here. The bolded word does not invite the reader to study the image just as the image does not ask to be studied. Despite the fact that the artist has tried to establish bridges between the text and the image, the two remain isolated from one another much like a soloist and the stage scenery placed behind her. Even though there are multiple modes present, they do not create a solid ensemble. They are playing two different arrangements of the same piece.

Figure 1
Figure 1

Duke, A (a), Coleridge, S.T.(w) illustration “Kubla Khan.” In R. Kick ed. The Graphic Canon v 2. (New York: Seven Stories Press, 2012: 2). Copyright © Alice Duke, 2012.

With this discussion behind us, we can turn our attention to an example of poetry comics. Nick Hayes’ The Rime of The Modern Mariner, despite paying obvious homage to Coleridge’s Rime of the Ancient Mariner, is an original poem with original artwork. The text and images were conceived at the same time rather than trying to fit an image into a text. This simple fact goes a long way in explaining why this specific poem can work to be a strong example of poetry comics.

This close interplay between the image and the text creates a coherent whole throughout the poem. During several instances, the text is left intentionally vague or misleading. It is during these points that the meaning for the poem is turned over to the image. However, there are several instances where image without text would be idiosyncratic and virtually meaningless. The different modes are each foregrounded at different times showing the coherence of the multimodal ensemble within its own orchestration.

Hayes follows the basic narrative structures of Coleridge’s poem: a man meets a mariner who recounts his adventures at sea along with the lessons he learned. However, as the title points out, Hayes’ poem takes place in the modern world where the business, industrial, consumer worlds are held to be in stark opposition to the natural, rural world. As Hayes works through the poem with text and image, he is ultimately arguing that spirituality can be found in the natural world while the prominent commercial industrialization is causing people to lose their souls.

This is first seen when the man who encounters the mariner gets ready to step out into the world. He has finished signing papers, and we are told that he is “cocooned inside and cosy/in the artificial glow” (Hayes 2011: 8). Even this simple page becomes emblematic of both segmentivity and the coherent multimodal approach that poetry comics must have.

Looking first at segmentivity, we see that rather than taking the entire clause as one part of meaning, Hayes breaks up the metered line into two units of meaning. He uses these units of meaning to highlight various aspects of the man’s life. The image (Fig. 2) breaks up the line to show how being cocooned in a modern age would look. No longer is being at home near a hearth considered cozy; a clean and spacious office building takes that place. The cocoon only becomes more complete as the individual under question is seen putting pen in pocket, hands in gloves, and (presumably) papers in briefcase. Everything is put where it needs to be, and the orderliness of the professional world becomes the cocoon of the business man. However, at the bottom of the page Hayes undermines the homely feel by highlighting that everything is in an “artificial glow.” To discourage anyone from thinking that he is referring to the fluorescent lights of the office building, he exaggerates the image of the glowing handshake making it the central focus of the page. By placing it directly under the phrase “artificial glow,” he makes it the next item to be “read.” The combination of this particular phrase and image create a meaning that spreads to inform the entire page. The office space tells us about the space that the handshakes takes place in. The images of the pen being put away, gloves being put on, and a briefcase being carried all work to let us know that business has come to a close. The handshake, then, should be read in relation to the other images letting us know that this is the handshake to “seal the deal.” At the close of a business deal, one would think there would be good will all around as each side gets something that will benefit him or her. However, the text tells us that this glow is artificial. There is no genuine good will in the handshake nor is there any warmth to the business environment. The handshake is a mere formality to be used as the different players of the business game continue to use others in their quest for capital. The gesture is a hollow one that, we image, is accompanied with a practiced smile. As meaning is balanced between text and image, we can begin to see that the coziness and glow referred to in the text is one of sterility and isolation. The building and the handshake are devoid of natural light. The businessman cocooned inside his gloves, even in the midst of a handshake, remains separated from human contact. It is what happens when a businessman loses his soul.

Figure 2
Figure 2

Hayes “Artificial Glow”. The Rime of the Modern Mariner. (New York: Viking, 2011: 8). Copyright © Nick Hayes, 2011.

The idea of losing spirituality is further elaborated on after the mariner, in his quest to poach a whale, shoots the albatross. Just as in Coleridge’s poem, the ship is becalmed after the shooting of the albatross. However, things begin to take a more literal turn with Hayes at this point. Instead of supernatural encounters, Hayes describes great towers of garbage and plastics floating in the sea, surrounding the ship. Billions of micro-organisms that have ingested plastics inch their way toward the ship. As expected, the mariner finds this scene horrifying and appalling (Hayes 2011: 126–29). However, he takes no personal responsibility for the state of the sea. It is in the midst of this eco-disaster that a typhoon sweeps the mariner into the sea where he comes face to face with a whale. The encounter with the whale proves to be the impetus for the mariner’s spiritual awakening.

The whale itself is alluded to in the text as “The queen/of all creation” (Hayes 2011: 224). The textual reference to Mary, the mother of Jesus, is further heightened by a picture that shows a woman in the pose of benediction. Her religious significance is drawn out as she is surrounded by the symbols of communion (cup, crucifix, and book) as well as two monks in the obvious act of prayer. With all the religious imagery in place, the text can move to the next line as the mariner relates that he was the “mote/within its eye” (Hayes 2011: 225). This is an obvious allusion to Matthew 7:3 where Jesus admonishes his followers to see to the beam in their own eyes before judging the mote in the eye of another. With this realization of his relationship to creation, the mariner comes to the realization that he was the cause of creation’s problems.

The mariner’s full realization about the nature of his spirituality and soul is given final weight with the next segment of poem and its coherence with the image (Fig. 3). He states that he had been “too long above my station” (Hayes 2011: 125). The text in and of itself remains vague and could simply refer to a general attitude of self-importance or pride; however, the image brings thematic weight to the phrase tying it back to the emerging religious themes of the book. The image of the mariner climbing and falling from a tree with a serpent, coupled with the previously established religious meaning, is meant to equate the mariner as fallen. As Genesis says, the Fall happened because Eve and then Adam ate of the tree of knowledge which “opened their eyes.” The image of understanding runs throughout. It is also telling that knowledge also precipitates the fall. The realization that the mariner was the cause of the disaster (by being the mote in the eye of creation) causes his fall. The end result of the fall, as shown by the panel in the lower right hand corner of Figure 3, is a mariner who is ashamed of who he has been and what he has done. It is with this shame and contrition that he can begin to find his salvation.

Figure 3
Figure 3

Hayes, The mariner’ fall. The Rime of the Modern Mariner. (New York: Viking, 2011: 125). Copyright © Nick Hayes, 2011.

Hayes is able to take his poem and layer meaning and symbolism into it by using poetry’s segmentivity. He isolates phrases and lines to focus on them, but unlike a traditional poem where the segmentivity needs to maintain some quality of coherence within itself, the mode of pictures can step in and provide coherence for the text. As seen in Figure 3, Hayes is able to tie the meaning of the second textual segment into the religious framework started by the first segment at the top of the page. In this manner the responsibility of meaning making is passed back and forth by the modes and held together in a coherent multimodal ensemble.

Competing Interests

The author declares that they have no competing interests.


N Cohn, (2006).  Comic theory 101: Seeing rhyme.  June 8 2006 Available at [Last accessed 30 Oct. 2014].

A Duke, (2012). Kubla khan In:  R Kick,   The Graphic Canon. New York: Seven Stories Press, 2 pp. 1.

J M Foley, (2002).  How to Read an Oral Poem. Urbana: University of Illinois Press.

N Hayes, (2011).  The Rime of the Modern Mariner. New York: Viking.

I Kant, (2007).  Weigelt, Marcus (trans.),   Critique of Pure Reason. London: Penguin Books.

G Kress, (2010).  Multimodality: A Social Semiotic Approach to Contemporary Communication. London: Routledge.

S McCloud, (1996).  Understanding Comics: The Invisible Art. New York: HarperPerennial.

B McHale, (2010). Narrativity and segmentivity, or, poetry in the gutter In:  F Jannidis, M Martinez, J Pier, W Schmid,   Narratologia: Contributions to Narrative Theory. Berlin: Walter de Gruyter, pp. 27. DOI:

A Meskin, (2009).  Comics as literature?.  British Journal of Aesthetics 49 (3) : 219. DOI: