Heart is for Love : Cognitive Salience and Visual Metonymies in Comics

This article explores the role of conceptual proximity as a parameter of salience in visual metonymies. The study discusses visual metonymy as a type of conceptual metonymy understood as a way of referring to one concept (the target) via another concept (the vehicle; cf. Lakoff and Johnson 1980, Radden and Kovecses 1999). The vehicle and the target are connected through a contiguity relation salient in a given context. A formal framework is developed for describing such salient contiguities and the key hypothesis is that the salience effect is determined largely by the conceptual distance between the target and the vehicle within a network of available contiguity relations. For example, when a musical note (the vehicle) is used in graphic narrative to refer metonymically to a melody (the target), the note is selected for the vehicle, because there is low conceptual distance between the two concepts.


Setting the Scene
Originally, the research on conceptual metonymy, understood as a way of referring to one concept by means of another concept, focused on the domain of language.
In their seminal Metaphors We Live By (1980), George Lakoff and Mark Johnson proposed that expressions like She's just a pretty face are linguistic manifestations of the underlying association between the concepts person and face, which allows the former to be referred to by means of the latter (cf.Lakoff and Johnson 1980: 37).The authors emphasized that in their theory conceptual metonymies and conceptual metaphors are not, strictly speaking, linguistic phenomena, but mental phenomena expressed linguistically.In other words, metaphors and metonymies are not ways of speaking about the world, but ways of thinking about them.This opens up the possibility of tracing conceptual metonymies in domains other than language and Lakoff and Johnson granted that this is a viable option.Therefore, it should not come as a surprise that metonymies surface in other semiotic modalities, like the domain of visual signs, and their presence in comics in particular has already been noticed by scholars (cf.e.g.McCloud 2006, Duncan and Smith 2009, Szawerna 2017).
In the recent years, the interface between linguistics and comics scholarship has proven to be a fertile field of research.One of the most ambitious projects is Neil Cohn's Visual Language: an attempt to apply the theoretical apparatus of modern linguistics to the analysis of the semiotic repertoire of comics (cf.Cohn 2014).Cohn's proposition is a radical one: not only does he claim that theoretical terms used by linguists, like 'morphology,' 'syntax,' or ' dialect,' can capture some aspects of comics, but more importantly that Visual Language of comics is the same kind of entity as spoken and signed languages, like English, Spanish, Swahili, or American Sign Language.This radical claim has been justly criticized by Szawerna (2017: 24-26), who points out that Visual Language as defined by Cohn lacks essential properties of natural language, especially discreetness, arbitrariness, and duality of patterning.Moreover, there is no evidence (apart from Cohn's speculations) that Visual Language could be spontaneously acquired in childhood like natural languages.Instead it appears that creating comics is similar to other artistic skills, like playing the piano or sculpting, which have to be deliberately learned if one is to become a proficient artist.Nonetheless, Cohn's rightly draws our attention to the fact that comics use signs in a principled and orderly manner amenable to systematic research and that patterns of visual narration hinge on cognitive faculties of the human mind.In this article, one of such faculties, the ability to build and manipulate mental representations of physical and functional connections between objects, is taken to be crucial for explaining the mechanism underlying visual metonymy.This cognitive faculty does play a role in linguistic metonymy and Cohn suggests that since the faculty underlies both linguistic and visual expressions, visual expressions are linguistic expressions.My approach is less radical: I merely claim that the cognitive faculty is general (i.e.not dedicated to exclusively linguistic purposes) and that it plays a role in structuring both linguistic and visual expressions.This does not entail that natural languages and patterns of visual narration are to be conflated.My approach is more similar to one adopted by Potsch and Williams (2012) (as well as a few of other authors from Linguistics and the Study of Comics by Frank Bramlett (2012)), where metonymy is accounted for in terms of conceptual schemas and mappings operating in the semantic representations underlying visual signs.Also, I am generally sympathetic to 'multimodal' accounts of visual metaphors in comics (cf.Forceville 2005, Eerden 2009).
For our purposes, I will define a visual metonymy as a conceptual metonymy expressed by means of visual signs.One example analyzed in the following part of the article is a musical note used to refer to a melody (cf.Section 3.1.).Plainly speaking, a musical note stands for a melody or a song, and this 'standing for' is the essential function of metonymy.I will also assume that such visual metonymies are the same kind of phenomena as linguistic metonymies: both of them are ways of referring to one concept by means of another concept.The adjectives 'visual' and 'linguistic' signal the fact that these mental processes manifest themselves in different media (visual versus acoustic or written) and not that they are different kinds of processes.As far as the distinction between conceptual metaphor and conceptual metonymy is concerned, I will subscribe to Lakoff and Johnson's (1980) view that the former typically involves similarity between two concepts coming from different domains, while the latter involves some sort of contiguity relation (causal, spatial, partitive, etc.) between two concepts in the same domain.Consequently, time is a valuable resource in You're wasting my time!highlights selected salient similarities between the two distinct concepts of time and money (more technically, it involves a cross-domain mapping between the two) and a conceptual metonymy face for person in She's just a pretty face highlights a selected salient contiguity relation between face and person (more technically, a mapping between the two concepts in the same domain; cf.Lakoff and Johnson 1980).The complete picture is more complicated than that, since metaphor and metonymy often partake in complex interaction (cf.e.g.Goossens 1990) and some sorts of metaphors rely on co-occurrence of elements from different domains rather than similarity between them (see e.g.Lakoff and Johnson's orientational metaphors (cf. 1980) and Grady's correlation metaphors (cf. 2007)).However, since our discussion focuses on clear and relatively unproblematic instances of metonymies, these complexities can be put aside.
Before we move on, another point is worth clarifying.Under a sufficiently broad interpretation, every visual sign in comics can be interpreted as standing for something.A picture of Superman stands for Superman.A speech balloon whose tag points to Superman stands for Superman's utterance.Speedlines trailing Superman stand for Superman's motion.Et cetera.In this sense, every sign, both visual and non-visual, is a sort of metonymy, since every sign involves some sort of form standing for some sort of meaning.This observation, as noted by Radden and Kövecses (1999), expresses an important truth, but for our purposes I will adopt a narrower definition of visual metonymy.In this narrow sense, a visual metonymy is a process through which a visual vehicle concept (e.g. a concept musical note) refers to a target concept (not necessarily visual in nature, e.g.melody) and there is a relation of contiguity between these concepts.This narrow definition also excludes situations when a visual sign stands for a visible element of the world, but also suggests something non-visual.For example, in Figure 6 I do not analyze the puffs of smoke coming from the protagonist's gun as a visual metonymy of a gunshot sound.The puffs of smoke are 'literal': they stand for the smoke produced during shooting and I do not take them to be metonymies of anything else, despite the fact that they may be suggestive of gunshot sound.A rule of the thumb for detecting typical visual metonymies in that their vehicles are visual elements which will not appear in comics unless the metonymies are at play.
In Figure 6, a typical visual metonymy is the hearts hovering around the head of one of the characters: the hearts do not represent any visible element of the world and their only function is the metonymic 'standing for' some other concept.The contiguity relation can be of various kinds, including part-whole and causal relationships, spatial and temporal closeness, or frequent co-occurrence.The last type of contiguity can be found in the already mentioned musical note for melody metonymy: often when music is played there is a sheet with notes somewhere around.This frequent co-occurrence of music and musical notes in a particular situation motivates the association between the two concepts.The same point is spelled out more generally and technically by Taylor: 'the essence of metonymy resides in the possibility of establishing connections between entities which co-occur within a given conceptual frame' (Taylor 2009: 125).
Metonymy researchers widely agree that contiguity relations employed in metonymies are somehow salient for particular users in particular situations.While Radden and Kövecses merely note that the vehicle and the target ' are somehow associated' (1999: 17), Langacker states expressly that metonymies show ' our natural inclination to think and talk explicitly about those entities that have the greatest cognitive salience for us' (Langacker 1993: 30).Therefore, salience seems to be an important part of the answer to the question why certain vehicle concepts are more likely to be selected for particular target concepts than other (potentially available) vehicle concepts.For instance, in language, the sentences She's just a pretty face (Lakoff and Johnson 1980: 37) means roughly that the person in question is pretty.
More technically, the metonymy face for person selects the vehicle concept face rather than, for instance knuckles, because the face is more salient than knuckles as far as person's physical attractiveness is concerned.Turning to comics, one may plausibly speculate that when there is a need for depicting a melody in a comics panel, a musical note is more likely to be used than an image of a spreadsheet with a musical score.The choice of the note (rather than the spreadsheet) cannot be explained merely by the fact that there is a co-occurrence relation between notes and a melody, because a similar relation holds between a spreadsheet with score and the melody.
If salience is added to the equation, one may conclude that a note is used, because even though both notes and scores co-occur with the melody, the note is more salient to the melody than the spreadsheet.
Yet even though salience is an important constraint on vehicle selection in metonymies, the notion is far from being obvious and self-explanatory.It may be intuitively clear that in a given context concept a is more salient to concept c than concept b and for this reason a more likely metonymy is a for c rather than b for c, but what specifically makes a more salient than b (in relation to c)?For example, it is intuitively clear why the face is more salient to physical beauty that knuckles and why a note is more salient to a melody than a spreadsheet, but why exactly is this the case?The notion of salience may be intuitively obvious, but on closer inspection the underlying mechanisms appear surprisingly mysterious.Nonetheless, salience does not seem to be a basic unanalyzable property; on the contrary, it appears that in many cases we may at least try to determine what makes some concepts more salient than others in particular situations.

The Proximity Hypothesis
In '"Is this road lazy or just incompetent?"…'(Kowalewski 2017) I offered a formal model for describing salience in linguistic metonymies in terms of conceptual distance between the target and the vehicle concepts.Since both linguistic and visual metonymies are subtypes of conceptual metonymies, my working hypothesis is that the overall mechanism responsible for selecting the vehicle concept is the same.
The descriptive framework that I propose involves modeling the relations between the target and its potential vehicles as a network of contiguity relations.A generic network of this sort is sketched in Figure 1, where the box with 'T' stands for the target concept and the boxes with subscripted 'Vs' stand for potential vehicles.The target and other concepts are vertices of the network and the particular contiguity relations (partitive, causal, spatial, temporal, etc.) are the edges connecting the vertices.The network is built in the so-called cognitive domain, which is a structure of knowledge organizing all information about a particular concept or subject matter.Specific concepts are defined against their respective cognitive domains; for instance the concepts melody, face, and month are defined relative to the cognitive domains [music], [human body], and [time] respectively (cf.Langacker 1987).
The cognitive domain where the contiguity network arises is the so-called search domain.Thus, the concept melody is defined against the cognitive domain [music], which gathers and organizes all knowledge that a person possesses about music.The fact that the target and the vehicle belong to the same cognitive domain provides some restrictions of the selection of the vehicle.Since both elements of the metonymy come from the same cognitive domain, the concepts which are not in any way related to the target are never selected for the vehicle simply because they are absent from the relevant search domain.For example, when a comics artist needs to metonymically depict a melody, she is extremely unlikely to choose a dandelion for the vehicle, since the search domain for the concept melody is [music] and the concept dandelion is absent from this domain (that is: it is not in any way associated with music or perhaps only distantly related through some idiosyncratic associations in the minds of some users).
The fact that the vehicles are always selected from the domain of the target precludes the selection of a huge number of concepts, but nonetheless leaves in many candidates potentially eligible for the vehicle.However, not all of the available candidates are equally likely to be selected.Consider the already mentioned example of the target melody for which the vehicle note rather than spreadsheet with score is selected.This selection cannot be accounted for solely in terms of the shared search domain, since both note and spreadsheet with score belong to the search domain [music] and, at first glance, they both appear to be eligible candidates.In such cases, additional constraints come from what the proximity hypothesis (cf.Kowalewski 2017): Ceteris paribus, within a network of contiguity relations inside a search domain, the preferred vehicle is the closest concept which ensures effective reference to the target.The preferred search domain is the domain of direct sensory or physical experience.
The hypothesis involves the already mentioned network of contiguity relations and specifies that the preferred cognitive domain in which the network arises is the domain of concrete entities available for direct observation or physical interaction.
The hypothesis stipulates that in canonical cases the concepts most likely selected as vehicles are the ones that are within immediate vicinity of the target concept, i.e. they are one edge away from the target.More specifically, the most likely vehicle concepts are the concepts of object which are associated with the target through directly observation or physical interaction.Since comics is an essentially visual medium, this part of the hypothesis can be slightly elaborated: the preferred search domain is the domain of direct visual or interactive experience.Potential vehicles must be visible, not merely detectable through any of our senses.Another important requirement is that a metonymy should allow for unambiguous reference to the target; if such reference is not achieved by the concept in the immediate vicinity of to the target, an element 'further down' the network is likely to be selected instead.The ceteris paribus clause (' all other things being equal') indicates that the hypothesis applies to the canonical situation, but in some special contexts some parts of the hypothesis may not apply.Thus, the proximity hypothesis should not be viewed as a ' covering law' applicable to all cases of metonymies, but as an idealization capturing an important facet of salience.

Musical note for Melody
After this lengthy theoretical introduction, we are now in the position to analyze actual visual metonymies used in comics.The case studies investigated in the article come from Dylan Horrocks's Sam Zabel and the Magic Pen (Horrocks 2015).Figure 2 presents a frame with several instances of the already mentioned metonymy musical note for melody.
Since comics is a visual medium, depicting 'non-visual sensory experiences' (as Duncan and Smith put it; cf.2009: 155) is, of course, a notorious problem for artists.One cannot draw music or the smell of a tasty meal, and yet these elements may have a clearly diegetic ('story-telling') function, and for this reason they should be depicted somehow.A number of conventionalized strategies are available for artists: speech is presented as written text, typically in speech balloons, and non-linguistic sounds are often expressed through onomatopoeias (also used in Figure 2).Yet neither of the strategies is particularly effective for music.In such situations, Horrocks's Sam Zabel and the Magic Pen frequently resorts to visual metonymy musical note for melody.
In the light of the proximity hypothesis, it is easy to account for the selection of the concept musical note for the vehicle.Figure 3 presents a simple network of contiguity relations with two potential candidates for the vehicle, as proposed in Section 1.The actually selected vehicle is marked with heavy-line box.
The concept musical note is preferred over spreadsheet with score, because the former is closer to the target melody than the latter within the network of contiguity relations.To put things less technically, a melody is more immediately associated with musical notes, because notes are used to represent the melody graphically, while a spreadsheet with score is (perhaps) immediately associated with musical notes, but not with the melody itself.Metaphorically speaking, the process of vehicle selection tends to economize the cognitive effort: while moving down the network of contiguity relations starting from the target, the process singles out the concept which is first to satisfy the requirement of effective reference.Since in musical note for melody, musical note can be used to unambiguously refer to music, there is no need to move further down the network in search of a different vehicle.If one agrees that a note is more salient for music than a spreadsheet with score, it is this 'immediacy of association,' expressed formally in Figure 3 as the distance from the target concept, that accounts for the salience effect.
Horrocks operates with instances of the metonymy musical note for melody with notable subtlety.

action button for action
The metonymy musical note for melody employs a concept of a visible object which stands for something inherently invisible.According to the proximity hypothesis, attractive candidates for vehicles are also objects which are available for direct physical interaction.Such an 'interactive' metonymy can be found in Figure 4.In the frame, one of the protagonists, Alice, is holding a smartphone in order to make a recording.Alice's poise, the smartphone at the eye-level and ' aimed at' the sexual orgy, suggests that Alice is capturing the scene with the device or that she is about to do this.Nonetheless, the author decided to clarify the panel by drawing the distinctive 'play button' in the balloon extending from the smartphone.This, of course, suggests that Alice has (or is about to) hit the button to make the recording.
Here, the function of the metonymy action button for action is not meant to signal an invisible element of the world, but to signal a complex action, which may have been difficult to show in a static picture.The author could have depicted the very action of engaging the camera, for instance as a close-up of the screen of the smartphone, but this would require adding one extra panel, which would probably feature the play button anyway.Alternatively, Alice could have said that she was recording the orgy, but this would require at least one extra sentence and result in an unnecessarily bloated speech balloon.In effect, the metonymy action button for action in Figure 4 is an elegant narrative solution: it successfully specifies the action performed by Alice and steers clear of the redundancy of the other options.
action button for action in Figure 4 is fully compatible with the proximity hypothesis: the button depicted in the frame is the one which immediately triggers the action of recording.From our everyday experience with smartphones, we can safely assume that Alice had had to press other buttons as well in order to record the scene: there was at least a button engaging the built-in camera and perhaps she needed to press a combination of buttons to unlock the device.Thus, before pressing the play button, Alice must have interacted with other elements of her device's interface and since all of these elements form a cause-and-effect chain leading up to the action of recording, all of the buttons in the sequence are potential candidates for vehicles of the metonymy.However, the proximity hypothesis constrains the selection of the vehicle to the element of the interface most immediately linked to the target action of recording.The network of observable elements of smartphone's interface with which Alice had probably interacted in order to start recording is sketched in Figure 5; the proximity hypothesis correctly singles out the play button actually used by the author.Thus, salience is successfully captured in terms of conceptual distance.
It is worth noting that in many modern devices the button initiating the camera features a red dot instead if a triangle.In a private exchange of emails, Dylan Horrocks admitted, however, that at the time of drawing the frame it was the triangle that he associated with the 'record' button.This does not falsify the proximity hypothesis, but merely highlights the importance of personal world knowledge and idiosyncrasies of associations forged by particular artists.Apparently, in Dylan Horrocks's search domain the triangle was more readily included than a red dot and it is this semantic material on which the proximity hypothesis operated, at least at the moment when the graphic novel was created.

Heart for affection
The ceteris paribus clause prefacing the proximity hypothesis states that the hypothesis applies to a canonical, and somewhat idealized, situation when all additional factors can be filtered out from the picture.Not surprisingly, oftentimes the situation is more complicated and some extra factors have to be taken into account.
Their effect varies: they may provide additional constraints on the selection of the vehicle in the cases when several candidates are compatible with the proximity hypothesis, they may supersede the hypothesis altogether, or they may activate a different search domain than the domain of direct sensory or physical experience.
One of such factors is cultural conventions.A cultural convention associating love and affection with the heart plays a decisive role in the visual metonymy heart for affection in Figure 6.
Cognitive linguists have convincingly argued that in the domain of language concepts of emotions are very likely to be expressed through metaphors and metonymies motivated by bodily symptoms (cf.e.g.Lakoff 1987, Kövecses 1990, Kövecses 2000).
This observation can be easily extended to visual media.Visual metonymies of this  Linguistic and visual metonymies of emotions share the same motivation: emotional states of other people cannot be perceived directly.In order to directly experience someone's anger, one would need to be that person.For this reason, other people's emotions are typically inferred from behavioral and physiological symptoms: facial expression, perspiration, shaking of hands, redness or paleness of face, etc.Also, one's own emotions are typically accompanied by physiological symptoms like the feeling of heat or cold, increased blood pressure and heart rate, dizziness, etc. Due to these close causal associations, physiological and behavioral symptoms are likely to serve as vehicles in metonymies of emotions and all of them appear to be compatible with the proximity hypothesis.Thus, the repository of potential candidates for the vehicle is quite large and, since all bodily symptoms are directly caused by emotions, the proximity hypothesis does little to constrain the selection.Extensive studies carried out by cognitive linguists (cf.e.g.Lakoff 1987, Kövecses 1990, Kövecses 2000) show that many bodily symptoms are, in fact, used as vehicles in metonymies in languages around the world, either as standalone metonymies or metonymy-based metaphors.What helps to narrow down the selection is cultural conventions.In Western culture, warm feelings of love and affection are often expressed through a conventionalized metonymy heart for affection.
To see how cultural conventions help to constrain the selection of the vehicle in this case, it should be noticed that in some non-Western cultures other organs are more readily associated with love and affection.A good illustration of this is the Indonesian culture, where love is more readily associated with the liver (cf.Siahaan 2008).From the historical point of view, the Western association between the heart and affection is motivated by ancient and medieval philosophy and medicine predominant in Europe, which emphasized the connection between the heart and positive emotions (cf.Baig et al. 2007, Schmitter 2010).Yet it seems unlikely that this expert knowledge plays any role in modern folk understanding of affection; it is more probable that the association is simply 'learned by heart' by modern Westerners.The network of contiguity relations, with the concepts heart and liver both immediately available as metonymic vehicles, is sketched in Figure 7.
It is worth noting that in heart for affection the search domain is not the domain of direct sensory experience; instead the network of contiguity relations arises in the domain [internal organs].Nonetheless, the two domains are intimately interrelated, at least in the context of emotions.Firstly, there is a sense in which emotions are directly experienced 'inside' internal organs: love and affection may be accompanied by increased heart rate or blood pressure, which motivates the association between the emotional states and the heart.Secondly, the domain of observable and interactive objects mentioned in the proximity hypothesis clearly favors concrete material vehicles to abstract and immaterial ones.Even though internal organs are not normally observed, they are nonetheless observable physical objects: they can be observed in certain circumstances.It is this concreteness and observability, the propensity of an object to be observed in some situations, that is more crucial for vehicle selection than the mere state of being exposed to observation in typical situations.
This physical concreteness and availability for the sense of sight is particularly important in the context of the comics medium: after all, it is trivially true that the vehicle of a visual metaphor has to shown in the visual form in order to be shown at all.

Conclusion
Needless to say, this brief analysis of three visual metonymies from Sam Zabel and the Magic Pen does not cover all instances used in the graphic novel, let alone all kinds of visual metonymies used in other comics.In Sam Zabel, honorable mentions go to perspiration for emotion (more specifically, perspiration for fear; p. 123), question mark for confusion (p.123), exclamation mark for surprise (p.144), and stars for dizziness (with a metaphorical component patches of light in distorted vision are stars; p. 137).As already mentioned in the introduction, the case studies in this article cover Visual metonymies in the narrow sense rely on some sort of salient contiguity relation between the vehicle and the target.This salience can be explained in terms of conceptual proximity between concepts in a network of contiguity relations formed in a cognitive domain.In this model, out of all potential vehicles related to the target concept through contiguity relations, the most salient one tends to be the one immediately connected to the target (one edge away from the target), provided that this concept secures unambiguous reference to the target.In some cases, the proximity effect may not suffice to select one vehicle and cultural conventions may narrow down the options available.In the comics medium, an important additional constraint is concreteness and visibility of the vehicle, since all elements of a comics narrative have to be presentable in the graphical form.
The proximity hypothesis describes the canonical and somewhat idealized cases.Even though the hypothesis correctly characterizes many actual visual metonymies, it sometimes fails to sufficiently constrain the vehicle selection.In more extreme cases, the hypothesis does not apply at all due to some unexpected factors.
Nevertheless, all visual metonymies which I found in Sam Zabel and the Magic Pen are at least partly compatible with the proximity hypothesis.Visual metonymies violating the proximity hypothesis are logically possible and most probably they can be found in other comics narratives.In spite of this, the hypothesis appears to capture an important aspect of conceptual metonymies in both visual and nonvisual modalities.

Figure 2
depicts a lot music: there are two onomatopoeias tom tum tom and tappity tippity tap associated with the percussion instruments, and three musical note for melody metonymies: two of them enclosed with speech balloons, one of which stands for a song by 'popular tenor Pal Pory.' Since speech balloons are closely associated with vocal utterances, the two metonymies in the balloons suggest music produced by or reminiscent of human voice (arguably, this is how the 'mouth organ' sounds).The melody played with the instrument 'five string pipe flute' produces sounds of a different quality, which is indicated by the fact that the notes are not enclosed with the speech balloon and hover freely around the player.

Figure 3 :
Figure 3: Contiguity relations in musical note for melody (author's own diagram).

Figure 5 :
Figure 5: Contiguity relations in action button for action (author's own diagram).
sort has been noticed by comics scholars as well, but the scope of their discussions is often quite limited.For example, McCloud in Making Comics noticed that facial expression of characters can stand for their emotional states (McCloud 2006: 85), which instantiates behavior (caused by emotion) for emotion.Duncan and Smith make a similar point while discussing what they call 'psychological images' (Duncan and Smith 2009: 160).Interestingly, McCloud does not use the term 'metonymy' in this context and Duncan and Smith recognize a metonymy in McCloud's analysis of facial expressions (cf.Duncan and Smith 2009: 134-135), but not in their own examples of psychological images.Szawerna (2017) applies the term 'metonymy' more explicitly in relevant cases.