<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">What is Lost in Translation from Visual Graphics to Text for Accessibility</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Peter</forename><surname>Coppin</surname></persName>
							<email>pcoppin@faculty.ocadu.ca</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Dept. of Industrial Design</orgName>
								<orgName type="department" key="dep2">Faculty of Design</orgName>
								<orgName type="institution">OCAD University</orgName>
								<address>
									<postCode>M5T 1W1</postCode>
									<settlement>Toronto</settlement>
									<region>ON</region>
									<country key="CA">CANADA</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Dept. of Mechanical and Industrial Engineering</orgName>
								<orgName type="institution">University of Toronto</orgName>
								<address>
									<postCode>M5S 3G8</postCode>
									<settlement>Toronto</settlement>
									<region>ON</region>
									<country key="CA">CANADA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">What is Lost in Translation from Visual Graphics to Text for Accessibility</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">11EAA7A945835888AD3EC9474177A2CE</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T19:20+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Many blind and low-vision individuals are unable to access digital graphics visually. Currently, the solution to this accessibility problem is to produce text descriptions of visual graphics, which are then translated via text-to-speech screen reader technology. However, if a text description can accurately convey the meaning intended by an author of a visualization, then why did the author create the visualization in the first place? This essay critically examines this problem by comparing the so-called graphic-linguistic distinction to similar distinctions between the properties of sound and speech. It also presents a provisional model for identifying visual properties of graphics that are not conveyed via text-tospeech translations, with the goal of informing the design of more effective sonic translations of visual graphics.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Graphics Without Visual Perception</head><p>Consider the experience of a blind or low-vision individual who uses a screen reader to access pictures, diagrams, charts, and graphs. Unlike a user who accesses graphical media through visual perception, the screen reader user usually accesses these graphics via text-to-speech "descriptions," essentially interpretations of what was deemed most relevant by the person who produced the text descriptions of the author's intended meaning. For example, Figure <ref type="figure" target="#fig_0">1a</ref> presents a financial chart with rising and falling stock prices over time, where time is shown on the horizontal axis and monetary value is shown on the vertical axis. Figure <ref type="figure" target="#fig_0">1d</ref> presents a text description of the chart compliant with the Web Content Accessibility Guidelines (WCAG), using text to describe the rising and falling monetary values over time. The next sections compare and contrast how these presentations are experienced.</p><p>In a text description of a visual graphic (Figure <ref type="figure" target="#fig_0">1d</ref>), all of the information is conveyed via text (or text-to-speech, when conveyed via screen reader technology). But in the original chart (Figure <ref type="figure" target="#fig_0">1a</ref>), only some of the information is conveyed via text, predominantly numerical values and labels (Figure <ref type="figure" target="#fig_0">1c</ref>); the shape of the shaded contour (Figure <ref type="figure" target="#fig_0">1b</ref>) is not conveyed via text: the visually perceived shapes are picked up "more directly" and the features of shapes are translated to text descriptions. However, important properties of visually perceived shape information (Figure <ref type="figure" target="#fig_0">1b</ref>) are lost in translation and are instead conveyed via text (Figure <ref type="figure" target="#fig_0">1e</ref>). This shape information is needed to provide the unique affordances that are often associated with "visual" representations relative to text. Many scholars have explored the differences between graphics and text, often referred to as the so-called "graphic-linguistic distinction" <ref type="bibr" target="#b18">(Shimojima, 1999)</ref>. In addition, researchers have investigated how so-called "nonlinguistic sonification" can be employed to make charts and graphs more accessible (e.g., <ref type="bibr" target="#b10">Edwards, 2010)</ref>. This essay examines the graphic-linguistic distinction in order to better understand how it could correspond to a similar distinction between properties of non-linguistic sonification compared to speech to provide a means to identify what is lost when graphics are translated to text-to-speech. An increased understanding could inform the design of new approaches for conveying properties of graphically represented shapes via sound.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>The</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Graphic-Linguistic Distinction: Implications for Sonic Interface Design</head><p>The graphic-linguistic distinction has been described in various ways: analogical versus Fregean; analog versus propositional; graphical versus sentential; and diagrammatical versus linguistic <ref type="bibr" target="#b18">(Shimojima, 1999)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>2D Versus Sequential</head><p>According to <ref type="bibr" target="#b12">Larkin and Simon (1987)</ref>, a diagrammatic representation can be defined as a "data structure in which information is indexed by two-dimensional location" whereas a sentential representation can be defined as "a data structure in which elements appear in a single sequence". An advantage of diagrams is they "preserve explicitly the information about the topographical and geometric relations among the components of the problem." For the purposes of this essay, the text description in Figure <ref type="figure" target="#fig_0">1e</ref> are classified as sentential because the text is composed of marks arranged in a linear sequence and the marks are taken to refer to words with linguistic meanings (linguistically conveyed elements). In contrast, Figure <ref type="figure" target="#fig_0">1a</ref> is classified as a diagram because the financial values are indicated via (textually) labeled points or lines (elements) that are indexed to a graphical grid. The visually processed spatial relations among these labeled marks yield powerful affordances, because by processing the contours of lines or the relative positions of marks scattered across the two-dimensional graphical surface, the viewer can infer values and trends that are not explicitly conveyed via labels (cf. <ref type="bibr" target="#b4">Barwise &amp; Etchemendy, 1990)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Implications for sonic charts and graphs</head><p>Sonic sentential properties. Text-to-speech (the current standard for WCAG accessibility) would seem to be the obvious candidate for the sonic version of what Larkin and Simon referred to as a sentential structure, where elements are arranged in a linear sequence. In the case of visually processed written sentences composed of word forms printed on a page, the sequential properties result from the linear arrangement of characters and word forms on the printed surface. In the case of sonic sentential structures, the sequential properties are temporal, presented as a sequence of sounds that are perceptually processed as words that refer to intended meanings. Larkin and Simon did not define what the elements (that are arranged in sequence) are composed of. For the purpose of this subsection, let us assume that the elements are some combination of properties that, when sequentially processed as words, refer to intended items.</p><p>Sonic diagrammatic properties. To present diagrammatic properties in a way that can be perceived aurally, designers would need to exploit properties of sound that can convey topological and geometric relations. People use stereo, echo, and the Doppler effect to determine the spatial locations of sound-producing objects in physical environments (cf. <ref type="bibr" target="#b14">Nasir &amp; Roberts, 2007)</ref>. Designers could exploit these cues to convey geometric and topological relations among elements that are indexed to a 2D plane (cf. <ref type="bibr" target="#b5">Brown, Ramloll, Burton, &amp; Riedel, 2003;</ref><ref type="bibr">Hermann, Hunt, &amp; Neuhoff, 2011)</ref>. Figure <ref type="figure">2</ref> shows how left and right arrow keys could move an "audio cursor" to different positions on an x-axis of a computationally generated 2D space. The position of the sonically conveyed cursor on the x-axis could be indicated via stereo (cf. <ref type="bibr" target="#b21">Zhao, Plaisant, Shneiderman, &amp; Lazar, 2008)</ref>. For a simple spark line graph, the sonic cursor can alter the pitch of the sound if "scrubbed" to different points on the x-axis, so that higher pitches correspond to points that intersect with the cursor at higher elevations (Figure <ref type="figure">2</ref>, right) and lower pitches correspond to points that intersect with the cursor at lower elevations, thereby allowing blind or low-vision users to perceive the contours of the graph (cf. <ref type="bibr" target="#b5">Brown, Ramloll, Burton, &amp; Riedel, 2003)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Relation Symbols and Object Symbols</head><p>According to <ref type="bibr" target="#b17">Russell (1923)</ref>, in sentences "words which mean relations are not themselves relations," whereas in graphical representations like maps, "a relation is represented by a relation." An example of the latter is the financial chart (e.g., Figure <ref type="figure" target="#fig_0">1a</ref>), where higher monetary values are conveyed via marks at higher elevations of the graphic, whereas lower monetary values are conveyed via marks at lower elevations. This convention allows the visually perceived spatial relationships among the marks to represent relationships among monetary values over time.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Implications for sonic charts and graphs</head><p>Graphical relations could be conveyed sonically. Consider two tones with different pitches: Tone A and Tone B (Figure <ref type="figure">2</ref>, right). If Tone A is at a lower frequency than Tone B, then the sonic relation between the two tones is the perceptible difference in pitch between the tones. For example, if Tone A refers to a stock price at an earlier point in time, and Tone B refers to a stock price at a later point in time, then the perceptible difference between the pitches of the tones can convey the difference in price over time. Moving the sonic cursor from left to right would correspond to a change (increase) in pitch, conveying the change in stock price over time via a sonic relation.</p><p>Figure <ref type="figure">2</ref>. By scrubbing a "sonic cursor" along an axis, audiences could access sonically conveyed relations through changes in pitch and via stereo.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Analog Versus Digital</head><p>The classic distinction between analog versus digital, where analog refers to visual properties of a graphic and digital refers to linguistic properties, is most commonly associated with <ref type="bibr" target="#b9">Goodman (1968)</ref>. <ref type="bibr" target="#b18">Shimojima (1999)</ref> illustrated this distinction using the example of a speedometer dial. The analog aspect of the dial is the perceived orientation of the speedometer needle relative to the numerically labeled marks on the dial. The digital aspect is the numerical magnitude (speed) that the user extrapolates by perceptually processing the orientation of the needle relative to the marks representing numerical values. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Implications for sonic charts and graphs</head><p>The analog versus digital distinction appears to involve two interrelated capabilities: lower-level perceptual capabilities to process geometric and topological properties (e.g., those shown on the speedometer dial); and higher-level capabilities to process, filter, and interpret how those perceptually processed features fall into conceptual categories (e.g., the numerically represented velocity) <ref type="bibr" target="#b13">(Mandler, 2006</ref>; Figure <ref type="figure">3</ref>). For instance, to discern the values shown on a visual financial chart, a user must perceptually process the light reflected from the surface of the chart, observing lines in relation to dots that are labeled using textually conveyed numerical values and/or company names. To discern topological and geometric features using sound perception, a user would need the same set of interrelated capabilities: lower-level capabilities to process varying sound frequencies, timbre, etc., as well as higherlevel capabilities to identify the linguistic meanings of the sounds. The current text-to-speech approach only exploits the digital properties of language -but designers could produce more effective translations by recruiting "precategorized" analog properties of sound such as pitch, echo, stereo, and timbre to convey geometric and topological properties.</p><p>Figure <ref type="figure">3</ref>. A perception-reaction system is hierarchically organized to process lower-level perceptual structures and categorize them into higher-level conceptual categories.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Intrinsic Versus Extrinsic Constraints</head><p>For brevity, the following discussion will use the classic characterization provided by <ref type="bibr" target="#b4">Barwise and Etchemendy (1990)</ref> because it is compact and intuitive:</p><p>Diagrams are physical situations. They must be, since we can see them. As such, they obey their own set of constraints . . . By choosing a representational scheme appropriately, so that the constraints on the diagrams have a good match with the constraints on the described situation, the diagram can generate a lot of information that the user never need infer. Rather, the user can simply read off facts from the diagram as needed. This situation is in stark contrast to sentential inference, where even the most trivial consequence needs to be inferred explicitly.</p><p>To illustrate how "diagrams are physical situations," consider the illustration shown in Figure <ref type="figure">2</ref>  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Implications for sonic charts and graphs</head><p>When <ref type="bibr" target="#b4">Barwise and Etchemendy (1990)</ref> referred to diagrams as "physical situations," they were referring to the properties (and affordances) of diagrams that emerge through interaction via a human visual perception system. The challenge for designers who seek to extend the affordances of visual diagrams to the sonic domain is to identify properties or dimensions of sound that similarly (i.e., using human perceptual processing of sound) make use of "physical situations" to present "countless facts."</p><p>Thus, a hybrid stereo-varying frequency interface (see Figure <ref type="figure">3</ref>) should enable a user to "hear the shape" of a contour. Indexing text-to-speech labels to contours should allow users to form multiple sentences (countless facts) about the geometric and/or topological relations among the labeled elements.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Extending the Graphic-Linguistic Distinction into the Sonic Domain</head><p>Let us now extend on the various graphic-linguistic distinctions to consider sonic versions of visual charts and graphs.</p><p>1. Extending on the diagrammatic versus sentential distinction, text-to-speech can be considered a sonic version of what Larkin and Simon referred to as a sentential structure and is the current WCAG approach to web accessibility. In contrast, spatial sound can be exploited to convey 2D sonic diagrammatic external representations.</p><p>2. Extending on the analog versus digital distinction, textto-speech uses language to convey digital properties sonically. The analog properties of sound, such as tone, timbre, stereo, and echo could afford the communication of spatial, geometric, or topological information.</p><p>3. Extending on the distinction between relation symbols and object symbols, the current text-to-speech approach uses words to convey relations. Because relations among elements represented by analog and spatial properties of sound are themselves relations, analog and spatial properties of sound could be recruited to map numerical values to perceptual dimensions.</p><p>4. Extending on the distinction between intrinsic and extrinsic constraints, producing sonic versions of visual graphics would require identifying "physical situations" that naturally emerge during human perceptual processing of sound to present "countless facts."</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Perceptual and Conceptual Graphic Relations</head><p>This section integrates these extensions and proposes how the graphic-linguistic distinction could be extended to sonic external representations. First, let us recruit and expand on the distinction between lower-level perceptually processed topological and geometric features of an environment versus the recognition, categorization, and linguistic communication of those features.</p><p>Visual and aural sentential structures and relations are detected and perceptually processed via lower-level sensory receptors and perceptual categories (Figure <ref type="figure">3</ref>, left). In written text or text-to-speech, what is most relevant is the higher-level conceptual category (Figure <ref type="figure">3</ref>, right) that a given feature (such as perceptually processed printed text on a page or text-to-speech) is taken to fall under. What is needed is a way to convey topological and geometric relations among elements by exploiting lower-level perceptually processed features of a visual graphic or sonic structure (Figure <ref type="figure">3</ref>, left). Let us refer to these perceptually processed features as perceptual properties. Let us refer to these perceptually processed relations among elements as perceptual relations. Let us refer to relations that are communicated via text as text-described relations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Perceptual Relations vs. Text-Described Relations</head><p>We are now ready to build on previous work by <ref type="bibr" target="#b6">Coppin (2014)</ref> to provide a theoretical foundation for distinguishing perceptual relations versus text-described relations.</p><p>The model is based on the idea that an individual's perception-reaction loop (cf. <ref type="bibr" target="#b8">Gibson, 1986)</ref> enables survival and prosperity within a dynamic environment composed of change and variation. This requires capabilities to predict, anticipate, and simulate <ref type="bibr" target="#b0">(Barsalou, 1999)</ref> dynamic change and variation. For example, reaching for and grasping an item such as a cup requires capabilities to perceptually process features from the proximal surface of the item and also to predict, anticipate, and simulate features of the distal surface of the item.</p><p>These simulations are constructed from the memory traces of past perception-reactions (conjunctive neurons), so simulation involves many of the same neural systems used during perception <ref type="bibr" target="#b11">(Kosslyn, Ganis, &amp; Thompson, 2001)</ref>. For example, as I perceive the cup, I am also informing potential action (reaching for and grasping the proximal and distal sides of the cup). Thus, perception and simulation are integrated aspects of perception-reaction within a physical environment, and each act of perception-reaction leaves memory traces in the form of conjunctive neurons across lower-level association areas (Figure <ref type="figure">3</ref>).</p><p>At lower-level association areas, which are more tightly coupled with sensory receptors, simulated prototypes fall under perceptual categories. At higher-level association areas (see Figure <ref type="figure">3</ref>, right), conjunctive neurons converge in zones across multiple sensory modes. These "convergence zones" <ref type="bibr" target="#b7">(Damasio, 1989;</ref><ref type="bibr" target="#b19">Simmons &amp; Barsalou, 2003)</ref> enable simulated prototypes of possible perception-reactions that are not as easily described in terms of a specific perceptual mode or a reenactment of a specific prior perception-action. Instead, these simulated prototypes fall under more general categories of possible perception-actions <ref type="bibr" target="#b1">(Barsalou, 2003)</ref>. These are not only more amodal, but have been described as more filtered, interpreted <ref type="bibr" target="#b16">(Pylyshyn, 1973)</ref>, conceptual <ref type="bibr" target="#b1">(Barsalou, 2003</ref><ref type="bibr" target="#b2">(Barsalou, , 2005))</ref>, or abstract <ref type="bibr" target="#b1">(Barsalou, 2003)</ref>. For example, a child who takes a bite out of what turns out to be a rotten apple might later reenact this experience when she perceives another rotten apple with common properties. Over time, she will develop an understanding of 'rotten' as a category that can include apples, as well as many other objects and experiences.</p><p>Similarly, a child can learn to associate sounds with certain intended meanings (learning a language), or to associate marks with intended meaning (learning to read). The abstract concept of 'square' can apply to a shape on a raised surface that is touched but not seen, as well as to a drawing on a piece of paper that is seen and not touched. These "less modally specific" simulations have been described as more "interpreted" or "conceptual," while more perceptually based simulations are considered to be more "concrete." The next section applies this interpretation to external graphic representation.</p><p>Back to charts and graphs. In a financial chart (and many other kinds of diagrams), relations are conveyed via lower-level perceptual processing of the geometrical and topological properties of the marked physical surface (Table <ref type="table" target="#tab_0">1</ref>). In contrast, in text descriptions (sentential structures), relations are conceptual (and conveyed linguistically; see Table <ref type="table">2</ref>); although visual properties of printed text or aural properties of text-to-speech are also picked up by sensory receptors, what is meaningful about them is the conceptual relation that is conveyed linguistically. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Perceptual Specificity is Lost in Translation</head><p>The idea of "specificity" is central to understanding what is lost in translation, so let us begin by clarifying what is meant by "more or less specific" in this context. Consider the line shown in Figure <ref type="figure" target="#fig_3">4b</ref>. Relative to the line of Figure <ref type="figure" target="#fig_3">4c</ref>, we have more knowledge about the location of a point in a one-dimensional space, due to the shaded red marker. This means we have more certainty (or more information) about the specified location of the point in Figure <ref type="figure" target="#fig_3">4b</ref> than we do about the location of the point in Figure <ref type="figure" target="#fig_3">4c</ref>. Extending the line example to discuss perceptual relations, Figure <ref type="figure" target="#fig_3">4b</ref> refers to intentionally configured marks or sounds from an author to cause intended audience percepts (the diagram in Figure <ref type="figure" target="#fig_3">4a</ref>). However, the perceptual relations of Figure <ref type="figure" target="#fig_3">4a</ref> can be processed, filtered, and interpreted to fall under a range of possible relational categories (that can be text-described), indicated by the highlighted segment of the right line in Figure <ref type="figure" target="#fig_3">4c</ref> (as shown in Figure <ref type="figure" target="#fig_3">4d</ref>: "A is below B and both A and B are to the left of C" or "B is between A and C and is above both A and C"). In other words, although perceptual specificity is high, conceptual specificity of the intended relation is low because the perceptual relations can fall under numerous conceptual categories. However, the reverse is also true and this reversal exposes the heart of what is lost during the translation process.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Conceptual Specificity is Perceptually Ambiguous</head><p>Extending the line example to discuss the perceptual ambiguity of text-described (conceptual) relations, the right highlighted line in Figure <ref type="figure">5c</ref> refers to a specific (sentential) text description authored to convey intended conceptual relations (Figure <ref type="figure">5d</ref>). However, numerous perceptual relations (Figure <ref type="figure">5a</ref>) can fall under the text-described conceptual relations, indicated by the highlighted segment of the left line in Figure <ref type="figure">5b</ref>. In other words, although conceptual specificity is high, perceptual specificity of the intended relations is low, because numerous perceptual relations can fall under the text-described conceptual relations.</p><p>Figure <ref type="figure">5</ref>. The model predicts that when conceptual specificity is high (c) perceptual specificity is low (b).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Application to an Example Design Problem</head><p>Let us now return to the WCAG text description example from Figure <ref type="figure" target="#fig_0">1</ref> in order to demonstrate what is lost in translation and how what is lost could be conveyed via nonlinguistic sound. In the text description (Figure <ref type="figure" target="#fig_0">1d</ref>), the problem is that all content is conveyed conceptually (via text-to-speech) whereas the original visual graphic that the text description is based on conveys much of the content (the contour of the shape) perceptually: Perceptual relations are lost and replaced by conceptual relations, generating perceptual ambiguity. If the objective is to present Figure <ref type="figure" target="#fig_0">1a</ref> sonically, how can a designer decide which aspects should be conveyed via conceptual properties (text-to-speech) and which aspects should be conveyed via perceptual sonic properties (such as spatial sound)?</p><p>Recall the perceptual distinction, where perceptual properties are predicted to afford the communication of concrete structures more effectively compared with conceptual properties, and an aspect of a graphic can be identified as "more concrete" if it produces a perceptual structure that corresponds to what could be picked up and perceptually processed from a physical environment. In this account, the graphically represented shape contour (Figure <ref type="figure" target="#fig_0">1b</ref>) is primarily perceptual, and is therefore more appropriate for translation to sonic properties that can use spatial sound to convey geometric and topological relations among conceptually conveyed objects.</p><p>To determine which aspects of a graphic should be conveyed via text-to-speech, recall the conceptual distinction: text is predicted to afford the communication of abstract conceptual categories more effectively compared with perceptual properties, and a concept can be identified as more abstract if it is more amodal. In other words, it is less easily mapped back to a structure that could be picked up and perceptually processed from a physical environment. Under this account, the numbers that label increments on the x and y axes (Figure <ref type="figure" target="#fig_0">1a</ref>) are more conceptual because they cannot be mapped back to a perceptual structure that could be picked up from a physical environment.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Conclusion</head><p>This essay proposes a provisional model to underpin the various accounts of the graphic-linguistic distinction described in the literature as a means to extend the graphiclinguistic distinction into aural domains. The model makes the distinction in terms of lower level perceptual capabilities that enable perceivers to perceptually process concrete structures (e.g., geometric and topological features) on the one hand, and higher level capabilities that enable perceivers to process and interpret how those perceptually processed structures fall under more abstract conceptual categories on the other.</p><p>Due to these distinctions, the model predicts that perceptual relations (conveyed via graphics or non-linguistic sonification) afford the communication of concrete relations (conveyed via text or text-to-speech) more effectively compared to conceptual relations conveyed via text or text-to-speech. In addition, the model predicts that conceptual relations (conveyed via text or text-to-speech) afford the communication of abstract relations more effectively compared to perceptual relations conveyed via graphics or non-linguistic sonification. This could be tested, for example, by observing whether perceivers can identify visual data sets more accurately using sonification or text descriptions.</p><p>In addition, the model streamlines accounts that distinguish diagrammatic from sentential structures to (1) characterize sentential structures as composed of conceptual relations among conceptual objects on the one hand, and (2) diagrammatic structures as perceptually represented relations among conceptual objects on the other. Under this account, (3) a sonic diagram is conceptualized as sonically conveyed relations among linguistically conveyed (via text-to-speech) objects.</p><p>This model is useful within a design context because designers lack clear models or guidelines for converting visual graphics into non-visual perceptual modes. This can be seen in the WCAG text description example, which ignores the pictorial properties of graphics.</p><p>By reverse engineering the classic graphic-linguistic distinction to more fundamental perceptual principles, this model provides a way to understand how the distinction applies to sonic representations. This approach can also be applied to haptic representations but the focus of this paper was on sound for its ubiquity in the consumer market.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 .</head><label>1</label><figDesc>Figure 1. The chart (a) is composed of visually perceived shape contours (b) and text labels (c). Accessibility practices translate b-c to text (d), with shapes described via text (e). 1</figDesc><graphic coords="1,318.05,158.24,242.94,214.10" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head></head><label></label><figDesc>(left). A text (or text-to-speech) description might go as follows: "A is below B and both A and B are to the left of C." Another textual description might read: "B is between A and C and is above both A and C." Each text description conveys a different interpretation of what is shown visually and therefore affords different inferences. In contrast, a diagram can convey many other relationships because of how it conveys topological and geometric information through visual perception: Barwise and Etchemendy referred to this as a diagram's ability to present "countless facts."</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 .</head><label>4</label><figDesc>Figure 4. The left vertical line (b) refers to the limited range of perceptual structures conveyed via a given graphic. The right line (c) refers to the wider range of possible conceptual categories that the perceptual structures could fall under. The model predicts that when perceptual specificity is high (b) conceptual specificity is low (c).</figDesc><graphic coords="5,54.00,95.00,243.00,82.50" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Diagrams are composed of perceptually processed relations among linguistically conveyed conceptual objects; sentences are composed of linguistically conveyed conceptual relations among linguistically conveyed conceptual objects (adapted from<ref type="bibr" target="#b6">Coppin, 2014)</ref>.</figDesc><table><row><cell></cell><cell>Diagrammatic</cell><cell>Sentential</cell></row><row><cell>Relations</cell><cell>Perceptual</cell><cell>Conceptual</cell></row><row><cell cols="2">Objects or Items Conceptual</cell><cell>Conceptual</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Adapted from "Web Accessibility Best Practices: Graphs" by Campus Information Technologies and Educational Services (CITES) and Disability Resources and Educational Services (DRES), University of Illinois at Urbana/Champaign. Copyright</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2005" xml:id="foot_1">by University of Illinois at Urbana/Champaign.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>This research was supported in part by grants from the Centre for Innovation in Data-Driven Design and the Graphics Animation and New Media Centre for Excellence. I would like to thank Research Assistant Ambrose Li for his assistance in the preparation of this essay and Dr. David Steinman for the many fruitful conversations that helped inform the ideas explored in the work described here.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Perceptual symbol systems</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">W</forename><surname>Barsalou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Behavioral &amp; Brain Sciences</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="page" from="577" to="660" />
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Abstraction in perceptual symbol systems</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">W</forename><surname>Barsalou</surname></persName>
		</author>
		<idno type="DOI">10.1098/rstb.2003.1319</idno>
	</analytic>
	<monogr>
		<title level="j">Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences</title>
		<imprint>
			<biblScope unit="volume">358</biblScope>
			<biblScope unit="page" from="1177" to="1187" />
			<date type="published" when="1435">2003. 1435</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Abstraction as dynamic interpretation in perceptual symbol systems</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">W</forename><surname>Barsalou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Carnegie Symposium Series: Building object categories</title>
				<editor>
			<persName><forename type="first">L</forename><surname>Gershkoff-Stowe</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">&amp;</forename><forename type="middle">D</forename><surname>Rakison</surname></persName>
		</editor>
		<meeting><address><addrLine>Majwah, NJ</addrLine></address></meeting>
		<imprint>
			<publisher>Erlbaum</publisher>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="389" to="431" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Simulation, situated conceptualization, and prediction</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">W</forename><surname>Barsalou</surname></persName>
		</author>
		<idno type="DOI">10.1098/rstb.2008.0319</idno>
	</analytic>
	<monogr>
		<title level="j">Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences</title>
		<imprint>
			<biblScope unit="volume">364</biblScope>
			<biblScope unit="page" from="1281" to="1289" />
			<date type="published" when="1521">2009. 1521</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Visual information and valid reasoning</title>
		<author>
			<persName><forename type="first">J</forename><surname>Barwise</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Etchemendy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Visualization in mathematics</title>
				<editor>
			<persName><forename type="first">W</forename><surname>Zimmerman</surname></persName>
		</editor>
		<meeting><address><addrLine>Washington, DC</addrLine></address></meeting>
		<imprint>
			<publisher>Mathematical Association of America</publisher>
			<date type="published" when="1990">1990</date>
			<biblScope unit="page" from="8" to="23" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Design guidelines for audio presentation of graphs and tables</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">M</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Brewster</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Ramloll</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Burton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Riedel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Auditory Display</title>
				<imprint>
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">W</forename><surname>Coppin</surname></persName>
		</author>
		<ptr target="https://tspace.library.utoronto.ca/handle/1807/44108" />
		<title level="m">Perceptual-cognitive properties of pictures, diagrams, and sentences: Toward a science of visual information design</title>
				<meeting><address><addrLine>Toronto, Canada</addrLine></address></meeting>
		<imprint>
			<publisher>University of Toronto</publisher>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note type="report_type">Doctoral dissertation</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The brain binds entities and events by multiregional activation from convergence zones</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">R</forename><surname>Damasio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural Computation</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="123" to="132" />
			<date type="published" when="1989">1989</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">The ecological approach to visual perception</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Gibson</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1986">1986</date>
			<publisher>Lawrence Erlbaum</publisher>
			<pubPlace>Hillsdale, NJ</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><surname>Goodman</surname></persName>
		</author>
		<title level="m">Languages of art: An approach to a theory of symbols</title>
				<meeting><address><addrLine>Indianapolis, IN</addrLine></address></meeting>
		<imprint>
			<publisher>Bobbs-Merrill Company</publisher>
			<date type="published" when="1968">1968</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Auditory display in assistive technology</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">D N</forename><surname>Edwards</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Sonification Handbook</title>
				<editor>
			<persName><forename type="first">T</forename><surname>Hermann</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Hunt</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin</addrLine></address></meeting>
		<imprint>
			<publisher>Verlag</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="431" to="453" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Neural foundations of imagery</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Kosslyn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ganis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">L</forename><surname>Thompson</surname></persName>
		</author>
		<idno type="DOI">10.1038/35090055</idno>
	</analytic>
	<monogr>
		<title level="j">Nature Reviews Neuroscience</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">9</biblScope>
			<biblScope unit="page" from="635" to="642" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Why a diagram is (sometimes) worth ten thousand words</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Larkin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">A</forename><surname>Simon</surname></persName>
		</author>
		<idno type="DOI">10.1111/j.1551-6708.1987.tb00863.x</idno>
	</analytic>
	<monogr>
		<title level="j">Cognitive Science</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="65" to="99" />
			<date type="published" when="1987">1987</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Categorization, development of</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Mandler</surname></persName>
		</author>
		<idno type="DOI">10.1002/0470018860.s00516</idno>
	</analytic>
	<monogr>
		<title level="j">Encyclopedia of Cognitive Science</title>
		<imprint>
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Sonification of spatial data</title>
		<author>
			<persName><forename type="first">T</forename><surname>Nasir</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">C</forename><surname>Roberts</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">13th International Conference on Auditory Display (ICAD</title>
				<imprint>
			<publisher>ICAD</publisher>
			<date type="published" when="2007">2007. 2007</date>
			<biblScope unit="page" from="112" to="119" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Fundamental aspects of cognitive representation</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">E</forename><surname>Palmer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Cognition and Categorization</title>
				<editor>
			<persName><forename type="first">E</forename><surname>Rosch</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><forename type="middle">B</forename><surname>Llyod</surname></persName>
		</editor>
		<meeting><address><addrLine>Hillsdale, NJ</addrLine></address></meeting>
		<imprint>
			<publisher>Lawrence Erlbaum Associates, Publishers</publisher>
			<date type="published" when="1978">1978</date>
			<biblScope unit="page" from="259" to="303" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">What the mind&apos;s eye tells the mind&apos;s brain: A critique of mental imagery</title>
		<author>
			<persName><forename type="first">Z</forename><forename type="middle">W</forename><surname>Pylyshyn</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Psychological Bulletin</title>
		<imprint>
			<biblScope unit="volume">80</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">1</biblScope>
			<date type="published" when="1973">1973</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title/>
		<author>
			<persName><forename type="first">B</forename><surname>Russell</surname></persName>
		</author>
		<idno type="DOI">10.1080/00048402308540623</idno>
	</analytic>
	<monogr>
		<title level="j">Vagueness. Australasian Journal of Psychology and Philosophy</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="84" to="92" />
			<date type="published" when="1923">1923</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">The graphic-linguistic distinction: Exploring alternatives</title>
		<author>
			<persName><forename type="first">A</forename><surname>Shimojima</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artificial Intelligence Review</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="313" to="335" />
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">The similarityin-topography principle: reconciling theories of conceptual deficits</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">K</forename><surname>Simmons</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">W</forename><surname>Barsalou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Cognitive neuropsychology</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="page" from="451" to="486" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Crossmodal correspondences: A tutorial review</title>
		<author>
			<persName><forename type="first">C</forename><surname>Spence</surname></persName>
		</author>
		<idno type="DOI">10.3758/s13414-010-0073-7</idno>
	</analytic>
	<monogr>
		<title level="j">Attention, Perception, &amp; Psychophysics</title>
		<imprint>
			<biblScope unit="volume">73</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="971" to="995" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Data sonification for users with visual impairment: a case study with georeferenced data</title>
		<author>
			<persName><forename type="first">H</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Plaisant</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Shneiderman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lazar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Computer-Human Interaction (TOCHI)</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">4</biblScope>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
