<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Spatial Relations in Text-to-Scene Conversion</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Bob</forename><surname>Coyne</surname></persName>
							<email>coyne@cs.columbia.edu</email>
							<affiliation key="aff0">
								<orgName type="institution">Columbia University</orgName>
								<address>
									<settlement>New York NY</settlement>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Richard</forename><surname>Sproat</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">Oregon Health &amp; Science University</orgName>
								<address>
									<settlement>Beaverton</settlement>
									<region>Oregon</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Julia</forename><surname>Hirschberg</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Columbia University</orgName>
								<address>
									<settlement>New York NY</settlement>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Spatial Relations in Text-to-Scene Conversion</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">FE68AA6BC2226DD3591D69E484DD9FCD</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T07:28+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Spatial relations play an important role in our understanding of language. In particular, they are a crucial component in descriptions of scenes in the world. WordsEye (www.wordseye.com) is a system for automatically converting natural language text into 3D scenes representing the meaning of that text. Natural language offers an interface to scene generation that is intuitive and immediately approachable by anyone, without any special skill or training. WordsEye has been used by several thousand users on the web to create approximately 15,000 fully rendered scenes. We describe how the system incorporates geometric and semantic knowledge about objects and their parts and the spatial relations that hold among these in order to depict spatial relations in 3D scenes.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Spatial relations are expressed either directly or implicitly in a wide range of natural language descriptions. To represent these descriptions in a 3D scene, one needs both linguistic and real-world knowledge, in particular knowledge about: the spatial and functional properties of objects; prepositions and the spatial relations they convey, which is often ambiguous; verbs and how they resolve to poses and other spatial relations. For example, to interpret apple in the bowl we use our knowledge of bowls -that they have interiors that can contain objects. With different objects (e.g., boat in water), a different spatial relation is conveyed.</p><p>WordsEye <ref type="bibr" target="#b5">[6]</ref> is a system for automatically converting natural language text into 3D scenes representing the meaning of that text. A version of WordsEye has been tested online (www.wordseye.com) with several thousand real-world users. We have also performed preliminary testing of the system in schools, as a way to help students exercise their language skills. Students found the software fun to use, an important element in motivating learning. As one teacher reported, "One kid who never likes anything we do had a great time yesterday...was laughing out loud."</p><p>WordsEye currently focuses on directly expressed spatial relations and other graphically realizable properties. As a result, users must describe scenes in somewhat stilted language. See Figure <ref type="figure">1</ref>. Our current research focuses on improving the system's ability to infer these relations automatically. However, in this paper, we describe the basic techniques used by WordsEye to interpret and depict directly expressed spatial relations.</p><p>In Section 2 we describe previous systems that convert natural language text to 3D scenes and prior linguistic work on spatial relations. In Section 3 we provide an overview of WordsEye. In Section 4 we discuss the spatial, semantic and functional knowledge about objects used to depict spatial relations in our system. We conclude and describe other ongoing and future work in Section 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Prior Work</head><p>Natural language input has been investigated in some early 3D graphics systems <ref type="bibr" target="#b0">[1]</ref>[13] including the Put system <ref type="bibr" target="#b3">[4]</ref>, which was limited to spatial arrangements of existing objects in a pre-constructed environment. In this system, input was restricted to an artificial subset of English consisting of expressions of the form P ut(X, P, Y ), where X and Y are objects and P is a rigidly defined spatial preposition. Work at the University of Pennsylvania's Center of Human Modeling and Simulation <ref type="bibr" target="#b1">[2]</ref>, used language to control animated characters in a closed virtual environment. CarSim <ref type="bibr" target="#b6">[7]</ref> is a domain-specific system that creates animations from natural language descriptions of accident reports. CONFUCIUS <ref type="bibr" target="#b11">[12]</ref> is a multi-modal text-to-animation system that generates animations of virtual humans from single sentences containing an action verb. In these systems the referenced objects, attributes, and actions are typically relatively small in number or targeted to specific pre-existing domains.</p><p>Spatial relations have been studied in linguistics for many years. One reasonably thorough study for English is Herskovits <ref type="bibr" target="#b8">[9]</ref>, who catalogs fine-grained distinctions in the interpretations of various prepositions. 3 For example, she distinguishes among the various uses of on to mean "on the top of a horizontal surface" (the cup is on the table), or "affixed to a vertical surface" (the picture is on the wall). Herskovits notes that the interpretation of spatial expressions may involve considerable inference. For example, the sentence the gas station is at the freeway clearly implies more than just that the gas station is located next to the freeway; the gas station must be located on a road that passes over or under the freeway, the implication being that, if one proceeds from a given point along that road, one will reach the freeway, and also find the gas station. 3 It is important to realize that how spatial relations are expressed, and what kinds of relations may be expressed varies substantially across languages. Levinson and colleagues <ref type="bibr" target="#b10">[11]</ref> have catalogued profound differences in the ways different languages encode relations between objects in the world. In particular, the Australian language Guugu Yimithirr and the Mayan language Tzeltal use absolute frames of reference to refer to the relative positions of objects. In Guugu Yimithirr, one can locate a chair relative to a table only in terms of cardinal points saying, for example, that the chair is north of the table. In English such expressions are reserved for geographical contexts -Seattle is north of Portland -and are never used for relations at what Levinson terms the "domestic scale". In Guugu Yimithirr one has no choice, and there are no direct translations for English expressions such as the chair is in front of the table.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Eye of the Beholder by Bob Coyne</head><p>No Dying Allowed by Richard Sproat</p><p>Input text: The silver penny is on the moss ground. The penny is 7 feet tall. A clown is 2 feet in front of the penny. The clown is facing the penny.</p><p>Input text: Eight big white washing machines are in front of the big cream wall. The wall is 100 feet long. The "No Dying Allowed" whiteboard is on the wall. The whiteboard is one foot high and five feet long. The ground is tile. Death is in front of the washing machines. It is facing southeast. Death is eight feet tall.</p><p>Fig. <ref type="figure">1</ref>: Some Examples from WordsEye's Online Gallery</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">System Overview</head><p>Our current system is an updated version of the original WordsEye system <ref type="bibr" target="#b5">[6]</ref>, which was the first system to use a large library of 3D objects to depict scenes in a free-form manner using natural language. The current system contains 2,200 3D objects and 10,000 images and a lexicon of approximately 15,000 nouns. It supports language-based control of objects, spatial relations, and surface properties (e.g., textures and colors); and it handles simple coreference resolution, allowing for a variety of ways of referring to objects. The original WordsEye system handled 200 verbs in an ad hoc manner with no systematic semantic modeling of verb alternations and argument combinations. In the current system, we are instead adding frame semantics to support verbs more robustly. To do this, we are utilizing our own lexical knowledge-base, called the SBLR (Scenario-Based Lexical Resource) <ref type="bibr" target="#b4">[5]</ref>. The SBLR consists of an ontology and lexical semantic information extracted from WordNet <ref type="bibr" target="#b7">[8]</ref> and FrameNet <ref type="bibr" target="#b2">[3]</ref> which we are augmenting to include the finer-grained relations and properties on entities needed for depicting scenes as well as capturing the different senses of prepositions related to those properties and relations.</p><p>The system works by first parsing each input sentence into a dependency structure. These dependency structures are then processed to resolve anaphora and other coreferences. The lexical items and dependency links are then converted to semantic nodes and roles drawing on lexical valence patterns and other information in the SBLR. The resulting semantic relations are then converted to a final set of graphical constraints representing the position, orientation, size, color, texture, and poses of objects in the scene. Finally, the scene is composed from these constraints and rendered in OpenGL (http://www.opengl.org) and optionally ray-traced in Radiance <ref type="bibr" target="#b9">[10]</ref>. The user can then provide a title and caption and save the scene in our online gallery where others can comment and create their own pictures in response. See Figure <ref type="figure">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Spatial Relations</head><p>WordsEye uses spatial tags and other spatial and functional properties on objects to resolve the meaning of spatial relations. We focus here on the interpretation of NPs containing spatial prepositions of the form "X-preposition-Y", where we will refer to X as the figure and Y as the ground. For example, in snow is on the roof, snow is the figure and roof is ground. The interpretation of the spatial relation often depends upon the types of the arguments to the preposition. There can be more than one interpretation of a spatial relation for a given preposition The geometric and semantic information associated with those objects will, however, help narrow down the possibilities. The 3D objects in our system are augmented with the following features:</p><p>-Is-a: The lexical category to which the given object belongs.</p><p>-Spatial tags identifying the following regions: (See Figure <ref type="figure" target="#fig_0">2</ref>)</p><p>• Canopy: A canopy-like area "under" an object (e.g., under a tree).</p><p>• Cup: A hollow area, open above, that forms the interior of an object.</p><p>• Enclosure: An interior region, bounded on all sides (holes allowed).</p><p>• Top/side/bottom/front/back: For both inner and outer surfaces.</p><p>• Named-part: For example, the hood on car.</p><p>• Stem: A long thin, vertical base.</p><p>• Opening: An opening to an object's interior (e.g., doorway to a room).</p><p>• Hole-through: A hole through an object. For example, a ring or donut.</p><p>• Touch-point: Handles and other functional parts on the object. For example, in John opened the door, the doorknob would be marked as a handle, allowing the hand to grasp at that location. • Base: The region of an object where it supports itself.</p><p>-Overall shape: A dominant overall shape used in resolving various spatial relations. For example, sheet, block, ribbon, cup, tube, disk, rod. -Forward/Upright direction: The object's default orientation.</p><p>-Size: The default real-world size of the object. This is also used in spatial relations where the figure and ground size must be compatible. For example, ring on a stick versus *life-preserver on a pencil. -Length axis: The axis for lengthening an object.</p><p>-Segmented/stretchable: Some objects don't change size in all dimensions proportionally. For example, a fence can be extended indefinitely in length without a corresponding change in height. -Embeddable: Some objects, in their normal function, are embedded in others. For example, fireplaces are embedded in walls, and boats in water. -Wall-item and Ceiling-item: Some objects are commonly attached to walls or ceilings or other non-upward surfaces. Some (e.g., pictures) do this by virtue of their overall shape, while for others (e.g., sconces) the orientation of the object's base is used to properly position the object. -Flexible: Flexible objects such as cloth and paper allow an object to hang or wrap. For example, towel over a chair. -Surface element: Any object that can be part of a flat surface or layer.</p><p>For example, a crack, smudge, decal, or texture. -Semantic properties such as Path, Seat, Airborne for object function. Some of these features were used in earlier versions of our system <ref type="bibr" target="#b5">[6]</ref>. Features we have added to the current version include: surface element, embeddable, overall shape, length axis, segmented/stretchable. Other features, including (flexible, opening, hole-through and various semantic features) are in the development stage. The implemented tagset supports the generation of scenes such as Figure <ref type="figure" target="#fig_1">3</ref>.</p><p>In order to resolve a spatial relation, we find the spatial tags and other features of the figure and ground objects that are applicable for the given preposition. For example, if the preposition is under, a canopy region for the ground object is relevant, but not a top surface. Various other factors, such as size, must also be considered. With enclosed-in, the figure must fully fit in the ground. For embedded-in, only part need fit. For other relations (e.g., next-to), the objects can be any size, but the figure location might vary. For example, The mosquito is next to the horse and The dog is next to the horse position the figure in different places, either in the air or on the ground, depending on whether the given object is commonly airborne or not. We also note that the figure is normally the smaller object while the ground functions as a landmark. So it's normal to say The flower bed is next to the house, but unnatural to say *The house is next to the flowerbed. This is discussed in <ref type="bibr" target="#b8">[9]</ref>. See Table <ref type="table" target="#tab_0">1</ref> for some mappings we make from prepositions to spatial relations. In order to use the object features described above to resolve the spatial meaning of prepositions, linguistically referenced subregions must also be considered. Spatial relations often express regions relative to an object (e.g., left side of in The chair is on the left side of the room). The same subregion designation can yield different interpretations, depending on the features of the objects.</p><p>external-vertical-surface: shutters on the left side of the house interior-vertical-surface: picture on the left side of the room region-of-horiz-surface: vase on the left side of the room neighboring-area: car on the left side of the house These regions (when present) are combined with the other constraints on spatial relations to form the final interpretation.</p><p>Input text: A large magenta flower is in a small vase. The vase is under an umbrella. The umbrella is on the right side of a  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusions and Ongoing and Future Work</head><p>In order to represent spatial relations more robustly, much remains to be done at the language, graphical, and application levels.</p><p>We are augmenting the system to resolve verbs to semantic frames using information in our SBLR, and mapping those in turn to corresponding poses and spatial relations <ref type="bibr" target="#b4">[5]</ref>. Figure <ref type="figure" target="#fig_3">4</ref> illustrates this process, which currently is supported for a limited set of verbs and their arguments. This enhanced capability also requires contextual information about actions and locations that we are acquiring using human annotations obtained via Amazon's Mechanical Turk and by extracting information from corpora using automatic methods <ref type="bibr" target="#b13">[14]</ref>. We will be evaluating our software in partnership with a non-profit after-school program in New York City.  </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 2 :</head><label>2</label><figDesc>Fig.2: Spatial Tags, represented here by the boxes associated with each object, designate regions of those objects used in resolving spatial relations. For example, the top surface region marked on the seat of the chair is used in sentences like The pink mouse is on the small chair to position the figure (mouse) on the ground (chair). See Figure3for the depiction of this sentence and several others that illustrate the effect of spatial tags and other object features.</figDesc><graphic coords="4,137.72,407.49,74.10,59.80" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 3 :</head><label>3</label><figDesc>Fig.3: Spatial relations and features: enclosed-in (chicken in cage); embedded-in (horse in ground); in-cup (chicken in bowl); on-top-surface (apple on wall); on-vertical-surface (picture on wall); pattern-on (brick texture on wall); under-canopy (vase under umbrella); under-base (rug under table); stem-in-cup (flower in vase); laterally-related (wall behind table); length-axis (wall); default size/orientation (all objects); region (right side of); distance (2 feet behind); size (small and 16 foot long); orientation (facing).</figDesc><graphic coords="7,284.39,160.57,196.40,142.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head></head><label></label><figDesc>The truck chased the man down the road... The man ran across the sidewalk...</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 4 :</head><label>4</label><figDesc>Fig.4: Spatial relations derived from verbs. The verbs are mapped to semantic frames which in turn are mapped to vignettes (representing basic contextual situations) given a set of semantic role and values. These, in turn, are mapped to spatial relations. In the first example, the pursued (soldier) is in a running pose, located on the path (road), and in front of the pursuer (truck).</figDesc><graphic coords="8,136.16,115.84,137.48,99.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Spatial relations for in and on (approximately half are currently implemented). Similar mappings exist for other prepositions such as under, along. Handcrafted rules resolve the spatial relation given the object features.</figDesc><table><row><cell>Spatial Relation</cell><cell>Example</cell><cell>Partial Conditions</cell></row><row><cell>on-top-surface</cell><cell>vase on table</cell><cell>Ground is upward-surface</cell></row><row><cell>on-vertical-surface</cell><cell>postcard on fridge</cell><cell>Ground is vertical-surface</cell></row><row><cell>on-downward-surface</cell><cell>fan on ceiling</cell><cell>Ground is downward-surface</cell></row><row><cell>on-outward-surface</cell><cell>pimple on nose</cell><cell>Ground is surface</cell></row><row><cell>pattern/coating-on</cell><cell cols="2">plaid pattern on shirt Figure is texture or layer</cell></row><row><cell>fit-on-custom</cell><cell>train on track</cell><cell>special base pairing</cell></row><row><cell>ring-on-pole</cell><cell>bracelet on wrist</cell><cell>Figure=ring-shape,</cell></row><row><cell></cell><cell></cell><cell>Ground=pole-shape</cell></row><row><cell>on-vehicle</cell><cell>man on bus</cell><cell>Ground=public-transportation</cell></row><row><cell>on-region</cell><cell cols="2">on the left side of... ground=region-designator</cell></row><row><cell>hang-on</cell><cell>towel on rod</cell><cell>figure is hangable</cell></row><row><cell>embedded-in</cell><cell>pole in ground</cell><cell>Ground is mass</cell></row><row><cell>embedded-in</cell><cell>boat in water</cell><cell>Figure is embeddable</cell></row><row><cell>buried-in</cell><cell>treasure in ground</cell><cell>Ground is terrain</cell></row><row><cell>enclosed-in-volume</cell><cell>bird in cage</cell><cell>Ground has enclosure</cell></row><row><cell>enclosed-in-area</cell><cell>tree in yard</cell><cell>Ground is area</cell></row><row><cell>in-2D-representation</cell><cell>man in the photo</cell><cell>Ground is 2D representation</cell></row><row><cell>in-cup</cell><cell>cherries in bowl</cell><cell>ground has cup</cell></row><row><cell>in-horiz-opening</cell><cell>in doorway</cell><cell>ground has opening</cell></row><row><cell>stem-in-cup</cell><cell>flower in vase</cell><cell>figure has stem, ground has cup</cell></row><row><cell>wrapped-in</cell><cell>chicken in the foil</cell><cell>ground is flexible/sheet</cell></row><row><cell cols="2">member-of-arrangement plate in stack</cell><cell>ground is arrangement</cell></row><row><cell>in-mixture</cell><cell>dust in air</cell><cell>Figure/Ground=substance</cell></row><row><cell>in-entanglement</cell><cell>bird in tree</cell><cell>Ground has entanglement</cell></row><row><cell>fitted-in</cell><cell>hand in glove</cell><cell>Figure/Ground=fit</cell></row><row><cell>in-grip</cell><cell>pencil in hand</cell><cell>Ground=gripper</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head></head><label></label><figDesc>table. A picture of a woman is on the left side of a 16 foot long wall. A brick texture is on the wall. The wall is 2 feet behind the table. A small brown horse is in the ground. It is a foot to the left of the table. A red chicken is in a birdcage. The cage is to the right of the table. A huge apple is on the wall. It is to the left of the picture. A large rug is under the table. A small blue chicken is in a large flower cereal bowl. A pink mouse is on a small chair. The chair is 5 inches to the left of the bowl. The bowl is in front of the table. The red chicken is facing the blue chicken. . .</figDesc><table /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work was supported in part by the NSF IIS-0904361. Any opinions, findings and conclusions or recommendations expressed in this material are the authors' and do not necessarily reflect those of the sponsors.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Natural language driven image generation</title>
		<author>
			<persName><forename type="first">G</forename><surname>Adorni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Di Manzo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Giunchiglia</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">COLING</title>
		<imprint>
			<biblScope unit="page" from="495" to="500" />
			<date type="published" when="1984">1984</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A parameterized action representation for virtual human agents</title>
		<author>
			<persName><forename type="first">N</forename><surname>Badler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Bindiganavale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bourne</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Palmer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Schule</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Workshop on Embodied Conversational Characters</title>
				<meeting><address><addrLine>Lake Tahoe</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Baker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Fillmore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lowe</surname></persName>
		</author>
		<title level="m">The Berkeley FrameNet Project</title>
				<imprint>
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
	<note>COLING-ACL</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Put: Language-based interactive manipulation of objects</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">R</forename><surname>Clay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wilhelms</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Computer Graphics and Applications</title>
		<imprint>
			<biblScope unit="page" from="31" to="39" />
			<date type="published" when="1996">1996</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Frame semantics in text-toscene generation</title>
		<author>
			<persName><forename type="first">B</forename><surname>Coyne</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Rambow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hirschberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sproat</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">14th International Conference on Knowledge-Based and Intelligent Information &amp; Engineering Systems</title>
				<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">WordsEye: An automatic text-to-scene conversion system</title>
		<author>
			<persName><forename type="first">B</forename><surname>Coyne</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sproat</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">SIGGRAPH, Computer Graphics Proceedings</title>
		<imprint>
			<biblScope unit="page" from="487" to="496" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Generating a 3d simulation of a car accident from a written description in natural language: The carsim system</title>
		<author>
			<persName><forename type="first">S</forename><surname>Dupuy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Egges</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Legendre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nugues</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of ACL Workshop on Temporal and Spatial Information Processing</title>
				<meeting>ACL Workshop on Temporal and Spatial Information Processing</meeting>
		<imprint>
			<date type="published" when="2001">2001</date>
			<biblScope unit="page" from="1" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">WordNet: an electronic lexical database</title>
		<author>
			<persName><forename type="first">C</forename><surname>Fellbaum</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1998">1998</date>
			<publisher>MIT Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Language and Spatial Cognition: an Interdisciplinary Study of the Prepositions in English</title>
		<author>
			<persName><forename type="first">A</forename><surname>Herskovits</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1986">1986</date>
			<publisher>Cambridge University Press</publisher>
			<pubPlace>Cambridge, England</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Rendering with Radiance</title>
		<author>
			<persName><forename type="first">G</forename><surname>Larson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Shakespeare</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Morgan Kaufmann Series in Computer Graphics</title>
				<imprint>
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Levinson</surname></persName>
		</author>
		<title level="m">Space in Language and Cognition: Explorations in Cognitive Diversity</title>
				<meeting><address><addrLine>Cambridge</addrLine></address></meeting>
		<imprint>
			<publisher>Cambridge University Press</publisher>
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Automatic Conversion of Natural Language to 3D Animation</title>
		<author>
			<persName><forename type="first">M</forename><surname>Ma</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2006">2006</date>
		</imprint>
		<respStmt>
			<orgName>University of Ulster</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Ph.D. thesis</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">The clowns microworld</title>
		<author>
			<persName><forename type="first">R</forename><surname>Simmons</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of TINLAP</title>
				<meeting>TINLAP</meeting>
		<imprint>
			<date type="published" when="1998">1998</date>
			<biblScope unit="page" from="17" to="19" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Inferring the environment in a text-to-scene conversion system</title>
		<author>
			<persName><forename type="first">R</forename><surname>Sproat</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">First International Conference on Knowledge Capture</title>
				<meeting><address><addrLine>Victoria, BC</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
