<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">English-Russian WordNet for Multilingual Mappings</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Sergey</forename><surname>Yablonsky</surname></persName>
							<email>serge_yablonsky@hotmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">St. Petersburg State University</orgName>
								<address>
									<addrLine>Volkhovsky Per. 3, St. Petersburg</addrLine>
									<postCode>199004</postCode>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">English-Russian WordNet for Multilingual Mappings</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">7C7A6F58811591074F36406AA053B65B</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T07:32+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>WordNet</term>
					<term>English-Russian WordNet</term>
					<term>Grid</term>
					<term>Semantic Web</term>
					<term>RDF</term>
					<term>OWL</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper reports about the current results of the development of the English-Russian WordNet. It describes usage of English-Russian lexical language resources and software to process English-Russian WordNet and design of a XML/RDF/OWL-markup of the English-Russian WordNet resources. Relevant aspects of the DTD/XML/RDF/OWL formats and related technologies are surveyed.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The Semantic Web, a Web with the meaning, is often associated with specific XMLbased standards for semantics, such as RDF and OWL [http://www.w3.org/RDF/, http://www.w3.org/TR/owl-features/]. If HTML and the Web made all the online documents look like one huge book, RDF, schema, and inference languages will make all the data in the world look like one huge database. One of the key promises of the Semantic Web is that it will provide the necessary infrastructure for enabling services and applications on the Web to automatically aggregate and integrate information into a sum which is greater than the individual parts. So the Semantic Web should enable users to locate, select, employ, compose, and monitor Web-based services automatically. To make use of a Web service a software agent needs a computerinterpretable description of the service, and the means by which it is accessed. An important goal for Semantic Web markup languages is to establish a framework within which these descriptions are made and shared. Web sites should be able to employ a standard ontology, consisting of a set of basic classes and properties, for declaring and describing services, while the cross-lingual ontology structuring mechanisms of OWL provide an appropriate, Web-compatible representation language framework within which to do this.</p><p>Web-compatible representation language framework today usially is based on lexical ontologies. Wordnets are cross-lingual lexical ontologies, including information on hypernyms, synonyms, polysemous terms, relations between terms, and sometimes multilingual equivalents. Wordnets are valuable resources as sources of ontological distinctions. WordNets provide a conceptual framework for multilingual mappings in ontologies. Linking concepts across many cross-lingual lexicons belonging to the WordNet-family started by using the Interlingual Index (ILI) <ref type="bibr" target="#b2">[2]</ref>. Unfortunately, no version of the ILI can be considered a standard and often the various lexicons exploit different version of WordNet as ILI.</p><p>At the 3rd GWA Conference in Korea there was launched the idea to start building a WordNet grid around a Common Base Concepts expressed in terms of WordNet synsets and SUMO definitions (http://www.globalwordnet.org/gwa/gwa_grid.htm). This first version of the Grid was planned to be build around the set of 4689 Common Base Concepts. Since then only three languages with essentially various number of synsets and different WordNet versions were placed in the Grid mappings (English -4689 synsets with WN 2.0 mapping, Spanish -15556 synsets with WN1.6 mapping and Catalan -12942 synsets with WN1.6 mapping). But there is yet no official format for the Global WordNet Grid. So far there are just only 3 files in the specified format.</p><p>This paper reports about the current results of the English-Russian WordNet development <ref type="bibr" target="#b2">[2,</ref><ref type="bibr" target="#b3">3,</ref><ref type="bibr" target="#b4">4]</ref>. It describes usage of Russian and English-Russian lexical language resources and software to process English-Russian WordNet and English-Russian WordNet Grid (4600 synsets with WN 3.0 mapping) and design of a XML/RDF/OWL-markup of the English-Russian WordNet resources. Relevant aspects of the DTD/XML/RDF/OWL formats and related technologies are surveyed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Lexical Resources</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Lexical Resources for English-Russian WordNet</head><p>On December 2003 our research group got license from OUP to explore and exploit for research purposes such language resources: -Oxford Russian Dictionary; -New Oxford Dictionary of English, 2nd Edition; -New Oxford Thesaurus of English.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Language Software</head><p>For many linguistic tasks of WordNet development we use language processor Russicon that includes such main blocks:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head> System for construction and support of machine dictionaries</head><p>System allows receiving morphological information of the word and to build normal form of the word, shows paradigm for the word, constructs new words lexicon, constructs frequency lexicon (Fig. <ref type="figure" target="#fig_0">1</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head> Morphological analyzer and normalyzer</head><p>The theoretical foundation of the morphological analyzer and normalyzer program is a language-independent model of morphological analysis <ref type="bibr" target="#b6">[6]</ref><ref type="bibr" target="#b7">[7]</ref><ref type="bibr" target="#b8">[8]</ref>. Morphological analyzer and normalyzer allows a) defining the following grammatical characteristic s of a word: part of speech, case, gender, number, tense, person, degree of comparison, voice, aspect, mood, form, type, transitiveness, reflexive, animation, b) modifying a given word to its normal grammatical form/s -lemma/s. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head> WordNet Editor</head><p>WordNet editor TenDrow was developed to help join production of Russian WordNet from above mentioned linguistic resources. It allows to join sysnsets from Thesaurus, explanatory and other dictionaries; proceed relations between synsets and words of synsets. WordNet editor is not only viewer but also a real tool for constructing and editing multiligual/monolingual WordNet. It is a database management system in which users (linguist or knowledge engineer) can create, edit and look at the English and Russian (Fig. <ref type="figure" target="#fig_2">2</ref>).  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">English-Russian WordNet translation</head><p>Usually there were several standard variants (Fig. <ref type="figure" target="#fig_3">3, a,b,c,d</ref>) of the English-Russian WordNet and English-Russian WordNet Grid translation equivalents.</p><p>The simplest is the a variant. Approximately 24000 English-Russian synsets could be translated in such way. The hardest is d variant because such kind of translation destroys normal mapping and forms additional sub mappings. More then 15000 English synsets have no right word to word translation to Russian.  Synset: {sell} -be sold at a certain price or in a certain way Example: These books sell like hot cakes.</p><p>Russian translation: {prodavat'sya;rasprodavat'sya;sbyvat'sya}  Hyponomy problems: sometime no translations of English synset member exists in Russian or there were some loops in relations : Synset: {сutter, cutlery, cutting tool} -a cutting implement; a tool for cutting.</p><p>Hypernym: ENG20-03040079-n Synset: {cutting implement} -a tool used for cutting or slicing.</p><p>Russian translation: { kolyusche-rejuschie orudiya} Hypernym: ENG20-04279652-n Synset: {edge tool} -any cutting tool with a sharp cutting edge (as a chisel or knife or plane or gouge).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Russian translation: { rejuschij instrument}</head><p>Hypernym: ENG20-03039706-n</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">English-Russian WordNet [Grid] construction</head><p>The porting of the English-Russian WordNet was done into XML using the DTD for the XML structure from http://www.globalwordnet.org/gwa/gwa_grid.htm and the The English-Russian DTD and XML format for the English-Russian WordNet and English-Russian WordNet Grid is shown on Fig. <ref type="figure" target="#fig_4">4,5</ref>. The WordNet Task Force <ref type="bibr" target="#b9">[9]</ref> developed a new approach in WordNet RDF conversion. The W3C WordNet project is still in the process of being completed, at the level of schema and data (http://www.w3.org/2001/sw/BestPractices/WNET/wn-conversion.html). It was used for porting of the English-Russian WordNet and English-Russian WordNet Grid into RDF and OWL.</p><p>But still there are open issues how to support different versions of WordNet in XML/RDF/OWL and how to define the relationship between them and how to integrate WordNet with sources in other languages. ?aWordSense wn20schema:word ?aWord . ?aWord wn20schema:lexicalForm "bank"@en-US } Proposed semantic framework <ref type="bibr" target="#b8">[8]</ref> for grid improvement is based on such main counterparts (Fig. <ref type="figure">6</ref>): RDF/OWL store; tools for information extraction; tools for Ontology Engineering Modeling Process; knowledge mining, SPAROL/SQL search and analysis tools. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig.1. System for construction and support of machine dictionaries; I -word input, IIparadigm; III -input of basic grammatical features; IV -input of additional grammatical features; V -input of part of speech.</figDesc><graphic coords="3,124.80,147.66,345.72,344.28" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. TenDrow</figDesc><graphic coords="4,313.14,364.56,129.52,91.68" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Standard variants of the English-Russian WordNet [Grid] translation equivalents</figDesc><graphic coords="4,160.86,591.60,284.88,77.52" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Fig. 4 .</head><label>4</label><figDesc>Fig.4. Standard DTD for the Russian grid XML structure</figDesc><graphic coords="6,163.80,181.86,267.72,341.28" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Fig. 5 .Fig. 6 .</head><label>56</label><figDesc>Fig.5.</figDesc><graphic coords="7,138.48,459.90,329.70,186.24" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="8,135.96,147.24,323.62,218.28" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>Some addition of the explanations in Russian translation were made in the cases when there was no any translation or when there exists only general translation not in a given sence:</figDesc><table><row><cell cols="2">Additional issues in translation could be mentioned:  Some English Grid synsets doesn't contain the words from synsets in</cell></row><row><cell></cell><cell>Example.</cell></row><row><cell></cell><cell>Synset: {talk of; talk about} -discuss or mention</cell></row><row><cell></cell><cell>Example: «They spoke of many things»</cell></row><row><cell></cell><cell>Russian translation: obsudit'; obgovorit'; upomyanut'; kstati skazat';</cell></row><row><cell></cell><cell>kasat'sya kakoi-libo temy; kosnut'sya kakoi-libo temy.</cell></row><row><cell></cell><cell>Synset: {restrict} -place under restrictions; limit access to</cell></row><row><cell></cell><cell>Example: This substance is controlled</cell></row><row><cell></cell><cell>Russian translation: ogranichivat'</cell></row><row><cell></cell><cell>Synset: {bring about} -make possible</cell></row><row><cell></cell><cell>Example: The grant made our research possible</cell></row><row><cell></cell><cell>Russian translation: dat' vozmojnost'; obuslovit' vozmojnost'; sdelat'</cell></row><row><cell></cell><cell>vozmojnym</cell></row></table><note>Synset: {foot; invertebrate foot} Russian translation: noga Russian explanation: organ peredvijeniya ili prikrepleniya u nekotoryh bespozvonochnyh Synset: {soldier} Russian translation: soldat Russian explanation: rabochaya osob' kolonii nasekomyh, prisposoblennaya k zaschite soobschestva  Creation of new Russian synsets from English synset translation was done:</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>5 Framework architecture for English-Russian WordNet Grid Improvement</head><label></label><figDesc>XMLSpy 2007 and Oracle 11g were used for managing WordNet Semantic web models that provided important XML/RDF/OWL support for data modeling and editing of XML/RDF/OWL WordNet models. RDF specification defines the syntax and semantics of the SPARQL query language for RDF. SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware. SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions. SPARQL also supports extensible value testing and constraining queries by source RDF graph. The results of SPARQL queries can be results sets or RDF graphs (http://www.w3.org/TR/rdf-sparql-query/).Example. The following queries for all Synsets that contain a Word with the lexical form "bank" (http://www.w3.org/TR/wordnet-rdf/):</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>PREFIX wn20schema: &lt;http://www.w3.org/2006/03/wn/wn20/schema/&gt; SELECT ?aSynset WHERE { ?aSynset wn20schema:containsWordSense ?aWordSense .</head><label></label><figDesc></figDesc><table /></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">WordNet: An Electronic Lexical Database</title>
		<editor>Fellbaum, C.</editor>
		<imprint>
			<date type="published" when="1998">1998</date>
			<publisher>Bradford Books</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">EuroWordNet: A Multilingual Database with Lexical Semantic Network</title>
		<author>
			<persName><forename type="first">P</forename><surname>Vossen</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1998">1998</date>
			<pubPlace>Dordrecht</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Russia WordNet. From UML-notation to Internet / Intranet Database Implementation</title>
		<author>
			<persName><forename type="first">V</forename><surname>Balkova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Suhonogov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Yablonsky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Second International WordNet Conference (GWC 2004)</title>
				<meeting>the Second International WordNet Conference (GWC 2004)<address><addrLine>Brno</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="31" to="38" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Some Issues in the Construction of a Russian WordNet Grid</title>
		<author>
			<persName><forename type="first">V</forename><surname>Balkova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Suhonogov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Yablonsky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Forth International WordNet Conference (GWC 2008)</title>
				<meeting>the Forth International WordNet Conference (GWC 2008)<address><addrLine>Szeged, Hungary</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2008">January 22-25, 2008</date>
			<biblScope unit="page" from="44" to="55" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Semi-Automated English-Russian WordNet Construction: Initial Resources, Software and Methods of Translation</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Yablonsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Suhonogov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Third International WordNet Conference (GWC 2006)</title>
				<meeting>the Third International WordNet Conference (GWC 2006)<address><addrLine>South Jeju Island, Korea</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2006">January 22-26 ( 2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Russicon Slavonic Language Resources and Software</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Yablonsky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings First International Conference on Language Resources &amp; Evaluation</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Rubio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Gallardo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Castro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Tejada</surname></persName>
		</editor>
		<meeting>First International Conference on Language Resources &amp; Evaluation<address><addrLine>Granada, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Russian Morphological Analyses</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Yablonsky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Conference VEXTAL</title>
				<meeting>the International Conference VEXTAL<address><addrLine>Venezia, Italia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1999">November 22-24. 1999</date>
			<biblScope unit="page" from="83" to="90" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Russian Morphology: Resources and Java Software Applications</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Yablonsky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings EACL03 Workshop Morphological Processing of Slavic Languages</title>
				<meeting>EACL03 Workshop Morphological Processing of Slavic Languages<address><addrLine>Budapest, Hungary</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Semantic Web Framework for Development of Very Large Ontologies</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Yablonsky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">POLIBITS</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="page" from="19" to="26" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<ptr target="http://www2.unine.ch/imi/page11291_en.html" />
		<title level="m">WordNet OWL Ontology</title>
				<imprint/>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
