<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">NLP &amp; DBpedia: An Upward Knowledge Acquisition Spiral</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Sebastian</forename><surname>Hellmann</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Institute of Computer Science</orgName>
								<orgName type="institution" key="instit1">University of Leipzig</orgName>
								<orgName type="institution" key="instit2">AKSW Group</orgName>
								<address>
									<addrLine>Augustusplatz 10</addrLine>
									<postCode>D-04009</postCode>
									<settlement>Leipzig</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Agata</forename><surname>Filipowska</surname></persName>
							<affiliation key="aff1">
								<orgName type="department">Faculty of Informatics and Electronic Economy</orgName>
								<orgName type="institution">Poznan University of Economics</orgName>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="department">Department of Information Systems</orgName>
								<address>
									<addrLine>Al. Niepodleglosci 10</addrLine>
									<postCode>61-875</postCode>
									<settlement>Poznan</settlement>
									<country key="PL">Poland</country>
								</address>
							</affiliation>
							<affiliation key="aff3">
								<orgName type="institution">Instytut Informatyki Gospodarczej Sp. z o.o</orgName>
								<address>
									<addrLine>ul. Rubiez 12G/6</addrLine>
									<postCode>61-612</postCode>
									<settlement>Poznan</settlement>
									<country key="PL">Poland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Caroline</forename><surname>Barrière</surname></persName>
							<affiliation key="aff4">
								<orgName type="department">Centre de Recherche Informatique de Montréal</orgName>
								<address>
									<settlement>Montréal</settlement>
									<country key="CA">Canada</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Pablo</forename><forename type="middle">N</forename><surname>Mendes</surname></persName>
							<affiliation key="aff5">
								<orgName type="department">Kno.e.sis Center</orgName>
								<orgName type="institution">Wright State University</orgName>
								<address>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Dimitris</forename><surname>Kontokostas</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Institute of Computer Science</orgName>
								<orgName type="institution" key="instit1">University of Leipzig</orgName>
								<orgName type="institution" key="instit2">AKSW Group</orgName>
								<address>
									<addrLine>Augustusplatz 10</addrLine>
									<postCode>D-04009</postCode>
									<settlement>Leipzig</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">NLP &amp; DBpedia: An Upward Knowledge Acquisition Spiral</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">46B949301D70470F11ECA2EE9E3D498F</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T04:02+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>DBpedia</term>
					<term>Natural Language Processing</term>
					<term>RDF</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Recently, the DBpedia community has experienced an immense increase in activity and we believe, that the time has come to explore the connection between DBpedia &amp; Natural Language Processing (NLP) in a yet unprecedented depth. DBpedia has a long-standing tradition to provide useful data as well as a commitment to reliable Semantic Web technologies and living best practices. As the extraction of the Wikipedia's infoboxes by DBpedia matures, we can shift our focus to new challenges such as extracting information from an unstructured article text as well as becoming a testing ground for multilingual NLP methods. DBpedia has the potential to create an upward knowledge acquisition spiral as it provides a small amount of general knowledge allowing to process text, derive more knowledge, validate this knowledge and improve text processing methods. The goal of this workshop was to present existing research, systems and resources, but also to allow discussion about different points of convergence and divergence of the NLP and DBpedia community with a special focus on challenges that lie ahead. We would like to take part in the debate on how to use DBpedia for NLP and NLP for DBpedia.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Communities interested in Natural Language Processing (NLP) and in the Semantic Web, in particular DBpedia, come together to explore different ways of collaborating, and helping each other, towards a common goal of understanding and representing information.</p><p>Resources such as DBpedia are a step towards a solution to the knowledge acquisition bottleneck, so often mentioned in earlier days of NLP <ref type="bibr" target="#b9">[10]</ref>. A prerequisite of text processing and understanding is the availability of knowledge about words, concepts and ways of expressing information. But then, to acquire such knowledge, we are required to automatically process text or immerse in costly and error-prone manual knowledge engineering.</p><p>Where formerly, there was a chicken and egg problem with a serious bootstrapping issue, we now have structured data in DBpedia, which is readily available to turn the bottleneck into an upward knowledge acquisition spiral -a small amount of general knowledge allowing to process text, create more knowledge, validate this knowledge and improve text processing for more acquisition (and so on).</p><p>The recent years have seen a major change, mostly through crowd-sourcing for the construction of the largest encyclopaedic resource, Wikipedia. Although first, mainly made of unstructured data (paragraphs), the addition of infoboxes, and the expansion of interest towards the Semantic Web, have led to DBpedia -one of the largest openly shared structured resource available today. However, any resource not curated nor scrutinized by experts will be prone to noise, and that becomes a new and different challenge for NLP. Also, any resource, even as large as DBpedia, is not complete. So far, mainly the infoboxes, which are already semi-structured, are used to build the RDF repository. But even then, Aprosio et al. <ref type="bibr" target="#b0">[1]</ref> (this volume) mention that more than 50% of Wikipedia articles do not include an infobox. So if the article text is analysed, the spiral can turn further, using DBpedia as input for the NLP process and then create more RDF triples to add and integrate into DBpedia <ref type="bibr" target="#b11">[12]</ref>. This workshop's aim is right in the knowledge acquisition spiral, bringing together researchers in both areas to see how NLP can benefit DBpedia and how DBpedia can benefit NLP. The contributions in the workshop allow to highlight multiple facets of this duality. In the remainder of this article, we discuss the contributions to the NLP&amp;DBpedia workshop. Our main interest, however, are the challenges that the readers can expect to stay unresolved, that is the many interesting underlying issues brought forward by these articles. Another goal of this workshop was to present existing research, systems and resources to allow discussion about different points of convergence and divergence of the NLP and DBpedia community. It is also interesting to illustrate when both communities actually tackle very similar problems, with different approaches.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Knowledge acquisition and structuring</head><p>To some extent <ref type="bibr" target="#b6">[7]</ref> (this volume) explore the problem of the above-mentioned knowledge acquisition bottleneck, by comparing information extraction systems, in particular NELL <ref type="bibr" target="#b3">[4]</ref>, which spirals on the large corpus ClueWeb09<ref type="foot" target="#foot_0">6</ref> to acquire more and more knowledge, with database extraction approaches based on crowdsourcing resources such as DBpedia.</p><p>While the main focus of <ref type="bibr" target="#b6">[7]</ref> is more about how to structure the acquired knowledge than on the acquisition method itself, their work raises an important question: to what extent can we (or should we) use Wikipedia and DBpedia to structure and organize data extracted from text? This relates to an issue known in NLP, computational terminology and even more in library sciencethe debate between classifying (finding which terms in a thesaurus to associate to a document) and free-characterisation (extracting any terms from the text for its representation). The former obliges a thesaurus-like structure to be built before the text is analysed. But then many questions of how such structure was made arise. The latter allows the structure (or none) to emerge from the analysed text, but makes it difficult to compare information extracted from different texts, as there is no agreed-upon schema and synonyms stay unresolved.</p><p>The proposal of <ref type="bibr" target="#b18">[19]</ref>, is clearly on the acquisition of knowledge to be "fitted" into a known schema, that of the DBpedia ontology. Their proposal suggests the extension of DBpedia through Wikipedia list pages. The main problem is the actual matching between the extracted knowledge and the ontology. Knowledge sharing and matching is always problematic because of two main issues in semantics, that of polysemy (multiple concepts for a word) and synonymy (multiple words for a concept). Furthermore, there are also two main issues in ontology design and knowledge structuring, that of purpose-based versus non-purposed based ontologies, and that of the granularity of the information represented. All those issues combined make it quite difficult to attempt any kind of ontology expansion.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Representation of knowledge</head><p>As we look at NLP and DBpedia, we see that NLP requires knowledge about words, not only about concepts. Obviously the notion of labels exists in DBpedia, but there is more to language than labels. Should this lexical information be represented the same ways as conceptual information is?</p><p>The separation between lexical, conceptual, terminological, encyclopaedic, and other kinds of knowledge has been a debate for years. Can a single schema allow all types of knowledge? Lexical approaches usually start from words, going from a word to all its senses, and sometimes terminological approaches will start from concepts, and defining all the words that illustrate such concept. If DBpedia is more concept-based, we can then wonder how lexical information would be attached to it, or a more general question of how lexical knowledge has its place within the Semantic Web? <ref type="bibr" target="#b25">[26]</ref> (this volume) present a lemon lexicon for DBpedia and discuss different issues of lexicalization of conceptual structures.</p><p>The BabelNet <ref type="bibr" target="#b14">[15]</ref> resource, resulting from a merge of WordNet <ref type="bibr" target="#b8">[9]</ref> (a widelyused lexical resource in NLP) and Wikipedia, is an example of a mixed-level representation in which lexical, conceptual and encyclopaedic knowledge is combined. BabelNet is used in the work of <ref type="bibr" target="#b7">[8]</ref> (this volume) for the task of QALD (Question Answering over Linked Data) as we will see in the next section. Also <ref type="bibr" target="#b26">[27]</ref> (this volume) talk of developing their own representation, SAR-Graphs (Semantically Associated Relations Graphs) to express not only lexical knowledge, but sentence-based knowledge, that is useful for verbalizing simple predicates but also combined predicates (child of child, for example). These three contributions stimulate a debate on the granularity of the representation of any language resource. Such debate is present in corpus studies, where experts study the value of not only terms, but also phrases (phraseology) in the understanding of language use <ref type="bibr" target="#b23">[24]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">NLP tasks and applications</head><p>Although different tasks are mentioned in our workshop's contributions, three of them are more prominent, that of NER (Named Entity Recognition), Relation Extraction, and Question Answering over Linked Data (QALD).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Named Entity Recognition</head><p>Named Entity Recognition is defined as the task of assigning a class to entities found in a text, such as person, location, organization, date, etc. NER is a well-recognized task in the NLP community since the beginning of the Message Understanding Conferences (MUC) in 1987 (see <ref type="bibr" target="#b10">[11]</ref> for a good overview of information extraction and the early MUC conferences). Although not called as such at the time, early work on information extraction looked at text to find Who did What When How discovering entities such as places, people and dates. Extracted entities were not necessarily typed, or classified, but as information extraction templates were used, such types were implicitly given by the roles the entities filled (Agent, Place, Date).</p><p>Later on, researchers, such as Sekine ( <ref type="bibr" target="#b20">[21]</ref>) defined a hierarchical schema of classes for the NER task. Although, the more fine-grained the classes are, however, the more difficult it is to obtain (or even measure) classification results. Obviously, integration and comparison of these hierarchies can have high complexity, if no reference hierarchy is agreed upon. One such reference hierarchy is the recently created NERD ontology <ref type="bibr" target="#b19">[20]</ref>, however, containing only 84 types <ref type="foot" target="#foot_1">7</ref> which is coarse grained when compared to the over 500 DBpedia Ontology classes <ref type="foot" target="#foot_2">8</ref> , which are used in <ref type="bibr" target="#b5">[6]</ref> (this volume).</p><p>As mentioned in <ref type="bibr" target="#b22">[23]</ref> (this volume) Named Entity Disambiguation is a further step towards identifying not only that an entity is a Person, but who this person actually is by establishing a link to a more specific reference id or URI in a knowledge base. New names are given to the NED or NERD task, that of Entity linking and "wikifiers" <ref type="bibr" target="#b5">[6]</ref> (this volume) and the list of emerging tools, which belong to this class of wikifiers is quite huge and growing steadily: Zemanta, OpenCalais, Ontos, Evri, Extractiv, Alchemy API and many more <ref type="foot" target="#foot_3">9</ref> .</p><p>Wikipedia (and therefore DBpedia) is limited to encyclopaedic knowledge, but often terminological knowledge (how different terms describe different domain specific concepts) as well as lexical knowledge (common words) are available for interlinking with text, thus resembling Word-Sense Disambiguation (WSD), i.e. taking any word in a text and being able to connect the appropriate URI. In <ref type="bibr" target="#b7">[8]</ref> (this volume), both tasks (NED and WSD) are tackled using BabelNet.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Relation extraction</head><p>The task of relation extraction is sometimes seen as a step following that of NER. After entities are extracted, it would be interesting to see how they are related. But sometimes a more "template-like" strategy, as was suggested in early information extraction is done. For example, a system would look for "merger" relations between companies, to find out which companies merged. In such case, the relation is known in advance, and we look in text for both the relation and the participants in such relation.</p><p>Different types of relations have been investigated over the years, and as NLP and DBpedia come closer, relations found in DBpedia tend to be used. <ref type="bibr" target="#b15">[16]</ref> (this volume) focus on ten different relations found in DBpedia. They identify such relations in text through developed lexical extraction rules. The work of <ref type="bibr" target="#b0">[1]</ref> (this volume) focuses on seven different properties found in DBpedia. By properties, they mean relations for which the subject is most likely a named entity, but the object could be a literal, such as the property populationTotal. The line is fuzzy between properties and relations (for example, both contributions mentioned above use the birthDate as a relation to extract in text), and could bring an interesting discussion and debate about this topic. The work of <ref type="bibr" target="#b26">[27]</ref> (this volume) does not target any specific relation and is mostly about the development of a representational schema (as mentioned before) for the English expression of relations.</p><p>The explicit expression of relations in text is a topic of interest in the NLP community for a while. Different methods, either statistical <ref type="bibr" target="#b24">[25]</ref> or pattern-based are developed and experimented on <ref type="bibr" target="#b1">[2]</ref>. This is an interesting place for NLP and the Semantic Web to meet as both communities are interested in finding links between concepts and extract facts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Question Answering over Linked Data</head><p>The tasks of Information Retrieval and Question Answering, within the NLP community, provided some of the early attempts towards a more systematized approach to making the field of NLP grow. Those tasks encouraged the development of challenges and competitions with common data (TREC, <ref type="bibr" target="#b21">[22]</ref>) which we discuss in the next section. The more recent task of Question Answering over Linked Data<ref type="foot" target="#foot_4">10</ref> is a very interesting task, certainly promoting a communication and shared interest between the NLP and Semantic Web community, and also providing some early attempts within the Semantic Web community at sharing data and evaluation standards.</p><p>Three contributions look into QALD. The work of <ref type="bibr" target="#b7">[8]</ref> (this volume), addresses the task of QALD, with a particular strategy which involves NED and word sense disambiguation, as we mentioned above. In <ref type="bibr" target="#b2">[3]</ref> (this volume), the QALD task is not just tackled, but they go further into the study of inconsistency detection when gathering knowledge to answer questions. They look into English, German, French and Italian chapters of DBpedia, and try to detect inconsistencies and supporting evidence among the different answers. In <ref type="bibr" target="#b25">[26]</ref> (this volume) the task of QALD is not performed in itself, but it is mentioned as an extrinsic evaluation of the coverage of the lemon lexicon, saying that the verbalizations found in the lexicon cover many of the questions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Resources</head><p>As most workshop contributions combine some techniques from NLP with the Semantic Web, they talk about different resources that would be useful to the community. We don't want to reinvent the wheel. Obviously, even if alternative Semantic Web resources, such as Yago (http://www.mpi-inf.mpg.de/ yago-naga/yago/) and Freebase (http://www.freebase.com) exist, this workshop focuses on DBpedia, which therefore is the Semantic Web resource most referred to in the different contributions.</p><p>On the NLP side, many frameworks and typical resources exist as well. Wordnet (http://wordnet.princeton.edu/) for example, has been a resource much used in the community for English. More recently, Babelnet (http://babelnet. org), mentioned earlier, has been developed to merge Wikipedia and Wordnet. Also GATE, an open source development framework (http://gate.ac.uk), is used in <ref type="bibr" target="#b5">[6]</ref> (this volume).</p><p>We can think that the primary resource for NLP is text, but which text? There has been work in NLP on different types of texts, from news articles to scientific articles, to blogs, to web data. In the present day, textual content is abundant, and the appropriateness of which text should be analysed for which purpose is a pertinent question. In fact, if we see NLP for DBpedia, at the service of expanding DBpedia, then the chosen text should be informative, factual, accurate. As we saw above, mining Wikipedia for more information is an interesting direction, it is not the only one. We also saw (with NELL) that a large crawled Web corpus is a possibility, as it brings large coverage, but it can also bring noise.</p><p>Different ways of filtering noise exists, either by trying to evaluate the source of information (trust), or by looking at how consistent or inconsistent different information is, looking at redundancy and conflicts. In <ref type="bibr" target="#b2">[3]</ref> (this volume), the general problem of inconsistent information is tackled.</p><p>If we reverse our point of view and see DBpedia at the service of NLP, then the text on which NLP techniques are used is quite arbitrary and depends on further purposes and applications. For example, in <ref type="bibr" target="#b5">[6]</ref> (this volume), both news articles and tweets are explored, which are two very different types of texts.</p><p>The question of language is valid whether we are looking at "NLP for DBpedia" or "DBpedia for NLP". In <ref type="bibr" target="#b15">[16]</ref> (this volume), French text is analysed, and in <ref type="bibr" target="#b2">[3]</ref> (this volume), four different language chapters of DBpedia are used. This is a minority of contributions exploring other languages than English. As always, work on English is more prominent than that on other language, and it brings awareness that it would be interesting for both communities to work on different languages.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Gold and silver standards</head><p>The topic of evaluation is both an important one, and a much debated one. In NLP, there has been a tendency in the past 15 years to perform experiments for which there are well defined gold standards and datasets. There has been an increase in the number of competitions and challenges in many sub-fields of NLP, such as automatic summarization <ref type="bibr" target="#b16">[17]</ref>, word-sense disambiguation <ref type="bibr" target="#b13">[14]</ref>, textual entailment <ref type="bibr" target="#b4">[5]</ref>, etc.</p><p>In the Semantic Web community, there is less of such rigid evaluation, as the field is younger than NLP, and is still looking at pushing the field with different ideas and concepts without imposing rigid evaluations. Certainly, one of the purposes of this workshop was to start discussion towards bringing more of gold standards and evaluation datasets into the community. Although there are some competitions in other areas, such as the OAEI (Ontology Alignment Evaluation Initiative<ref type="foot" target="#foot_5">11</ref> ) which has been happening for a few years now, as well as the QALD (see above) and the plethora of benchmarks for triplestores such as the DBPSB (DBepdia SPARQL Benchmark <ref type="bibr" target="#b12">[13]</ref>). In the field of NER/NED, there are not many datasets or gold standards and only few challenges. The work of <ref type="bibr" target="#b5">[6]</ref> (this volume) paves the way towards the standardization of NER and NED benchmarking in an implemented benchmarking system.</p><p>As a first important step to develop such a gold standard, it is also good to review and question existing work. The work of <ref type="bibr" target="#b22">[23]</ref> (this volume) is an extensive comparison of NED benchmarks and characterizes them to see, if they could be biased for particular types of algorithms, or types of test data. The contribution therefore opens the debate as to how we should develop such benchmarks and provides a solid foundation to built upon.</p><p>When gold standards are hard (costly, time-consuming) to develop, it can be interesting to develop silver standards that are the results of well-known methods, or the combined results of different methods. Such standards do not replace gold standards, but they at least give an indication of the direction of progress for particular algorithms. One possibility when two communities come together is to take the results of one to become the "silver standard" of the other. <ref type="bibr" target="#b17">[18]</ref> (this volume) describes such a silver standard and discusses its benefits as well as its limitations.</p><p>In some work, such as <ref type="bibr" target="#b15">[16]</ref> (this volume) and <ref type="bibr" target="#b0">[1]</ref> (this volume), DBpedia's network of relations is used as a gold standard in relation extraction. Also Wikipedia/DBpedia entities have become the most predominant link targets in NED. <ref type="bibr" target="#b19">[20]</ref> reports of 7 out of 10 tools that attach Wikipedia/DBpedia URLs as annotations (3 out of 10 for the DBpedia Ontology). Although this is an interesting way to proceed, we can debate whether we are using gold or silver standards and how to unify benchmarks for comparison.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Summary</head><p>We conclude by highlighting a few issues brought forward by the contributions in this workshop. First, the selected papers discuss many problems that have been recognized within the NLP community for a long time, but have only recently been introduced to Semantic Web researchers. The main challenges here concern:</p><p>consensus upon annotation guidelines, development of extraction rules and agreed upon hierarchies that may be used to unify semantic enrichment and benchmarks, identification of well-defined tasks and problem classes, transferability of NLP tasks, resources and tools to other research communities (e.g. library and life sciences) as well as other languages and application areas, building practical resources and infrastructures, which do not target one single research question, but can be exploited in a more universal manner by NLP tools, unlock higher layers of semantic annotation to enable state-of-the art OWLbased reasoning on a combination of noisy NLP data and LOD and DBpedia based knowledge structures.</p><p>Second, and perhaps more importantly, new possibilities emerge from the combination of the communities, and we hope to further push such possibilities to have more NLP for DBpedia and more DBpedia for NLP, continuing the knowledge spiral, and fighting together to open the knowledge acquisition bottleneck. We hope that the readers of this volume will find all papers interesting. We invite you to join our community and attend future workshop editions.</p></div>			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_0">http://lemurproject.org/clueweb09/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_1">accessed Oct. 10th, 2013 http://nerd.eurecom.fr/ontology</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_2">An up to date version can be downloaded fromhttp://mappings.dbpedia.org/ server/ontology/dbpedia.owl</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_3">http://en.wikipedia.org/wiki/Knowledge_extraction#Tools contains an up-todate overview</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_4">The first challenge started in 2011, and information can be found at http:// greententacle.techfak.uni-bielefeld.de/ ~cunger/qald/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_5">http://oaei.ontologymatching.org/</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments.</head><p>We especially thank all contributors to DBpedia and the DBpedia Internationalisation committee <ref type="bibr" target="#b11">12</ref> . This work was supported by grants from the European Union's 7th Framework Programme provided for the projects LOD2 (GA no. 257943) and GeoKnow (GA no. 318159).</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Programme Commitee</head><p>We would like to thank all reviewers that have helped us and especially the authors with their comments and feedback.</p><p>-Guadalupe Aguado, Universidad Politécnica de Madrid, Spain -Chris Bizer, Universität Mannheim, Germany -Volha Bryl, Universität Mannheim, Germany -Paul Buitelaar, DERI, National University of Ireland, Galway -Charalampos Bratsas, OKFN, Aristotle University of Thessaloniki, Greece -Philipp Cimiano, CITEC, Universität Bielefeld, Germany -Samhaa R. El-Beltagy, Nile University, Egypt -Daniel Gerber, AKSW, Universität Leipzig, Germany -Jorge Gracia, Universidad Politécnica de Madrid, Spain -Max Jakob, Neofonie GmbH, Germany -Anja Jentzsch, Hasso-Plattner-Institut, Potsdam, Germany -Ali Khalili, AKSW, Universität Leipzig, Germany -Daniel Kinzler, Wikidata, Germany </p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Extending the Coverage of DBpedia Properties using Distant Supervision over Wikipedia</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">P</forename><surname>Aprosio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Giuliano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">Alberto</forename><surname>Lavelli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 1st International Workshop on NLP and DBpedia</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>1st International Workshop on NLP and DBpedia<address><addrLine>Sydney, Australia; Sydney, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-10-25">October 21-25. October 2013</date>
		</imprint>
	</monogr>
	<note>1064 of NLP &amp; DBpedia 2013</note>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Probing Semantic Relations: Exploration and identification in specialized texts</title>
		<author>
			<persName><forename type="first">A</forename><surname>Auger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Barrière</surname></persName>
		</author>
		<editor>John Benjamins</editor>
		<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
	<note>benjamins edition</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Argumentation-based Inconsistencies Detection for Question-Answering over DBpedia</title>
		<author>
			<persName><forename type="first">E</forename><surname>Cabrio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cojan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Villata</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Gandon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 1st International Workshop on NLP and DBpedia</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>1st International Workshop on NLP and DBpedia<address><addrLine>Sydney, Australia; Sydney, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-10-25">October 21-25. October 2013</date>
		</imprint>
	</monogr>
	<note>1064 of NLP &amp; DBpedia 2013</note>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Toward an architecture for never-ending language learning</title>
		<author>
			<persName><forename type="first">A</forename><surname>Carlson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Betteridge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Kisiel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Settles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">R H</forename><genName>Jr</genName></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">M</forename><surname>Mitchell</surname></persName>
		</author>
		<editor>M. Fox and D. Poole</editor>
		<imprint>
			<date type="published" when="2010">2010</date>
			<publisher>AAAI. AAAI Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Textual entailment</title>
		<author>
			<persName><forename type="first">D</forename><surname>Cristea</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Computational Linguistics</title>
				<imprint>
			<date type="published" when="2009-06">June. 2009</date>
			<biblScope unit="page" from="1140" to="1143" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Datasets and GATE Evaluation Framework for Benchmarking Wikipedia-Based NER Systems</title>
		<author>
			<persName><forename type="first">M</forename><surname>Dojchinovski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Kliegr</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 1st International Workshop on NLP and DBpedia</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>1st International Workshop on NLP and DBpedia<address><addrLine>Sydney, Australia; Sydney, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-10-25">October 21-25. October 2013</date>
		</imprint>
	</monogr>
	<note>1064 of NLP &amp; DBpedia 2013</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Integrating Open and Closed Information Extraction: Challenges and First Steps</title>
		<author>
			<persName><forename type="first">A</forename><surname>Dutta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Meilicke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Niepert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ponzetto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 1st International Workshop on NLP and DBpedia</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>1st International Workshop on NLP and DBpedia<address><addrLine>Sydney, Australia; Sydney, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-10-25">October 21-25. October 2013</date>
		</imprint>
	</monogr>
	<note>1064 of NLP &amp; DBpedia 2013</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Using BabelNet in Bridging the Gap Between Natural Language Queries and Linked Data Concepts</title>
		<author>
			<persName><forename type="first">K</forename><surname>Elbedweihy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wrigley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ciravegna</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 1st International Workshop on NLP and DBpedia</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>1st International Workshop on NLP and DBpedia<address><addrLine>Sydney, Australia; Sydney, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-10-25">October 21-25. October 2013</date>
		</imprint>
	</monogr>
	<note>1064 of NLP &amp; DBpedia 2013</note>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">WordNet: an electronic lexical database</title>
		<author>
			<persName><forename type="first">C</forename><surname>Fellbaum</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1998">1998</date>
			<publisher>MIT Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">A method for disambiguating word senses in a large corpus</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">A</forename><surname>Gale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">W</forename><surname>Church</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Yarowsky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computers and the Humanities</title>
		<imprint>
			<biblScope unit="volume">26</biblScope>
			<biblScope unit="issue">5-6</biblScope>
			<biblScope unit="page" from="415" to="439" />
			<date type="published" when="1992">1992</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Information Extraction: Techniques and Challenges</title>
		<author>
			<persName><forename type="first">R</forename><surname>Grishman</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1997">1997</date>
			<biblScope unit="page" from="10" to="27" />
			<pubPlace>New York</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Round-trip semantics with sztakipedia and dbpedia spotlight</title>
		<author>
			<persName><forename type="first">M</forename><surname>Héder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">N</forename><surname>Mendes</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">WWW (Companion Volume</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Mille</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><forename type="middle">L</forename><surname>Gandon</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Misselis</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Rabinovich</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Staab</surname></persName>
		</editor>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="357" to="360" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">DBpedia SPARQL Benchmark -Performance Assessment with Real Queries on Real Data</title>
		<author>
			<persName><forename type="first">M</forename><surname>Morsey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lehmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Auer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A.-C. Ngonga</forename><surname>Ngomo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ISWC 2011</title>
				<imprint>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">SemEval-2013 Task 12 : Multilingual Word Sense Disambiguation</title>
		<author>
			<persName><forename type="first">R</forename><surname>Navigli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Jurgens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Vannella</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 7th International Workshop on Semantic Evaluation SemEval 2013 in conjunction with the Second Joint Conference on Lexical and Computational Semantics SEM 2013</title>
				<meeting>the 7th International Workshop on Semantic Evaluation SemEval 2013 in conjunction with the Second Joint Conference on Lexical and Computational Semantics SEM 2013</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network</title>
		<author>
			<persName><forename type="first">R</forename><surname>Navigli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">P</forename><surname>Ponzetto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artif. Intell</title>
		<imprint>
			<biblScope unit="volume">193</biblScope>
			<biblScope unit="page" from="217" to="250" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">A Rule-Based Relation Extraction System using DBpedia and Syntactic Parsing</title>
		<author>
			<persName><forename type="first">K</forename><surname>Nebhi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 1st International Workshop on NLP and DBpedia</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>1st International Workshop on NLP and DBpedia<address><addrLine>Sydney, Australia; Sydney, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-10-25">October 21-25. October 2013</date>
		</imprint>
	</monogr>
	<note>1064 of NLP &amp; DBpedia 2013</note>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Text Summarization Challenge 2 -Text summarization evaluation at NTCIR Workshop 3</title>
		<author>
			<persName><forename type="first">T</forename><surname>Okumura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Manabu</forename><surname>Fukusima</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Nanba</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the HLT-NAACL 03 Text Summarization Workshop</title>
				<meeting>the HLT-NAACL 03 Text Summarization Workshop</meeting>
		<imprint>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="49" to="56" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">A Silver Standard Benchmark Dataset for Semantic Relatedness in DBpedia</title>
		<author>
			<persName><forename type="first">H</forename><surname>Paulheim</surname></persName>
		</author>
		<author>
			<persName><surname>Dbpedianyd</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 1st International Workshop on NLP and DBpedia</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>1st International Workshop on NLP and DBpedia<address><addrLine>Sydney, Australia; Sydney, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-10-25">October 21-25. October 2013</date>
		</imprint>
	</monogr>
	<note>1064 of NLP &amp; DBpedia 2013</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Extending DBpedia with Wikipedia List Pages</title>
		<author>
			<persName><forename type="first">H</forename><surname>Paulheim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">P</forename><surname>Ponzetto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 1st International Workshop on NLP and DBpedia</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>1st International Workshop on NLP and DBpedia<address><addrLine>Sydney, Australia; Sydney, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-10-25">October 21-25. October 2013</date>
		</imprint>
	</monogr>
	<note>1064 of NLP &amp; DBpedia 2013</note>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">NERD meets NIF: Lifting NLP extraction results to the linked data cloud</title>
		<author>
			<persName><forename type="first">G</forename><surname>Rizzo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Troncy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hellmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bruemmer</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2012">2012</date>
			<publisher>LDOW</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Definition, dictionaries and tagger for extended named entity hierarchy</title>
		<author>
			<persName><forename type="first">S</forename><surname>Sekine</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Nobata</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Language Resources and Evaluation Conference LREC</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Zampolli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Lino</surname></persName>
		</editor>
		<meeting>the Language Resources and Evaluation Conference LREC</meeting>
		<imprint>
			<publisher>European Language Resources Association</publisher>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="1977" to="1980" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Further reflections on TREC</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">Sparck</forename><surname>Jones</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Processing &amp; Management</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="37" to="85" />
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Statistical Analyses of Named Entity Disambiguation Benchmarks</title>
		<author>
			<persName><forename type="first">N</forename><surname>Steinmetz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Knuth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Sack</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 1st International Workshop on NLP and DBpedia</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>1st International Workshop on NLP and DBpedia<address><addrLine>Sydney, Australia; Sydney, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-10-25">October 21-25. October 2013</date>
		</imprint>
	</monogr>
	<note>1064 of NLP &amp; DBpedia 2013</note>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">An example of frequent English phraseology: Distribution, structures and functions</title>
		<author>
			<persName><forename type="first">M</forename><surname>Stubbs</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Corpus linguistics 25 years on</title>
				<editor>
			<persName><forename type="first">R</forename><surname>Facchinetti</surname></persName>
		</editor>
		<imprint>
			<publisher>Rodopi</publisher>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="89" to="105" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Corpus-based Learning of Analogies and Semantic Relations</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">D</forename><surname>Turney</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Littman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Machine Learning</title>
				<imprint>
			<date type="published" when="2005">2005</date>
			<biblScope unit="volume">60</biblScope>
			<biblScope unit="page" from="1" to="3" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">A lemon lexicon for DBpedia</title>
		<author>
			<persName><forename type="first">C</forename><surname>Unger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mccrae</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Walter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Winter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cimiano</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 1st International Workshop on NLP and DBpedia</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>1st International Workshop on NLP and DBpedia<address><addrLine>Sydney, Australia; Sydney, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-10-25">October 21-25. October 2013</date>
		</imprint>
	</monogr>
	<note>1064 of NLP &amp; DBpedia 2013</note>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">From Strings to Things SAR-Graphs: A New Type of Resource for Connecting Knowledge and Language</title>
		<author>
			<persName><forename type="first">H</forename><surname>Uszkoreit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Xu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 1st International Workshop on NLP and DBpedia</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>1st International Workshop on NLP and DBpedia<address><addrLine>Sydney, Australia; Sydney, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-10-25">October 21-25. October 2013</date>
		</imprint>
	</monogr>
	<note>1064 of NLP &amp; DBpedia 2013</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
