<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Language-Agnostic Knowledge Graphs for Smarter Multilingual Chatbots</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Alena</forename><surname>Vasilevich</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Coreon GmbH</orgName>
								<address>
									<addrLine>Rungestrasse 20</addrLine>
									<postCode>10179</postCode>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Michael</forename><surname>Wetzel</surname></persName>
							<email>michael@coreon.com</email>
							<affiliation key="aff0">
								<orgName type="department">Coreon GmbH</orgName>
								<address>
									<addrLine>Rungestrasse 20</addrLine>
									<postCode>10179</postCode>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Georg</forename><surname>Sedlbauer</surname></persName>
							<email>sedlbauer@wirtschaftsagentur.at</email>
							<affiliation key="aff1">
								<orgName type="department">Vienna Business Agency</orgName>
								<address>
									<addrLine>Mariahilfer Strasse 20</addrLine>
									<postCode>1070</postCode>
									<settlement>Vienna</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kerstin</forename><surname>Hubmer</surname></persName>
							<email>hubmer@wirtschaftsagentur.at</email>
							<affiliation key="aff1">
								<orgName type="department">Vienna Business Agency</orgName>
								<address>
									<addrLine>Mariahilfer Strasse 20</addrLine>
									<postCode>1070</postCode>
									<settlement>Vienna</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Language-Agnostic Knowledge Graphs for Smarter Multilingual Chatbots</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">BD8033C546970FCD28BBA71AEB329176</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T03:53+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Chatbots</term>
					<term>knowledge management</term>
					<term>terminology management</term>
					<term>knowledge graphs</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>CEFAT4Cities targets the development of multilingual cross-border e-Government services, facilitating the conversion of natural-language administrative procedures into machine-readable data. We showcase the integration of CEFAT4Cities results into SmartBot, a prototype of a multilingual chatbot, developed for the Vienna Business Agency (VBA) in scope of the project. SmartBot makes VBA's services discoverable in a user-friendly way, fine-targeting such topics as starting a new business and finding relevant grants among hundreds of funding opportunities. It is driven by multilingual AI that contains the results of CEFAT4Cities workflows, integrated into its domain knowledge along with multilingual domain-specific vocabularies, represented in a language-agnostic knowledge graph in Coreon. Thanks to the integrated multilingual knowledge system (MKS), SmartBot is able to infer connections between language-agnostic concepts and deal with terms, previously unseen by the bot's language model.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Nowadays, tools and data provided by public sector and private organizations still tend to be institutionally fragmented. The fragmentation of European e-government fabric triggered the emergence of interoperability solutions, to unify and simplify interaction between cross-border and cross-sector services. EuroVoc 1 and ISA2 2 belong to such inter-operable solutions, fostering uniformity within technical, semantic, organizational, and legal layers across the EU <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>. The existing Standards for Public Sector Information (PSI) provision supply instruments to describe e-Government services in a uniform way. Yet they remain mostly unexploited and often lack user-centric design, let alone multilingual functionality that would support the official linguistic diversity of the EU <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref>. CEFAT4Cities project (2020-2022) 3 targets this challenge of interaction between EU residents, businesses, and public services, aiming to speed up the adoption of multilingual cross-border eGovernment services. Its main objective is a software layer that facilitates the conversion of natural-language administrative procedures into machine-readable data (see <ref type="bibr" target="#b3">[4]</ref> for details). Integrating its output into the existing EU resources, such as ISA2 and CPSV <ref type="foot" target="#foot_0">4</ref> that describe public services and associated life and business events <ref type="bibr" target="#b3">[4]</ref>, we created an open linked data repository, uniting concepts, relevant for businesses and citizens.</p><p>In this paper, we showcase how this resource is leveraged in a prototype of a real-life chatbot application, SmartBot, developed for the Vienna Business Agency (VBA) <ref type="foot" target="#foot_1">5</ref> in scope of the CE-FAT4Cities project. Lately, chatbots have started to emerge in various fields, featuring use-cases like information retrieval, service discoverability, customer service, and administrative workflows <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b7">8]</ref>. Since dialogue is a natural way of interaction between humans, conversational agents designed to mimic this behaviour have potential to increase the efficiency of public services. In our case, SmartBot's goal is to automate and make VBA's services discoverable in a user-friendly way, targeting such topics as starting a new business in Vienna and helping users find relevant grants among hundreds of funding opportunities for companies of various scale.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Chatbot's architecture</head><p>Figure <ref type="figure" target="#fig_0">1</ref> displays SmartBot's architecture. The prototype is powered by Rasa Open Source<ref type="foot" target="#foot_2">6</ref> , a framework for building conversational AI assistants. Domain knowledge as well as VBAspecific vocabularies are organised as a language-agnostic knowledge graph and curated in Coreon <ref type="foot" target="#foot_3">7</ref> . Rasa Open Source is modular by design: it consists of two primary components, Natural Language Understanding (NLU) and dialogue management (Rasa Core), and allows easy integration with other systems. NLU component is responsible for understanding the input received from the user: it handles intent classification and entity identification in users' utterances. The dialogue management component predicts the next action in a conversation based on the context. Rasa SDK handles all of our custom code: it is organised as custom actions that search databases, make API calls, trigger a handover of the conversation to a human, etc. Rasa Open Source is therefore adjustable to developers' needs, featuring straightforward integration and data control <ref type="bibr" target="#b8">[9]</ref>.</p><p>On top of Rasa, our architecture features an integration with Coreon Multilingual Knowledge System (MKS) <ref type="bibr" target="#b9">[10,</ref><ref type="bibr" target="#b10">11]</ref>. MKS is a a semantic knowledge repository, comprised of concepts linked via relations. Following the semantic web standards, it caters for visual discovery, access, drafting, and re-usability of any assets, organised in language-agnostic knowledge graphs. Since the linking is performed at the concept level, we can abstract from language-specific terms and model structured knowledge for phenomena that reflect the non-deterministic nature of the human language, such as word sense ambiguity, synonymy, homonymy, and multilingualism. Linking per concept ensures smooth maintenance of relations without additional data clutter: relation edges are independent from labels and terms and other metadata. It thus helps exchange information among acting systems and ensures that its precise meaning is understood and preserved among all parties, in any language.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Leveraging language-agnostic knowledge graphs</head><p>A big part of any chatbot's implementation is associated with domain data. In our case, a smooth cooperation in knowledge transfer is facilitated by MKS: VBA domain experts used it to model their domain knowledge, populating and curating it as a graph (see Data Curation side in Figure <ref type="figure" target="#fig_0">1</ref>). The repository also incorporates the interoperability layer and public servicethemed multilingual vocabularies. Aside from easy knowledge drafting, there are 4 concrete challenges that are tackled by the incorporation of language-agnostic knowledge graphs in virtual assistants: i) multilingualism; ii) language-independent entity management; iii) enabling semantic search; iv) dealing with homonymy and unseen terms.</p><p>In the European context, multilingualism is a big asset, yet it also brings along a conceptual challenge: the kind of multilingualism served tends to heavily influence the architecture and scalability of a solution. We decided to go with individual NLU models per language, i.e. keeping them language-specific, while making dialogue management -Stories -universal, adding an extra layer of abstraction to maintain consistency in bot's behaviour across languages. It implies that the core model should not have a single language-specific string among the training data, but rather an abstraction for the representation of entities, like languageindependent IDs. We abstracted from entity maintenance in distinct languages, replacing language-specific terms in the NLU training data with their unique Coreon concept IDs. Maintaining entities in each language separately would be tedious and not consistent, particularly since the VBA domain knowledge is not static. Also, agnostic entities are crucial for keeping the Core module language-agnostic, abstracted from entity names in a specific language. Once VBA decides to expand SmartBot's language capabilities with a new language, this method of universal entities will ensure smooth model development and minimization of the labeling effort. The core goal of SmartBot is to serve the user relevant grant recommendations based on previously provided input (see Figure <ref type="figure" target="#fig_1">2</ref> for a conversation snippet). This implies that the bot will have to fetch records with relevant VBA grants. To achieve this, we match information drawn from the user's input that influences the funding outcome (e.g., intents, entities, and their types extracted by Rasa NLU). Since grant information was also imported into MKS and each grant entry linked with relevant entity types, we can leverage these relations between concepts in the repository. With this functionality, SmartBot is able to fetch relevant funding entries even when terms extracted from the user's input are not explicitly appearing among VBA funding entries: the bot navigates parental and associative relations of the extracted entity and infers if there are any semantically close or connected concepts, linked with specific funding entries. Ultimately, we cover this scenario: given a VBA grant for SMEs focusing on environmental protection, a user X, searching for grants for small businesses doing roof planting/vertical gardening, and a user Y, looking for funding to support a startup that calculates CO2 footprint for businesses, would both land at the aforementioned grant. Unseen terms and homonymy is tackled by the KG in the same fashion. If users choose to use terminology previously unknown to the model, SmartBot will first try to get its meaning using the connector to Coreon rather than taking a standard fallback. If a German user enquires about the amount of money they can get from VBA; they refer to money as Kohle, a slang term homonymous to Kohle, "coal", a fossil. The NLU model does not know this term, so the bot makes an API request, searches for it in MKS, and finds two hits in two distinct concepts. The first one belongs to CO2 concept in a branch dealing with resource-saving and sustainability. The second one is found among synonyms for Geldmittel, denoting financial funds and has a more generic parent Geld, "money". Since quite a few terms of the concept Geldmittel are known to the NLU model and the context of the conversation is corresponding, the meaning of Kohle is disambiguated for the chatbot; subsequently, SmartBot informs the user about the amount of money they can qualify for.</p><p>Chatbots are becoming a turning point for rationalizing of business processes. Here we investigated technical feasibility and described the implementation of the prototype that can support VBA and serve the needs of Vienna residents, catering to the interaction in the language of their choice and understanding the intents of their requests.</p><p>Combining Rasa Open Source with reusable multilingual KG data, we delivered the intelligent chatbot solution, robust, extendable, and modular -a steady reference point for similar activities to facilitate provision of PSI. Accommodating the chatbot interaction to the user's needs, VBA SmartBot automatically overcomes the language gap, contributing to the elevation of local public services to the European scale and red tape reduction.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: SmartBot's architecture.</figDesc><graphic coords="2,89.29,84.19,416.70,175.92" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: A demo dialogue snippet.</figDesc><graphic coords="4,89.29,88.60,187.51,304.56" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_0">https://ec.europa.eu/isa2/solutions/core-public-service-vocabulary-application-profile-cpsv-ap_en</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_1">https://viennabusinessagency.at/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_2">https://github.com/RasaHQ/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_3">https://www.coreon.com/</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This project utilises the results of CEFAT4Cities Action, funded by the European Commission's CEF Telecom programme under Grant 2019-EU-IA-0015.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Promoting interoperability in europe&apos;s e-government</title>
		<author>
			<persName><forename type="first">K</forename><surname>Bovalis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Peristeras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Abecasis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R.-M</forename><surname>Abril-Jimenez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Rodriguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gattegno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Karalopoulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sagias</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Szekacs</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wigard</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer</title>
		<imprint>
			<biblScope unit="volume">47</biblScope>
			<biblScope unit="page" from="25" to="33" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Using chatbots and semantics to exploit public sector information</title>
		<author>
			<persName><forename type="first">E</forename><surname>Tambouris</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">EGOV-CeDEM-ePart</title>
		<imprint>
			<biblScope unit="page" from="125" to="132" />
			<date type="published" when="2018">2018. 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Playing the telephone game in a multilevel polity: On the implementation of e-government services for business in the eu</title>
		<author>
			<persName><forename type="first">E.-J</forename><surname>Mulder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Snijders</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Government Information Quarterly</title>
		<imprint>
			<biblScope unit="page" from="101526" to="101534" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Cefat4cities, a natural language layer for the isa2 core public service vocabulary</title>
		<author>
			<persName><forename type="first">J</forename><surname>Van Den Bogaert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Defauw</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Szoc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Everaert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Van Winckel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kramchaninova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bardadym</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Vanallemeersch</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 22nd Annual Conference of the European Association for Machine Translation</title>
				<meeting>the 22nd Annual Conference of the European Association for Machine Translation</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="483" to="484" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Artificial intelligence for citizen services and government</title>
		<author>
			<persName><forename type="first">H</forename><surname>Mehr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ash</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fellow</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Ash Cent. Democr. Gov. Innov</title>
				<imprint>
			<publisher>Harvard Kennedy Sch</publisher>
			<date type="published" when="2017-08">August (2017</date>
			<biblScope unit="page" from="1" to="12" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">On using chatbots and cpsv-ap for public service provision</title>
		<author>
			<persName><forename type="first">A</forename><surname>Stamatis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gerontas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Tambouris</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">EGOV-CeDEM-ePart</title>
		<imprint>
			<biblScope unit="page" from="133" to="139" />
			<date type="published" when="2019">2019. 2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Artificial intelligence for public sector: chatbots as a customer service representative</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Adnan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hamdan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Alareeni</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Business and Technology</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="164" to="173" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Koch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Linnik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Pelzel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sultanow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Welter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Cox</surname></persName>
		</author>
		<idno type="DOI">10.18420/informatik2021-106</idno>
		<title level="m">A reference architecture for on-premises chatbots in banks and public institutions</title>
				<meeting><address><addrLine>Bonn</addrLine></address></meeting>
		<imprint>
			<publisher>Gesellschaft für Informatik</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="1265" to="1281" />
		</imprint>
	</monogr>
	<note>INFORMATIK 2021</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Evaluating natural language understanding services for conversational question answering systems</title>
		<author>
			<persName><forename type="first">D</forename><surname>Braun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hernandez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Mendez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Matthes</surname></persName>
		</author>
		<author>
			<persName><surname>Langen</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/W17-5522</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, Association for Computational Linguistics</title>
				<meeting>the 18th Annual SIGdial Meeting on Discourse and Dialogue, Association for Computational Linguistics<address><addrLine>Saarbrücken, Germany</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="174" to="185" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Multilinguale taxonomien mit coreon. wissens-und sprachmanagement in einer lösung, Rechte, Rendite, Ressourcen</title>
		<author>
			<persName><forename type="first">M</forename><surname>Wetzel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Wirtschaftliche Aspekte des Terminologiemanagements</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page" from="41" to="51" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Metadaten für intelligenten content</title>
		<author>
			<persName><forename type="first">W</forename><surname>Ziegler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Schriften zur Technischen Kommunikation</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="page" from="51" to="66" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note>Intelligente Information</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
