<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Querying Enterprise Knowledge Graph With Natural Language</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Junyi Chai?</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yonggang Deng?</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maochen Guan?</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yujie He?</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bing Li?</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rui Yan ?</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Microsoft AI &amp; R</institution>
          ,
          <addr-line>Bellevue, WA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Conversational interface to enterprise knowledge graph with large amount of data has become popular recently for AI powered applications. Natural language understanding (NLU) techniques not only provide user-friendly interaction, but also greatly boost productivity by eliminating the need of learning structured query languages (such as SPARQL) required to access knowledge databases. We present Yugen as a rst step to conversational AI that answers user queries using knowledge graph in the enterprise domain. Yugen is a deep learning based NLU &amp; QA enginecurrently and serving the product of Microsoft Enterprise Graph1 on Azure cloud. The development of Yugen has tackled many enterprise domain challenges, such as data cold start problem, detecting domain-speci c query intents, key entities and their relationships, as well as generating correct graph queries and restating the query results in natural language back to users.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Model Training. We rstly implement a training data generation pipeline that
overcomes cold start issue under compliance, which is a common challenge among
enterprise AI powered applications. This pipeline requires minimum human
annotation and is highly customizable to allow enterprise users to train NLU models
with domain-speci c data. Yugen supports online training, where user-de ned
intents and entity types can be used to train new NLU models to suit
changing business scenarios easily. Our NLU system consists of an attention-based
seq2seq deep learning model with 1 embedding layer and 2-layer Bi-LSTM (each
layer has 8 multi-head attention mechanisms). This model architecture can be
extended easily to adapt to di erent training tasks. E.g. adding a CRF layer for
entity detection model, or a softmax layer for an intent detection model. Online
trained models can be 1-click deployed to serve production user queries.</p>
      <p>Online Serving. The input query is pre-processed rst. This includes
casing/punctuation reconstruction, tokenization, POS tagging, dependency parsing,
date/time recognition. This pre-processed result is then sent to two subsequent
components: intent detection and entity mention detection (EMD).
? All authors contribute to this work equally.</p>
      <p>Intent detection recognizes user intents, which usually covers a wide
spectrum, ranging from nding people to scheduling meetings. Each intent maps to
a speci c class in the ontology. E.g. \Find me quantum computing experts who
live in Redmond." has the \expert" intent with a type of \Employee".</p>
      <p>EMD detects key entity mentions and types in the query. Yugen's EMD
detects more entities compared to the traditional NER. It can recognize any
entities in the knowledge graph by distinguishing concepts and instances. In the
above query example, EMD can detect \experts" as a concept entity with the
type of \Employee" and \quantum computing" as an instance entity with the
type of \Expertise", while the traditional NER will not be able to do so.</p>
      <p>The aggregated outputs from intent detection and EMD will then be passed
into relation mention detection (RMD). RMD detects the relations between
intent and entities, then constructs a graph structure representing Yugen's
understanding of the original search query. RMD considers both the ontology and
the query's dependency parse tree for relation detection. In the above
example, RMD detects a two-hop relationship (hasExpertise ! hasExpertiseName)
for \experts" and \quantum computing", then forms a graph representation
as \expert hasExpertis!e skill hasExpertiseNam!e quantum computing". RMD also
conducts relation disambiguation if multiple relations exist using a language
model. E.g. in the above query, RMD detects both \expert livesI!n Redmond"
and \expert worksI!n Redmond", but ultimately will choose the former.</p>
      <p>Structured query generation (SQG) then takes the outputs from above
components to generate queries, then queries knowledge bases and gets results.
Yugen's architecture embraces diverse types of knowledge database by design, thus
can generate various structured queries in parallel on the y that are applicable
to knowledge graphs indexed di erently (e.g. graph index vs. inverted index). In
the current implementation, SQG supports both SPARQL and multi- eld search
queries.</p>
      <p>The natural language restatement component nally wraps up all of the
Yugen's natural language understanding and query results, and returns users
a human-readable answer. This restatement is not just a simple echo of user's
query but an explanation with all details about how the query is understood by
Yugen, so that users can easily evaluate the correctness of the answer and each
step in the pipeline.
2</p>
      <p>Business Value
Yugen provides enterprise users an end-to-end speech-based NLU &amp; QA engine
for enterprise knowledge graph. It is customizable, fast, accurate and robust. It
bene ts enterprise users by lowering the cost of training employees to learn
speci ed query languages, understanding enterprise domain queries, providing query
answers, and explaining the results with natural language response, ultimately
adding massive value to the business. Yugen has intention detection F1 score of
98.42% and EMD F1 score of 96.96%. Our customer - Publicis Groupe is now
leveraging Yugen as their main AI-enabled interface to their knowledge graph.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>