Common-Knowledge Concept Recognition for SEVA

                 Jitin Krishnan1 , Patrick Coronado2 , Hemant Purohit3 , Huzefa Rangwala4
                                   1,4
                                     Department of Computer Science, George Mason University
                               2
                               Instrument Development Center, NASA Goddard Space Flight Center
                          3
                            Information Sciences & Technology Department, George Mason University
                   jkrishn2@gmu.edu, patrick.l.coronado@nasa.gov, hpurohit@gmu.edu, rangwala@gmu.edu


                            Abstract
  We build a common-knowledge concept recognition system
  for a Systems Engineer’s Virtual Assistant (SEVA) which
  can be used for downstream tasks such as relation extraction,
  knowledge graph construction, and question-answering. The
  problem is formulated as a token classification task similar
  to named entity extraction. With the help of a domain expert
  and text processing methods, we construct a dataset annotated
  at the word-level by carefully defining a labelling scheme to
  train a sequence model to recognize systems engineering con-
  cepts. We use a pre-trained language model and fine-tune it
  with the labeled dataset of concepts. In addition, we also cre-   Figure 1: Common-knowledge concept recognition and sim-
  ate some essential datasets for information such as abbrevi-      ple relation extraction
  ations and definitions from the systems engineering domain.
  Finally, we construct a simple knowledge graph using these
  extracted concepts along with some hyponym relations.                SE commonsense comes from years of experience and
                                                                    learning which involves background knowledge that goes
Keywords: Natural Language Processing, Named En-                    beyond any handbook. Although constructing an assis-
tity Recognition, Concept Recognition, Relation Extraction,         tant like SEVA system is the overarching objective, a key
Systems Engineering.                                                problem to first address is to extract elementary common-
                                                                    knowledge concepts using the SE handbook and domain ex-
                   INTRODUCTION                                     perts. We use the term ‘common-knowledge’ as the ‘com-
                                                                    monsense’ knowledge of a specific domain. This knowledge
The Systems Engineer’s Virtual Assistant (SEVA) (Krish-
                                                                    can be seen as a pivot that can be used later to collect ‘com-
nan, Coronado, and Reed 2019) was introduced with the
                                                                    monsense’ knowledge for the SE domain. We propose a pre-
goal to assist systems engineers (SE) in their problem-
                                                                    liminary research study that can pave a path towards a com-
solving abilities by keeping track of large amounts of infor-
                                                                    prehensive commonsense knowledge acquisition for an ef-
mation of a NASA-specific project and using the informa-
                                                                    fective Artificial Intelligence (AI) application for the SE do-
tion to answer queries from the user. In this work, we address
                                                                    main. Overall structure of this work is summarized in Fig-
a system element by constructing a common-knowledge
                                                                    ure 1. Implementation with demo and dataset is available at:
concept recognition system for improving the performance
                                                                    https://github.com/jitinkrishnan/NASA-SE .
of SEVA, using the static knowledge collected from the Sys-
tems Engineering Handbook (NASA 2017) that is widely
used in projects across the organization as domain-specific              BACKGROUND AND MOTIVATION
commonsense knowledge. At NASA, although there exists               Creating commonsense AI still remains an important and
knowledge engines and ontologies for the SE domain such             challenging task in AI research today. Some of the inspir-
as MBSE (Hart 2015), IMCE (JPL 2016), and OpenCaesar                ing works are the CYC project (Panton et al. 2006) that tries
(Elaasar 2019), generic commonsense acquisition is rarely           to serve as a foundational knowledge to all systems with
discussed; we aim to address this challenge.                        millions of everyday life commonsense assertions, Mosaic
                                                                    Commonsense Knowledge Graphs and Reasoning (Zellers
Copyright c 2020 held by the author(s). In A. Martin, K. Hinkel-
                                                                    et al. 2018) that addresses aspects like social situations, men-
mann, H.-G. Fill, A. Gerber, D. Lenat, R. Stolle, F. van Harmelen
(Eds.), Proceedings of the AAAI 2020 Spring Symposium on Com-       tal states, and causal relationships, and Aristo System (AI2
bining Machine Learning and Knowledge Engineering in Practice       Allen Institute for AI ) that focuses on basic science knowl-
(AAAI-MAKE 2020). Stanford University, Palo Alto, California,       edge. In NASA’s context, systems engineering combines
USA, March 23-25, 2020. Use permitted under Creative Commons        several engineering disciplines requiring extreme coordina-
License Attribution 4.0 International (CC BY 4.0).                  tion and is prone to human errors. This, in combination with
the lack of efficient knowledge transfer of generic lessons-     BIO Labelling Scheme
learned makes most technology-based missions risk-averse.        1. abb: represents abbreviations such as TRL representing
Thus, a comprehensive commonsense engine can signifi-               Technology Readiness Level.
cantly enhance the productivity of any mission by letting the    2. grp: represents a group of people or an individual such
experts focus on what they do best.                                 as Electrical Engineers, Systems Engineers or a Project
   Concept Recognition (CR) is a task identical to the tra-         Manager.
ditional Named Entity Recognition (NER) problem. A typi-         3. syscon: represents any system concepts such as engineer-
cal NER task seeks to identify entities like name of a per-         ing unit, product, hardware, software, etc. They mostly
son such as ‘Shakespeare’, a geographical location such             represent physical concepts.
as ‘London’, or name of an organisation such as ‘NASA’           4. opcon: represents operational concepts such as decision
from unstructured text. A supervised NER dataset consists           analysis process, technology maturity assessment, system
of the above mentioned entities annotated at the word-token         requirements review, etc.
level using labelling schemes such as BIO which provides         5. seterm: represents generic terms that are frequently used
beginning (B), continuation or inside (I), and outside (O)          in SE text and those that do not fall under syscon or op-
representation for each word of an entity. (Baevski et al.          con such as project, mission, key performance parameter,
2019) is the current top-performing NER model for CoNLL-            audit etc.
2003 shared task (Sang and De Meulder 2003). Off-the-shelf       6. event: represents event-like information in SE text such as
named entity extractors do not suffice in the SE common-            Pre-Phase A, Phase A, Phase B, etc.
knowledge scenario because the entities we want to extract       7. org: represents an organization such as ‘NASA’,
are domain-specific concepts such as ‘system architecture’          ‘aerospace industry’, etc.
or ‘functional requirements’ rather than physical entities       8. art: represents names of artifacts or instruments such as
such as ‘Shakespeare’ or ‘London’. This requires defining           ‘AS1300’
new labels and fine-tuning.                                      9. cardinal: represents numerical values such as ‘1’, ‘100’,
   Relation extraction tasks extract semantic relationships         ’one’ etc.
from text. These extractors aim to connect named entities       10. loc: represents location-like entities such as component
such as ‘Shakespeare’ and ‘England’ using relations such            facilities or centralized facility.
as ‘born-in’. Relations can be as simple as using hand-         11. mea: represents measures, features, or behaviors such as
built patterns or as challenging as using unsupervised meth-        cost, risk, or feasibility.
ods like Open IE (Etzioni et al. 2011); with bootstrapping,
supervised, and semi-supervised methods in between. (Xu
                                                                 Abbreviations
and Barbosa 2019) and (Soares et al. 2019) are some of           Abbreviations are used frequently in SE text. We automat-
the high performing models that extract relations from New       ically extract abbreviations using simple pattern-matching
York Times Corpus (Riedel, Yao, and McCallum 2010) and           around parentheses. Given below is a sample regex that
TACRED challenges (Zhang et al. 2017) respectively. Hy-          matches most abbreviations in the SE handbook.
ponyms represent hierarchical connection between entities        r"\([ ]*[A-Z][A-Za-z]*[ ]*\)"
of a domain and represent important relationships. For in-       An iterative regex matching procedure using this pattern
stance, a well-known work by (Hearst 1992) uses syntactic        over the preceding words will produce the full phrase of the
patterns such as [Y such as A, B, C], [Y including X], or [Y,    abbreviation. ‘A process to determine a system’s technologi-
including X] to extract hyponyms. Our goal is to extract pre-    cal maturity based on Technology Readiness Levels (TRLs)’
liminary hyponym relations from the concepts extracted by        produces the abbreviation TRL which stands for Technology
the CR and to connect the entities through verb phrases.         Readiness Levels. ‘Define one or more initial Concept of
                                                                 Operations (ConOps) scenarios’ produces the abbreviation
                                                                 ConOps which stands for Concept of Operations. We pre-
           CONCEPT RECOGNITION                                   label these abbreviations as concept entities. Many of these
                                                                 abbreviations are also provided in the Appendix section of
SE concepts are less ambiguous as compared to generic            the handbook which is also extracted and used as concepts.
natural language text. A word usually means one concept.
For example, the word ‘system’ usually means the same            Common-Knowledge Definitions
when referring to a ‘complex system’, ‘system structure’, or     Various locations of the handbook and the glossary pro-
‘management system’ in the SE domain. In generic text, the       vide definitions of several SE concepts. We collect these and
meaning of terms like ‘evaluation’, ‘requirement’, or ‘anal-     compile a comprehensive definitions document which is also
ysis’ may contextually differ. We would like domain specific     used for the concept recognition task. An example definition
phrases such as ‘system evaluation’, ‘performance require-       and its description is shown below:
ment’, or ‘system analysis’ to be single entities. Based on         Definition: Acceptable Risk
the operational and system concepts described in (Krishnan,         Description: The risk that is understood and agreed to
Coronado, and Reed 2019), we carefully construct a set of        by the program/project, governing authority, mission direc-
concept-labels for the SE handbook which is shown in the         torate, and other customer(s) such that no further specific
next section.                                                    mitigating action is required.
                                                                                     precision    recall   f1-score   support
                                                                  syscon             0.94         0.89     0.91           320
                                                                  opcon              0.87         0.91     0.89          1154
                                                                  seterm             0.98         0.94     0.96           287
                                                                  mea                0.91         0.90     0.90           248
                                                                  grp                0.94         0.93     0.94            89
                                                                  org                1.00         0.11     0.21            26
                                                                  cardinal           0.90         0.92     0.91            71
                                                                  event              0.71         0.78     0.76            77
    Figure 2: A Snippet of the concept-labelled dataset           abb                0.82         0.58     0.68            79
                                                                  art                0.00         0.00     0.00             4
      O        73944    B-cardinal   414      I-grp    132        loc                0.00         0.00     0.00             1
   B-opcon      5530       B-abb     354      B-org    87         micro/macro-avg    0.90          0.88    0.88          2356
   B-syscon     1640     B-event     350    I-seterm   26
   B-seterm     1431      I-event    218      B-art    17                  Table 2: Performance of different labels
   I-opcon      1334     I-syscon    201      I-org    12
    B-mea       1117       I-abb     156      I-loc     3           F1-Score      Accuracy       Accuracy without ‘O’-tag
    B-grp        499       I-mea     145      B-loc     2
                                                                    0.89          0.97           0.86

      Table 1: Unique Tag Count from the CR dataset              Table 3: Overall Performance of CR; For fairness, we also
                                                                 provide the accuracy when the most common ‘O’-tag is ex-
                                                                 cluded from the analysis.
CR Dataset Construction and Pre-processing
Using python tools such as PyPDF2, NLTK, and RegEx we
build a pipeline to convert PDF to raw text along with ex-       chunking connects the named entities recognized by the CR
tensive pre-processing which includes joining sentences that     model through verbs.
are split, removing URLs, shortening duplicate non-alpha
characters, and replacing full forms of abbreviations with       Hyponyms from Definitions
their shortened forms. We assume that the SE text is free
of spelling errors. For the CR dataset, we select coherent       The definition document consists of 241 SE definitions
paragraphs and full sentences by avoiding headers and short      and their descriptions. We iteratively construct entities in
blurbs. Using domain keywords and a domain expert, we an-        increasing order of number of words in the definitions
notate roughly 3700 sentences at the word-token level. An        with the help of their parts-of-speech tags. This helps in
example is shown in Figure 2 and the unique tag count is         creating subset-of relation between a lower-word entity and
shown in Table 1.                                                a higher-word entity. Each root entity is lemmatized such
                                                                 that entities like processes and process appear only once.
Fine tuning with BERT
Any language model can be used for the purpose of cus-
tomizing an NER problem to CR. We choose to go with
BERT (Devlin et al. 2018) because of its general-purpose
nature and usage of contextualized word embeddings.              Hyponyms from POS tags
   In the hand-labelled dataset, each word gets a label. The
                                                                 Using the words (especially nouns) that surround an already
idea is to perform multi-class classification using BERT’s
                                                                 identified named entity, more specific entities can be iden-
pre-trained cased language model. We use pytorch trans-
                                                                 tified. This is performed on a few selected entity tags such
formers and hugging face as per the tutorial by (Huang
                                                                 as opcon and syscon. For example, consider the sentence
2019) which uses BertF orT okenClassif ication. The
                                                                 ‘SE functions should be performed’. ‘SE’ has tag NNP
text is embedded as tokens and masks with a maximum to-
                                                                 and ‘functions’ has tag NNS. We create a relation called
ken length. This embedded tokens are provided as the input
                                                                 subset-of between ‘SE functions’ and ‘SE’.
to the pre-trained BERT model for a full fine-tuning. The
model gives an F1-score of 0.89 for the concept recognition
task. An 80-20 data split is used for training and evaluation.
Detailed performance of the CR is shown in Table 2 and 3.
Additionally, we also implemented CR using spaCy (Honni-
bal and Johnson 2015) which also produced similar results.       Relations from Abbreviations

              RELATION EXTRACTION
In this work, for relation extraction, we focus on hyponyms
and verb phrase chunking. Hyponyms are more specific con-        Relations from abbreviations are simple direct connections
cepts such as earth to planet or rose to flower. Verb phrase     between the abbreviation and its full form described in
the abbreviations dataset. Figure 3 shows a snippet of                                  References
knowledge graph constructed using stands-for and subset-of       AI2 Allen Institute for AI. Aristo: An intelligent system
relationships. Larger graphs are shown in the demo.              that reads, learns, and reasons about science. https://
                                                                 allenai.org/aristo/. Accessed: 2019-08-12.
                                                                 Baevski, A.; Edunov, S.; Liu, Y.; Zettlemoyer, L.; and Auli,
                                                                 M. 2019. Cloze-driven pretraining of self-attention net-
                                                                 works. arXiv preprint arXiv:1903.07785.
                                                                 Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2018.
                                                                 Bert: Pre-training of deep bidirectional transformers for lan-
                                                                 guage understanding. arXiv preprint arXiv:1810.04805.
                                                                 Elaasar, M. 2019. Open casesar; the case for integrated
                                                                 model centric engineering.
                                                                 Etzioni, O.; Fader, A.; Christensen, J.; Soderland, S.; and
   Figure 3: A snippet of the knowledge graph generated          Mausam. 2011. Open information extraction: the second
                                                                 generation. IJCAI 3–10.
                                                                 Hart, L. E. 2015. Introduction to model-based system
Relation Extraction using Verb Phrase Chunking                   engineering (mbse) and sysml. http://www.incose.
Finally, we explore creating contextual triples from sen-        org/docs/default-source/delaware-valley/
tences using all the entities extracted using the CR model       mbse-overview-incose-30-july-2015.pdf.   Ac-
and entities from definitions. Only those phrases that con-      cessed: 11-09-2017.
nect two entities are selected for verb phrase extraction. Us-   Hearst, M. A. 1992. Automatic acquisition of hyponyms
ing NLTK’s regex parser and chunker, a grammar such as           from large text corpora. In Proceedings of the 14th confer-
VP: {(<MD>|<R.*>|<I.*>|<VB.*>|<JJ.*>|                            ence on Computational linguistics-Vol. 2, 539–545. ACL.
<TO>)*<VB.*>+(<MD>|<R.*>|<I.*>|<VB.*>|                           Honnibal, M., and Johnson, M. 2015. An improved non-
<JJ.*>|<TO>)*}                                                   monotonic transition system for dependency parsing. In
with at least one verb, can extract relation-like phrases from   Proceedings of the 2015 Conference on Empirical Methods
the phrase that links two concepts. An example is shown in       in Natural Language Processing, 1373–1378. ACL.
Figure 4. Further investigation of relation extraction from      Huang, B. 2019. Ner with bert in action.
SE handbook is left as future work.                              JPL. 2016. Imce ontological modeling framework.
                                                                 Krishnan, J.; Coronado, P.; and Reed, T. 2019. Seva: A
                                                                 systems engineer’s virtual assistant. In AAAI-MAKE Spring
                                                                 Symposium.
                                                                 NASA. 2017. Nasa systems engineering handbook.
                                                                 Panton, K.; Matuszek, C.; Lenat, D.; Schneider, D.; Wit-
                                                                 brock, M.; Siegel, N.; and Shepard, B. 2006. Common Sense
                                                                 Reasoning – From Cyc to Intelligent Assistant. Berlin, Hei-
                                                                 delberg: Springer Berlin Heidelberg. 1–31.
                                                                 Riedel, S.; Yao, L.; and McCallum, A. 2010. Modeling
                                                                 relations and their mentions without labeled text. In Joint
                                                                 European Conference on Machine Learning and Knowledge
      Figure 4: Relation Extraction using Verb Phrase            Discovery in Databases, 148–163. Springer.
                                                                 Sang, E. F., and De Meulder, F. 2003. Introduction to the
    CONCLUSION AND FUTURE WORK                                   conll-2003 shared task: Language-independent named entity
                                                                 recognition. arXiv preprint cs/0306050.
We presented a common-knowledge concept extractor for            Soares, L. B.; FitzGerald, N.; Ling, J.; and Kwiatkowski,
the Systems Engineer’s Virtual Assistant (SEVA) system           T. 2019. Matching the blanks: Distributional similarity for
and showed how it can be beneficial for downstream tasks         relation learning. arXiv preprint arXiv:1906.03158.
such as relation extraction and knowledge graph construc-        Xu, P., and Barbosa, D. 2019. Connecting language and
tion. We construct a word-level annotated dataset with the       knowledge with heterogeneous representations for neural re-
help of a domain expert by carefully defining a labelling        lation extraction. arXiv preprint arXiv:1903.10126.
scheme to train a sequence labelling task to recognize SE
                                                                 Zellers, R.; Bisk, Y.; Schwartz, R.; and Choi, Y. 2018. Swag:
concepts. Further, we also construct some essential datasets
                                                                 A large-scale adversarial dataset for grounded commonsense
from the SE domain which can be used for future re-
                                                                 inference. CoRR abs/1808.05326.
search. Future directions include constructing a comprehen-
sive common-knowledge relation extractor from SE hand-           Zhang, Y.; Zhong, V.; Chen, D.; Angeli, G.; and Manning,
book and incorporating such human knowledge into a more          C. D. 2017. Position-aware attention and supervised data
comprehensive machine-processable commonsense knowl-             improve slot filling. In EMNLP, 35–45.
edge base for the SE domain.