=Paper= {{Paper |id=Vol-2065/paper10 |storemode=property |title=Machine Reading as Model Construction |pdfUrl=https://ceur-ws.org/Vol-2065/paper10.pdf |volume=Vol-2065 |authors=Peter Clark |dblpUrl=https://dblp.org/rec/conf/kcap/Clark17 }} ==Machine Reading as Model Construction== https://ceur-ws.org/Vol-2065/paper10.pdf
                               Machine Reading as Model Construction
                                                                    Peter Clark
                                                           Allen Institute for AI (AI2)
                                                                   Seattle, WA
                                                               peterc@allenai.org

1    WHAT IS MACHINE READING?                                                      stem into the leaf. Carbon dioxide enters the leaf. Light,
With the advent of large datasets of paragraphs + questions, e.g.,                 water and minerals, and the carbon dioxide all com-
SQuAD [4], TriviaQA [3], there has been renewed interest in general-               bine into a mixture. This mixture forms sugar (glucose)
purpose “reading comprehension” (RC) systems, capable of answer-                   which is what the plant eats.
ing questions against those paragraphs, e.g., [5, 6]. These systems           While reading comprehension (RC) systems can reliably answer
have become remarkably effective at factoid QA. However, they              lookup questions such as:
require extensive training data, and can still struggle with queries
                                                                                   (1) What do the roots absorb? (A:water, minerals)
requiring complex inference [1]. The extent to which these systems
have truely read and understood the paragraph remains unclear              they struggle when answers are not explicit, e.g.,
[2].                                                                               (2) Where is sugar produced? (A:in the leaf)
    At the other end of the spectrum, AI has also developed sophis-
ticated formalisms for modeling the world, e.g., situation calculus,       For example, the RC system BiDAF [5] answers “glucose” to this
event calculus, qualitative modeling. These frameworks allow sys-          second question. This question requires knowledge and inference:
tems to represent facts which are known, and infer facts which             If carbon dioxide enters the leaf (stated), then it will be at the leaf
are unknown. Models built with these frameworks constitute an              (unstated), and as it is then used to produce sugar, the sugar pro-
understanding of the world, in that sense that they are predictive:        duction will be at the leaf too. This is the kind of inference that
If the model’s computational clockwork moves in a way similar to           our system, ProComp (“process comprehension”), is able to model,
the world, then the model can predict how the world will behave,           using a structured representation of events and states.
constituting a degree of understanding of the world. In this context,          Our approach is illustrated in Figure 1, and we briefly summarize
machine reading can be viewed as the task of constructing such             it here. First, ProComp extracts a Process Graph from the paragraph,
models from text, given a particular modeling framework in which           representing the event sequence in the process. It then performs
to express those models.                                                   a STRIPS-like simulation of the process, using a set of precondi-
    While it is possible that a neural system might eventually be          tion/effect rules about events, mined from VerbNet. Finally, a small
able to infer a predictive, neural model of the world solely from          set of answer procedures operate over that simulation, allowing sev-
large numbers of examples, we do not believe this is likely in the         eral classes of questions about change to be answered (e.g., “Where
near future. Rather, we see the way forward as combining the               is X at step Y?”, “What entities change size during the process?”).
pattern-learning techniques of neural systems with the modeling            Although our initial work has used largely traditional techniques,
capabilities of structured representations. AI modeling frameworks         it is still able to outperform RC systems on questions about change,
provide a set of primitives for constructing predictive models, and        and thus illustrates the importance of modeling in machine reading.
neural systems can help construct models within those frameworks
that best fit data. The grand challenge for machine reading, going         3      INTEGRATING NEURAL METHODS
forward, is combining these two technologies together to do this.          Our initial system uses three basic operations:
                                                                                • (Event extraction) Given a sentence describing an event,
2    MACHINE READING ABOUT PROCESSES
                                                                                  identify the event and the participants within it.
At AI2 we have been pursuing a specific genre of machine reading                • (State prediction) Given a sentence describing an event, and
along these lines, namely reading paragraphs describing processes                 an entity mentioned in the sentence, predict the state of the
(e.g., photosynthesis). Our goal is not to simply answer lookup                   entity before/after the event (where the state of the entity is a
questions, but also answer questions that go beyond the text, in                  set of properties associated with it, selected from a predefined
particular about the states that exist during a process. Such ques-               set).
tions are challenging because those world states are often implicit,            • (State inference) Given a partial description of the entities
making questions hard to answer from surface cues alone.                          and their states during the process (i.e., a partially filled
   For example, consider the following paragraph about photosyn-                  Participant Grid), fill in the remaining states.
thesis:
                                                                           To date, we have collected a large number of hand-annotated exam-
        Chloroplasts in the leaf of the plant trap light from the
                                                                           ples of these predictions to evaluate our system ProComp. However,
        sun. The roots absorb water and minerals from the soil.
                                                                           clearly this data can also be used for learning, to train a system
        This combination of water and minerals flows from the
                                                                           to make these inferences. Note that this does not obviate the need
K-CAP2017 Workshops and Tutorials Proceedings, 2017                        for ontology design - the appropriate dimensions of modeling still
©Copyright held by the owner/author(s).                                    need to be selected. However, it does offer an example-based means
K-CAP2017 Workshops and Tutorials Proceedings, 2017                                                                                                     Peter Clark




Figure 1: An illustration of machine reading as model construction and inference: Here the constructed model is a process
graph (sequence of events), and inference is state-space simulation, presented graphically as a “Participant Grid”. Each row
in the Grid is a state (time vertically downwards), each column is a process participant, and each cell shows facts true of a
participant in a state. For brevity, @ denotes is-at(), yellow lines denotes exists(), red denotes a direct consequence of an event,
green an inferred consequence. For example, at line 8 in the Grid (labelled “Assertion”), the “CO2 enters leaf” step asserts that
CO2 is therefore @leaf after the event. By inference, the sugar must therefore be produced at the leaf too (green box), a fact
not explicitly stated in the text.
for connecting that ontology and the reasoning to data. This is an                  [3] Mandar Joshi, Eunsol Choi, Daniel S. Weld, and Luke Zettlemoyer. 2017. TriviaQA:
exciting direction we are pursuing.                                                     A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension.
                                                                                        In Proc. ACL’17.
                                                                                    [4] Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. Squad:
4    SUMMARY                                                                            100,000+ questions for machine comprehension of text. In Proc. EMNLP’16.
                                                                                    [5] Min Joon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. 2017.
Unlike much recent work, we view machine reading as the task of                         Bidirectional Attention Flow for Machine Comprehension. In Proc. ICLR’17.
constructing a model from text using a particular modeling frame-                   [6] Junbei Zhang, Xiaodan Zhu, Qian Chen, Lirong Dai, and Hui Jiang. 2017. Explor-
                                                                                        ing Question Understanding and Adaptation in Neural-Network-Based Question
work. The framework provides the building blocks for modeling a                         Answering. arXiv preprint arXiv:1703.04617 (2017).
certain class of phenomena, and the task of reading is to construct a
model within that framework. We have illustrated this for reading
text about processes, using a state-based modeling framework.
   There is a symbiotic relationship between text and modeling
frameworks:
     • Text suggests which modeling framework is appropriate
       (e.g., the text appears to be describing a process, so use a
       framework suitable for processes)
     • The modeling framework provides expectations about what
       to look for in the text (e.g., given it’s a process, expect to see
       events and their participants)
This approach does not remove the need for learning, rather it
provides a scaffolding within which learning can take place, and a
mechanism for then supporting inference and prediction - activities
that truly demonstrate that the machine has understood what it
has read.

REFERENCES
[1] Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will
    Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and
    comprehend. In Advances in Neural Information Processing Systems. 1693–1701.
[2] Robin Jia and Percy Liang. 2017. Adversarial Examples for Evaluating Reading
    Comprehension Systems. In Proc. EMNLP’17.