<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Risk Analysis and Prevention in Procedures: extraction and preliminary results</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Patrick Saint-Dizier IRIT-CNRS</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>route de Narbonne</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Toulouse cedex France stdizier@irit.fr</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>1. PROBLEMATICS Maintenance operations as well as production launches are essentially based on procedures which describe how to install and use a product and how to maintain it. Due to the complexity of to-day's equipements, and to the complexity of their interactions it is difficult to maintain up-to-date documentations. These procedural documents become more and more complex, even if simplified language constraints and revision scenarios are imposed. According to several analysis, out of 377 technicians working in different domains, 45% of them indicate that they have identified major errors in maintenance documents. About 75% indicate that there are major gaps (missing instructions) or obscure or imcomplete instructions, and 78% admit that often need help because they feel they are not operating the right way. We are all confronted to situations where we wish to follow instructions (DIY, software installation, etc.) with pictures, diagrams, etc. and that these are not understandable, have obvious gaps or do not correspond to the situation at stake. In some industrial areas, such difficulties are common and lead to accidents (aeronautics, nuclear energy, health, etc.). Risk analysis and prevention are therefore a major concern.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The most frequently encountered parameters are, informally:
- presence of ’complex’ manners (e. g. very slowly), by
complex we mean either a manner which is inherently difficult
2. A DOMAIN-INDEPENDENT ANALYSIS to realize or a manner reinforced by an adverb of intensity,
- technical complexity of the verb or the verb compound
OF RISKY SITUATIONS FROM TEXT ANAL-used: if most instructions include a verb which is quite
simYSIS ple, some exhibit quite technical verbs, metaphorical uses,
Procedural texts consist of a sequence of instructions, de- or verbs applied to unexpected situations, for which an
elabsigned with some accuracy in order to reach a goal (e.g. oration is needed.
assemble a computer) [
        <xref ref-type="bibr" rid="ref2 ref3 ref7 ref8">2, 3, 7, 8</xref>
        ]. Procedural texts are com- - duration of execution as specified in the instruction (the
plex structures, they often exhibit a quite complex ratio- longer the more difficult),
nal (the goal-instructions) and ’irrational’ structure which - synchronization between actions, in particular in
instrucis mainly composed of advice, conditions, preferences, eval- tional compounds,
uations, user stimulations, etc. They form what we call the - uncommon tools, or uncommon uses of basic tools (open
explanation structure [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which motivates and justifies [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] the box with a sharp knife) however this is quite difficult to
the goal-instructions structure, viewed as the backbone of chracaterize, besides statistical analysis (e.g. via
bootstrapprocedural texts. A number of these elements are forms of ping on the net),
argumentation [
        <xref ref-type="bibr" rid="ref1">1, 9</xref>
        ], they appear to be very useful, some- - presence of evaluation statements or resulting states, for
example to indicate the termination of the action (as soon
as the sauce turns brown add flour).
      </p>
      <p>For some of these criteria, some application-dependent
knowledge linguistic resources are needed: some lexical data, basic
ontological data, and a few business rules. These
observations allow us to introduce a very preliminary measure of
complexity. To be able to have an indicative evaluation,
each of the points above counts for 1, independently of its
importance or strength in the text. Complexity c therefore
[procedure [purpose Writing a paper: [elaboration Read light sources, then thorough ]]
[assumption/circumstance Assuming you’ve been given a topic,]
[circumstance When you conduct research], move from light to thorough resources [purpose to make sure you’re moving in the
right direction].</p>
      <p>
        Begin by doing searches on the Internet about your topic [purpose to familiarize yourself with the basic issues;]
[temporal−sequence then ] move to more thorough research on the Academic Databases;
[temporal−sequence finally ], probe the depths of the issue by burying yourself in the library.
[warning Make sure that despite beginning on the Internet, you don’t simply end there.
[elaboration A research paper using only Internet sources is a weak paper, [consequence which puts you at a disadvantage... ]]]
While the Internet should never be your only source of information, [contrast it would be ridiculous not to utilize its vast sources
of information. [advice You should use the Internet to acquaint yourself with the topic more before you dig into more academic
texts. ]]]
ranges from 0 to 6. The complexity rate di of instruction i
is c/6 to keep it in [
        <xref ref-type="bibr" rid="ref1">0,1</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2.2 Measuring the expliciteness rate t of an instruction</title>
      <p>Expliciteness characterizes the degree of accuracy of an
instruction. Several marks, independently of the domain,
contribute to making more explicit an instruction:
- when appropriate: existence of means or instruments,
- pronominal references as minimal as possible, and
predicate argument constructions as comprehensive as possible,
- length of action explicit when appropriate (stir for 10
minutes),
- list of items to consider as explicit and low level as possible
(mix the flour with the sugar, eggs and oil),
- presence of an argument, advice or warning,
- presence of some help elements like images, diagrams, etc.
- presence of elaborations, illustrations or goal specification,
- presence of a frame or a condition to limit the scope of the
action.</p>
      <p>
        Those criteria may be dependent on the domain, for
example length of an action is very relevant in cooking, somewhat
in do-it-yourself, and much less in the society domain.
Similarly as for d, each item counts for 1 at the moment,
expliciteness e therefore ranges from 0 to 8. The expliciteness
rate is ti = e/8 to keep it in [
        <xref ref-type="bibr" rid="ref1">0,1</xref>
        ]. Note also that the higher
ti is, the more chances the instruction has to succeed since
it is very explicit and has a lot of details.
      </p>
      <p>Now, if we consider the product di × (1 − ti), the more
it tends towards 1, the higher the risk is for the action to
fail. Therefore, when di is high, it is also necessary that
ti is high to compensate the difficulty. Given that di
remains unchanged (if the instruction cannot be simplified),
the strategy is then to increase ti as much as possible.</p>
    </sec>
    <sec id="sec-3">
      <title>3. A DOMAIN-DEPENDENT ANALYSIS OF</title>
    </sec>
    <sec id="sec-4">
      <title>RISKS</title>
      <p>A number of factors of risk are clearly domain-dependent.
The difficulty is to be able to identify and evaluate risks
without any access to a deep semantic analysis of the
different actions of the domain at stake since this is seldom
available.</p>
      <p>In a first stage, as an exploration, our strategy is to extract
from a large corpus of documents of the domain, for each
action, the set of warnings associated with it. An action is
characterized by a verb and its object argument(s), whatever
their position in the instruction. Following argumentation
theory, instructions with warnings have the following form:
instruction because warning, as in
Carefully plug-in the mother card vertically, otherwise you
will damage the connectors, where the otherwise section is
the support: it indicates the risks of not doing the action
correctly. In this work, if the action is ’plug-in the mother
card’ the risks are the list of those warnings associated with
it over the whole corpus.</p>
    </sec>
    <sec id="sec-5">
      <title>4. PERSPECTIVES</title>
      <p>In this short paper, we presented the main lines of a
preliminary approach to risk identification in procedures. This
is a huge problem in the industry, to prevent accidents
(humans and ecological). We proposed a simple solution to
capture domain dependent knowledge acquired from
procedure warnings. Obviously, this is just one useful facet of
the problem, since a lot of knowledge is implicit and almost
never expressed. Our users estimates is that we cover about
40% of the risks using this approach.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Amgoud</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parsons</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maudet</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arguments</surname>
          </string-name>
          , Dialogue, and Negotiation,
          <source>in: 14th European Conference on Artificial Intelligence</source>
          , Berlin,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Di</given-names>
            <surname>Eugenio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            and
            <surname>Webber</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.L.</surname>
          </string-name>
          ,
          <article-title>Pragmatic Overloading in Natural Language Instructions</article-title>
          ,
          <source>International Journal of Expert Systems</source>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Fontan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saint-Dizier</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <article-title>Analyzing the explanation structure of procedural texts: dealing with Advices and Warnings</article-title>
          , STEP conference, Venice,
          <year>August 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Moens</surname>
            ,
            <given-names>M-F</given-names>
          </string-name>
          , Boiy,
          <string-name>
            <given-names>E.</given-names>
            , Mochales Palau R. ,
            <surname>Reed</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          , Automatic Detection of Arguments in Legal Texts,
          <source>in Proceedings of the Eleventh International Conference on Artificial Intelligence and Law</source>
          , ACM Press, NY,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Pollock</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <source>Knowledge and Justification</source>
          , Princeton university Press,
          <year>1974</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Reed</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , Generating Arguments in Natural Language,
          <source>PhD dissertation</source>
          , University College, London,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Takechi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tokunaga</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matsumoto</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tanaka</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <source>Feature Selection in Categorizing Procedural Expressions, IRAL2003</source>
          , pp.
          <fpage>49</fpage>
          -
          <lpage>56</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Walton</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reed</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macagno</surname>
            ,
            <given-names>F</given-names>
          </string-name>
          . (eds),
          <source>Argumentation Schemes</source>
          , Cambridge University Press,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>