<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Human-in-the-loop approach to digitisation of engineering drawings?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andrew M. Fagan</string-name>
          <email>andrew.fagan@strath.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Graeme M. West</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stephen D. J. McArthur</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Strathclyde</institution>
          ,
          <addr-line>Glasgow</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the nuclear power industry, the high-performance blackbox systems which are prevalent in modern AI research are di cult to match to applications which take advantage of their strengths. These systems generally require volumes of labelled or well formatted data, and will provide a high level of performance which cannot be easily understood, explained or audited. Indeed, most AI systems deployed in this industry are done so with constant oversight by a skilled human operator, negating many of their advantages in cost, speed and reliability. In many cases, the time taken to format data for these systems is prohibitive even when automation might be desirable. This paper presents a framework for deploying a variety of AI techniques to industries where human oversight is required. Instead of treating the user as an external element while automating the task, this framework incorporates them as an active participant in the process, augmenting their performance while leveraging their strengths to make the AI systems more reliable and user-friendly. The framework is concerned primarily with the problem of digitising Elementary Wiring Diagrams, an important class of engineering drawing. These are in regular use even as low-quality scans of paper documents, and while digitising them is desirable and has many potential bene ts, the level of time investment required by skilled engineers is prohibitive. Mistakes in digitisation are also potentially costly, meaning that the skilled engineer must remain involved in the process.</p>
      </abstract>
      <kwd-group>
        <kwd>Human-in-the-loop</kwd>
        <kwd>Digitisation</kwd>
        <kwd>Engineering Drawings</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Engineering drawings serve several key purposes in design engineering,
maintenance and asset management. The ability to quickly determine which
components are connected to and a ected by others can assist in tasks such as fault
diagnosis and upgrading equipment, especially when designs are digitised with
intelligent metadata and cross referencing between multiple drawings.</p>
      <p>In the nuclear power industry, and in others such as the oil and gas industry,
many assets have been in use for a signi cant period. With their associated
drawings having been originally drafted on paper, they are now digitised in
lowquality scanned formats. As it would take considerable e ort by skilled engineers
to manually redraft these as modern CAD drawings or to review them to nd
useful metadata, a system for intelligently parsing these drawings would be of
signi cant value.</p>
      <p>Unfortunately, while this problem has been approached for many classes of
drawings, such as piping and instrumentation diagrams (P&amp;ID), success has
resulted from the existence of at least moderately sized labelled datasets, which
are used as a starting point for symbol classi cation. In attempting to utilise
this library of research on a class of drawing where no such dataset exists, such
as elementary wiring diagrams (EWDs), this is the rst and most manually
intensive problem to overcome.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <sec id="sec-2-1">
        <title>Digitisation of Engineering Drawings</title>
        <p>
          Digitisation of engineering drawings has been of interest to many since as early
as the 1980s[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], and continues to be worked on. The most active research on
the subject is being done on P&amp;ID diagrams. These drawings are comparable in
many ways to EWDs, in that they are composed of symbols, text and topological
connections. They di er primarily in the family of symbols which they utilise,
with P&amp;ID diagrams featuring a more complicated variety of symbols, with
embedded text and subtle lines which can change the meaning of a symbol.
        </p>
        <p>
          In the domain of P&amp;ID digitisation, Morena-Garcia identi ed three key
challenges: quality, skewing and topology [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Brie y, the quality problem refers to
the variance caused by hand drawn symbols and distortion caused by scanning
paper documents. The skewing problem refers to class imbalance caused by some
symbols being more common than others, while the topology problem is both
the in-drawing topology of recognising lines and connecting symbols, as well as
the meta-topology of drawings which connect to others. Again it can be seen
intuitively that all of these problems transfer to the EWD domain, and so must
be considered.
        </p>
        <p>
          A wide variety of techniques have been applied to address these common
problems, for example the utilisation of Generative Adversarial Networks [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] or
class decomposition [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] to address the class imbalance problem. However, this
research relies on a large dataset of labelled P&amp;ID symbols[
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], so applying this
rich library of techniques to EWDs requires either a manually intensive amount of
splitting and labelling drawings, or some interim techniques to allow the dataset
to be expanded.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Human-in-the-loop systems</title>
        <p>In nuclear power applications, there is a consistent problem with utilising data
intensive AI solutions. In many nuclear applications, including the problem of
digitising engineering drawings, there is a lack of quality data. Labelled data
is usually minimal, hampering supervised techniques, and the data which is
available is often low quality and poorly formatted, which creates di culties
applying unsupervised or semi-supervised methods. In addition to this, most
modern AI solutions are \black boxes", which may provide a high degree of
performance but are not easily understood. This is also a poor t for the nuclear
domain, as most decisions must be justi ed and audited.</p>
        <p>This leads to the main di culty with nuclear applications of AI: a skilled
human engineer must remain involved in any uncertain process, usually either
checking the AI decision at the end or making their own decision based on AI
suggestions. Therefore an AI must provide enough of a bene t to justify the
time of a skilled engineer to format or label data, as well as the development
and veri cation time required to design and implement such a system, and must
perform better with supervision than an engineer working alone.</p>
        <p>However, these restrictions also present an opportunity to explore ways in
which the human expertise can be leveraged to improve performance, by taking
advantage of the human in the loop rather than working around them.</p>
        <p>
          Human-in-the-loop (HITL) systems refer generally to systems where AI and
humans interact, though is more commonly used when referring to systems in
which the human is not simply a supervisor of the AI, but when the human and
AI cooperate to accomplish a task [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. This human-AI cooperation is sometimes
referred to as shared autonomy [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
        <p>
          HITL has been applied to the process of designing AI systems in HELIX [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ],
a human centric tool for rapidly design and iteration on data science work ows.
The design of HELIX involves providing the user with a high delity interface
which suits their specialisation (in this case a programming language) and high
quality visualisation tools to allow them to do the same work more e ectively.
The idea of the human user being a skilled expert in the target domain and
the utilisation of their knowledge to improve the system transfers well to the
nuclear domain, where the users will be experienced engineers, though the critical
di erence is that they will not necessarily have programming experience, but
rather engineering and design experience.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Framework for human-in-the-loop digitisation</title>
      <p>In order to address the problem of digitising EWDs using a human in the loop,
there are several key considerations to address. To be of use to the end user, the
system must be more e cient than an engineer simply digitising the drawing
manually, even when the AI modules are at their weakest or have not yet been
implemented. If the user is willing to spend a signi cant amount of time labelling
data, they would prefer to redraw instead, in line with their expertise.</p>
      <p>This also means that even once AI modules are implemented and are
displaying a reasonable performance level, the user should not have to spend time
tweaking until the AI digitises the drawing by itself. If the user makes
corrections, those should be logged for future use, but the completely labelled and
connected drawing they have produced should not be discarded. It is the desired
output of the system, and spending more time on a drawing once it has been
digitised is wasteful.</p>
      <p>The output of the system is the original drawing marked up with a label
and bounding box on each symbol, blocks of text transcribed and topological
connections represented by connecting lines. Additional information such as that
originally contained in embedded tables would instead be attached to the relevant
component in a text eld. This intermediate state could in principle be used
to automatically generate a new digital drawing, but could also be applied to
other potential uses, such as a subsequent database application for spanning
between components across an entire plant, allowing for many potential work ow
improvements.</p>
      <p>The proposed framework is shown in gure 1. It consists rstly of an attempt
by any number of AI modules to parse the drawing. These modules could be as
simple as a classi er which identi es individual symbols or a rule based system
governing how components can connect together, but could equally be complex
holistic solutions of the type which exist for P&amp;ID diagrams, as discussed in
section 2.1. They return their ndings using a common interface, represented on
the drawing in the same format available to the user.</p>
      <p>The modularity provided by a swappable bank of AIs moves the focus away
from individual techniques and instead to the performance of the system as
a whole and the interplay between human and AI. It also allows the addition
and removal of modules as appropriate. Early on, for example, a Convolutional
Neural Network is likely to be of extremely limited value due to their reliance on
volume of available data, but when a large dataset is available their performance
is likely to be very high in comparison with other methods. We therefore might
remove a CNN based method in the early stages, and only reintroduce it when
it attains a reasonable level of accuracy.</p>
      <p>
        In contrast, initial work utilising a Quality Assured Template Matching [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
based approach successfully identi ed 20% of symbols with no false positives,
and required only one example per symbol class, but additional data and
experimentation provided greatly diminished returns, suggesting it might be a useful
module early on, but might fall out of favour as more data becomes available,
or else might be relegated to a reliable check on the output of other modules. In
principle, this selection and comparison of modules could be done automatically
based on the accuracy metrics of modules, but in the simplest implementation
of this framework it would instead be a choice on the part of the designer which
they might revisit regularly.
      </p>
      <p>The annotated drawing is then opened up to the human expert, who has
the opportunity to modify the AI attempt, either by adding additional labels
not agged by an AI, or by modifying or deleting AI outputs they disagree
with. The interface at this stage should be extremely simple and intuitive. The
users actions at this stage, and therefore areas where AI modules were correct or
incorrect, will be logged into the learning data repository, as well as areas where
modules might have disagreed with one another.</p>
      <p>The learning data will not be a single exhaustive dataset of the type that
might be used to train a machine learning system, such as labelled pictures. It
will instead be a log of user actions. If the user highlights a region and labels
it as a relay, the data logged would be the name of the drawing in question,
the coordinates highlighted, and the label \relay". Likewise if the user deletes
or modi es an area the AI agged as a resistor, the coordinates would be logged
either as a negative example or with the label agged by the user. By saving
this log instead of a single formatted dataset, we allow creativity when deciding
what data would be valuable to a new AI module, by the use of the presenter
modules.</p>
      <p>The data presenters are another modular element to the system which allows
the learning data to be parsed in many ways. A presenter in this case is a simple
module which transforms a subset of the learning data into something usable
by another part of the system. A simple example of this would be a presenter
which collates all of the learning data relating to symbols, in the form of drawing
name, coordinates and label, and extracts the pixels from the drawing to create
a dataset for training a classi er. The presenters might be simple programs, or
might themselves be more complex systems. An example of a di erent type of
presenter is provided by the implemented subset of the system which, given a
single coordinate point clicked by the user, performs simple image manipulation
to nd a potential area of interest which it ags on the drawing for the user's
further input. Another more complex presenter could take that same single
coordinate and run a classi er on a sliding window around it to identify the area
of interest and its type more accurately.</p>
      <p>The current implementation consists of the subset of the framework marked
\Labelling System". The lack of implemented AI modules in this version allows
veri cation of the speed at which the operator will initially be able to digitise
a drawing. By utilising the aforementioned presenter, the user can click a single
point on a symbol, resulting in a best guess bounding box drawn around it based
on the centre of mass within that region. The user can then pick from a list what
the symbol is as well as adjust the bounding box, and their decisions are logged.
It is then easy to add text to the symbols and draw links between them. This is
a smoother process than manually segmenting each image, and results in more
consistently sized images being identi ed. Compared to manually redesigning a
drawing in a CAD format, which industrial partners suggest takes several hours,
this process takes only minutes.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions and future work</title>
      <p>The framework is still at an early stage of implementation, and requires further
testing to verify the extent to which it improves the digitisation work ow. The
framework does take steps towards all three of the challenges to digitisation
of engineering drawings. The quality problem is addressed by minimising the
impact of quality-based failures: the user has the ability to intercept mistakes
made by the AI before they are output and allows the AI more leniency before
enough data is available to potentially overcome the poor quality.</p>
      <p>The skewing problem is also addressed by this, but is further supported
by the modularity of the framework. The class imbalance caused by the relative
rarity of some symbols over others can eventually be addressed by the generation
of additional arti cial data, once enough labelled data exists, but can also be
addressed in the shorter term by building smaller and more speci c modules
which identify only one symbol at a time.</p>
      <p>The topology problem is in some ways the hardest problem to overcome, and
features the least available research. It requires taking a more holistic view of
the document rather than dividing into small units like identi cation of
symbols or text, while also requiring a high level of delity in other systems before
beginning. The framework addresses this problem both by allowing the user to
make connections in the early stages before a module is written to overcome this
problem, but also by the modularity of the system, by allowing identi cation to
be easily abstracted away into another module.</p>
      <p>In the future, the framework will be ported to other domains, initially to
other similar drawings such as P&amp;ID diagrams, but eventually to entirely di
erent classes of problems. The domain of AI planning [10] in particular features
scheduling problems which a skilled expert must undertake. Many techniques
which produce partial solutions are available, but the human centred framework
might provide an excellent way to turn these techniques into a more complete
solution.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Bunke</surname>
          </string-name>
          , \
          <article-title>Automatic interpretation of lines and text in circuit diagrams.," Pattern recognition theory and applications</article-title>
          .
          <source>Proc. NATO ASI</source>
          , Oxford,
          <year>1981</year>
          , pp.
          <volume>297</volume>
          {
          <issue>310</issue>
          ,
          <year>1982</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C. F.</given-names>
            <surname>Moreno-Garcia</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Elyan</surname>
          </string-name>
          , \
          <article-title>Digitisation of Assets from the Oil &amp; Gas Industry: Challenges and Opportunities," in 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), Sydney: Institute of Electrical and Electronics Engineers</article-title>
          (IEEE),
          <year>Nov</year>
          .
          <year>2019</year>
          , pp.
          <volume>2</volume>
          {
          <fpage>5</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Elyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jamieson</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Ali-Gombe</surname>
          </string-name>
          , \
          <article-title>Deep learning for symbols detection and classi cation in engineering drawings,"</article-title>
          <source>Neural Networks</source>
          ,
          <year>2020</year>
          , issn:
          <fpage>18792782</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E.</given-names>
            <surname>Elyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Garcia</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Jayne</surname>
          </string-name>
          , \
          <article-title>Symbols Classi cation in Engineering Drawings,"</article-title>
          <source>in Proceedings of the International Joint Conference on Neural Networks</source>
          , vol. 2018
          <article-title>-July, Institute of Electrical and Electronics Engineers Inc</article-title>
          ., Oct.
          <year>2018</year>
          , isbn:
          <fpage>9781509060146</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Elyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. F.</given-names>
            <surname>Moreno-Garc</surname>
          </string-name>
          <string-name>
            <surname>a</surname>
          </string-name>
          , and P. Johnston, \
          <article-title>Symbols in Engineering Drawings (SiED): An Imbalanced Dataset Benchmarked by Convolutional Neural Networks,"</article-title>
          <source>in Proceedings of the 21st EANN (Engineering Applications of Neural Networks) 2020 Conference</source>
          , Springer, Cham, Jun.
          <year>2020</year>
          , pp.
          <volume>215</volume>
          {
          <fpage>224</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Sousa Nunes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. Sa</given-names>
            <surname>Silva</surname>
          </string-name>
          , \
          <article-title>A Survey on human-in-Theloop applications towards an internet of all,"</article-title>
          <source>IEEE Communications Surveys and Tutorials</source>
          , vol.
          <volume>17</volume>
          , no.
          <issue>2</issue>
          , pp.
          <volume>944</volume>
          {
          <issue>965</issue>
          ,
          <string-name>
            <surname>Apr</surname>
          </string-name>
          .
          <year>2015</year>
          , issn: 1553877X.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Javdani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Admoni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pellegrinelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Srinivasa</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Bagnell</surname>
          </string-name>
          , \
          <article-title>Shared autonomy via hindsight optimization for teleoperation and teaming,"</article-title>
          <source>International Journal of Robotics Research</source>
          , vol.
          <volume>37</volume>
          , no.
          <issue>7</issue>
          , pp.
          <volume>717</volume>
          {
          <issue>742</issue>
          ,
          <string-name>
            <surname>Jun</surname>
          </string-name>
          .
          <year>2018</year>
          , issn:
          <fpage>17413176</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Xin</surname>
          </string-name>
          , L. Ma, J. Liu,
          <string-name>
            <given-names>S.</given-names>
            <surname>Macke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Parameswaran</surname>
          </string-name>
          , \
          <article-title>Accelerating human-in-the-loop machine learning: Challenges and opportunities,"</article-title>
          <source>in DEEM'18: Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning</source>
          , New York, NY, USA: Association for Computing Machinery, Apr.
          <year>2018</year>
          , pp.
          <volume>1</volume>
          {
          <fpage>4</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cheng</surname>
          </string-name>
          , Y. Wu,
          <string-name>
            <given-names>W.</given-names>
            <surname>Abdalmageed</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Natarajan</surname>
          </string-name>
          , \QATM:
          <article-title>Qualityaware template matching for deep learning,"</article-title>
          <source>in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition</source>
          , vol. 2019-June,
          <year>2019</year>
          , pp.
          <fpage>11</fpage>
          <lpage>545</lpage>
          {
          <issue>11</issue>
          554, isbn: 9781728132938.
          <string-name>
            <given-names>T.</given-names>
            <surname>Chakraborti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sreedharan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Kambhampati</surname>
          </string-name>
          , \
          <article-title>The emerging landscape of explainable automated planning &amp; decision making,"</article-title>
          <source>in IJCAI International Joint Conference on Arti cial Intelligence</source>
          , vol. 2021-Janua,
          <source>International Joint Conferences on Arti cial Intelligence</source>
          ,
          <source>Jul</source>
          .
          <year>2020</year>
          , pp.
          <volume>4803</volume>
          {
          <issue>4811</issue>
          , isbn:
          <fpage>9780999241165</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>