<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Cross-Media Knowledge Acquisition: A Case Study</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marina Giordanino</string-name>
          <email>marina.giordanino@crf.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudio Giuliano</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Damjan Kuˇznar</string-name>
          <email>damjan.kuznar@fri.uni-lj.si</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alberto Lavelli</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Moˇzina</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lorenza Romano</string-name>
          <email>romano@fbk.eu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CRF</institution>
          ,
          <addr-line>Torino</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>FBK-irst</institution>
          ,
          <addr-line>Povo TN</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Ljubljana</institution>
          ,
          <country country="SI">Slovenia</country>
        </aff>
      </contrib-group>
      <fpage>59</fpage>
      <lpage>66</lpage>
      <abstract>
        <p>The paper describes an approach to cross-media knowledge acquisition which combines text and raw data. The approach has been applied in a real-world use case concerning wind tunnel reports within the EU-funded project X-Media. The goal is to identify the source of wind noise in a vehicle and find the most suitable solution to reduce it. Information is extracted from the textual parts of the reports and provided to the raw data tool to improve its performance. The results of the initial experiments are encouraging.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Human perception relies on visual, raw data, and textual stimuli but also on
the exploitation of domain specific background knowledge. This is true also in
industrial domains where engineers report experimental results in technical
reports that exploits different media in an integrated way. The inherent specificity
and concision of this kind of technical documents induce the experts to fully
use multimedia contents. They create contents that require the synergy of
different human perceptive functions in order to be fully perceived and thus, the
combination of different communication sources (i.e., images, texts, raw data)
strictly interlinked. As a consequence, the intended meaning can be effectively
transferred only considering such sources in a combined/fused manner. One of
such examples concerns the identification of the source of wind noise in a vehicle,
and the proposal of the most suitable technical solution to reduce noise in terms
of aerodynamic noise comfort. In this context, the analysis of past test results
may allow experts to know how to design suitable solutions for new products
avoiding new long and costly experiments. In particular, these results allow wind
tunnel experts to know which vehicle components are crucial for reducing noise
(e.g. mirrors, windshield, etc.).</p>
      <p>The data and expert knowledge were provided by Fiat Aerodynamic
department in the context of the EU X-Media project. Competitors’ vehicles undergo
several tests to measure their performance, while during their development Fiat
vehicles are tested to verify if they achieve the required target for each
performance. The process is long and costly as they need to test each possible
component to determine its influence on the interior noise. In this paper, we
describe a method that estimates noise produced by a component prior to testing.
One of the main challenges is related to the high complexity of the task, due to
a number of factors: First, dozens of competitors’ and Fiat vehicles tested and
technical reports produced by wind tunnel experts every year. Technical reports
are semi-structured Word documents containing graphs and tables with expert
comments. Second, different kinds of aerodynamic measurements taken: for each
vehicle, for each of eight microphones placed inside the vehicle, for each
component contributing to noise, and in different stages of the product development
process. Third, extraction, analysis and classification of vehicle-related
information on: vehicles features and configuration, test results from noise spectra,
tables, and their parameters, components involved and quality of contribution
to noise. Finally, estimation of components’ influence on noise for new vehicles.
The last element is particularly challenging. It requires the prediction of a full
noise spectrum, which consists of 110 continuous values. Moreover, as already
mentioned, experts provide comments of previous tests’ results, saying whether
component’s influence on noise is high or negligible. We shall investigate whether
such information can be useful (or how useful it is) for the predictions of noise
curves.</p>
      <p>An overall schema of the modules involved is shown in Figure 1. From the
report images (2d graphs), the noise curves and captions are extracted
(image conversion). These are thus processed by the image caption analysis that,
analysing the caption of the image, identifies the type of experiment reported in
the curve. In the last step of textual analysis, the tool finds sentences
containing experts’ comments about vehicle components and classifies them as critical
(component increases noise) or non-critical (component reduces noise). Finally,
the data extracted from the document and saved on the knowledge base are
used to build a model that can predict influence of a component in a new (not
yet tested) vehicle. The entire process is ontology-based. A Fiat ontology has
been designed describing the whole set of concepts involved and their properties
and relations e.g. vehicle, component, wind tunnel test etc. The first four steps
related to extraction of data from documents are described in Section 2, while
we present the method for prediction of a noise spectrum in Section 3. We finish
the paper with conclusions and directions for further work.
After performing the wind tunnel tests, experts produce reports consisting of:
(a) textual tables containing basic vehicle characteristics: vehicle model, maker,
segment, size, etc; (b) experts’ summaries of experiment results: a set of
paragraphs (text) where experts express their opinions about components used in
the vehicle: which of them turned up to be good with respect to noise
reduction and which of them should be improved; (c) experiment results: a set of
noise curves/spectra and tables containing noise levels in the vehicle in different
conditions.</p>
      <p>Experts’ opinions reported in summaries contain relevant information and
could, if used together with methods for knowledge acquisition (KA) from raw
data, improve the prediction accuracy of single media models. Moreover, curve
captions and tables contain textual information necessary to the raw data tools
to associate their output with the concepts present in the ontology.</p>
      <p>
        We identified the following three interactions between the text and raw data
tools: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) The tools for text and raw data extraction need to map their output
into the ontology. To this end, the textual KA tools extract the vehicle model
and maker from the report (Figure 2). This task is performed using simple
regular expressions and, since the information is contained in tables, the process is
almost 100% correct. (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) Another key information is the mapping between the
spectra and the ontology. This is achieved performing entity recognition on the
graph captions (Figure 2) and associating the spectra with the corresponding
experiments. As training we use a set of entities presents in the ontology, using the
k-nearest neighbor algorithm where the Levenshtein distance is used to compute
the distance between two entities. The obtained accuracy is around 85%. This
information is also used to enrich the feature space of raw data tools and, as we
will see in Section 3.3, contributes to improve the acquisition process. (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) The
information extracted in this last step is exclusively used to improve the
acquisition process (Figure 4). The information extracted from the report summaries
is used as background knowledge by the raw data tools. The summaries contain
the salient aspects of a vehicle test, in particular, the vehicle components that
have been tested and their negative or positive influence on the vehicle noise
(i.e., if the components are critical or not). The k-nearest neighbor algorithm is
also used to extract the car’s components, while the noise influence is estimated
using a strategy similar to sentiment analysis. We used a small set of
manuallylabeled summaries to train the classifier. The overall accuracy is around 51%.
      </p>
      <p>Note that this result is strongly influenced by the accuracy of the component
recognition (75%).</p>
    </sec>
    <sec id="sec-2">
      <title>Noise prediction</title>
      <sec id="sec-2-1">
        <title>Problem definition</title>
        <p>A single learning example used in noise prediction consists of: (a) a vehicle shape
wind noise spectrum, shortly vs spectrum (here all component-related noise is
removed by fully taping critical parts of a car), (b) an original vehicle noise
spectrum, shortly o spectrum (noise measured in the original car), (c) name of
a tested component, (e.g., front door cut) (d) a vehicle wind noise spectrum with
one component being tested, shortly comp spectrum, and (e) expert’s comments
on the comp spectrum (possible values are critical and non-critical ).</p>
        <p>The “raw” type of data are the three noise spectra (a,b,d), which are
automatically extracted from graphs in the Image conversion tool (e.g., Figure 5). As
these images are in vector graphics format, there is no need to use “true” image
analysis tools, but we can parse images’ source code with a few regular
expressions to extract the noise spectra. Each noise spectrum is thus represented with
110 measurements, where one measurement presents the sound pressure level (in
dB) for each octave in 0−104 frequency range. The name of the tested component
is extracted from the image caption, as described in the previous section. The
expert comments are extracted for each component from the description written
in natural language classified into critical and non-critical values. A component
classified as critical is a component that has a large impact on wind noise. In this
paper, we shall use manually classified comments, since the results of automatic
and vehicle with one component being tested (red). On vertical axis is sound pressure
level (measured in dB), on horizontal is frequency (Hz).
extraction are currently not reliable enough (51%), and cannot be successfully
used in learning. The current data set contains 186 learning examples parsed
from 18 documents (=18 tested cars; approx. 10 experiments per car).</p>
        <p>The task is to estimate the level of noise change for each component proposed
by an expert, when this component is untaped; in other words, we need to predict
the comp spectrum. We will measure two different types of prediction errors.
The first is root mean squared error (RMSE) between the predicted and the true
noise curve, and the second is the relative (by car) number of inconsistencies with
respect to expert comments. An inconsistency can occur between predictions of
two different components of the same vehicle, where one is critical and one
noncritical, and predicted noise change is higher for the non-critical component. The
change of noise is defined as Euclidean distance between the comp spectrum of
the component and the vs spectrum of the tested car. To estimate these two
errors, we used a leave-one-vehicle-out strategy (i.e., each time the examples
associated to one vehicle were left out). The list of possible components for each
vehicle tested was the same as those that were actually used in the experiments.
3.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Methods using only raw data</title>
        <p>In this section, we report on the results obtained from raw data only; we predict
the comp spectrum using o spectrum and vs spectrum only. Since raw data does
not contain information about the actual tested component, we cannot expect
to achieve good results, however, it is still possible to make a prediction based
on average influences in the testing data. The results obtained here will be used
as baseline for comparison with the cross-media methods.</p>
        <p>The simplest methods used are Shape and Original. Shape is a method that
always predicts the spectrum of the vehicle shape (vs spectrum) and therefore
assumes no influence of components on noise. Similarly, Original is a method
that always predicts o spectrum.</p>
        <p>The next method used is AverageInfluence. This method first computes
the average relative influence of all components in the learning set, and then
applies it to each of the test examples. The relative influence of a single component
(example) i is computed as
rel infi(f req) =
comp spectrumi(f req) − vs spectrumi(f req)</p>
        <p>o spectrumi(f req) − vs spectrumi(f req)
where comp spectrum(f req), o spectrum, and vs spectrum(f req) are sound
pressure levels in dB at given frequency f req for one tested component, original
vehicle and vehicle shape, respectively. Average relative influence avg rel inf is
merely the average over all relative influences for all components in learning set.
The comp spectrumk of a new component k is then computed with the following
formula:
comp spectrumk(f req) = vs spectrumk(f req) +</p>
        <p>+ dif fk(f req) × avg rel inf (f req)
where dif fk(f req) stands for o spectrumk(f req) − vs spectrumk(f req).</p>
        <p>SimpleKNN method adopts the basic idea from classical KNN. We believe
that KNN is the most reasonable approach in this domain, since we need to
predict 110 highly dependent continuous values. A noise spectrum is a
continuous curve, therefore the differences between predicted adjacent points must be
small, and KNN (or any other instance-based method) will always satisfy this
constraint. Distance between two examples is calculated as:</p>
        <p>dist(ei, ej) = dist(o spectrumi, o spectrumj) + dist(dif fi, dif fj)
where dist is the Euclidean distance function. We use the distance to calculate
the weight of a learning example ei in KNN prediction of a new testing example
ek: weight(ei) = 1/distance(ei, ek) [2]. The relative weighted influence of a new
example ek is then computed by:
w rel infk(f req) =
n
i=0 weight(ei) × rel infi(f req)
n
i=0 weight(ei)
where n equals the size of train dataset. Using the weighted relative influence
we can then calculate the prediction of comp spectrum of the new component:
comp spectrumk(f req) = vs spectrumk(f req) +</p>
        <p>+ dif fk × w rel infk(f req)
We can see that the prediction of SimpleKNN method is similar to the
AverageInfluence method. The difference is that SimpleKNN applies relative influences
more selectively by using the measure of similarity between vehicles. This means
that the similar vehicles (based on Euclidean distance) contribute more to the
predicted component influence.</p>
        <p>LearnDistanceKNN is an improvement of previously described SimpleKNN.
It uses a generalised formula for distance</p>
        <p>dist(ei, ej) = w1 × dist(o spectrumi, o spectrumj) + w2 × dist(dif fi, dif fj)
where w1 and w2 are weights that are optimised to achieve lowest prediction
error (RMSE) on learn data. We used a simple hill-climbing based method for
optimisation. Table 1 shows results of used methods. Shape and Original have the
worst results, as expected. The best result is achieved by SimpleKNN method,
In the previous section we described methods using raw data only. Here, we
extend them to accept the complete learning example that contains, along to the
raw data, also the name of used component and experts’ comments on
experiment results. To use component names we simply added them to the feature
space and extended SimpleKNN and LearnDistanceKNN methods with a third
term (for component name) in the distance computation. The distance between
two components is defined as 1, if they are different, and 0, if they are the same.
The results are shown in Table 2. As you can see, extending SimpleKNN does
not result in a better model, since the distances between components are not
contributing enough. On the other hand, the new LearnDistanceKNN predicts
significantly better. After inspecting weights learned for distance computation,
we can see that the weight for component name is always higher (approx.
10times) than weights for spectra differences.</p>
        <p>Using expert comments in learning is more difficult. Expert comments
distinguish components in two classes: critical and non-critical. Critical components
are components that do not reduce wind noise well enough with respect to the
vehicle shape wind noise. While it is important to have as accurate models as
possible, it is also important to distinguish between critical and non-critical
components in the same vehicle, since critical components are the weak points in the
vehicle that have to be improved.</p>
        <p>The criticality cannot be used as a feature in learning, because it is not
available for new vehicles (expert can not state its criticality until they conduct
experiments). Therefore, we need to use this feature as a form of background
knowledge in the process of learning. We applied the principle used in QFilter [3].
QFilter is a method used in qualitative reasoning that makes quantitative
predictions, which are guaranteed to be consistent with the induced qualitative model.
These predictions are often considerably more accurate than those obtained by
the state-of-the-art numerical learning methods. QFilter changes class values of
examples to make them consistent with a given qualitative model. In our case,
the class value is relative influence, which we decrease if the value is too high for
non-critical examples, and increase low values of critical examples. In this paper,
we arbitrarily used 0.1 as a threshold; therefore, whenever a critical component
has relative influence at any frequency less than 0.1, its value is increased to
0.1, and when a non-critical component has relative influence at given frequency
higher than 0.1, its influence is decreased to 0.1. This changes class values in a
way that makes the difference between critical and non-critical examples bigger.</p>
        <p>We applied the mentioned change to the LearnDistanceKNN (with
component name) method. The RMSE of the method improved to 9.42, while the
average relative inconsistency decreased to 0.16. As expected, using text
comments decreased the number of inconsistencies. At the same time the accuracy
of the method did not decrease, which is a very good result, although surprising.
One would expect that changing class values in learn data would result in a
decrease of total accuracy.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Conclusions and Future Work</title>
      <p>In this paper, we have described the initial attempts to improve raw data analysis
exploiting information extracted from text in a use case involving reports
describing wind tunnel experiments. We have shown that, by using text extracted
comments, we improve the average rate of inconsistency, therefore there are less
mistakes in ordering the components by their influence on noise. Moreover, the
results also show an improvement in RMSE, which is promising. In general, there
is still a large space for improvement in raw data extraction, text extraction only,
and in their combination. One possibility is to use more training data. In the
current evaluation we have used only 18 reports. We expect to have about 100
parsed by the end of the project and analyzed. We also plan to improve our
algorithms by considering similarity of different components (e.g. two different
components can have a similar effect on noise spectrum) computed from their
influences on the training data.</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>This work has been funded by the X-Media project (www.x-media-project.org)
sponsored by the European Commission as part of the Information Society Technologies
(IST) programme under EC grant number IST-FP6-026978.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>M.</given-names>
            <surname>Giordanino</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Scarafiotti</surname>
          </string-name>
          .
          <source>Deliverable D13</source>
          .
          <article-title>1. Testbed description and requirement analysis for competitor analysis</article-title>
          .
          <source>Deliverable</source>
          ,
          <string-name>
            <surname>X-Media</surname>
            <given-names>Consortium</given-names>
          </string-name>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>D.</given-names>
            <surname>Shepard</surname>
          </string-name>
          .
          <article-title>A two-dimensional interpolation function for irregularly-spaced data</article-title>
          .
          <source>In Proceedings of the 1968 23rd ACM national conference</source>
          , pages
          <fpage>517</fpage>
          -
          <lpage>524</lpage>
          ,
          <year>1968</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>D. Sˇuc</surname>
          </string-name>
          , D. Vladuˇsiˇc, and I. Bratko.
          <article-title>Qualitatively faithful quantitative prediction</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>158</volume>
          (
          <issue>2</issue>
          ):
          <fpage>189</fpage>
          -
          <lpage>214</lpage>
          ,
          <fpage>204</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>