=Paper= {{Paper |id=Vol-233/paper-31 |storemode=property |title=X-Media: Large Scale Knowledge Acquisition,Sharing and Reuse across-Media |pdfUrl=https://ceur-ws.org/Vol-233/p63.pdf |volume=Vol-233 |dblpUrl=https://dblp.org/rec/conf/samt/CiravegnaS06 }} ==X-Media: Large Scale Knowledge Acquisition,Sharing and Reuse across-Media== https://ceur-ws.org/Vol-233/p63.pdf
          X-Media: Large Scale Knowledge Acquisition,
               Sharing and Reuse across-Media
                                                       Fabio Ciravegna and Steffen Staab


                                                                               which the complexity arises:
   Abstract— X-Media is an integrated Project funded by the                     xCross-Media: evidence is often distributed in different
European Commission, which addresses the issue of knowledge                       media; it is possible that knowledge expressed in just one
management in complex distributed environments. It will study,                    medium does not carry enough evidence. Connecting
develop and implement large scale methodologies and techniques                    information in more than one medium is often required.
for knowledge management able to support sharing and reuse of
knowledge that is distributed across different media (images,
                                                                                xKnowledge integration: large distributed archives require
documents and data) and repositories (data bases, knowledge                       the ability to map the distribution of information, to weight
bases, document repositories, etc.). The project started in March                 every single source and to distribute searches carefully; this
2006 and will last for 4 years.                                                   is very difficult and often search is performed just in some of
                                                                                  the archives, disregarding others that can bring very useful
  Index Terms—Cross Media Knowledge Acquisition, Cross-                           information;
media Knowledge Sharing, Architectures for Knowledge                            xFocusing: large amount of information implies that
Management                                                                        managing knowledge becomes more complex and needs
                                                                                  powerful focusing methodologies. Focus of searching
                           I. INTRODUCTION                                        changes in time and from user to user, and requires a
                                                                                  balanced mixture of exploration and searching;
W      hile in the past, medium size, mainly textual, centralized
       archives used to be the only resources for knowledge
       management, nowadays large companies handle very
                                                                                xUncertainty and Dynamicity: information is often
                                                                                  ambiguous, incomplete, or referring to a specific context -
large quantities of multimedia information in distributed                         therefore archives can contain noise and imprecision, as well
archives. Their intranets connect thousands of computers and                      as obsolete information; each piece of knowledge must
reach sizes of dozens of millions of documents. In addition, the                  therefore be judged based on provenance, evidence, etc.
increased use of the WWW as a source of information has                         xInfrastructure: different media cannot easily be shared. A
made the boundary between intra- and inter-net very thin. This                    folder of text documents may be sent via email, but a folder
dramatically increases the size of the information space.                         of images may not, and may instead require a shared image
Moreover, databases and archives are used to store huge                           repository. For 10 GByte of data remote access to the
amounts of information that is vital for the organization life,                   underlying data base is to be considered.
such as data on products, financial information, etc. Collecting
and aggregating multimedia knowledge is of fundamental                          Current knowledge management technologies and practises
importance in order to gain competitiveness and to reduce                       cannot cope with such new situation, as they mainly provide
costs. For example thousands of documents are produced                          simple mechanisms (e.g. keyword searching) for supporting
during the design and manufacturing of a class of jet engines.                  knowledge workers manually pierce together the information
During service, a single engine produces about 1Gbyte of                        from different sources.
vibration data per flight; if irregularities are found, part of the
data is stored. Every time an engine is serviced, financial                                              II. X-MEDIA
information is produced. If problems are found, pictures are                    X-Media addresses the issue of knowledge management in
taken, reports are written. Each individual engine has a                        complex distributed environments. It studies, develops and
potential “folder” of information describing the whole lifecycle
                                                                                implements large scale methodologies and techniques for
of the engine that can easily sum up to several Gigabytes of
                                                                                knowledge management able to support sharing and reuse of
information, potentially Terabytes, and contains highly
                                                                                knowledge that is distributed in different media (images,
interrelated information stored in different media.
The growing size and the multi-media nature of the archives                     documents and data) and repositories (data bases, knowledge
have serious implication on the way knowledge management                        bases, document repositories, etc.).
can be implemented. There are a number of dimensions along
                                                                                X-Media studies, designs and develops:
                                                                                1) Robust and scalable knowledge acquisition and data
   F. Ciravegna is with the Department of Computer Science of the University       analysis tools operating across media boundaries (text,
of Sheffield, Regent Court, 211 Portobello Street, S1 4DP, Sheffield, UK. (e-
mail: f.ciravegna@dcs.shef.ac.uk)
                                                                                   images and data) to automatically cross-relate and annotate
   S. Staab is with the Department of Computer Science, University Koblenz-        text and images with metadata.
Landau, 56016 Koblenz, Germany. (e-mail: staab@uni-koblenz.de)
2) Novel and cutting-edge knowledge fusion methods to                 Area 2: automated knowledge acquisition from documents,
   support knowledge workers in making decisions when                 images and raw data
   confronted with – possibly contradicting – knowledge             Functional to the methodologies for knowledge sharing
   derived from different resources;                                investigated in Area 1, is the ability to acquire knowledge
3) Effective and efficient new paradigms for knowledge              across media in a rich, semantically-oriented way. X-Media
   retrieval, sharing and reuse working across media which          develops a set of tools able to support sharing methodologies
   enable users to define and parameterize views on the             in a seamless and automatic way. Media addressed are raw
   available knowledge according to their needs.                    data, texts and images (e.g. results or parameters in
4) Techniques able to represent and manage (i) uncertainty,         experiments, raw images, textual documents, etc.). The
   (ii) trust and provenance as well as (iii) dynamic aspects of    outcome of the acquisition technologies will be a semantic
   knowledge;                                                       representation of the content (conceptualization) to be used for
5) A methodology and a technical infrastructure able to             knowledge management purposes. Enrichment of multimedia
   deliver knowledge from across media to the knowledge             documents with additional layers of automatically generated
   workers, taking into account the complexity of managing          annotation is the main medium of associating
   different media with different size of data.                     conceptualizations to resources. Current technology focuses on
6) A generic and flexible architecture allowing end users to        single medium technologies to acquire knowledge in multi
   easily customize it and integrate it with their KM practices     media environments; this means that retrieval methods use
   or needs as well as a mainly open source reference               mainly one medium (e.g. text) even in multimedia
   implementation and libraries which technology providing          environments. X-Media designs and develops technologies for
   companies can reuse.                                             information extraction that work truly cross media and that can
                                                                    be used in cases where information in one medium is necessary
Technologies will be able to support knowledge workers in an        to understand the information in the other.
effective way, (i) hiding the complexity of the underlying
search/retrieval process, (ii) resulting in a natural access to       Area 3: Infrastructure
knowledge,      (iii)  allowing interoperability between            A knowledge acquisition, integration and sharing environment
heterogeneous information resources and (iv) including              is defined. Since X-Media is an application-oriented
heterogeneity of data type (data, image, texts). The expected       integrated project, integration is required on the
impact on organizations is to dramatically improve access to,       implementation as well as on the conceptual level. The main
sharing of and use of information by humans and between             outcome of this area of activity will be a methodology and a
machines. Expected benefits are a dramatic reduction of             technical infrastructure able to deliver knowledge from across
management costs and increasing feasibility of complex              media to the knowledge workers, taking into account the
knowledge management tasks. The project plan is structured          complexity of managing media with different size of data.
along the four areas described below.                                 Area 4: Application and Testing
  Area 1: knowledge sharing and reuse                               The technology above is used to define showcases and
X-Media studies and implements technologies and                     prototype applications. Two main testbeds are defined by the
methodologies for easy and intelligent access to and reuse of       two industrial users (Rolls Royce and Fiat). They concern
formalized and non formalized knowledge. The reuse takes            competitor analysis in the car industry and product lifecycle
into consideration the user context to help focus searches and      monitoring in aerospace. System trials with final users will
reuse. Reuse and sharing is enabled via cross-media ontology        showcase the technology and pave the way to further
supported automatic indexing. The technology works in a             exploitation.
largely automated way, but it is centered on supporting users’
work, rather than replacing them. This is because the activity                            III. CONSORTIUM
of a knowledge worker is complex and humans are                     Partners: University of Sheffield (coordinator, UK),
irreplaceable agents in this process.                               University of Koblenz (D), ITC-Irst (I), University of
                                                                    Ljubljana (Slovenia) University of Freiburg (D), CERTH (G),
In this context, we are studying, designing and developing:         Labri (F), University of Karlsruhe (D) and the Open
(1) Effective and efficient new paradigms for knowledge             University (UK). Quinary (I), Ontoprise (D), Solcara (UK),
  retrieval, sharing and reuse which enable users to define and     CognIT (N), Rolls Royce (UK) and Centro Ricerche Fiat (I).
  parameterize views on the available knowledge according to
  their needs.                                                                           ACKNOWLEDGMENT
(2) Novel and cutting-edge knowledge fusion methods to
                                                                    X-Media is funded by the European Commission as part of
  support knowledge workers in making decisions when
                                                                    Framework 6 of IST, contract no FP6-26978.
  confronted with – possibly contradicting – knowledge              Project web page: http://www.x-media-project.org.
  derived from different resources.
(3) techniques able to represent and manage (i) uncertainty, (ii)   For   information:   Prof.    Fabio       Ciravegna,     email:
  trust and provenance as well as (iii) dynamic aspects of          xmedia-coordinator@dcs.shef.ac.uk
  knowledge.