=Paper=
{{Paper
|id=Vol-233/paper-31
|storemode=property
|title=X-Media: Large Scale Knowledge Acquisition,Sharing and Reuse across-Media
|pdfUrl=https://ceur-ws.org/Vol-233/p63.pdf
|volume=Vol-233
|dblpUrl=https://dblp.org/rec/conf/samt/CiravegnaS06
}}
==X-Media: Large Scale Knowledge Acquisition,Sharing and Reuse across-Media==
X-Media: Large Scale Knowledge Acquisition,
Sharing and Reuse across-Media
Fabio Ciravegna and Steffen Staab
which the complexity arises:
Abstract— X-Media is an integrated Project funded by the xCross-Media: evidence is often distributed in different
European Commission, which addresses the issue of knowledge media; it is possible that knowledge expressed in just one
management in complex distributed environments. It will study, medium does not carry enough evidence. Connecting
develop and implement large scale methodologies and techniques information in more than one medium is often required.
for knowledge management able to support sharing and reuse of
knowledge that is distributed across different media (images,
xKnowledge integration: large distributed archives require
documents and data) and repositories (data bases, knowledge the ability to map the distribution of information, to weight
bases, document repositories, etc.). The project started in March every single source and to distribute searches carefully; this
2006 and will last for 4 years. is very difficult and often search is performed just in some of
the archives, disregarding others that can bring very useful
Index Terms—Cross Media Knowledge Acquisition, Cross- information;
media Knowledge Sharing, Architectures for Knowledge xFocusing: large amount of information implies that
Management managing knowledge becomes more complex and needs
powerful focusing methodologies. Focus of searching
I. INTRODUCTION changes in time and from user to user, and requires a
balanced mixture of exploration and searching;
W hile in the past, medium size, mainly textual, centralized
archives used to be the only resources for knowledge
management, nowadays large companies handle very
xUncertainty and Dynamicity: information is often
ambiguous, incomplete, or referring to a specific context -
large quantities of multimedia information in distributed therefore archives can contain noise and imprecision, as well
archives. Their intranets connect thousands of computers and as obsolete information; each piece of knowledge must
reach sizes of dozens of millions of documents. In addition, the therefore be judged based on provenance, evidence, etc.
increased use of the WWW as a source of information has xInfrastructure: different media cannot easily be shared. A
made the boundary between intra- and inter-net very thin. This folder of text documents may be sent via email, but a folder
dramatically increases the size of the information space. of images may not, and may instead require a shared image
Moreover, databases and archives are used to store huge repository. For 10 GByte of data remote access to the
amounts of information that is vital for the organization life, underlying data base is to be considered.
such as data on products, financial information, etc. Collecting
and aggregating multimedia knowledge is of fundamental Current knowledge management technologies and practises
importance in order to gain competitiveness and to reduce cannot cope with such new situation, as they mainly provide
costs. For example thousands of documents are produced simple mechanisms (e.g. keyword searching) for supporting
during the design and manufacturing of a class of jet engines. knowledge workers manually pierce together the information
During service, a single engine produces about 1Gbyte of from different sources.
vibration data per flight; if irregularities are found, part of the
data is stored. Every time an engine is serviced, financial II. X-MEDIA
information is produced. If problems are found, pictures are X-Media addresses the issue of knowledge management in
taken, reports are written. Each individual engine has a complex distributed environments. It studies, develops and
potential “folder” of information describing the whole lifecycle
implements large scale methodologies and techniques for
of the engine that can easily sum up to several Gigabytes of
knowledge management able to support sharing and reuse of
information, potentially Terabytes, and contains highly
knowledge that is distributed in different media (images,
interrelated information stored in different media.
The growing size and the multi-media nature of the archives documents and data) and repositories (data bases, knowledge
have serious implication on the way knowledge management bases, document repositories, etc.).
can be implemented. There are a number of dimensions along
X-Media studies, designs and develops:
1) Robust and scalable knowledge acquisition and data
F. Ciravegna is with the Department of Computer Science of the University analysis tools operating across media boundaries (text,
of Sheffield, Regent Court, 211 Portobello Street, S1 4DP, Sheffield, UK. (e-
mail: f.ciravegna@dcs.shef.ac.uk)
images and data) to automatically cross-relate and annotate
S. Staab is with the Department of Computer Science, University Koblenz- text and images with metadata.
Landau, 56016 Koblenz, Germany. (e-mail: staab@uni-koblenz.de)
2) Novel and cutting-edge knowledge fusion methods to Area 2: automated knowledge acquisition from documents,
support knowledge workers in making decisions when images and raw data
confronted with – possibly contradicting – knowledge Functional to the methodologies for knowledge sharing
derived from different resources; investigated in Area 1, is the ability to acquire knowledge
3) Effective and efficient new paradigms for knowledge across media in a rich, semantically-oriented way. X-Media
retrieval, sharing and reuse working across media which develops a set of tools able to support sharing methodologies
enable users to define and parameterize views on the in a seamless and automatic way. Media addressed are raw
available knowledge according to their needs. data, texts and images (e.g. results or parameters in
4) Techniques able to represent and manage (i) uncertainty, experiments, raw images, textual documents, etc.). The
(ii) trust and provenance as well as (iii) dynamic aspects of outcome of the acquisition technologies will be a semantic
knowledge; representation of the content (conceptualization) to be used for
5) A methodology and a technical infrastructure able to knowledge management purposes. Enrichment of multimedia
deliver knowledge from across media to the knowledge documents with additional layers of automatically generated
workers, taking into account the complexity of managing annotation is the main medium of associating
different media with different size of data. conceptualizations to resources. Current technology focuses on
6) A generic and flexible architecture allowing end users to single medium technologies to acquire knowledge in multi
easily customize it and integrate it with their KM practices media environments; this means that retrieval methods use
or needs as well as a mainly open source reference mainly one medium (e.g. text) even in multimedia
implementation and libraries which technology providing environments. X-Media designs and develops technologies for
companies can reuse. information extraction that work truly cross media and that can
be used in cases where information in one medium is necessary
Technologies will be able to support knowledge workers in an to understand the information in the other.
effective way, (i) hiding the complexity of the underlying
search/retrieval process, (ii) resulting in a natural access to Area 3: Infrastructure
knowledge, (iii) allowing interoperability between A knowledge acquisition, integration and sharing environment
heterogeneous information resources and (iv) including is defined. Since X-Media is an application-oriented
heterogeneity of data type (data, image, texts). The expected integrated project, integration is required on the
impact on organizations is to dramatically improve access to, implementation as well as on the conceptual level. The main
sharing of and use of information by humans and between outcome of this area of activity will be a methodology and a
machines. Expected benefits are a dramatic reduction of technical infrastructure able to deliver knowledge from across
management costs and increasing feasibility of complex media to the knowledge workers, taking into account the
knowledge management tasks. The project plan is structured complexity of managing media with different size of data.
along the four areas described below. Area 4: Application and Testing
Area 1: knowledge sharing and reuse The technology above is used to define showcases and
X-Media studies and implements technologies and prototype applications. Two main testbeds are defined by the
methodologies for easy and intelligent access to and reuse of two industrial users (Rolls Royce and Fiat). They concern
formalized and non formalized knowledge. The reuse takes competitor analysis in the car industry and product lifecycle
into consideration the user context to help focus searches and monitoring in aerospace. System trials with final users will
reuse. Reuse and sharing is enabled via cross-media ontology showcase the technology and pave the way to further
supported automatic indexing. The technology works in a exploitation.
largely automated way, but it is centered on supporting users’
work, rather than replacing them. This is because the activity III. CONSORTIUM
of a knowledge worker is complex and humans are Partners: University of Sheffield (coordinator, UK),
irreplaceable agents in this process. University of Koblenz (D), ITC-Irst (I), University of
Ljubljana (Slovenia) University of Freiburg (D), CERTH (G),
In this context, we are studying, designing and developing: Labri (F), University of Karlsruhe (D) and the Open
(1) Effective and efficient new paradigms for knowledge University (UK). Quinary (I), Ontoprise (D), Solcara (UK),
retrieval, sharing and reuse which enable users to define and CognIT (N), Rolls Royce (UK) and Centro Ricerche Fiat (I).
parameterize views on the available knowledge according to
their needs. ACKNOWLEDGMENT
(2) Novel and cutting-edge knowledge fusion methods to
X-Media is funded by the European Commission as part of
support knowledge workers in making decisions when
Framework 6 of IST, contract no FP6-26978.
confronted with – possibly contradicting – knowledge Project web page: http://www.x-media-project.org.
derived from different resources.
(3) techniques able to represent and manage (i) uncertainty, (ii) For information: Prof. Fabio Ciravegna, email:
trust and provenance as well as (iii) dynamic aspects of xmedia-coordinator@dcs.shef.ac.uk
knowledge.