=Paper= {{Paper |id=None |storemode=property |title=Molecules of Knowledge: a Novel Perspective over Knowledge Management |pdfUrl=https://ceur-ws.org/Vol-926/paper5.pdf |volume=Vol-926 |dblpUrl=https://dblp.org/rec/conf/aiia/0001O12 }} ==Molecules of Knowledge: a Novel Perspective over Knowledge Management== https://ceur-ws.org/Vol-926/paper5.pdf
                                                                                23

    Molecules of Knowledge: a Novel Perspective
           over Knowledge Management

                                Stefano Mariani
                           Supervisor: Andrea Omicini

                Alma Mater Studiorum – Università di Bologna
                     via Venezia 52 – 47521 Cesena, Italy
                            s.mariani@unibo.it


      Abstract. To face the challenges of knowledge-intensive environments,
      we investigate a novel self-organising knowledge-oriented (SOKO) model,
      called Molecules of Knowledge (MoK for short). In MoK, knowledge
      atoms are generated by knowledge sources in shared spaces – compart-
      ments –, self-aggregate to shape knowledge molecules, and autonomously
      move toward knowledge consumers.


1   The Molecules of Knowledge Model
The Molecules of Knowledge model is a biochemically-inspired coordination
model exploiting and promoting self-organisation of knowledge. The basic moti-
vation behind MoK is the idea that knowledge should autonomously aggregate
and diffuse to reach knowledge consumers rather than “be searched”. The main
pillars of MoK are represented by the following (bio)chemical abstractions:
atoms — the smallest unit of knowledge in MoK, an atom contains information
   from a source, and belongs to a compartment where it “floats”—and where
   it is subject to its “laws of nature”;
molecules — the MoK units for knowledge aggregation, molecules bond to-
   gether somehow-related atoms;
enzymes — emitted by MoK catalysts, enzymes influence MoK reactions, thus
   affecting the dynamics of knowledge evolution within MoK compartments
   to better match the catalyst’s needs;
reactions — working at a given rate, reactions are the biochemical laws regu-
   lating the evolution of each MoK compartment, by ruling knowledge aggre-
   gation, diffusion, and decay within compartments.
Other relevant MoK abstractions are instead in charge of important aspects
like topology, knowledge production and consumption: compartments repre-
sent the conceptual loci for all MoK entities as well as for biochemical processes
– like knowledge aggregation and diffusion –, providing MoK with the notions
of locality and neighbourhood ; sources are the origins of knowledge, which is
continuously injected in the form of MoK atoms at a certain rate within the
compartment sources belong to; catalysts stand for knowledge prosumers, emit-
ting enzymes which represent their actions which affect knowledge dynamics
24

within their own compartment, especially to increase the probability of provid-
ing him/her with relevant knowledge items.

Atoms Atoms are the most primitive living pieces of knowledge within the model.
A MoK atom is produced by a knowledge source, and conveys a piece of informa-
tion spawning from the source itself. Hence, along with the content they store,
atoms should also store some contextual information to refer to the content’s
origin, and to preserve its original meaning.
    As a result, a MoK atom is a triple of the form atom( src, val , attr )c where:
src identifies unambiguously the source of knowledge; val is the actual piece of
knowledge carried by the atom – any kind of content –; attr is essentially the
content’s attribute, that is, the additional information that helps understanding
it – possibly expressed according to some well-defined ontology or controlled
vocabulary –; c is the current concentration of the atom, which is essentially the
number of atoms of the kind in the compartment.

Molecules Molecules are spontaneous, stochastic, “environment-driven” aggrega-
tions of atoms, which in principle are meant to reify some semantic relationship
between atoms, thus possibly adding new knowledge to the system—for instance
self-aggregated news chunks could shape the conceptual “path” toward a novel
interesting news story. Each molecule is simply interpreted as a set of atoms,
that is, an unordered collection of somehow semantically-related atoms.
    A MoK molecule has then a structure of the form molecule( Atoms )c where
c is the current concentration of the molecule, and Atoms is the collection of all
the atoms currently bonded together in the molecule.

Enzymes One of the key features of MoK is that it interprets prosumer’s knowledge-
related actions as positive feedbacks that increase the concentration of related
atoms and molecules within the prosumer’s compartment, producing the positive
feedback required to enable self-organisation of knowledge.
    A MoK enzyme has then a structure of the form enzyme( Atoms )c where
every enzyme – with its own concentration c – explicitly refers to a collection of
Atoms that the catalyst’s actions have in any way pointed out as of interest for
the catalyst him/herself.

Biochemical Reactions The behaviour of a MoK system is actually determined
by the last abstraction of the model: the biochemical reaction [1].
   As a knowledge-oriented model, the main issue of MoK is determining the
semantic correlation between MoK atoms. So, to define a working MoK sys-
tem, the basic mok ( atom1 , atom2 ) function should be defined, which takes
two atoms atom1 , atom2 , and returns a “matching degree”—which could be a
boolean or a double value ∈ [0, 1]. The precise definition of mok depends on the
specific application domain.
   The aggregation reaction (AggR) bonds together atoms and molecules:
                molecule(Atoms1 ) + molecule(Atoms2 ) 7−→r agg
                           S
            molecule(Atoms1 Atoms2 ) + Residual(Atoms1 , Atoms2 )
                                                                                 25

where r agg is the AggR reaction rate, mok (atom1 , atom2 ) holds for some atom1 ∈
Atoms1 , atom2 ∈ Atoms2 , and Residual (AtomsU        1 , Atoms 2 ) is the multiset
                                                                            S       of
atoms obtained as the multiset difference (Atoms 1 Atoms 2 )\(Atoms 1 Atoms 2 )—
that is, essentially, all the atoms of Atoms 1 and Atoms 2 that do not belong to the
resulting molecule. In short, more complex molecules are formed by aggregation
whenever some atoms in the reacting molecules (or in reacting atoms, handled
here as one atom molecules) are semantically correlated (via the mok function).
   Positive feedback is obtained by the reinforcement reaction (ReinfR), which
consumes a single unit of enzyme to produce a unit of the relative atom/molecule:
      enzyme(Atoms1 ) + molecule(Atoms2 )c 7−→r reinf molecule(Atoms2 )c+1
where r reinf is the ReinfR reaction rate, enzyme(Atoms1 ) and molecule(Atoms2 )c
exist in the compartment, both with c 6= 0, and mok(atom1 , atom2 ) holds for
some atom1 ∈ Atoms1 , atom2 ∈ Atoms2
    Following the biochemical metaphor, molecules should fade as time passes,
lowering their own concentration according to some well-defined decay-law. The
temporal decay reaction (DecayR) is hence defined as follows:
                 molecule(Atoms)c 7−→r decay molecule(Atoms)c−1
Similarly, a biochemical-inspired knowledge management model should provide
some spatial interaction pattern. MoK adopts diffusion as its data-migration
mechanism, in which atoms and molecules can move only between neighbour
compartments, resembling membrane crossing among cells. MoK diffusion reac-
tion (DiffR) is then modelled as follows, assuming that σ identifies a biochemical
compartment, and kkσ brackets molecules in a compartment σ:
                        S
          k Molecules 1   molecule 1 kσ i + k Molecules 2 kσ ii 7−→r diffusion
                                                  S
               k Molecules 1 kσ i + k Molecules 2   molecule1 kσ ii
where σ i and σ ii are neighbour compartments, r diffusion is the diffusion rate,
and molecule 1 moves from σ i to σ ii as the result of the reaction.

2     News Management: A First Case Study
While MoK is a general-purpose model for knowledge self-organisation, it can be
tailored on specialised application scenarios, by refining the notions of atom and
suitably defining the mok semantic correlation function. Since news management
provide a prominent example of a knowledge-intensive environment, we chose
news management as the first case study for MoK, introducing the MoK-News
model for self-organisation of news.

2.1   Knowledge representation for news management
      1
IPTC develops and maintains technical standards for improved news manage-
ment, such as NewsML2 and NITF3 .
1
  http://www.iptc.org/
2
  http://www.iptc.org/site/News_Exchange_Formats/NewsML-G2/
3
  http://www.iptc.org/site/News_Exchange_Formats/NITF/
26

    The NewsML tagging language is a media-type orthogonal news sharing for-
mat standard aimed at conveying not only the core news content, but also the
data that describe the content in an abstract way, namely the metadata. In order
to ease syntactical and semantical interoperability, NewsML adopts XML as the
first implementation language for its standards and for maintaining sets of Con-
trolled Vocabularies (CVs), collectively branded as NewsCodes, to represent con-
cepts describing and categorising news objects in a consistent manner—pretty
much as domain-specific ontologies do.
    The News Industry Text Format, too, adopts XML to enrich the content of
news articles, supporting the identification and description of a number of news
typical features, among which the most notable are: Who owns the copyright to
the item, who may republish it, and who it’s about; What subjects, organisa-
tions, and events it covers; When it happened, was reported, issued, and revised;
Where it was written, where the action took place, and where it may be released;
Why it is newsworthy, based on the editor’s analysis of the metadata. NewsML
in fact provides no support for any form of inline tagging to add information
to the plain text, for instance with the purpose to ease the work of a text min-
ing algorithm usable to automatically process the document. Thus, NITF and
NewsML are complementary standards, hence they perfectly combine to shape
quite a comprehensive and coherent framework to manage the whole news lifecy-
cle: comprehensive, given that one cares about news overall structure, including
metadata, whereas the other focusses on their internal meaning making it unam-
biguous; coherent, because they both exploit the same IPTC abstractions—in
fact NITF, too, uses NewsCodes.

2.2   Towards MoK-News
Since sources provide journalist with the required raw information already for-
matted according to the afore-mentioned IPTC standards, a simple-yet-effective
mapping can be drawn. In fact, MoK atoms do actually have a clear counterpart
in NewsML and NITF standards: tag. Tags – along with their “content” – can in
fact be seen as the atoms that altogether compose the “news-substance” in the
news management scenario. As a result, our MoK-coordinated news management
system would contain < newsItem > atoms, < person > atoms, < subject >
atoms, etc.—that is, virtually one kind of atom for each NewsML/NITF tag.
    A MoK-News atom looks like atom( src, val , sem(tag, catalog) )c , where
      src ::= news source uri
      val ::= news content
      attr ::= sem(tag , catalog )
             tag ::= NewsML tag | NITF tag
             catalog ::= NewsCode uri | ontology uri
Here, the content of an atom is mostly given by the pair < val , tag >, where
tag could be either a metadata tag drawn from NewsML or an inline descrip-
tion tag taken from NITF. The precise and unambiguous semantics of the new
content (val) can be specified thanks to the catalog information, which could be
grounded in either NewsML or NITF standards in the form of NewsCodes, or
                                                                                      27

instead be referred to a custom ontology defined by the news worker. MoK-News
molecules, enzymes, and biochemical reactions are then both syntactically and
semantically affected by such domain-specific mapping of the MoK model.

3    Related & Future Works
To the best of our knowledge, althought the (bio-)chemical metaphor is widely
used to achieve self-organisation and self-adaptation by emergence, no MoK-
like approaches were studied that could bring self-* behaviours directly into
data. On the other hand, in the news community much attention is paid to
interoperability and semantic standardisation—but no paradigm shift has been
tried to “see” data as active entities.
    Nevertheless, some MoK-similar system actually exists although with dif-
ferent aims. In [2] a general-purpose, tuple-space-based approach to knowledge
self-organisation was built exploiting a WordNet ontology to identify relation-
ships between knowledge chunks – tuples –, and drive their migration to build
clusters. In [3] a similar clustering behavior is achieved with collective sort, as-
suming that a 1 : 1 relation exists between admissible tuple templates and tuple
“sorts”—essentially making the number of clusters known a priori. The MoK
approach is different in that it pushes the above cited “similarity-based cluster-
ing” to the limit: MoK not only aggregates somehow-related knowledge chunks
in a same spot – e.g. diffusing news to interested journalists – but also tries to
physically merge units of information to create new knowledge—the molecules.
    Our future efforts will be on first devoted to provide an effective implementa-
tion of the MoK model upon an existing coordination middleware enriched with
the “online biochemical simulator” behavior [4], then to test the implementa-
tion on the MoK-News application scenarios, and on other knowledge-intensive
environments as well—e.g., MoK-Research and MoK-HealthCare.

References
1. Viroli, M., Casadei, M., Nardini, E., Omicini, A.: Towards a chemical-inspired in-
   frastructure for self-* pervasive applications. In Weyns, D., Malek, S., de Lemos, R.,
   Andersson, J., eds.: Self-Organizing Architectures. Volume 6090 of LNCS. Springer
   (July 2010) 152–176
2. Pianini, D., Virruso, S., Menezes, R., Omicini, A., Viroli, M.: Self organization
   in coordination systems using a WordNet-based ontology. In Gupta, I., Hassas,
   S., Jerome, R., eds.: 4th IEEE International Conference on Self-Adaptive and Self-
   Organizing Systems (SASO 2010), Budapest, Hungary, IEEE CS (27 September–
   1 October 2010) 114–123
3. Gardelli, L., Viroli, M., Casadei, M., Omicini, A.: Designing self-organising MAS
   environments: The collective sort case. In Weyns, D., Parunak, H.V.D., Michel, F.,
   eds.: Environments for MultiAgent Systems III. Volume 4389 of LNAI. Springer
   (May 2007) 254–271
4. Gillespie, D.T.: Exact stochastic simulation of coupled chemical reactions. The
   Journal of Physical Chemistry 81(25) (1977) 2340–2361