Referent Tracking and its Applications
Werner Ceusters Barry Smith
CoE in Bioinformatics & Life Sciences Department of Philosophy,
701 Ellicott Street University at Buffalo
Buffalo, NY 14203 Buffalo, NY 14260
(1) 716 881 8971 (1) 716 645 2444
ceusters@buffalo.edu phismith@buffalo.edu
ABSTRACT by the same GUSI, and (3) singularity: a thing cannot be
Referent tracking (RT) is a new paradigm, based on unique denoted by more than one GUSI. [32].
identification, for representing and keeping track of particulars.
It was first introduced to support the entry and retrieval of data
1.1 Unique Identifiers
While Kent himself saw this paradigm as unfeasible for both
in electronic health records (EHRs). Its purpose is to avoid the
theoretical and pragmatic reasons, a number of approaches
ambiguity that arises when statements in an EHR refer to
which come close to his ideal have emerged in recent years.
lesions, disorders, and other entities on the side of the patient
exclusively by means of compound descriptions utilizing Microsoft introduced in the early nineties the Globally Unique
general terms such as ‘pimple on nose’ or ‘small left breast Identifier paradigm (GUID) which implements UUIDs
tumor’. In this paper, we describe the theoretical foundations of (Universally Unique IDs) as defined by the Open Software
the RT paradigm and show how it is being applied to the Foundation in its Distributed Computing Environment
solution of problems of ambiguous identification in the fields of specification [56]. UUIDs have recently been standardized
digital rights management, corporate memories and decision through ISO/IEC 9834-8:2004, which specifies format and
algorithms. generation rules that enable users to produce every 100
nanoseconds 128-bit identifiers which are either guaranteed to
be, or have a high probability of being, globally unique [30].
Categories and Subject Descriptors
I.2.4 [Artificial Intelligence]: Knowledge Representation In 1998, the International DOI Foundation was created to
Formalisms and Methods – Representations (procedural and support the development and promotion of the DOI system [40]
rule-based). resting on the notion of a Digital Object Identifier (DOI). A
DOI is a single, unambiguous and persistent string that
General Terms references a single entity and that is generated on the basis of a
consistent syntactic frame (a ‘numbering scheme’ as defined in
Management, Documentation, Design, Standardization.
the NISO standard ANSI/NISO Z39.84) in a form suitable for
use in an automated system [51]. The DOI system is a specific
Keywords implementation of the Uniform Resource Identifier paradigm
Referent Tracking, Basic Formal Ontology, Referential advanced by W3C [6] supplemented by management policies
Semantics, Knowledge Management for use in the domain of Digital Rights Management.
An enormous boost to the use of unique identification has been
1. INTRODUCTION given by the wide adoption of Radio Frequency Identification
In 1979, the late William Kent, author of Data and Reality, (RFID) technology, which initiated in its turn what may be the
defended a view according to which integers should be next wave of hype in information technology: The Internet of
employed to uniquely identify entities in the real world and to Things [41].
serve as their surrogates in databases: ‘If everything we dealt
with in a database had single, unique, simple names, then we 1.2 Entity Descriptions
would have no need for domain rules on joins (nor would we Introducing global unique identification is indeed a first and
have to distinguish two kinds of join)’ [31]. Where the much needed step towards bringing some clarity to our
traditional join operation in relational databases connects two understanding of what the descriptions in knowledge
tuples if specified fields in the tuples contain the same symbol, management systems and in what is called the ‘Semantic Web’
Kent’s remark refers to a proposal for a new type of join that are actually about. There are several reasons for the current lack
would relate two tuples if the specified fields refer to the same of clarity. One is the overemphasis on syntactic regimentation
entity. In 2003 Kent introduced in this spirit the notion of a and the false claims, for instance made in the early days of
Globally Unique and Singular Identifier (GUSI) and defined it XML but still prevailing today among non-expert professionals,
as a computational surrogate that can be placed in one-to-one to the effect that such regimentation provides the needed sort of
correspondence with the things they denote, thereby satisfying referential semantics as a byproduct [36, 39]. We can, certainly,
the principles of (1) globality: a GUSI is recognized throughout make legacy electronic documents more easily accessible by
the universe, (2) uniqueness: different things cannot be denoted manually or semi-automatically annotating documents with tags
that reformulate words or relevant phrases in a document in a
more structured and standardized manner (e.g. by tagging all
Copyright is held by the author/owner(s).
WWW 2007, May 8--12, 2007, Banff, Canada.
occurrences of the words car, van, bus, etc. with the compound
motor vehicle), or by using meta-tags that add additional context
to phrases or paragraphs (e.g. important, motivation, ignore, different from classes and instances as referred to in ontologies
etc.). Such tags enable retrieval of documents or document adhering to a concept-based view [42]. From the BFO
sections on the basis of queries issued by users with specific perspective, the view advocated in [34] that ‘individual
information needs. But they only add more syntax; they do not instances are the most specific concepts in an ontology’ rests on
contribute in any way to providing some formal reference to the a confusion. This confusion supports in turn a recommendation
entities in reality with which they might be associated. according to which ‘deciding whether a particular concept is a
A second reason is the blind, yet unwarranted, trust in the class in an ontology or an individual instance depends on what
suitability of Description Logics (DL) as a vehicle for making the potential applications of the ontology are’. The
unambiguous descriptions about entities in some domain of implementation of such a recommendation would cripple the
discourse [13]. DLs can do no more than guarantee consistent ability of ontology to realize its goal of integrating information
reasoning according to the descriptions and definitions provided derived from heterogeneous sources.
to them. But if the latter fall short of correspondence to the Second, BFO distinguishes, within the realm of particulars,
reality that they are designed to represent, then even the most between continuants and occurrents. Continuants are those
powerful DL will do very little to help resolve such problems. entities that endure continuously through a period of time while
Finally, there is the dominant view that ontologies designed to undergoing changes of various sorts. Occurrents are such
allow software agents to understand how the entities in a given changes: they are entities which unfold in time through their
domain are structured and in what relationships they stand to successive temporal parts or phases, otherwise called
each other should be organized around ‘concepts’ rather than ‘processes,’ ‘actions’, ‘events,’ ‘changes.’ The difference
around those entities themselves. This view, rather than solving between occurrents and continuants is crucial, and any ontology
problems of ambiguity, introduces additional ones [42, 46]. neglecting this distinction is not capable of dealing with changes
of entities over time in an adequate way. While, for instance, a
1.3 Towards a Solution continuant particular may instantiate different universals at
These three false beliefs continue to enjoy wide acceptance as different times (the first author of this paper was once an
foundational requirements for the Semantic Web approach to instance of child, later an instance of adult; his societal role was
the creation of the knowledge management system of the future. once an instance of student, now of professor), occurrents
Yet we believe that they each contribute to a potentially fateful cannot undergo such changes because occurrents are changes.
inability of the Semantic Web to do justice to the way in which Third, there is the distinction between dependent and
our data and information refers to entities in reality – and to the independent entities, where each dependent entity is defined as
associated phenomena of identification [54]. Promoters of the being such that it cannot exist without some independent entity
Semantic Web conceive everything through the spectacles of the which is its bearer. All occurrents are dependent in this sense on
Uniform Resource Identifier (URI), with all its associated the continuants which participate in them. Thus the process of
problems. [9] for instance proposes a solution to these problems signing a contract cannot exist without some person who signs.
that focuses on keeping track of provenance, i.e. of how names But there are also dependent continuants, for example the
and identifiers come to be assigned to entities. In the work contract itself, which cannot exist without contracting
described here, we direct our efforts towards the complementary organizations or persons. Persons themselves, in contrast, are
issue of keeping track of the entities themselves on the basis of from the very first moment of their existence independent.
what we have called Referent Tracking (RT), a paradigm rooted Certainly they may require the services of their parents; they
in the solid foundations offered by an approach to ontology will require food, oxygen, and so forth; but they are not
based on philosophical realism. We first summarize the theories dependent on these things in the ontological sense that is
underlying RT presented in earlier papers [12, 18], and then relevant to us here.
outline how the approach has allowed us to uncover
Fourth, there is the distinction between fiat and bona fide
inadequacies in less rigorous approaches to entity identification
entities, which is based on the opposition between bona fide (or
in domains such as electronic health record management, digital
physical) and fiat boundaries, the latter being exemplified
rights management, corporate memory systems and algorithmic
especially by those boundaries – such as the boundary of Utah,
treatment optimization.
or of the 20th century – which are introduced via human
2. BASIC FORMAL ONTOLOGY demarcation [48]. Fiat boundaries are overwhelmingly present
Basic Formal Ontology (BFO) is a framework that is designed in the realm of social entities, where they delineate for example
to serve as basis for the creation of high-quality shared markets, parcels of real estate, postal districts, and where they
ontologies especially in the domain of natural science [24]. It serve in establishing what is an employee, what is a taxpayer,
holds (1) that reality and its constituents exist independently of what is an able-bodied person, and so forth.
our (linguistic, conceptual, theoretical, cultural) representations Relations. BFO also distinguishes three major families of
thereof; (2) that our theories and classifications can be subject to relations between the entities just sketched: (1)
–relations,
revision; (3) that there exists a plurality of alternative but obtaining between particular and particular (for example:
equally legitimate perspectives on reality, and (4) that these Werner Ceusters being Director of the Ontology Research
alternative views are not reducible to any single basic view. Group); (2)
-relations, obtaining between particular and
BFO subdivides reality according to a number of basic universal (for example: Werner Ceusters being an instance of
dichotomies. First, it distinguishes particulars from universals; the universal person); and (3) -relations, obtaining
the former are entities such as Werner Ceusters, the first author between universal and universal (for example: person being a
of this paper; the latter are entities such as person, which have subkind of cognitive being) [45]. The importance of this
the former as their instances. Both universals and instances are distinction is exemplified by the fact that relationships such as
restricted to what exists (or has existed) in reality, and are thus parthood have distinct properties at the particular and at the
universal levels, and that ignoring these distinctions has led to a Table 1: Abstract syntax and semantics of information
number of erroneous representations of relations in Description templates in a referent tracking system
Logic-based approaches to ontology development [21]. Template Name Abstract Syntax
Description
3. GRANULAR PARTITION THEORY A Ai = < IUIp, IUIa, tap>
Granular Partition Theory is a framework for understanding the Captures the assignment of a IUI to a particular where
ways in which, when cataloguing, classifying, mapping or • IUIp is the IUI of the particular in question,
inventorizing a certain portion of reality (POR), human beings • IUIa is the IUI of the author of the assignment act, and
and other cognitive agents divide up or partition this reality at • tap is a time-stamp indicating when the assignment was made.
one or more levels of granularity [8]. The resultant partitions are PtoP Ri =
composed of units (analogous to the cells in a grid), which may Description of a relationship between particulars, where
• IUIa is the IUI of the author of the assertion to the effect that
be organized into larger sub-partitions in a modular fashion, and
the relationship referred to by r holds between the particulars
the theory provides a formal account of the different ways in referred to by the IUIs listed in P,
which such modules can correspond, or fail to correspond, to the • ta is a time-stamp indicating when the assertion was made,
entities in reality towards which they are directed. The theory • r is the designation in o of the relationship obtaining between
takes account for example of the degree to which a partition the particulars referred to in P,
represents the mereological structure of the domain onto which • o is the ID of the ontology from which r is taken,
it is projected, and also of the degree of completeness with • P is an ordered list of IUIs referring to the particulars
which a partition represents this domain. Drawing on this between which r obtains, and
framework, we have proposed a calculus for use in quality • tr is a time-stamp representing the time at which the
assurance of complex representations created for clinical or relationship was observed to obtain.
research purposes in the context of both ontology evolution [14] PtoU Ui =
and ontology mapping [11]. The calculus is based on a Description of an instantiation, where
• IUIa is the IUI of the author of the assertion to the effect that
distinction between three levels [47]: (1) the level of reality, (2)
IUIp inst u,
the cognitive representations of this reality, and (3) the publicly
• ta is a time-stamp indicating when the assertion was made,
accessible concretizations of these representations in artifacts of • inst is the designation in o of the relationship of
various sorts, of which ontologies and documents are specific instantiation,
examples. The representations on levels 2 and 3 are partitions in • o is the ID of the ontology from which inst and u are taken,
the sense of Granular Partition Theory. Thus they are composed • IUIp is the IUI referring to the particular whose inst
in hierarchical fashion out of modular sub-representations built relationship with u is asserted,
ultimately out of smallest modules called representational units, • u is the designation of the class in o with which IUIp enjoys
whereby: (1) each module is assumed to be veridical, i.e. to the inst relationship, and
conform to some relevant POR on the basis of our best current • tr is a time-stamp representing the time at which the
understanding (which may, of course, be based on errors); (2) relationship was observed to obtain.
distinct modules may correspond to the same POR by presenting PtoCo Coi =
different though still veridical views or perspectives of this Annotating a particular with a code from a concept-based system,
where
reality, for instance one and the same event may be described
• IUIa better to use a single letter instead of ‘IUI’ here I think
both as an event of buying and as an event of selling; and (3) the -- also I am now really confused about what your rule is for
modules included in a given representation are determined by use of italics and non-italics e.g. in the case of ‘t’) is the IUI
the purpose which the representation is intended to serve. of the author asserting that terms associated to co may be
Relevant portions of reality can include not only physical things used to describe p,
• ta is a time-stamp indicating when the assertion was made,
(buildings, physical goods) but also mental acts and states
• cbs is the ID of the concept-based system from which co is
(feelings of pain, states of desire or fear) and entities of many
taken,
other types, including social roles and relations. • IUIp is the IUI referring to the particular which the author
associates with co,
4. REFERENT TRACKING • co is the concept-code in the concept-system referred to by
Referent tracking (RT) is a new approach to the handling of data cbs which the author associates with IUIp, and
about real world entities introduced in [18]. It is designed to • tr is a time-stamp representing a time at which the author
allow instances in reality to serve as benchmark for the considers the association appropriate.
PtoU- Ui =
correctness of the ontologies used to describe them. The RT
The particular referred to by IUIa asserts at time ta that the relation r
paradigm has been developed thus far to support the entry and
of ontology o does not obtain at time tr between the particular
retrieval of data in the Electronic Health Record (EHR), where referred to by IUIp and any of the instances of the class u at time tr
its purpose is to avoid the problems which arise when statements PtoN Ni=< IUIa, ta, ntj, ni, IUIp, tr>
in an EHR refer to disorders, lesions and other entities on the The particular referred to by IUIa asserts at time ta that ni is the name
side of the patient by means of logically complex descriptive of the nametype ntj assigned to the particular referred to by IUIp at tr.
phrases such as ‘the fracture in the leg of patient X’ or ‘the Meta-template Di =
tumor in the lung of patient Y’. These problems arise because Publication of a description of a portion of reality in the RTS where
the phrases in question employ generic terms in ways which IUId is the IUI of the entity registering Xi in the system, Xi is the
may fail to identify the relevant instances unambiguously. (John information-unit in question (in the form of any other template
may have multiple fractures in his leg; or he may have fractured above), and td is a reference to the time the registration was carried
out.
his leg twice at different times in his life.) Referent tracking
avoids such ambiguities by introducing unique identifiers, called Service (3), here called the referent tracking database (RTDB),
IUIs – Instance Unique Identifiers (pronounced you-eye) – for should provide access to the information entered into a given
each numerically distinct entity that exists in reality and that is knowledge management system about the particulars referred to
referred to in statements in a record. In the currently still in the IUI repository. Where the latter is an inventory of
dominant paradigms, the items uniquely identified for EHR concrete entities that have been acknowledged to exist, and,
purposes are restricted to entities such as patients, care consequently, of what IDs to use if one wants to refer to them,
providers, buildings, machines and so forth. The referent the RTDB is an inventory of descriptions of features of and
tracking paradigm expands this list to include also fractures, interrelations between these entities and of the ways in which
polyps, seizures and a vast variety of other clinically salient they change in the course of time. The RTDB, too, does not
real-world instances in all the categories distinguished by BFO. need to be set up as a single central database but can rely on any
[18] sets forth the conditions for assigning a IUI to a particular, paradigm for distributed storage.
and describes the templates according to which some portions of A prototype implementation of an RTS is available through
reality are to be represented in an RT implementation. An SourceForge under an Open Source license. It is designed in
additional template for dealing with what in healthcare is known such a way that it can be used as a server application as well as
as “negative clinical findings”, is introduced in [12]. Note that a Java library. As a server, the system runs as a standalone
RT is free from the erroneous assumption of inherent application inside an apache tomcat HTTP web server at port
classification adhered to in many database design circles 8080 [50] and it can communicate simultaneously with multiple
according to which entities can be referred to only as instances EHR clients running at remote locations. The server is intended
of pre-specified classes [35]. Thus it is possible to relate to be hosted by a health institute which serves as the hub for
particulars to other particulars, and thus do useful inferencing, other health institutes (clients). The hosting health institute is
even where we do not specify of what universals these responsible for taking care of the administration and privacy
particulars are instances. issues of the shared information stored at the server. The
Finally, we have proposed an outline template for registering prototype itself is implemented to serve as a centralized registry
names by which a particular is referred to in reality (e.g. “John” system, but the addressing scheme of the identifiers can
as first name for the particular John). This template will be accommodate distributed implementations.
expanded along the lines described in [9] in such a way as to
allow temporal aspects to be taken into account. The current set 5. CASE STUDIES
of templates is shown in Table 1. The templates are to be 5.1 Electronic Health Records
interpreted as constituting an abstract syntax; it is left to the In [18] we sketched how the referent tracking paradigm might
developers of an RTS to implement the specifications in the be implemented in the healthcare environment, particularly in
most optimal way given the constraints of the environment in relation to clinical record-keeping. The key idea is to do full
which the system has to operate. justice to the what it is on the side of the patient that is
documented in an EHR, an issue that is severely neglected in
4.1 RT Implementations prevailing approaches to clinical record keeping, where the
A system that implements the RT paradigm (called an RTS)
(billable) actions of health practitioners take center-stage. The
should offer at least three services: (1) generation of unique
need for unique identification of clinically salient entities in a
identifiers to be used as IUIs, (2) management of the IUIs
patient’s documentation was however recognized already very
generated, and (3) provision of access to the IUIs stored.
early on in the history of medical informatics. The central idea
As to (1), the schemes for generating unique strings described in of Weed’s Problem Oriented Medical Record (POMR) is to
section 1.1 can be used unproblematically. If RTS services organize all medical data around a problem list, thereby
would be offered by an entity external to a specific organization, assigning each individual problem a unique ID [55].
then it may be beneficial for this entity not only to register IUIs Unfortunately Weed proposes to apply the IUI methodology
but also to certify the uniqueness of the strings to be used within only to problems, and thus not to the various particulars that
a given IUI-repository and to guarantee that the assignments cause or are symptomatic for them, or are involved in their
claimed to have been made by given authors were indeed made diagnosis or therapy. The same holds of the problem-based
by those authors. This can be compared to the services offered approach of Barrows and Johnson, which suffers further from an
by trusted third parties in private key management for ambiguity in its treatment of unique IDs, which sometimes seem
asymmetrical encryption purposes [4]. to refer to problems themselves and sometimes to statements
Service (2) involves what we refer to as the IUI-repository, about such problems [3]. The argument often used in favor of a
whose purpose is to keep track of the identifiers assigned to POMR is that it makes it possible to track a problem such as
already existing entities, or reserved for entities that are chest pain over time as it evolves into a problem of angina,
expected to come into existence in the future. It will do this in from there into a problem of myocardial infarction, of CABG
such a way that (i) each IUI represents exactly one particular, (Coronary Artery Bypass Graft), and so forth. However, we
and (ii) no particular is referred to by more than one IUI. These consider it wrong to use the labels ‘chest pain’, ‘angina’,
two requirements are not always easy to fulfill, since both ‘myocardial infarction’, and so on to denote some one enduring
depend on the ability and willingness of users to provide thing defined by POMR as ‘the problem’. Rather, these labels
accurate information. This, however, introduces no problems refer to very different kinds of particular entities that appear and
different in principle from those already faced by the users of disappear in the unfolding of the history of the problem, all of
existing systems when called upon to provide information of a them related in various ways to another particular by which the
non-trivial and occasionally sensitive sort about individuals. problem is caused, namely the underlying disorder. Hence we
argue that an adequate POMR should embrace also unique with it, the result, when is used as the basis for a
identifiers for particulars of all of these latter types. system of object identifiers, is an abundance of confusions
Another example of an EHR regime involving the use of unique (analyzed in our [16]). Some examples:
identifiers is that proposed by Huff et al. [26], who, • ‘The model elaborates a logical and
refreshingly, take “the real world to consist of objects (or semantic framework for describing entities, their
entities)”. They continue by asserting: “Objects interact with attributes and, where appropriate, values of each.
other objects and can be associated with other objects by Entities, attributes and values are referred to as types
relationships … When two or more objects interact in the real of metadata elements’
world, an ‘event’ is said to have occurred.” Each event, on the
• ‘a thing must be both thought about or perceived and
Huff approach, receives an explicit identifier called an event
identified before it exists in a metadata framework’
instance ID, which is used to link it to other events (reflecting
the goal of supporting temporal reasoning with patient data). • ‘all metadata relationships are either events in
This ID serves as an anchor for describing the event via a frame- themselves, or rely on events to establish them’
representation, where the slots in the frame are name-value • ‘nothing exists in any useful sense until it is
tuples such as event-ID = “#223”, event-family = “diagnostic identified’.
procedures”, procedure-type = “chest X-ray”, etc. Via other
unique IDs the framework incorporates also explicit reference to The orientation of the underlying Framework towards
the patient, the physician and even to the radiographic film used particular, identity-bearing entities in the real world, rather than
in an X-ray image analysis event. Unfortunately, because they to generic or conceptual entities, exhibits a clear understanding
concentrate too narrowly on the events themselves [19], Huff of what is at stake in facing the challenge of object reference
and his associates do not allow explicit reference to those and identification. Unfortunately however the framework itself
entities in reality which are observed during events. This is in provides no clear ontological underpinning to support this
spite of the fact that the very X-ray report that they analyze understanding. We therefore argue that, by subjecting
contains the sentence: “Surgical clips are again seen along the to a deep ontological analysis based on philosophical realism,
right mediastinum and right hilar region.” [26] and by adjusting its data dictionary accordingly, we can make
the system fit better the requirements of the Semantic Web.
Because they have no means to refer directly to those clips, Huff
et al. must resort to a complex representation with nested and 5.3 Corporate Memories in Enterprises
linked event frames in order to simulate such reference, in ways Another area where appropriate identification is of utmost
which once again create opaque contexts which severely reduce importance is in corporate memory (CM) systems designed to
the degree to which the resultant information can be used to keep track of the history and evolution of an enterprise with the
support reasoning, e.g. for purposes of clinical decision support, goal of using lessons learned from past experiences to enhance
tracking of surgical items, and the like. performance in the future. Well designed CMs should contain
data about both the enterprise and the environment in which it
5.2 Digital Rights Management operates [33, 37, 52]. Enterprise Ontologies can play an
Digital Rights Management covers the description, important role in this context as a means of organizing and
identification, trading, protection, monitoring and tracking of all standardizing the meta-tags used for annotating documents in
forms of rights over both tangible and intangible assets, such a way as to create more powerful CM applications that
including management of relationships between rights holders in would work over corporate networks linking together multiple
a digital environment. The Digital Object Identifier (DOI) heterogeneous systems [20]. But also in this area, our analysis
system provides a framework for the persistent identification of revealed the existence of much unclarity concerning object
artistic and other types of content in its broadest interpretation. reference and identification [15].
Although the system has been very well designed to manage
object identifiers, some important questions related to the The ACORD insurance industry Data Dictionary, for example,
assignment of identifiers are left open. which is used to assist in automating business interactions
between insurers and clients [1], defines a building as ‘a
In [16] we demonstrated the usefulness of the RT paradigm by construction that normally has a roof and walls’. ‘Air
showing how it was able to bring to light inconsistencies in the conditioning’, however, it defines as ‘information necessary to
DOI models and how such inconsistencies would be avoided describe a given type of air conditioning in a building.’
through use of an RTS. The main problem with the DOI Consistency in providing definitions would dictate that either all
approach turned out to be its dependence on the entries involve information about something in reality, or that
Framework [40], which is itself based on the first version of ISO they denote that something in reality itself. ACORD, however,
11179 [29]. The latter restricts an identifier to ‘a language provides a problematic mishmash, in which buildings would
independent unique identifier of a data element within a contain information about air conditioning as parts.
registration authority’. Each data element is itself such that it
relates to an ‘object’, which is in turn defined, in the usual ISO The same confusion is found in [23]. The latter correctly argues
parlance, as ‘any part of the conceivable or perceivable world’, that the Enterprise [49] and TOVE [22] ontologies do not
including not only existing things but also, for example, emphasize the distinction between things and their changes on
unicorns. the one hand and conceptual entities on the other, drawing
hereby on the work of Bunge [10] and specifically on its
For , in consequence, identifiers relate not to entities in application in the Bunge-Wand-Weber model in the domain of
reality (such as Werner Ceusters) but rather to pieces of data information systems for accounting [53]. This analysis led them
(such as Werner Ceusters’ name). And because, according to to develop the PSIM (Participative Simulation environment for
ISO, an object need not exist in order to have data ‘associated’ Integral Manufacturing renewal) Ontology, which was inspired
also by earlier work conducted in the European Research demonstrating how the RT approach could be used for
Project CIMOSA [2] and by Peircean Semiotics [25]. The result, upgrading static and inert flow-chart algorithms like IPAP in
however, is not without its own dramatic mysteries and such a way that they would constitute dynamic application
misinterpretations. Thus we read that the PSIM Ontology ontologies. It revealed, again, how important it is not just to
distinguishes the three main categories of: Activity, Object and uniquely identify patients, but also their individual diseases and
Information (element), whereby an ‘Information (element)’ is associated phenomena.
defined as: ‘a characteristic of either an object or activity or For the execution of the IPAP schizophrenia algorithm, it is
information, which is used to constrain directly or indirectly the mandatory that the patient’s disease be an instance of one or
involvement of an object in an activity’ [23]. PSIM classifies as other of the universals schizophrenia or schizoaffective disorder.
information elements not only ‘the time needed to perform an This entry condition is phrased in the algorithm itself as: ‘meet
activity’ and ‘how an activity has to be performed’, but also DSM-IV and ICD10 criteria for schizophrenia and
‘how the enterprise is organised’, ‘the way the responsibilities schizoaffective disorder’, referring respectively to the
are distributed among the enterprise’, and even ‘the weight of a Diagnostic and Statistical Manual of Mental Disorders
piece of material’. Weight, for RT, however, is a dependent published by the American Psychiatric Association and to the
continuant that depends on the material object of which it is the International Classification of Diseases published by WHO.
weight, and this independently of whether or not a cognitive This, unfortunately, poses certain problems. The first is logical
being has any sort of information about the matter. in nature: does the patient’s established diagnosis need to satisfy
the diagnostic criteria of both DSM-IV and ICD-10, or is it
5.4 Psychiatric Treatment Optimization sufficient that either one or the other be satisfied? This question
The International Psychopharmacology Algorithm Project is important, since there is only a partial concordance between
(IPAP) is an international initiative set up in 1985 by a team of the two, concrete figures for this concordance ranging from 60%
psychiatrists, psychopharmacologists and algorithm designers in to 83% depending on the subtype of schizophrenia [7]. Thus it is
an effort to improve choice of medication in psychiatry [27]. In possible that a patient’s disease has to be classified as
1995, the IPAP Schizophrenia Algorithm (IPAP-SA) was schizophrenia according to one system, but that it is not allowed
published by IPAP as a guideline consisting of four to be so classified by the other.
schizophrenia treatment algorithms developed, respectively, for
The second question is ontological in nature: to what extent do
the first schizophrenic episode, long-term medication
the terms (“schizophrenia” or “schizoaffective disorder”) used
maintenance, schizophrenia complicated by comorbid
by ICD-10 and DSM-IV represent one, or two, or no universals
psychiatric disorders, and schizophrenia complicated by
at all on the side of biomedical reality? Here, too, the referent
neuroleptic malignant syndrome [5].
tracking idea brings certain advantages. We first make what
In 2006, we analyzed the January 2005 version of this IPAP seems to us to be a reasonable assumption to the effect that, if a
guideline which was made available on the web. (This has since given body of patient records systematically includes diagnoses
been replaced by a newer version (v. 20060327 [28]), which of schizophrenia and/or of schizoaffective disorder, then there is
however does not differ in substantial ways for the purposes of something to which these terms refer on the side of the
this discussion.) The algorithm is presented in the form of a corresponding patients. Each such something can be given an
flow chart with an established diagnosis of schizophrenia or IUI – even should it turn out that the something in question is,
schizoaffective disorder as its single entry condition and two for example, some different disease. Let us suppose, for
exit conditions, one suggesting a modification to the patient’s example, that we assign #I-9001 to the putative case of
current treatment program, the other suggesting unaltered schizophrenia diagnosed in John, and that we include this IUI in
continuation of this program. The on-line version provides some a referent tracking database that is used while carrying out a
obvious advantages over a traditional journal or textbook variety of different types of diagnostic tests. By analyzing the
publication. It can be accessed immediately through any suitable results of such tests, we may in the long run be led to the
browser, and new versions become accessible as soon as they conclusion that #I-9001 is in fact a compound of two or more
are released. Given that the algorithm is currently implemented disease particulars (or, in the worst case, that it is an empty ID
as a simple flow-chart, however, in which the included designating no disease at all) [43]. In this way experience might
hyperlinks serve only human browsing, it still fails to exploit the indeed prove in the course of time that “schizophrenia” itself is
real power of the computer, which is to perform reasoning a term that has no referent, for example because what had been
automatically. We accordingly investigated the possibility of thought to be a single disease is in fact a compound of several
developing an implementation which could draw on information diseases hitherto not cleanly separated – in ways which might
already available in the patient’s electronic health record (EHR) then lead to modifications to the IPAP algorithm itself.
in such a way as to process relevant features of the patient’s
current condition in light of those criteria which play a role in 6. CONCLUSION
the corresponding step of the algorithm. Our case studies indicate that the currently predominant
In [17], we reported on our research to carry out the first step of enabling technologies for building knowledge management
enhancing the present version of the IPAP algorithm along these systems are still too narrowly oriented around the paradigm of
lines in such a way that it can be used in automatic decision information modeling, which is a matter of the tracking (or
support. To this end it was necessary to identify the minimal set modeling, or representation) of information. A referent tracking
of universals and particulars which must be represented in a system, in contrast, tracks entities in reality. The latter can
referent tracking system in order to allow software agents to indeed include also pieces of information about entities (for
carry out real-time monitoring and control activities to optimize example in the form of images), which are acknowledged as
the treatment of schizophrenic patients in accordance with IPAP entities in their own right, but it should do this in such a way
guidelines. The analysis was performed with the goal of that first-level entities are never confused with those entities
which carry information about them – a confusion of a type the Semantic Web. Tummarello, G., Bouquet, P. and
which, as we have seen, is endemic on current paradigms. Signore, O. eds. Semantic Web Applications and
Consider, to take just one illustrative example, the influential Perspectives (SWAP 2006), Pisa, Italy, 2006.
paper [38] of Rector et al., which contains assertions such as: [10] Bunge, M. Treatise on Basic Philosophy, Ontology I: The
‘Every occurrence level statement concerning Jane Smith’s Furniture of the World. Reidel, Boston, 1977.
Fracture of the Femur is an observation of the corresponding
individual ’; whereby: ‘The existence [sic] of the individual Jane [11] Ceusters, W. Towards A Realism-Based Metric for Quality
Smith’s Fracture of Femur does not imply that Jane Smith has, Assurance in Ontology Matching. in Bennett, B. and
or has ever had, a fracture of the femur [sic], but merely that Fellbaum, C. eds. Formal Ontology in Information
some observation has been made about Jane Smith regarding a Systems, IOS Press, Amsterdam, 2006, 321-332.
fracture of the femur.’ Such confusions are manifested in a quite [12] Ceusters, W., Elkin, P. and Smith, B. Referent Tracking:
peculiarly egregious form in the case described in [44]. The Problem of Negative Findings. in Hasman, A., Haux,
This is not to deny that much valuable work has been invested R., Lei, J.v.d., Clercq, E.D. and Roger-France, F. eds.
in information model- and concept system-based tools for Studies in Health Technology and Informatics. Ubiquity:
knowledge management systems. But we believe that the Technologies for Better Health in Aging Societies -
referent tracking paradigm – and the concomitant clear Proceedings of MIE2006, IOS Press, Amsterdam, 2006,
understanding of the distinction between an entity and the data 741-746.
about an entity which it brings in its wake – must be called in [13] Ceusters, W. and Smith, B. Ontology and Medical
aid to support any application of such tools in mission critical Terminology: why Descriptions Logics are not enough.
domains such as healthcare (or indeed in any domain where Towards an Electronic Patient Record (TEPR 2003), San
quality of work is considered to be of importance). Referent Antonio, 2003.
tracking gives us the means to allow reality itself to serve as
[14] Ceusters, W. and Smith, B. A Realism-Based Approach to
benchmark for the correctness of such application, where, on
the Evolution of Biomedical Ontologies. in Proceedings of
current paradigms, we have only ‘concepts’ and ‘models’.
AMIA 2006, 2006, 121-125.
7. ACKNOWLEDGMENTS [15] Ceusters, W. and Smith, B. Referent Tracking for
This work has been funded in part by grant 1 U 54 HG004028 Corporate Memories. in Rittgen, P. ed. Handbook of
from the National Institutes of Health through the NIH Ontologies for Business Interaction, Idea Group
Roadmap for Medical Research. Publishing, 2007 (forthcoming).
[16] Ceusters, W. and Smith, B. Referent Tracking for Digital
REFERENCES Rights Management. Forthcoming in International Journal
of Metadata, Semantics and Ontologies.
[1] ACORD Data Dictionary for Insurance Industry, 2005.
http://www.acord.org/dataDictionary/dataDictionary.htm [17] Ceusters, W. and Smith, B. Referent Tracking for
Treatment Optimisation in Schizophrenic Patients. Journal
[2] AMICE-Consortium Open System Architecture for CIM,
of Web Semantics - Special issue on semantic web for the
Research Reports of ESPRIT Project 688. Springer Verlag,
life sciences, 4 (3). 229-236.
Berlin 1989.
[18] Ceusters, W. and Smith, B. Strategies for Referent
[3] Barrows, R.C. and Johnson, S.B. A data model that
Tracking in Electronic Health Records. Journal of
captures clinical reasoning about patient problems.
Biomedical Informatics, 39 (3). 362-378.
Gardner, R.M. ed. 19th Annual Symp Computer
Applications in Medical Care, Hanley & Belfus, Inc., New [19] Coyle, J.F., Rossi-Mori, A. and Huff, S.M. Standards for
Orleans, 1995, 402-405. detailed clinical models as the basis for medical data
exchange and decision support. International Journal of
[4] Bellare, M. and Rogaway, P. The exact security of digital
Medical Informatics, 69 (2-3). 157-174.
signatures – How to sign with RSA and Rabin. in Lecture
Notes in Computer Science, Springer, 1996, 399-416. [20] Davies, J., Fensel, D. and Harmelen, F.v. (eds.). Towards
the Semantic Web - Ontology-driven Knowledge
[5] Bender, K.J. Algorithm Project Provides Guides to Current
Management. John Wiley & Sons, 2002.
Knowledge Psychiatric Times, 1996.
[21] Donnelly, M., Bittner, T. and Rosse, C. A formal theory for
[6] Berners-Lee, T., Fielding, R. and Masinter, L. Uniform
spatial representation and reasoning in biomedical
Resource Identifier (URI): Generic Syntax, The Internet
ontologies. Artificial Intelligence in Medicine, 36 (1). 1-27.
Society, 2005.
[22] Fox, M.S. The TOVE Project: Towards A Common-sense
[7] Bertelsen, A. Schizophrenia and related disorders:
Model of the Enterprise, Enterprise Integration Lab, 1992.
experience with current diagnostic systems.
Psychopathology, 35 (2-3). 89-93. [23] Goossenaerts, J. and Pelletier, C. Ontology and Enterprise
Modeling, 2003.
[8] Bittner, T. and Smith, B. A Theory of Granular Partitions.
http://is.tm.tue.nl/staff/jgoossenaerts/4PublicPdf/PSIM%20
in Duckham, M., Goodchild, M.F. and Worboy, M.F. eds.
book%20ch%205%20Ontol&EM.pdf
Foundations of Geographic Information Science, Taylor &
Francis Books, London, 2003, 117-151. [24] Grenon, P., Smith, B. and Goldberg, L. Biodynamic
Ontology: Applying BFO in the Biomedical Domain. in
[9] Bouquet, P., Stoermer, H., Mancioppi, M. and Giacomuzzi,
Pisanelli, D.M. ed. Ontologies in Medicine, IOS Press,
D. OKKAM: Towards a Solution to the “Identity Crisis” on
Amsterdam, 2004, 20-38.
[25] Hoopes, J. Peirce ON SIGNS. Writings on Semiotic by [41] Shannon, V. Wireless: Creating Internet of 'Things': A
Charles Sanders Peirce. The University of North Carolina scary, but exciting idea International Herald Tribune,
Press Chapel Hill and London, 1991. Sunday, November 20, 2005.
[26] Huff, S.M., Rocha, R.A., Bray, B.E., Warner, H.R. and [42] Smith, B. Beyond concepts: ontology as reality
Haug, P.J. An event model of medical information representation. in Proceedings of the third international
representation. Journal of the American Medical conference on formal ontology in information systems
Informatics Association, 2. 116-134. (FOIS 2004), IOS Press, Amsterdam, 2004, 73-84.
[27] IPAP. About The International Psychopharmacology [43] Smith, B. From Concepts to Clinical Reality: An Essay on
Project, 2006. http://www.ipap.org/about.php the Benchmarking of Biomedical Terminologies. Journal
[28] International Psychopharmacology Algorithm Project. of Biomedical Informatics, 39 (3). 288-298.
IPAP-Schizophrenia Algorithm Interactive Flowchart [44] Smith, B. and Ceusters, W. HL7 RIM: An Incoherent
2006. Standard. in Hasman, A., Haux, R., Lei, J.v.d., Clercq, E.D.
http://www.ipap.org/schiz/schizalg.php?screen=flowchart and Roger-France, F. eds. Studies in Health Technology
[29] International Standards Organisation ISO/IEC 11179- and Informatics. Ubiquity: Technologies for Better Health
1:1999(E) Information technology -- Specification and in Aging Societies - Proceedings of MIE2006, IOS Press,
standardization of data elements -- Part 1: Framework for Amsterdam, 2006, 133-138.
the standardization of data elements. [45] Smith, B., Ceusters, W., Klagges, B., Köhler, J., Kumar,
[30] International Standards Organisation ISO/IEC FDIS 9834- A., Lomax, J., Mungall, C., Neuhaus, F., Rector, A.L. and
8:2004. Information technology – Open Systems Rosse, C. Relations in biomedical ontologies. Genome
Interconnection – Procedures for the operation of OSI Biology, 6 (5). R46.
Registration Authorities: Generation and registration of [46] Smith, B., Ceusters, W. and Temmerman, R. Wüsteria. in
Universally Unique Identifiers (UUIDs) and their use as Engelbrecht, R., Geissbuhler, A., Lovis, C. and Mihalas, G.
ASN.1 Object Identifier components. eds. Connecting Medical Informatics and Bio-Informatics.
[31] Kent, W., The Entity Join. in Fifth International Medical Informatics Europe 2005, IOS Press, Amsterdam,
Conference on Very Large Data Bases, (Rio de Janeiro, 2005, 647-652.
Brazil, 1979), Morgan Kaufmann Publishers, 232-238. [47] Smith, B., Kusnierczyk, W., Schober, D. and Ceusters, W.
[32] Kent, W. The unsolvable identity problem Extreme Markup Towards a Reference Terminology for Ontology Research
Languages 2003, Montreal, Canada, 2003. and Development in the Biomedical Domain KR-MED
2006, Biomedical Ontology in Action., Baltimore MD,
[33] Kühn, O. and Abecker, A. Corporate memories for USA 2006.
Knowledge Management in Industrial Practice: Prospects
and Challenges. Journal of Universal Computer Science, 3 [48] Smith, B. and Varzi, A.C. Fiat and Bona Fide Boundaries:
(8). 929-954. Towards on Ontology of Spatially Extended Objects in
Lecture Notes In Computer Science, Springer Verlag,
[34] Noy, N.F. and McGuinness, D.L. Ontology Development London, UK, 1997, 103 - 119.
101: A Guide to Creating Your First Ontology, Stanford
Knowledge Systems Laboratory, 2001. [49] Stader, J. Results of the Enterprise Project 16th Annual
Conference of the British Computer Society Specialist
[35] Parsons, J. and Wand, Y. Emancipating Instances from the Group on Expert Systems Cambridge, UK, 1996.
Tyranny of Classes in Information Modeling. ACM
Transactions on Database Systems, 25 (2). 228-268. [50] The Apache Software Foundation. Apache Tomcat Server,
2006.
[36] Paskin, N. Digital Object Identifiers for Scientific Data.
Data Science Journal, 4. 12-20. [51] The International DOI Foundation. The DOI Handbook
(Version 4.4.1, released 5 October 2006). 2006.
[37] Prasad, M.V.N. and Plaza, E. Corporate Memories as
Distributed Case Librairies. in Tenth Knowledge [52] Van Heijst, G., Van der Spek, R. and Kruizinga, E.
Acquisition for Knowledge-Based Systems Workshop, Organizing Corporate Memories Tenth Knowledge
Banff, Canada, 1996, 40-41 40-19. Acquisition for Knowledge-Based Systems Workshop,
Banff, Canada, 1996, 42-41 42-17.
[38] Rector, A.L., Nowlan, W.A., Kay, S., Goble, C.A. and
Howkins, T.J. A framework for modelling the electronic [53] Wand, Y., Storey, V. and Weber, R. An Ontological
medical record. Methods of Information in Medicine, 32 Analysis of the relationship Construct in Conceptual
(2). 109-119. Modeling. ACM Transactions on Database Systems, 24 (4).
494-528.
[39] Renear, A., Dubin, D., Sperberg-McQueen, C.M. and
Huitfeldt, C. XML semantics and digital libraries in [54] Warren, P. Knowledge management and the semantic web :
Proceedings of the 3rd ACM/IEEE-CS joint conference on From scenario to technology. IEEE intelligent systems, 21
Digital libraries table of contents, IEEE Computer Society, (1). 53-59.
2003, 303 - 305. [55] Weed, L. Medical records that guide and teach. New
[40] Rust, G. and Bide, M. The metadata framework: England Journal of Medicine, 278. 593-600.
principles, model and data dictionary. WP1a-006-2.0, [56] Williams, S. and Kindel, C. The component object model:
2000. A technical overview, 1994.