=Paper=
{{Paper
|id=Vol-222/paper-7
|storemode=property
|title=Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain
|pdfUrl=https://ceur-ws.org/Vol-222/krmed2006-p07.pdf
|volume=Vol-222
|dblpUrl=https://dblp.org/rec/conf/krmed/SmithKSC06
}}
==Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain==
KR-MED 2006 "Biomedical Ontology in Action"
November 8, 2006, Baltimore, Maryland, USA
Towards a Reference Terminology for Ontology Research and Development
in the Biomedical Domain
1
Barry Smith, Ph.D., 2Waclaw Kusnierczyk, M.D., 3Daniel Schober, Ph.D.,
1
Werner Ceusters, M.D.
1
Center of Excellence in Bioinformatics and Life Sciences, Buffalo NY/USA
2
Department of Computer Computer and Information Science, NTNU,Trondheim, Norway
3
European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
phismith@buffalo.edu, waku@idi.ntnu.no, schober@ebi.ac.uk, ceusters@buffalo.edu
Ontology is a burgeoning field, involving researchers terms, such as ‘class’, ‘object’, ‘instance’,
from the computer science, philosophy, data and ‘individual’, ‘property’, ‘relation’, etc., all of which
software engineering, logic, linguistics, and have established, but unfortunately non-uniform,
terminology domains. Many ontology-related terms meanings in a range of different disciplines.
with precise meanings in one of these domains have Among philosophical ontologists, the term
different meanings in others. Our purpose here is to ‘instance’ means an individual (for example this
initiate a path towards disambiguation of such terms. particular dog Fido), which is an instance of a
We draw primarily on the literature of biomedical corresponding universal or kind (dog, mammal, etc.).
informatics, not least because the problems caused In OWL, ‘instance’ means ‘element’ or ‘member’ of a
by unclear or ambiguous use of terms have been class (where ‘class’ means ‘general concept, category
there most thoroughly addressed. We advance a or classification … that belongs to the class extension
proposal resting on a distinction of three levels too of owl:Class’2).
often run together in biomedical ontology research: Standardization agencies such as ISO, CEN and
1. the level of reality; 2. the level of cognitive W3C have been of little help in engendering cross-
representations of this reality; 3. the level of textual disciplinary uniformity in the use of such terms, since
and graphical artifacts. We propose a reference their standards are themselves directed towards
terminology for ontology research and development specific communities. Standardization efforts under
that is designed to serve as common hub into which the auspices of W3C or UML or Dublin Core, too,
the several competing disciplinary terminologies can have not addressed these problems. For while OWL-
be mapped. We then justify our terminological DL, for example, has a rigorously defined semantics,3
choices through a critical treatment of the ‘concept this does not by any means guarantee that an ontology
orientation’ in biomedical terminology research. formulated using OWL-DL is an error-free
representation of its intended domain, and nor – until
PREAMBLE the day when the use of OWL or of some successor
becomes uniform common practice – will it do
Ever since the invention of the computer, scientists anything to resolve the problems of semantic
and engineers have been exploring ways of ambiguity adverted to in the above.
‘modeling’ or ‘representing’ the entities about which In the domain of biomedical informatics a number
machines are expected to reason. But what do of attempts have been made to resolve these
‘modeling’ and ‘representing’ mean? What is a problems4,5,6 in light of an increasing recognition that
‘conceptual model’ or an ‘information model’ and many ambitious terminological systems developed in
how can they and their components be this field are marked by unclarity over what,
unambiguously described? precisely, they have been designed to achieve. Are
Two questions here arise: To what do expressions biomedical controlled vocabularies ‘concept
such as ‘concept’, ‘information’, ‘knowledge’, etc. representations’ or ‘knowledge models’? And if they
precisely refer? And what is it to ‘model’ or are either of these things, how, if at all, do they relate
‘represent’ such things? If information and to the reality – the tumors, diseases, treatments,
knowledge themselves consist in representations, then chemical interactions – on the side of the patient?
what could an information representation or a
knowledge representation be? There is, to say the
OBJECTIVES AND METHODS
least, some suspicion of redundancy here.
As we have argued elsewhere, the term ‘concept’ is The purpose of this communication is to initiate a
marked in a peculiarly conspicuous manner by process for resolving such problems by drawing on
problems in this regard.1 But the problem of multiple the best practices in ontology which are now
conflicting meanings arises also in regard to other beginning to take root through the efforts of
57
organizations such as the National Center for should thus be interpreted by analogy with talk of
Biomedical Ontology,7 the Open Biomedical Onto- ‘levels of granularity’: if we have apprehended all the
logies (OBO) Consortium,8 the OBO Foundry,9 and liquid in a vessel, then in a sense we have thereby
others.10 apprehended also all the molecules. Yet for scientific
What is needed is a set of terms referring in purposes molecules and liquids must be distinguished
unambiguous fashion to the different kinds of entities nonetheless, and the same applies, for the purposes of
surveyed above, which can serve as common target clarity in our thinking about ontologies, to the three
for mappings from other discipline- and levels delineated in the above.
computational idiom-centric terminologies, thereby
mediating efficient pairwise translations between FOUNDATIONS
these terminologies themselves.
Our strategy is to advance precision via clear Here we give precise definitions to a number of
informal definitions rooted in what we assume are central terms, which will then be used in conformity
commonly accepted intuitions, providing references thereto in the remainder of the paper. Really existing
to associated formal treatments where possible. In ontologies and related artifacts are typically
selecting terms we have sometimes chosen constructed to realize a mixture of different sorts of
expressions precisely because they have not been ends (terminologies, for example, to support clinical
used by others and hence do not have established record keeping and large-scale epidemiological
(and potentially conflicting) meanings. In other cases studies, and to serve as controlled vocabularies for
we have adapted existing terms to our purposes by the expression of research results). Hence they
providing them with more precise definitions or (in typically combine the features of artifacts of different
case of primitive terms) elucidations. basic types. Our reference terminology is designed to
These proposals are focused primarily on the reflect these basic types. Hence the definitions we
ontology-related needs of natural science, including propose for terms such as ‘ontology’ or ‘class’ do not
the clinical basic sciences, though we believe them to imply any claim to the effect that everything called an
be of quite general applicability. ‘ontology’ or ‘class’ in the literature exhibits just the
We start out from a distinction of three levels of characteristics referred to in the definition..
entities which have a role to play wherever ontologies An ENTITY is anything which exists, including
are used: objects, processes, qualities and states on all three
levels (thus also including representations, models,
• Level 1: the objects, processes, qualities, states, beliefs, utterances, documents, observations, etc.)
etc. in reality (for example on the side of the patient); A REPRESENTATION is for example an idea, image,
• Level 2: cognitive representations of this reality on record, or description which refers to (is of or about),
the part of researchers and others; or is intended to refer to, some entity or entities
• Level 3: concretizations of these cognitive external to the representation. Note that a
representations in (for example textual or graphical) representation (e.g. a description such as ‘the cat over
representational artifacts. there on the mat’) can be of or about a given entity
even though it leaves out many aspects of its target. A
This tripartite distinction will awaken echoes of the COMPOSITE REPRESENTATION is a representation
Semantic Triangle of Ogden and Richards, to which built out of constituent sub-representations as their
we return in the sequel. For present purposes we note parts, in the way in which paragraphs are built out of
that the indispensability of Level 1 reflects the fact sentences and sentences out of words. The smallest
that even those who see themselves as building for constituent sub-representations are called
example ‘data models’ in the domain of the life REPRESENTATIONAL UNITS; examples are: icons,
sciences are attempting to create thereby artifacts names, simple word forms, or the sorts of
which stand in some representational relation to alphanumeric identifiers we might find in patient
entities in the real world. Level 2 reflects the fact that records. Note that many images are not composite
a crucial role is played in ontology and terminology representations since they are not built out of smallest
development by the cognitive representations of representational units in the way in which molecules
human subjects. Level 3 reflects the fact that are built out of atoms. (Pixels are not representational
cognitive representations can be shared, and serve units in the sense defined.)
scientific ends, only when they are made If we take the graph-theoretic concretization of the
communicable in a form whereby they can also be Gene Ontology11 as our example, then the
subjected to criticism and correction, and also to representational units here are the nodes of the graph
implementation in software. (taken to comprehend terms and unique IDs), which
Note that the three levels overlap; thus the textual are intended to refer to corresponding entities in
and graphical artifacts distinguished in Level 3 are reality. But the composite representation refers,
themselves objects on Level 1. Our talk of ‘levels’
58
through its graph structure, also to the relations PARTICULARS in reality (Level 1), (in the vernacular
between these entities, so that there is reference to also called ‘tokens’ or ‘individuals’), that is to say
entities in reality both at the level of single units and with individual patients, their lesions, diseases, and
at the structural level.12 bodily reactions, divided into CONTINUANTS and
A COGNITIVE REPRESENTATION (Level 2) is a OCCURRENTS.13 Some particulars, such as human
representation whose representational units are ideas, beings, planets, ships, hurricanes, receive PROPER
thoughts, or beliefs in the mind of some cognitive NAMES (they may also receive unique identifiers, such
subject – for example a clinician engaged in applying as social security numbers) which are used in
theoretical (and practical) knowledge to the task of representational artifacts of various sorts. But we can
establishing a diagnosis. refer to particulars also by means of complex
A REPRESENTATIONAL ARTIFACT (Level 3) is a expressions – that man on the bench, this
representation that is fixed in some medium in such a oophorectomy, this blood sample – involving
way that it can serve to make the cognitive GENERAL TERMS of different sorts, including:
representations existing in the minds of separate i. General terms such as ‘apoptosis’, ‘fracture’,
subjects publicly accessible in some enduring fashion. ‘cat’, which represent structures or characteristics in
Examples are: a text, a diagram, a map legend, a list, reality which are exemplified – the very same
a clinical record, or a controlled vocabulary. Clearly structures or characteristics; over and over again – in
such artifacts can serve to convey more or less an open-ended collection of particulars in arbitrarily
adequately the underlying cognitive representations – disconnected regions of space and time. Consider for
and can be correspondingly more or less intuitive or example the way in which a certain DNA structure is
understandable. instantiated as a transcript (RNA-structure) over and
Because representational artifacts such as over again in cells of our body.
SNOMED CT give textual form to cognitive ii. General terms such as ‘danger’, ‘gift’, ‘surprise’,
representations which pre-exist them, some have which draw together entities in reality which share
taken this to mean that these artifacts are in fact made common characteristics which are not intrinsic to the
up of representations which refer to (are of or about) entities in question.
these cognitive representations (the ‘concepts’) from iii. General terms such as ‘Berliner’, ‘Paleolithic’,
out of which the latter are held to be composed. which relate to specific collections of particulars tied
We shall argue below that this reflects a deep to specific regions of space and time.
confusion, and that the constituent units of General terms of the first sort refer to UNIVERSALS
representational artifacts developed for scientific (in the vernacular also called ‘types’ or ‘kinds’). A
purposes should more properly (and more universal is something that is shared in common by
straightforwardly) be seen as referring to the very all those particulars which are its INSTANCES. The
same entities in reality – the diseases, patients, body universal itself then exists in Level 1 reality as a
parts, and so forth – to which the underlying cognitive result of existing in its particular instances. When a
representations of clinicians and others refer. Such clinician says ‘A and B have the same disease’, she is
artifacts are in this respect no different from scientific referring to the universal; when she says ‘A’s diabetes
textbooks. They are windows on reality, designed to is more advanced than B’s,’ then she is referring to
serve as a means by which representations of reality the respective instances.
on the part of cognitive agents can be made available It is overwhelmingly universals which are the
to other agents, both human and machine. A simple entities represented in scientific texts, and a good
phrase, such as ‘the cat over there on the mat’, can be prima facie indication that a general term ‘A’ refers to
used to refer more or less successfully to what is, in a universal is that ‘A’ is used by scientists for
reality, a portion of reality of a highly complex sort – purposes of classificiation and to make different sorts
and the same applies to all of the types of artifacts of law-like assertions about the individual instances
referred to above. The window on reality which each of A with which they work in the lab or clinic.
provides is, to be sure, in every case from a certain
perspective and in such a way as to embody a certain nose part_of body
granularity of focus. Yet the entities to which it refers
are full-fledged entities in reality nonetheless – the Mary’s nose part_of Mary
very same, full-fledged entities in reality with which Mary’s nose instance_of
we are familiar also in other ways, for example nose
because they provide us with food or companionship.
Table 1 – Three Basic Sorts of Binary Relation
REALITY Both particulars and universals stand to each other
The clinician is concerned first and foremost with in various RELATIONS. Thus particulars stand to the
corresponding universals in the relation of
59
INSTANTIATION. This and other binary relations (of the distinction is of no import. Indeed we believe that
parthood, adjacency, derivation) used in biomedical taking account of this distinction is indispensable to
ontologies13 can be divided into groups as in Table 1, creating an path to improvement of ontologies.16
which uses Roman for particulars, bold type for We use the term PORTION OF REALITY to
relations involving particulars, and italics for comprehend both single universals and particulars
universals and for relations between universals. and their more or less complex combinations. Some
A COLLECTION OF PARTICULARS (of molecules in portions of reality – for example single organisms,
John’s body, of pieces of equipment in a certain planets – reflect autonomous joints of reality (that is,
operating theater, of operations performed in this they would exist as separate entities even in a world
theater over a given period of months) is a Level 1 denuded of cognitive subjects). Other portions of
particular comprehending other particulars as its reality are products of fiat demarcations of one or
MEMBERS.14 We note that confusion is spawned by other sort,17 as when we delineate a portion of reality
the fact that we can use the very same general terms by focusing on some specific granular level (of
to refer both to universals and to collections of molecules, or molecular processes), or on some
particulars. Consider: specific family of universals (for example when we
• HIV is an infectious retrovirus view the human beings living in a given county in
light of their patterns of alcohol consumption).
• HIV is spreading very rapidly through Asia
A DOMAIN is a portion of reality that forms the
A CLASS is a collection of all and only the particulars subject-matter of a single science or technology or
to which a given general term applies. Where the mode of study; for example the domain of
general term in question refers to a universal, then the proteomics, of radiology, of viral infections in mouse.
corresponding class, called the EXTENSION of the Representational artifacts will standardly represent
universal (at a given time), comprehends all and only entities in domains delineated by level of granularity.
those particulars which as a matter of fact instantiate Thus entities smaller than a given threshold value
the corresponding universal (at that time). may be excluded from a domain because they are not
The totality of classes is wider than the totality of salient to the associated scientific or clinical
extensions of universals since it includes also purposes.18
DEFINED CLASSES, designated by terms like
‘employee of Swedish bank’, ‘daughter of Finnish REPRESENTATIONAL ARTIFACTS
spy’. Languages like OWL are ideally suited to the
In developing theories, biomedical researchers seek
formal treatment of such classes, and the popularity
representations of the universals existing in their
of OWL has encouraged the view that it is classes
respective domain of reality. They first develop
which are designated by the general terms in
cognitive representations, which they then transform
terminologies. (OWL classes are not, however,
incrementally into representational artifacts of various
identical with classes in the usual set-theoretic sense
sorts.
on which we draw also here.)
In developing diagnoses, and in compiling such
Some OWL classes (above all Thing and Nothing)
diagnoses into clinical records, clinicians seek a
are ‘primitive’ (which means: not defined), and these
representation of salient particulars (diseases, disease
classes are sometimes asserted to constitute an OWL
processes, drug effects) on the side of their patients.
counterpart of universals (‘natural kinds’) in the sense
Drawing on their theoretical understanding of the
here defined.15 Because OWL identifies the relation
universals which these particulars instantiate (which
of instantiation with that of membership, however, it
in turn draws on prior representations formed in
in effect identifies universals with their extensions.
relation to earlier particulars19), they first develop a
Through relations of greater and lesser generality
cognitive representation of what is taking place within
both classes and universals are organized into trees,
a given collection of particulars in reality, which they
the former on the basis of the subclass relation, the
then transform into representational artifacts such as
latter on the basis of the is_a relation (whereby,
clinical documents, entries in databases, and so forth,
again, in the OWL framework the two relations are
which may then foster more refined cognitive
identified). Because the instances of more specific
representations in the future.
universals are ipso facto also instances of the
The mentioned representations are typically built
corresponding more general universals, the latter
up out of sub-representations each of which, in the
hierarchy is, when viewed extensionally, a proper part
best case, mirrors a corresponding salient portion of
of the former. As we shall discuss further in our
reality. The most simple representations (‘blood! ’)
treatment of the argument from borderline cases
mirror universals or particulars taken singly; more
below, it is difficult to draw a sharp line between
complex representations – such as therapeutic
terms designating universals and those designating
schemas, diagnostic protocols, scientific texts,
defined classes. This does not mean, however, that
pathway diagrams – mirror more complex portions of
60
reality, their constituent sub-representations being structural fit, degree of completeness and degree of
joined together in ways designed to mirror salient redundancy.16,18 By exploiting such classifications we
relations on the side of reality. can measure the quality improvements made in
In the ideal case a representation would be such successive versions, and also use such measures as a
that all portions of reality salient to the purposes for basis for further improvement.20
which it was constructed would have exactly one To make a representation interpretable by a
corresponding unit in the representation, and every computer, it must be published in a language with a
unit in the representation would correspond to exactly formal semantics and so converted into a
one salient portion of reality.19 Unfortunately, in a FORMALIZED REPRESENTATION. The choice of
domain like biomedicine, ideal case will likely remain language will depend on the complexity of what one
forever beyond our grasp. Researchers working on needs to express and on the sorts of reasoning one
the level of universals may fall short by creating needs to perform. While OWL, for example, can cope
representations which either (i) fail to include general well with defined classes, it may not have sufficient
terms for universals which are salient to their domain, expressive power to meet the needs of ontologies in
or (ii) include general terms which do not in fact the life sciences domain. Thus it seems to be
denote any universals at all. Similarly, clinicians incapable, for example, of capturing the relations
working on the level of particulars may fall short of involved even in simple interactions among pluralities
the best case by creating misdiagnoses, either (i) by of continuants, or of capturing the changes which take
failing to acknowledge particulars which do exist and place in such continuants (for example growth of a
which are salient to the health of a given patient, or tumor) over time.21,22
(ii) by using representational units assumed to refer to Most inventories in the biomedical field (including
particulars where no such particulars exist. most EHRs) have still exploited hardly at all the
A TAXONOMY is a tree-form graph-theoretic powers of formal reasoning. The paradigm of
representational artifact with nodes representing Referent Tracking represents an exception to this
universals or classes and edges representing is_a or rule,20 since it involves precisely the embedding of a
subset relations. highly structured representation of particulars in a
An ONTOLOGY is a representational artifact, formalized representation of the corresponding
comprising a taxonomy as proper part, whose universals.
representational units are intended to designate some
combination of universals, defined classes, and THE CONCEPT ORIENTATION
certain relations between them.13
A REALISM-BASED ONTOLOGY is built out of terms We believe that ontologies, inventories and similar
which are intended to refer exclusively to universals, artifacts should consist exclusively of representational
and corresponds to that part of the content of a units which are intended to designate entities in Level
scientific theory that is captured by its constituent 1 reality. Defenders of the concept orientation in
general terms and their interrelations. medical terminology development have offered a
A TERMINOLOGY is a representational artifact series of arguments against this view, to the effect
consisting of representational units which are the that such terminologies should include also (or
general terms of some natural language used to refer exclusively) representational units referring to what
to entities in some specific domain. are called ‘concepts’.23
An INVENTORY is a representational artifact built First, is what we can call the argument from
out of singular referring terms such as proper names intellectual modesty, which asserts that it is up to
or alphanumeric identifiers. Electronic Health domain experts, and not to terminology developers, to
Records (EHRs) incorporate inventories in this sense, answer for the truth of whatever theories the
including both terms denoting particulars (‘patient terminology is intended to mirror. Since domain
#347’, ‘lung #420’) and more complex expressions experts themselves disagree, a terminology should
involving terms designating universals and defined embrace no claims as to what the world is like, but
classes (‘the history of cancer in patient #347’s reflect, rather, the coagulate formed out of the
family’).20 concepts used by different experts.
In the best case, again, each of the representational Against this, it can be pointed out that communities
artifacts listed above (ontologies, taxonomies, working on common domains in the medical as in
inventories) will be such that its representational units other scientific fields in fact accept a massive and
stand in a one-to-one correspondence with the salient ever-growing body of consensus truths about the
entities in its domain. In practice, however, such entities in these domains. Many of these truths are,
artifacts can be classified on the basis of the various admittedly, of a trivial sort (that mammals have
ways in which they fall short of this best case, in hearts, that organisms are made of cells), but it is
terms of properties such as correctness, degree of precisely such truths which form the core of science-
61
based ontologues. Where conflicts do arise in the Some patients do, after all, believe that they are
course of scientific development, these are highly James Bond, or that they see unicorns. The realist
localized, and pertain to specific mechanisms, for approach is however perfectly well able to
example of drug action or disease development, comprehend also phenomena such as these, even
which can serve as the targets of conflicting beliefs though it is restricted to the representation of what is
only because researchers share a huge body of real. For the beliefs and hallucinatory episodes in
presuppositions. question are of course as real as are the persons who
We can think of no scenario under which it would suffer (or enjoy) them. And certainly such beliefs and
make sense to postulate special entities called episodes may involve concepts (in the properly
‘concepts’ as the entities to which terms subject to psychological sense of this term). But they are not
scientific dispute would refer. For either, for any such about concepts, they do not have concepts as their
term, the dispute is resolved in its favor, and then it is targets – for they are intended by their subjects to be
the corresponding level 1 entity that has served as its about entities in flesh-and-blood external reality.
referent all along; or it is established that the term in Fourth, is the argument from medical history. The
question is non-designating, and then this term is no history of medicine is a scientific pursuit; yet it
longer a candidate for inclusion in a terminology. We involves use of terms such as ‘diabolic possession’
cannot solve the problem that we do not know, at which, according to the best current science, do not
some given stage of scientific inquiry, to which of refer to universals in reality. But again: the history of
these groups a given term belongs, by providing such medicine has as its subject-domain precisely the
terms instead with guaranteed referents called beliefs, both true and false, of former generations
‘concepts’. It may, finally, be the case that it is not the (together with the practices, institutions, etc.
disputed term itself which is at issue, but rather some associated therewith). Thus a term like ‘diabolic
more complex expression, as when we talk about ‘G. possession’ should be included in the ontology of this
E. Stahl’s concept of phlogiston’, but that the latter discipline in the first place as component part of
refers to some entity – a concept – in (psychological) terms designating corresponding classes of beliefs. In
reality is precisely not subject to scientific dispute. addition it may appear also as part of a term
Sometimes the argument from intellectual modesty designating some fiat collection of those diseases
takes an extreme form, for example on the part of from which the patients diagnosed as being possessed
those for whom reality itself is seen as being were in fact suffering. The evolution of our thinking
somehow unknowable (‘we can only ever know our about disease can then be understood in the same way
own concepts’). Arguments along these lines are of that we deal with theory change in other parts of
course familiar from the history of philosophy. Stove science, as a reordering of our beliefs about the
provides the definitive refutation.24 Here we need note ontological validity and salience of specific families
only that they run counter not just to the successes, of terms – and once again: concepts themselves play
but to the very existence, of science and technology no role as referents.20,26
as collaborative endeavors. Fifth, is the argument from syndromes. The
Second, is the argument from creativity. Designer subject-matters of biology and medicine are, it is
drugs are conceived, modeled, and described long held, replete with entities which do not exist in reality
before they are successfully synthesized, and the but are rather convenient abstractions. A syndrome
plans of pharmaceutical companies may contain such as congestive heart failure, for example, is
putative references to the corresponding chemical nothing more than a convenient abstraction, used for
universals long before there are instances in reality. the convenience of physicians to collect together
But again: such descriptions and plans can be many disparate and unrelated diseases which have
perfectly well apprehended even within terminologies common final manifestations. Such abstractions are, it
and ontologies conceived as relating exclusively to is held, mere concepts.
what is real. Descriptions and plans do, after all, According to the considerations on fiat
exist. On the other hand it would be an error to demarcations advanced above, however, syndromes,
include in a scientific ontology of drugs terms pathways, genetic networks and similar phenomena
referring to pharmaceutical products which do not yet are indeed fully real – though their reality is that of
(and may never) exist, solely on the basis of plans and defined (fiat) classes rather than of universals. A
descriptions. Rather, such terms should be included similar response can be given also in regard to the
precisely at the point where the corresponding many human-dependent delineations used in
instances do indeed exist in reality, exactly in expressions like ‘obesity’ or ‘hypertension’ or
accordance with our proposals above. ‘abnormal curvature of spine’. These terms, too, refer
Third, is what we might call the argument from to entities in reality, namely to defined classes which
unicorns. Some of the terms needed in medical rest on fiat thresholds established by consensus
terminologies refer, it is held, to what does not exist. among physicians.
62
Sixth is the argument from error. When erroneous ‘electron’ or ‘cell’, on the one hand, and ‘fall on stairs
entries are entered into a clinical record and inter- or ladders in water transport NOS, occupant of small
preted as being about level 1 entities, then logical unpowered boat injured’ (Read Codes) on the other.
conflicts can arise. For Rector et al., this implies that But there are also borderline cases such as ‘alcoholic
the use of a meta-language should be made compul- non-smoker with diabetes’, or ‘age-dependent yeast
sory for all statements in the EHR, which should be, cell size increase’, which call into question the very
not about entities in reality, but rather about what are basis of the distinction.
called ‘findings’.25 Instead of p and not p, the record In response, we note first the general point, that
would contain entries like: McX observed p and O’W arguments from the existence of borderline cases in
observed not p, so that logical contradiction is general have very little force. For otherwise they
avoided. The terms in terminologies devised to serve would allow us to prove from the existence of people
such EHRs would then one and all refer not to with borderline complements of hair that there is no
diseases themselves, but rather to mere ‘concepts’ of such thing as baldness or hairiness.
diseases. This, however, blurs the distinction between As to the specific problem of how to classify
entities in reality and associated findings, and opens borderline expressions, this is a problem not for
the door to the inclusion in a terminology of terminology, but rather for empirical science. For
problematic findings-related expressions such as borderline terms of the sorts mentioned will, as an
SNOMED’s ‘absent nipple’, ‘absent leg’, etc. inevitable concomitant of scientific advance, be in
Certainly clinicians need to record such findings. But any case subjected to a filtering process based on
then their findings are precisely that a leg is absent; whether they are needed for purposes of (for example
not that a special kind of (‘absent’) leg is present. therapeutically) fruitful classifications, and thus for
In the domain of scientific research we do not the expression of scientific laws.
embargo entirely the making of object-language Science itself is thereby subject to constant update.
assertions simply because there might be, among the A term taken to refer to a universal by one generation
totality of such assertions, some which are erroneous. of scientists may be demoted to the level of non-
Rather, we rely on the normal workings of science as designating term (‘phlogiston’) by the next. This
a collective, empirical endeavor to weed out error means also that representational artifacts of the sorts
over time, providing facilities to quarantine erroneous considered in the above, because they form an
entries and resolve logical conflicts as they are integral part of the practice of science, should
identified. We have argued elsewhere that these same themselves be subject to continual update in light of
devices can be applied also in the medical context.26 such advance. But again: we can think of no
The argument for the move to the meta-level is circumstance in which updating of the sort in question
sometimes buttressed by appeal to medico-legal would signify that phlogiston is itself a concept, or
considerations seen as requiring that the EHR be a that some expression was at one or other stage being
record not of what exists but of clinicians’ beliefs and used by scientists with the intention of referring to
actions. Yet the forensic purposes of an audit trail can ‘concepts’ rather than to entities in reality.
equally well be served by an object-language record
if we ensure that meta-data are associated with each THE SEMIOTIC TRIANGLE
entry identifying by whom the pertinent data were
entered, at what time, and so forth. Finally is what we might call the argument from
On the other side, moreover, even the move to multiple perspectives. Different patients, clinicians
meta-level assertions would not in fact solve the and biologists have their own perspectives on one and
problems of error, logical contradiction and legal the same reality. To do justice to these differences, it
liability. For the very same problems arise not only is argued, we must hold that their respective
when human beings are describing, on the object- representations point, not to this common reality, but
level, fractures, or pulse rates, or symptoms of rather to their different ‘concepts’ thereof.
coughing or swelling, but also on the meta-level when This argument has its roots in the work of Ogden
they are describing what clinicians have heard, seen, and Richards, and specifically in their discussion of
thought and done. The latter, too, are subject to error, the so-called ‘semiotic triangle’, which is of
fraud, and disagreement in interpretation. importance not least because it embodies a view of
Seventh is the argument from borderline cases. As meaning and reference that still plays a fateful role in
we have already noted above, there is at any given the terminology standardization work of ISO.26
stage no bright line between those general terms As Figure 1 makes clear, the triangle in fact refers
properly to be conceived as designating universals not to ‘concepts’, but rather to what its authors call
and those designating merely ‘concepts’ (or defined ‘thought or reference’,27 reflecting the fact that Ogden
classes). Certainly there are, at any given stage in the and Richards’ account is rooted in a theory of
development of science, clear cases on either side: psychological causality. When we experience a
63
certain object in association with a certain sign, then terminology literature henceforth? There are of
memory traces are laid down in our brains in virtue of course sensible uses of this term, for example in the
which the mere appearance of the same sign in the literature of psychology. In the terminology literature,
future will, they hold, ‘evoke’ a ‘thought or reference’ however, ‘concept’ has been used in such a
directed towards this object through the reactivation bewildering variety of confused and confusing ways
of impressions stored in memory. that we recommend that it be avoided altogether.
It is tempting to suppose that, when considered
extensionally, all of the mentioned alternative
readings come down to one and the same thing,
namely to an identification of ‘concept’ with what we
have earlier called ‘defined class’. If ‘concept’ could
be used systematically in this way in terminological
circles, then this would, indeed, constitute progress of
sorts, though the question would then arise why
‘defined class’ itself should not be used instead.
Unfortunately, however, the proposal in question
Figure 1 – Ogden and Richards’ Semiotic Triangle stands in conflict with the fact that ‘concept’ is used
by its adherents to comprehend also putative referents
The two solid edges of the triangle are intended to even for terms – such as ‘surgical procedure not
represent what are held to be causal relations of carried out because of patient’s decision’ – which do
‘symbolization’ (roughly: evocation), and ‘reference’ not designate defined classes because they designate
(roughly: perception or memory) on the part of a nothing at all. Here again, we believe, a proper
symbol-using subject. The dashed edge, in contrast, treatment would involve appeal to appropriate fiat
signifies that the relation between term and referent – classes, defined in terms of utterances, interrupted
the relation that is most important for the discussion plans, expectations, etc. on the part of the subjects
of terminology – is merely ‘imputed’. involved.
The background assumption here is that multiple What, now is to be said of terms such as ‘concept
perspectives are both ubiquitous and (at best) only model’, ‘knowledge representation’, ‘information
locally and transiently resolvable. The meanings model’, and so forth referred to in our premble
words have for you or me depend on our past above? To the extent that concept-based
experiences of uses of these words in different kinds terminological artifacts consist in representations not
of contexts. Ambiguity must be resolved anew (and a of the reality on the side of the patient but rather of
new ‘imputed’ relation of reference spawned) on each the entities in some putative ‘realm of concepts’, the
successive occasion of use. From this, Ogden and term ‘concept model’ may be justified. This term is
Richards infer that a symbolic representation can indeed used by SNOMED CT in its own self-
never refer directly to an object, but rather only descriptions, though given SNOMED’s scientific
indirectly, via a ‘thought or reference’ within the goals, we believe that, on the basis of the arguments
mind. given above, it should be abandoned. Still more
It is a depsychologized version of this latter thesis problematic is the term ‘knowledge model’ or
which forms the basis of the concept orientation in ‘knowledge representation’ (GALEN). For in the
contemporary terminology research. The terms in absence of a reference to reality to serve as
terminologies refer not to entities in reality, it is held, benchmark, what could motivate a distinction
but rather to ‘concepts’ in a special ‘realm’. The lat- between knowledge and mere belief.19 And what, in
ter are not transparent mediators of reference; rather the absence of a reference to reality, could motivate
they are its targets, and the job of the terminologist is adding or deleting terms in successive versions of a
to callibrate his list of terms in relation not to reality terminology, if every term is in any case guaranteed a
but to this special ‘realm of concepts’.26 reference to its own specially tailored ‘concept’.
The relation between terms in a terminology and As to ‘information model’, here one standard
the reality beyond becomes hereby obscured. Reality uncertainty concerns the relation between an entity in
exists, if at all, only behind a conceptual veil – and reality and the body of information used to ‘repre-
hence familiar confusions according to which for sent’ this entity in some information system. Is it in-
example the concept of bacteria would cause an formation which is being ‘modeled’ in an information
experimental model of disease, or the concept of model, or the reality which this information is about?
vitamin would be ‘essential in the diet of man’.28 The documentation of the HL7 Reference
Information Model (RIM)29 adds extra layers of
‘CONCEPTS’ AND ‘MODELS’ uncertainty by conceiving its principal formulas as
How, then, should ‘concept’ be properly treated in the referring to the acts in which entities are observed for
64
example in a clinical context. Simultaneously, of Anatomy. J Biomed Inform 2003;36:478-500.
however, it conceives these formulas as referring also 11. http://geneontology.org/.
to the documentation of such acts for example in an 12. Wittgenstein L. 1921 Tractatus Logico-
information system. The apparent contradiction is to Philosophicus, London: Routledge, 1961.
some degree resolved by the RIM on the basis of its 13 Smith B, Ceusters W, Klagges B et al.. Relations
assertion that there is in any case ‘no distinction in biomedical ontologies. Genome Biol,
between an activity and its documentation’.30 2005;6(5):R46.
14. Bittner T, Donnelly M, Smith B. Individuals,
CONCLUSION universals, collections. Formal Ontology in
Information Systems (FOIS 2004), p. 37-48.
Drawing on our distinction of the three levels of
15. Drummond N. Introduction to ontologies. http://
reality, cognition and representational artifact we
www.cs.man.ac.uk/~drummond/presentations/Int
have sought to formulate an unambiguous
roductionToOWL50mins.ppt.
terminology for describing ontologies and related
16. Ceusters W, Smith B. A realism-based approach
artifacts. The proposed terminology allows us to
to the versioning and evolution of biomedical
characterize more precisely the sorts of things which
ontologies. Proc AMIA Symp 2006, in press.
go wrong when the distinction between these levels is
17. Smith B. Fiat objects. Topoi, 2001;20(2):131-48.
ignored, or when one or other level is denied, so that
18. Bittner T, Smith B. A theory of granular parti-
the approach may also help in improving such
tions. Foundations of Geographic Information
artifacts in the future.
Science, London, 2003, p. 117-51
19. Smith B. From concepts to clinical reality, J
Acknowledgements
Biomed Inform. 2006 Jun;39(3):288-98.
This work was supported by the Wolfgang Paul 20. Ceusters W, Smith B. Strategies for referent
Program of the Humboldt Foundation, the Volks- tracking in Electronic Health Records. J Biomed
wagen Foundation, the European Union Semantic Inform. 2006 Jun;39(3):362-78.
Mining Network, by BBSRC Grant BB/D524283/1, 21. Bera P, Wand Y. Analyzing OWL using a
and by the NIH Roadmap Grant U54 HG004028. philosophy-based ontology. Formal Ontology in
Thanks are due also to Jim Cimino, Chris Chute, Information Systems (FOIS 2004), p. 353-62.
Gunnar Klein, Alan Rector, Stefan Schulz, and Kent 22. Kazic T. Putting semantics into the semantic
Spackman for fruitful discussions. web: How well can it capture biology? Pac Symp
Biocomputing 2006;11:140-51.
References 23 Cimino JJ. In defense of the desiderata. J Biomed
Inform. 2006;39:299-306.
(URLs last accessed July 1, 2006) 24. Franklin J. Stove’s discovery of the worst
1. Smith B. Beyond concepts, or: Ontology as argument in the world. Philosophy 2002;77:615-
reality representation, Formal Ontology in 24. www.maths.unsw.edu.au/~jim/worst.pdf.
Information Systems (FOIS 2004), p. 73-84. 25. Rector A, Nolan W, Kay S. Foundations for an
2. http://www.w3.org/2003/glossary. electronic medical record. Methods Inf Med,
3. Patel-Schneider PF, Hayes P, Horrocks I. OWL 1991;30:179-86.
Web Ontology Language. 2004. http://www.- 26. Smith B, Ceusters W, Temmerman R. Wüsteria,
w3.org/TR/owl-semantics. Stud Health Technol Inform. 2005;116:647-652.
4. Spackman KA, Reynoso G. Examining 27. Ogden CK, Richards IA. The Meaning of
SNOMED from the perspective of formal Meaning. 3rd ed. New York, 1930.
ontological principles. Workshop on Formal 28. The UMLS Semantic Network. http://semantic
Biomedical Knowledge Representation (KR- network.nlm.nih.gov/.
MED 2004), p. 72-80. 29. HL7 V3 Reference Information Model: Version
5. Johansson I. Bioinformatics and biological V 01-20. Normative Ballot 11/22/2005.
reality. J Biomed Inform. 2006;39(3):274-87. 30. Smith B, Ceusters W. HL7 RIM: An incoherent
6 Klein GO, Smith B. Concept systems and standard, Proc MIE, 2006, p. 133-138
ontologies. http://ontology.buffalo.edu/concepts
/ConceptsandOntologies.pdf.
7. http://ncbo.us/.
8. http://obo.sourceforge.net/.
9. http://obofoundry.org/.
10. Rosse C, Mejino JL, Jr. A reference ontology for
biomedical informatics: the Foundational Model
65