=Paper= {{Paper |id=Vol-222/paper-7 |storemode=property |title=Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain |pdfUrl=https://ceur-ws.org/Vol-222/krmed2006-p07.pdf |volume=Vol-222 |dblpUrl=https://dblp.org/rec/conf/krmed/SmithKSC06 }} ==Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain== https://ceur-ws.org/Vol-222/krmed2006-p07.pdf
KR-MED 2006 "Biomedical Ontology in Action"
November 8, 2006, Baltimore, Maryland, USA


 Towards a Reference Terminology for Ontology Research and Development
                        in the Biomedical Domain
           1
          Barry Smith, Ph.D., 2Waclaw Kusnierczyk, M.D., 3Daniel Schober, Ph.D.,
                                   1
                                     Werner Ceusters, M.D.
        1
          Center of Excellence in Bioinformatics and Life Sciences, Buffalo NY/USA
2
  Department of Computer Computer and Information Science, NTNU,Trondheim, Norway
        3
          European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
phismith@buffalo.edu, waku@idi.ntnu.no, schober@ebi.ac.uk, ceusters@buffalo.edu

Ontology is a burgeoning field, involving researchers           terms, such as ‘class’, ‘object’, ‘instance’,
from the computer science, philosophy, data and                 ‘individual’, ‘property’, ‘relation’, etc., all of which
software engineering, logic, linguistics, and                   have established, but unfortunately non-uniform,
terminology domains. Many ontology-related terms                meanings in a range of different disciplines.
with precise meanings in one of these domains have                 Among philosophical ontologists, the term
different meanings in others. Our purpose here is to            ‘instance’ means an individual (for example this
initiate a path towards disambiguation of such terms.           particular dog Fido), which is an instance of a
We draw primarily on the literature of biomedical               corresponding universal or kind (dog, mammal, etc.).
informatics, not least because the problems caused              In OWL, ‘instance’ means ‘element’ or ‘member’ of a
by unclear or ambiguous use of terms have been                  class (where ‘class’ means ‘general concept, category
there most thoroughly addressed. We advance a                   or classification … that belongs to the class extension
proposal resting on a distinction of three levels too           of owl:Class’2).
often run together in biomedical ontology research:                Standardization agencies such as ISO, CEN and
1. the level of reality; 2. the level of cognitive              W3C have been of little help in engendering cross-
representations of this reality; 3. the level of textual        disciplinary uniformity in the use of such terms, since
and graphical artifacts. We propose a reference                 their standards are themselves directed towards
terminology for ontology research and development               specific communities. Standardization efforts under
that is designed to serve as common hub into which              the auspices of W3C or UML or Dublin Core, too,
the several competing disciplinary terminologies can            have not addressed these problems. For while OWL-
be mapped. We then justify our terminological                   DL, for example, has a rigorously defined semantics,3
choices through a critical treatment of the ‘concept            this does not by any means guarantee that an ontology
orientation’ in biomedical terminology research.                formulated using OWL-DL is an error-free
                                                                representation of its intended domain, and nor – until
                    PREAMBLE                                    the day when the use of OWL or of some successor
                                                                becomes uniform common practice – will it do
Ever since the invention of the computer, scientists            anything to resolve the problems of semantic
and engineers have been exploring ways of                       ambiguity adverted to in the above.
‘modeling’ or ‘representing’ the entities about which              In the domain of biomedical informatics a number
machines are expected to reason. But what do                    of attempts have been made to resolve these
‘modeling’ and ‘representing’ mean? What is a                   problems4,5,6 in light of an increasing recognition that
‘conceptual model’ or an ‘information model’ and                many ambitious terminological systems developed in
how can they and their components be                            this field are marked by unclarity over what,
unambiguously described?                                        precisely, they have been designed to achieve. Are
   Two questions here arise: To what do expressions             biomedical      controlled     vocabularies      ‘concept
such as ‘concept’, ‘information’, ‘knowledge’, etc.             representations’ or ‘knowledge models’? And if they
precisely refer? And what is it to ‘model’ or                   are either of these things, how, if at all, do they relate
‘represent’ such things? If information and                     to the reality – the tumors, diseases, treatments,
knowledge themselves consist in representations, then           chemical interactions – on the side of the patient?
what could an information representation or a
knowledge representation be? There is, to say the
                                                                          OBJECTIVES AND METHODS
least, some suspicion of redundancy here.
   As we have argued elsewhere, the term ‘concept’ is           The purpose of this communication is to initiate a
marked in a peculiarly conspicuous manner by                    process for resolving such problems by drawing on
problems in this regard.1 But the problem of multiple           the best practices in ontology which are now
conflicting meanings arises also in regard to other             beginning to take root through the efforts of



                                                           57
organizations such as the National Center for                     should thus be interpreted by analogy with talk of
Biomedical Ontology,7 the Open Biomedical Onto-                   ‘levels of granularity’: if we have apprehended all the
logies (OBO) Consortium,8 the OBO Foundry,9 and                   liquid in a vessel, then in a sense we have thereby
others.10                                                         apprehended also all the molecules. Yet for scientific
  What is needed is a set of terms referring in                   purposes molecules and liquids must be distinguished
unambiguous fashion to the different kinds of entities            nonetheless, and the same applies, for the purposes of
surveyed above, which can serve as common target                  clarity in our thinking about ontologies, to the three
for mappings from other discipline- and                           levels delineated in the above.
computational idiom-centric terminologies, thereby
mediating efficient pairwise translations between                                   FOUNDATIONS
these terminologies themselves.
  Our strategy is to advance precision via clear                  Here we give precise definitions to a number of
informal definitions rooted in what we assume are                 central terms, which will then be used in conformity
commonly accepted intuitions, providing references                thereto in the remainder of the paper. Really existing
to associated formal treatments where possible. In                ontologies and related artifacts are typically
selecting terms we have sometimes chosen                          constructed to realize a mixture of different sorts of
expressions precisely because they have not been                  ends (terminologies, for example, to support clinical
used by others and hence do not have established                  record keeping and large-scale epidemiological
(and potentially conflicting) meanings. In other cases            studies, and to serve as controlled vocabularies for
we have adapted existing terms to our purposes by                 the expression of research results). Hence they
providing them with more precise definitions or (in               typically combine the features of artifacts of different
case of primitive terms) elucidations.                            basic types. Our reference terminology is designed to
  These proposals are focused primarily on the                    reflect these basic types. Hence the definitions we
ontology-related needs of natural science, including              propose for terms such as ‘ontology’ or ‘class’ do not
the clinical basic sciences, though we believe them to            imply any claim to the effect that everything called an
be of quite general applicability.                                ‘ontology’ or ‘class’ in the literature exhibits just the
  We start out from a distinction of three levels of              characteristics referred to in the definition..
entities which have a role to play wherever ontologies               An ENTITY is anything which exists, including
are used:                                                         objects, processes, qualities and states on all three
                                                                  levels (thus also including representations, models,
• Level 1: the objects, processes, qualities, states,             beliefs, utterances, documents, observations, etc.)
etc. in reality (for example on the side of the patient);            A REPRESENTATION is for example an idea, image,
• Level 2: cognitive representations of this reality on           record, or description which refers to (is of or about),
the part of researchers and others;                               or is intended to refer to, some entity or entities
• Level 3: concretizations of these cognitive                     external to the representation. Note that a
representations in (for example textual or graphical)             representation (e.g. a description such as ‘the cat over
representational artifacts.                                       there on the mat’) can be of or about a given entity
                                                                  even though it leaves out many aspects of its target. A
This tripartite distinction will awaken echoes of the             COMPOSITE REPRESENTATION is a representation
Semantic Triangle of Ogden and Richards, to which                 built out of constituent sub-representations as their
we return in the sequel. For present purposes we note             parts, in the way in which paragraphs are built out of
that the indispensability of Level 1 reflects the fact            sentences and sentences out of words. The smallest
that even those who see themselves as building for                constituent      sub-representations        are    called
example ‘data models’ in the domain of the life                   REPRESENTATIONAL UNITS; examples are: icons,
sciences are attempting to create thereby artifacts               names, simple word forms, or the sorts of
which stand in some representational relation to                  alphanumeric identifiers we might find in patient
entities in the real world. Level 2 reflects the fact that        records. Note that many images are not composite
a crucial role is played in ontology and terminology              representations since they are not built out of smallest
development by the cognitive representations of                   representational units in the way in which molecules
human subjects. Level 3 reflects the fact that                    are built out of atoms. (Pixels are not representational
cognitive representations can be shared, and serve                units in the sense defined.)
scientific ends, only when they are made                             If we take the graph-theoretic concretization of the
communicable in a form whereby they can also be                   Gene Ontology11 as our example, then the
subjected to criticism and correction, and also to                representational units here are the nodes of the graph
implementation in software.                                       (taken to comprehend terms and unique IDs), which
  Note that the three levels overlap; thus the textual            are intended to refer to corresponding entities in
and graphical artifacts distinguished in Level 3 are              reality. But the composite representation refers,
themselves objects on Level 1. Our talk of ‘levels’


                                                             58
through its graph structure, also to the relations                PARTICULARS in reality (Level 1), (in the vernacular
between these entities, so that there is reference to             also called ‘tokens’ or ‘individuals’), that is to say
entities in reality both at the level of single units and         with individual patients, their lesions, diseases, and
at the structural level.12                                        bodily reactions, divided into CONTINUANTS and
   A COGNITIVE REPRESENTATION (Level 2) is a                      OCCURRENTS.13 Some particulars, such as human
representation whose representational units are ideas,            beings, planets, ships, hurricanes, receive PROPER
thoughts, or beliefs in the mind of some cognitive                NAMES (they may also receive unique identifiers, such
subject – for example a clinician engaged in applying             as social security numbers) which are used in
theoretical (and practical) knowledge to the task of              representational artifacts of various sorts. But we can
establishing a diagnosis.                                         refer to particulars also by means of complex
   A REPRESENTATIONAL ARTIFACT (Level 3) is a                     expressions – that man on the bench, this
representation that is fixed in some medium in such a             oophorectomy, this blood sample – involving
way that it can serve to make the cognitive                       GENERAL TERMS of different sorts, including:
representations existing in the minds of separate                    i. General terms such as ‘apoptosis’, ‘fracture’,
subjects publicly accessible in some enduring fashion.            ‘cat’, which represent structures or characteristics in
Examples are: a text, a diagram, a map legend, a list,            reality which are exemplified – the very same
a clinical record, or a controlled vocabulary. Clearly            structures or characteristics; over and over again – in
such artifacts can serve to convey more or less                   an open-ended collection of particulars in arbitrarily
adequately the underlying cognitive representations –             disconnected regions of space and time. Consider for
and can be correspondingly more or less intuitive or              example the way in which a certain DNA structure is
understandable.                                                   instantiated as a transcript (RNA-structure) over and
   Because representational artifacts such as                     over again in cells of our body.
SNOMED CT give textual form to cognitive                             ii. General terms such as ‘danger’, ‘gift’, ‘surprise’,
representations which pre-exist them, some have                   which draw together entities in reality which share
taken this to mean that these artifacts are in fact made          common characteristics which are not intrinsic to the
up of representations which refer to (are of or about)            entities in question.
these cognitive representations (the ‘concepts’) from                iii. General terms such as ‘Berliner’, ‘Paleolithic’,
out of which the latter are held to be composed.                  which relate to specific collections of particulars tied
   We shall argue below that this reflects a deep                 to specific regions of space and time.
confusion, and that the constituent units of                         General terms of the first sort refer to UNIVERSALS
representational artifacts developed for scientific               (in the vernacular also called ‘types’ or ‘kinds’). A
purposes should more properly (and more                           universal is something that is shared in common by
straightforwardly) be seen as referring to the very               all those particulars which are its INSTANCES. The
same entities in reality – the diseases, patients, body           universal itself then exists in Level 1 reality as a
parts, and so forth – to which the underlying cognitive           result of existing in its particular instances. When a
representations of clinicians and others refer. Such              clinician says ‘A and B have the same disease’, she is
artifacts are in this respect no different from scientific        referring to the universal; when she says ‘A’s diabetes
textbooks. They are windows on reality, designed to               is more advanced than B’s,’ then she is referring to
serve as a means by which representations of reality              the respective instances.
on the part of cognitive agents can be made available                It is overwhelmingly universals which are the
to other agents, both human and machine. A simple                 entities represented in scientific texts, and a good
phrase, such as ‘the cat over there on the mat’, can be           prima facie indication that a general term ‘A’ refers to
used to refer more or less successfully to what is, in            a universal is that ‘A’ is used by scientists for
reality, a portion of reality of a highly complex sort –          purposes of classificiation and to make different sorts
and the same applies to all of the types of artifacts             of law-like assertions about the individual instances
referred to above. The window on reality which each               of A with which they work in the lab or clinic.
provides is, to be sure, in every case from a certain
perspective and in such a way as to embody a certain                    nose part_of body
granularity of focus. Yet the entities to which it refers
are full-fledged entities in reality nonetheless – the                Mary’s nose part_of Mary
very same, full-fledged entities in reality with which                                        Mary’s nose instance_of
                                                                  
we are familiar also in other ways, for example                                               nose
because they provide us with food or companionship.
                                                                     Table 1 – Three Basic Sorts of Binary Relation
                      REALITY                                       Both particulars and universals stand to each other
The clinician is concerned first and foremost with                in various RELATIONS. Thus particulars stand to the
                                                                  corresponding universals in the relation of


                                                             59
INSTANTIATION. This and other binary relations (of              the distinction is of no import. Indeed we believe that
parthood, adjacency, derivation) used in biomedical             taking account of this distinction is indispensable to
ontologies13 can be divided into groups as in Table 1,          creating an path to improvement of ontologies.16
which uses Roman for particulars, bold type for                    We use the term PORTION OF REALITY to
relations involving particulars, and italics for                comprehend both single universals and particulars
universals and for relations between universals.                and their more or less complex combinations. Some
  A COLLECTION OF PARTICULARS (of molecules in                  portions of reality – for example single organisms,
John’s body, of pieces of equipment in a certain                planets – reflect autonomous joints of reality (that is,
operating theater, of operations performed in this              they would exist as separate entities even in a world
theater over a given period of months) is a Level 1             denuded of cognitive subjects). Other portions of
particular comprehending other particulars as its               reality are products of fiat demarcations of one or
MEMBERS.14 We note that confusion is spawned by                 other sort,17 as when we delineate a portion of reality
the fact that we can use the very same general terms            by focusing on some specific granular level (of
to refer both to universals and to collections of               molecules, or molecular processes), or on some
particulars. Consider:                                          specific family of universals (for example when we
• HIV is an infectious retrovirus                               view the human beings living in a given county in
                                                                light of their patterns of alcohol consumption).
• HIV is spreading very rapidly through Asia
                                                                   A DOMAIN is a portion of reality that forms the
A CLASS is a collection of all and only the particulars         subject-matter of a single science or technology or
to which a given general term applies. Where the                mode of study; for example the domain of
general term in question refers to a universal, then the        proteomics, of radiology, of viral infections in mouse.
corresponding class, called the EXTENSION of the                Representational artifacts will standardly represent
universal (at a given time), comprehends all and only           entities in domains delineated by level of granularity.
those particulars which as a matter of fact instantiate         Thus entities smaller than a given threshold value
the corresponding universal (at that time).                     may be excluded from a domain because they are not
   The totality of classes is wider than the totality of        salient to the associated scientific or clinical
extensions of universals since it includes also                 purposes.18
DEFINED CLASSES, designated by terms like
‘employee of Swedish bank’, ‘daughter of Finnish                      REPRESENTATIONAL ARTIFACTS
spy’. Languages like OWL are ideally suited to the
                                                                In developing theories, biomedical researchers seek
formal treatment of such classes, and the popularity
                                                                representations of the universals existing in their
of OWL has encouraged the view that it is classes
                                                                respective domain of reality. They first develop
which are designated by the general terms in
                                                                cognitive representations, which they then transform
terminologies. (OWL classes are not, however,
                                                                incrementally into representational artifacts of various
identical with classes in the usual set-theoretic sense
                                                                sorts.
on which we draw also here.)
                                                                   In developing diagnoses, and in compiling such
   Some OWL classes (above all Thing and Nothing)
                                                                diagnoses into clinical records, clinicians seek a
are ‘primitive’ (which means: not defined), and these
                                                                representation of salient particulars (diseases, disease
classes are sometimes asserted to constitute an OWL
                                                                processes, drug effects) on the side of their patients.
counterpart of universals (‘natural kinds’) in the sense
                                                                Drawing on their theoretical understanding of the
here defined.15 Because OWL identifies the relation
                                                                universals which these particulars instantiate (which
of instantiation with that of membership, however, it
                                                                in turn draws on prior representations formed in
in effect identifies universals with their extensions.
                                                                relation to earlier particulars19), they first develop a
   Through relations of greater and lesser generality
                                                                cognitive representation of what is taking place within
both classes and universals are organized into trees,
                                                                a given collection of particulars in reality, which they
the former on the basis of the subclass relation, the
                                                                then transform into representational artifacts such as
latter on the basis of the is_a relation (whereby,
                                                                clinical documents, entries in databases, and so forth,
again, in the OWL framework the two relations are
                                                                which may then foster more refined cognitive
identified). Because the instances of more specific
                                                                representations in the future.
universals are ipso facto also instances of the
                                                                   The mentioned representations are typically built
corresponding more general universals, the latter
                                                                up out of sub-representations each of which, in the
hierarchy is, when viewed extensionally, a proper part
                                                                best case, mirrors a corresponding salient portion of
of the former. As we shall discuss further in our
                                                                reality. The most simple representations (‘blood! ’)
treatment of the argument from borderline cases
                                                                mirror universals or particulars taken singly; more
below, it is difficult to draw a sharp line between
                                                                complex representations – such as therapeutic
terms designating universals and those designating
                                                                schemas, diagnostic protocols, scientific texts,
defined classes. This does not mean, however, that
                                                                pathway diagrams – mirror more complex portions of


                                                           60
reality, their constituent sub-representations being             structural fit, degree of completeness and degree of
joined together in ways designed to mirror salient               redundancy.16,18 By exploiting such classifications we
relations on the side of reality.                                can measure the quality improvements made in
   In the ideal case a representation would be such              successive versions, and also use such measures as a
that all portions of reality salient to the purposes for         basis for further improvement.20
which it was constructed would have exactly one                     To make a representation interpretable by a
corresponding unit in the representation, and every              computer, it must be published in a language with a
unit in the representation would correspond to exactly           formal semantics and so converted into a
one salient portion of reality.19 Unfortunately, in a            FORMALIZED REPRESENTATION. The choice of
domain like biomedicine, ideal case will likely remain           language will depend on the complexity of what one
forever beyond our grasp. Researchers working on                 needs to express and on the sorts of reasoning one
the level of universals may fall short by creating               needs to perform. While OWL, for example, can cope
representations which either (i) fail to include general         well with defined classes, it may not have sufficient
terms for universals which are salient to their domain,          expressive power to meet the needs of ontologies in
or (ii) include general terms which do not in fact               the life sciences domain. Thus it seems to be
denote any universals at all. Similarly, clinicians              incapable, for example, of capturing the relations
working on the level of particulars may fall short of            involved even in simple interactions among pluralities
the best case by creating misdiagnoses, either (i) by            of continuants, or of capturing the changes which take
failing to acknowledge particulars which do exist and            place in such continuants (for example growth of a
which are salient to the health of a given patient, or           tumor) over time.21,22
(ii) by using representational units assumed to refer to            Most inventories in the biomedical field (including
particulars where no such particulars exist.                     most EHRs) have still exploited hardly at all the
   A TAXONOMY is a tree-form graph-theoretic                     powers of formal reasoning. The paradigm of
representational artifact with nodes representing                Referent Tracking represents an exception to this
universals or classes and edges representing is_a or             rule,20 since it involves precisely the embedding of a
subset relations.                                                highly structured representation of particulars in a
   An ONTOLOGY is a representational artifact,                   formalized representation of the corresponding
comprising a taxonomy as proper part, whose                      universals.
representational units are intended to designate some
combination of universals, defined classes, and                          THE CONCEPT ORIENTATION
certain relations between them.13
   A REALISM-BASED ONTOLOGY is built out of terms                We believe that ontologies, inventories and similar
which are intended to refer exclusively to universals,           artifacts should consist exclusively of representational
and corresponds to that part of the content of a                 units which are intended to designate entities in Level
scientific theory that is captured by its constituent            1 reality. Defenders of the concept orientation in
general terms and their interrelations.                          medical terminology development have offered a
   A TERMINOLOGY is a representational artifact                  series of arguments against this view, to the effect
consisting of representational units which are the               that such terminologies should include also (or
general terms of some natural language used to refer             exclusively) representational units referring to what
to entities in some specific domain.                             are called ‘concepts’.23
   An INVENTORY is a representational artifact built                First, is what we can call the argument from
out of singular referring terms such as proper names             intellectual modesty, which asserts that it is up to
or alphanumeric identifiers. Electronic Health                   domain experts, and not to terminology developers, to
Records (EHRs) incorporate inventories in this sense,            answer for the truth of whatever theories the
including both terms denoting particulars (‘patient              terminology is intended to mirror. Since domain
#347’, ‘lung #420’) and more complex expressions                 experts themselves disagree, a terminology should
involving terms designating universals and defined               embrace no claims as to what the world is like, but
classes (‘the history of cancer in patient #347’s                reflect, rather, the coagulate formed out of the
family’).20                                                      concepts used by different experts.
   In the best case, again, each of the representational            Against this, it can be pointed out that communities
artifacts listed above (ontologies, taxonomies,                  working on common domains in the medical as in
inventories) will be such that its representational units        other scientific fields in fact accept a massive and
stand in a one-to-one correspondence with the salient            ever-growing body of consensus truths about the
entities in its domain. In practice, however, such               entities in these domains. Many of these truths are,
artifacts can be classified on the basis of the various          admittedly, of a trivial sort (that mammals have
ways in which they fall short of this best case, in              hearts, that organisms are made of cells), but it is
terms of properties such as correctness, degree of               precisely such truths which form the core of science-



                                                            61
based ontologues. Where conflicts do arise in the                  Some patients do, after all, believe that they are
course of scientific development, these are highly                 James Bond, or that they see unicorns. The realist
localized, and pertain to specific mechanisms, for                 approach is however perfectly well able to
example of drug action or disease development,                     comprehend also phenomena such as these, even
which can serve as the targets of conflicting beliefs              though it is restricted to the representation of what is
only because researchers share a huge body of                      real. For the beliefs and hallucinatory episodes in
presuppositions.                                                   question are of course as real as are the persons who
   We can think of no scenario under which it would                suffer (or enjoy) them. And certainly such beliefs and
make sense to postulate special entities called                    episodes may involve concepts (in the properly
‘concepts’ as the entities to which terms subject to               psychological sense of this term). But they are not
scientific dispute would refer. For either, for any such           about concepts, they do not have concepts as their
term, the dispute is resolved in its favor, and then it is         targets – for they are intended by their subjects to be
the corresponding level 1 entity that has served as its            about entities in flesh-and-blood external reality.
referent all along; or it is established that the term in             Fourth, is the argument from medical history. The
question is non-designating, and then this term is no              history of medicine is a scientific pursuit; yet it
longer a candidate for inclusion in a terminology. We              involves use of terms such as ‘diabolic possession’
cannot solve the problem that we do not know, at                   which, according to the best current science, do not
some given stage of scientific inquiry, to which of                refer to universals in reality. But again: the history of
these groups a given term belongs, by providing such               medicine has as its subject-domain precisely the
terms instead with guaranteed referents called                     beliefs, both true and false, of former generations
‘concepts’. It may, finally, be the case that it is not the        (together with the practices, institutions, etc.
disputed term itself which is at issue, but rather some            associated therewith). Thus a term like ‘diabolic
more complex expression, as when we talk about ‘G.                 possession’ should be included in the ontology of this
E. Stahl’s concept of phlogiston’, but that the latter             discipline in the first place as component part of
refers to some entity – a concept – in (psychological)             terms designating corresponding classes of beliefs. In
reality is precisely not subject to scientific dispute.            addition it may appear also as part of a term
   Sometimes the argument from intellectual modesty                designating some fiat collection of those diseases
takes an extreme form, for example on the part of                  from which the patients diagnosed as being possessed
those for whom reality itself is seen as being                     were in fact suffering. The evolution of our thinking
somehow unknowable (‘we can only ever know our                     about disease can then be understood in the same way
own concepts’). Arguments along these lines are of                 that we deal with theory change in other parts of
course familiar from the history of philosophy. Stove              science, as a reordering of our beliefs about the
provides the definitive refutation.24 Here we need note            ontological validity and salience of specific families
only that they run counter not just to the successes,              of terms – and once again: concepts themselves play
but to the very existence, of science and technology               no role as referents.20,26
as collaborative endeavors.                                           Fifth, is the argument from syndromes. The
   Second, is the argument from creativity. Designer               subject-matters of biology and medicine are, it is
drugs are conceived, modeled, and described long                   held, replete with entities which do not exist in reality
before they are successfully synthesized, and the                  but are rather convenient abstractions. A syndrome
plans of pharmaceutical companies may contain                      such as congestive heart failure, for example, is
putative references to the corresponding chemical                  nothing more than a convenient abstraction, used for
universals long before there are instances in reality.             the convenience of physicians to collect together
But again: such descriptions and plans can be                      many disparate and unrelated diseases which have
perfectly well apprehended even within terminologies               common final manifestations. Such abstractions are, it
and ontologies conceived as relating exclusively to                is held, mere concepts.
what is real. Descriptions and plans do, after all,                   According to the considerations on fiat
exist. On the other hand it would be an error to                   demarcations advanced above, however, syndromes,
include in a scientific ontology of drugs terms                    pathways, genetic networks and similar phenomena
referring to pharmaceutical products which do not yet              are indeed fully real – though their reality is that of
(and may never) exist, solely on the basis of plans and            defined (fiat) classes rather than of universals. A
descriptions. Rather, such terms should be included                similar response can be given also in regard to the
precisely at the point where the corresponding                     many human-dependent delineations used in
instances do indeed exist in reality, exactly in                   expressions like ‘obesity’ or ‘hypertension’ or
accordance with our proposals above.                               ‘abnormal curvature of spine’. These terms, too, refer
   Third, is what we might call the argument from                  to entities in reality, namely to defined classes which
unicorns. Some of the terms needed in medical                      rest on fiat thresholds established by consensus
terminologies refer, it is held, to what does not exist.           among physicians.



                                                              62
   Sixth is the argument from error. When erroneous              ‘electron’ or ‘cell’, on the one hand, and ‘fall on stairs
entries are entered into a clinical record and inter-            or ladders in water transport NOS, occupant of small
preted as being about level 1 entities, then logical             unpowered boat injured’ (Read Codes) on the other.
conflicts can arise. For Rector et al., this implies that        But there are also borderline cases such as ‘alcoholic
the use of a meta-language should be made compul-                non-smoker with diabetes’, or ‘age-dependent yeast
sory for all statements in the EHR, which should be,             cell size increase’, which call into question the very
not about entities in reality, but rather about what are         basis of the distinction.
called ‘findings’.25 Instead of p and not p, the record             In response, we note first the general point, that
would contain entries like: McX observed p and O’W               arguments from the existence of borderline cases in
observed not p, so that logical contradiction is                 general have very little force. For otherwise they
avoided. The terms in terminologies devised to serve             would allow us to prove from the existence of people
such EHRs would then one and all refer not to                    with borderline complements of hair that there is no
diseases themselves, but rather to mere ‘concepts’ of            such thing as baldness or hairiness.
diseases. This, however, blurs the distinction between              As to the specific problem of how to classify
entities in reality and associated findings, and opens           borderline expressions, this is a problem not for
the door to the inclusion in a terminology of                    terminology, but rather for empirical science. For
problematic findings-related expressions such as                 borderline terms of the sorts mentioned will, as an
SNOMED’s ‘absent nipple’, ‘absent leg’, etc.                     inevitable concomitant of scientific advance, be in
Certainly clinicians need to record such findings. But           any case subjected to a filtering process based on
then their findings are precisely that a leg is absent;          whether they are needed for purposes of (for example
not that a special kind of (‘absent’) leg is present.            therapeutically) fruitful classifications, and thus for
   In the domain of scientific research we do not                the expression of scientific laws.
embargo entirely the making of object-language                      Science itself is thereby subject to constant update.
assertions simply because there might be, among the              A term taken to refer to a universal by one generation
totality of such assertions, some which are erroneous.           of scientists may be demoted to the level of non-
Rather, we rely on the normal workings of science as             designating term (‘phlogiston’) by the next. This
a collective, empirical endeavor to weed out error               means also that representational artifacts of the sorts
over time, providing facilities to quarantine erroneous          considered in the above, because they form an
entries and resolve logical conflicts as they are                integral part of the practice of science, should
identified. We have argued elsewhere that these same             themselves be subject to continual update in light of
devices can be applied also in the medical context.26            such advance. But again: we can think of no
   The argument for the move to the meta-level is                circumstance in which updating of the sort in question
sometimes buttressed by appeal to medico-legal                   would signify that phlogiston is itself a concept, or
considerations seen as requiring that the EHR be a               that some expression was at one or other stage being
record not of what exists but of clinicians’ beliefs and         used by scientists with the intention of referring to
actions. Yet the forensic purposes of an audit trail can         ‘concepts’ rather than to entities in reality.
equally well be served by an object-language record
if we ensure that meta-data are associated with each                        THE SEMIOTIC TRIANGLE
entry identifying by whom the pertinent data were
entered, at what time, and so forth.                             Finally is what we might call the argument from
   On the other side, moreover, even the move to                 multiple perspectives. Different patients, clinicians
meta-level assertions would not in fact solve the                and biologists have their own perspectives on one and
problems of error, logical contradiction and legal               the same reality. To do justice to these differences, it
liability. For the very same problems arise not only             is argued, we must hold that their respective
when human beings are describing, on the object-                 representations point, not to this common reality, but
level, fractures, or pulse rates, or symptoms of                 rather to their different ‘concepts’ thereof.
coughing or swelling, but also on the meta-level when               This argument has its roots in the work of Ogden
they are describing what clinicians have heard, seen,            and Richards, and specifically in their discussion of
thought and done. The latter, too, are subject to error,         the so-called ‘semiotic triangle’, which is of
fraud, and disagreement in interpretation.                       importance not least because it embodies a view of
   Seventh is the argument from borderline cases. As             meaning and reference that still plays a fateful role in
we have already noted above, there is at any given               the terminology standardization work of ISO.26
stage no bright line between those general terms                    As Figure 1 makes clear, the triangle in fact refers
properly to be conceived as designating universals               not to ‘concepts’, but rather to what its authors call
and those designating merely ‘concepts’ (or defined              ‘thought or reference’,27 reflecting the fact that Ogden
classes). Certainly there are, at any given stage in the         and Richards’ account is rooted in a theory of
development of science, clear cases on either side:              psychological causality. When we experience a



                                                            63
certain object in association with a certain sign, then            terminology literature henceforth? There are of
memory traces are laid down in our brains in virtue of             course sensible uses of this term, for example in the
which the mere appearance of the same sign in the                  literature of psychology. In the terminology literature,
future will, they hold, ‘evoke’ a ‘thought or reference’           however, ‘concept’ has been used in such a
directed towards this object through the reactivation              bewildering variety of confused and confusing ways
of impressions stored in memory.                                   that we recommend that it be avoided altogether.
                                                                      It is tempting to suppose that, when considered
                                                                   extensionally, all of the mentioned alternative
                                                                   readings come down to one and the same thing,
                                                                   namely to an identification of ‘concept’ with what we
                                                                   have earlier called ‘defined class’. If ‘concept’ could
                                                                   be used systematically in this way in terminological
                                                                   circles, then this would, indeed, constitute progress of
                                                                   sorts, though the question would then arise why
                                                                   ‘defined class’ itself should not be used instead.
                                                                   Unfortunately, however, the proposal in question
Figure 1 – Ogden and Richards’ Semiotic Triangle                   stands in conflict with the fact that ‘concept’ is used
                                                                   by its adherents to comprehend also putative referents
The two solid edges of the triangle are intended to                even for terms – such as ‘surgical procedure not
represent what are held to be causal relations of                  carried out because of patient’s decision’ – which do
‘symbolization’ (roughly: evocation), and ‘reference’              not designate defined classes because they designate
(roughly: perception or memory) on the part of a                   nothing at all. Here again, we believe, a proper
symbol-using subject. The dashed edge, in contrast,                treatment would involve appeal to appropriate fiat
signifies that the relation between term and referent –            classes, defined in terms of utterances, interrupted
the relation that is most important for the discussion             plans, expectations, etc. on the part of the subjects
of terminology – is merely ‘imputed’.                              involved.
   The background assumption here is that multiple                    What, now is to be said of terms such as ‘concept
perspectives are both ubiquitous and (at best) only                model’, ‘knowledge representation’, ‘information
locally and transiently resolvable. The meanings                   model’, and so forth referred to in our premble
words have for you or me depend on our past                        above? To the extent that concept-based
experiences of uses of these words in different kinds              terminological artifacts consist in representations not
of contexts. Ambiguity must be resolved anew (and a                of the reality on the side of the patient but rather of
new ‘imputed’ relation of reference spawned) on each               the entities in some putative ‘realm of concepts’, the
successive occasion of use. From this, Ogden and                   term ‘concept model’ may be justified. This term is
Richards infer that a symbolic representation can                  indeed used by SNOMED CT in its own self-
never refer directly to an object, but rather only                 descriptions, though given SNOMED’s scientific
indirectly, via a ‘thought or reference’ within the                goals, we believe that, on the basis of the arguments
mind.                                                              given above, it should be abandoned. Still more
   It is a depsychologized version of this latter thesis           problematic is the term ‘knowledge model’ or
which forms the basis of the concept orientation in                ‘knowledge representation’ (GALEN). For in the
contemporary terminology research. The terms in                    absence of a reference to reality to serve as
terminologies refer not to entities in reality, it is held,        benchmark, what could motivate a distinction
but rather to ‘concepts’ in a special ‘realm’. The lat-            between knowledge and mere belief.19 And what, in
ter are not transparent mediators of reference; rather             the absence of a reference to reality, could motivate
they are its targets, and the job of the terminologist is          adding or deleting terms in successive versions of a
to callibrate his list of terms in relation not to reality         terminology, if every term is in any case guaranteed a
but to this special ‘realm of concepts’.26                         reference to its own specially tailored ‘concept’.
   The relation between terms in a terminology and                    As to ‘information model’, here one standard
the reality beyond becomes hereby obscured. Reality                uncertainty concerns the relation between an entity in
exists, if at all, only behind a conceptual veil – and             reality and the body of information used to ‘repre-
hence familiar confusions according to which for                   sent’ this entity in some information system. Is it in-
example the concept of bacteria would cause an                     formation which is being ‘modeled’ in an information
experimental model of disease, or the concept of                   model, or the reality which this information is about?
vitamin would be ‘essential in the diet of man’.28                 The documentation of the HL7 Reference
                                                                   Information Model (RIM)29 adds extra layers of
          ‘CONCEPTS’ AND ‘MODELS’                                  uncertainty by conceiving its principal formulas as
How, then, should ‘concept’ be properly treated in the             referring to the acts in which entities are observed for



                                                              64
example in a clinical context. Simultaneously,                    of Anatomy. J Biomed Inform 2003;36:478-500.
however, it conceives these formulas as referring also        11. http://geneontology.org/.
to the documentation of such acts for example in an           12. Wittgenstein L. 1921 Tractatus Logico-
information system. The apparent contradiction is to              Philosophicus, London: Routledge, 1961.
some degree resolved by the RIM on the basis of its           13 Smith B, Ceusters W, Klagges B et al.. Relations
assertion that there is in any case ‘no distinction               in biomedical ontologies. Genome Biol,
between an activity and its documentation’.30                     2005;6(5):R46.
                                                              14. Bittner T, Donnelly M, Smith B. Individuals,
                  CONCLUSION                                      universals, collections. Formal Ontology in
                                                                  Information Systems (FOIS 2004), p. 37-48.
Drawing on our distinction of the three levels of
                                                              15. Drummond N. Introduction to ontologies. http://
reality, cognition and representational artifact we
                                                                  www.cs.man.ac.uk/~drummond/presentations/Int
have sought to formulate an unambiguous
                                                                  roductionToOWL50mins.ppt.
terminology for describing ontologies and related
                                                              16. Ceusters W, Smith B. A realism-based approach
artifacts. The proposed terminology allows us to
                                                                  to the versioning and evolution of biomedical
characterize more precisely the sorts of things which
                                                                  ontologies. Proc AMIA Symp 2006, in press.
go wrong when the distinction between these levels is
                                                              17. Smith B. Fiat objects. Topoi, 2001;20(2):131-48.
ignored, or when one or other level is denied, so that
                                                              18. Bittner T, Smith B. A theory of granular parti-
the approach may also help in improving such
                                                                  tions. Foundations of Geographic Information
artifacts in the future.
                                                                  Science, London, 2003, p. 117-51
                                                              19. Smith B. From concepts to clinical reality, J
                 Acknowledgements
                                                                  Biomed Inform. 2006 Jun;39(3):288-98.
This work was supported by the Wolfgang Paul                  20. Ceusters W, Smith B. Strategies for referent
Program of the Humboldt Foundation, the Volks-                    tracking in Electronic Health Records. J Biomed
wagen Foundation, the European Union Semantic                     Inform. 2006 Jun;39(3):362-78.
Mining Network, by BBSRC Grant BB/D524283/1,                  21. Bera P, Wand Y. Analyzing OWL using a
and by the NIH Roadmap Grant U54 HG004028.                        philosophy-based ontology. Formal Ontology in
Thanks are due also to Jim Cimino, Chris Chute,                   Information Systems (FOIS 2004), p. 353-62.
Gunnar Klein, Alan Rector, Stefan Schulz, and Kent            22. Kazic T. Putting semantics into the semantic
Spackman for fruitful discussions.                                web: How well can it capture biology? Pac Symp
                                                                  Biocomputing 2006;11:140-51.
                     References                               23 Cimino JJ. In defense of the desiderata. J Biomed
                                                                  Inform. 2006;39:299-306.
(URLs last accessed July 1, 2006)                             24. Franklin J. Stove’s discovery of the worst
1. Smith B. Beyond concepts, or: Ontology as                      argument in the world. Philosophy 2002;77:615-
    reality representation, Formal Ontology in                    24. www.maths.unsw.edu.au/~jim/worst.pdf.
    Information Systems (FOIS 2004), p. 73-84.                25. Rector A, Nolan W, Kay S. Foundations for an
2. http://www.w3.org/2003/glossary.                               electronic medical record. Methods Inf Med,
3. Patel-Schneider PF, Hayes P, Horrocks I. OWL                   1991;30:179-86.
    Web Ontology Language. 2004. http://www.-                 26. Smith B, Ceusters W, Temmerman R. Wüsteria,
    w3.org/TR/owl-semantics.                                      Stud Health Technol Inform. 2005;116:647-652.
4. Spackman KA, Reynoso G. Examining                          27. Ogden CK, Richards IA. The Meaning of
    SNOMED from the perspective of formal                         Meaning. 3rd ed. New York, 1930.
    ontological principles. Workshop on Formal                28. The UMLS Semantic Network. http://semantic
    Biomedical Knowledge Representation (KR-                      network.nlm.nih.gov/.
    MED 2004), p. 72-80.                                      29. HL7 V3 Reference Information Model: Version
5. Johansson I. Bioinformatics and biological                     V 01-20. Normative Ballot 11/22/2005.
    reality. J Biomed Inform. 2006;39(3):274-87.              30. Smith B, Ceusters W. HL7 RIM: An incoherent
6 Klein GO, Smith B. Concept systems and                          standard, Proc MIE, 2006, p. 133-138
    ontologies. http://ontology.buffalo.edu/concepts
    /ConceptsandOntologies.pdf.
7. http://ncbo.us/.
8. http://obo.sourceforge.net/.
9. http://obofoundry.org/.
10. Rosse C, Mejino JL, Jr. A reference ontology for
    biomedical informatics: the Foundational Model




                                                         65