=Paper= {{Paper |id=None |storemode=property |title=When owl:sameAs isn't the Same: An Analysis of Identity Links on the Semantic Web |pdfUrl=https://ceur-ws.org/Vol-628/ldow2010_paper09.pdf |volume=Vol-628 |dblpUrl=https://dblp.org/rec/conf/www/HalpinH10 }} ==When owl:sameAs isn't the Same: An Analysis of Identity Links on the Semantic Web== https://ceur-ws.org/Vol-628/ldow2010_paper09.pdf
 When owl:sameAs isn’t the Same: An Analysis of Identity
             Links on the Semantic Web

                             Harry Halpin                                        Patrick J. Hayes
          Institute for Communicating and Collaborative            Institute for Human and Machine Cognition
                              Systems                                           40 South Alcaniz St.
                       University of Edinburgh                                    Pensacola, USA
                         2 Buccleuch Place                                      phayes@ihmc.us
                     Edinburgh, United Kingdom
                        H.Halpin@ed.ac.uk

ABSTRACT                                                          language. Much of the supposed “crisis” over the prolifera-
In Linked Data, the use of owl:sameAs is ubiquitous in            tion of owl:sameAs in Linked Data can be traced to the fact
‘inter-linking’ data-sets. However, there is a lurking sus-       that these uses of owl:sameAs tend to be mutually incom-
picion within the Linked Data community that this use of          patible, and almost always violate the rather strict logical
owl:sameAs may be somehow incorrect, in particular with           semantics of identity demanded by owl:sameAs. However,
regards to its interactions with inference. In fact, owl:sameAs   the exact types of distinctions made by these individuals are
can be considered just one type of ‘identity link,’ a link that   important, even if they contradict the relevant specification
declares two items to be identical in some fashion. After         of owl:sameAs. First, these uses and abuses of owl:sameAs
reviewing the definitions and history of the problem of iden-     demonstrate for the first time in the history of knowledge
tity in philosophy and knowledge representation, we outline       representation how precisely these problems play out in the
four alternative readings of owl:sameAs, showing with ex-         wild. Second, as the Semantic Web is a project in develop-
amples how it is being (ab)used on the Web of data. Then          ment, it is always possible to specify anew different and new
we present possible solutions to this problem by introducing      kinds of language constructs and more clearly specified best
alternative identity links that rely on named graphs.             practices to align the specifications with the actual empirical
                                                                  use of the Semantic Web in the wild.
                                                                     First, we will give an overview of the problem of iden-
Categories and Subject Descriptors                                tity and its somewhat dusty lineage in artificial intelligence,
H.3.d [Information Technology and Systems]: Meta-                 if only to show how what was already a known issue for
data                                                              knowledge representation becomes even more exacerbated
                                                                  when knowledge representation goes global for the Semantic
General Terms                                                     Web. Then, four distinct uses of owl:sameAs are discussed
                                                                  in addition to the precise idea of “same thing as,” namely:
Knowledge Representation
                                                                     • Same Thing As But Different Context
Keywords
                                                                     • Same Thing As But Referentially Opaque
Linked Data, ontology, resource, Web architecture
                                                                     • Represents
1.   INTRODUCTION                                                    • Very Similar To
   As large numbers of independently developed data-sets
have been introduced to the Web as Linked Data, the vex-            Finally, a number of suggestions for how the current situ-
ing problem of identity has returned with a vengeance to the      ation can be improved are sketched. The necessity of both
Semantic Web. As the ubiquitous owl:sameAs property is            semantic and theoretical work is given as well.
used as the RDF property to connect these data-sets, it has
been dubbed the owl:sameAs problem. However, the prob-
lem of identity lies not within Linked Data or within the
                                                                  2. A BRIEF HISTORY OF IDENTITY
Semantic Web languages, but is an outstanding and well-              The problem of identity has a long and chequered history,
known – if sometimes not precisely labeled – issue in pre-        spreading from philosophy and mathematics to linguistics
Semantic Web knowledge representation languages in arti-          and knowledge representation. In each of these fields, what
ficial intelligence. What precisely is new in its latest guise    it means for two things to be identical goes straight to the
of this problem on the Web of Linked Data is that this is         heart of semantics.
the first time the problem is being encountered by different
individuals attempting to independently knit their knowl-
                                                                  2.1 What is Identity?
edge representations together using the same standardized            The father of knowledge representation, Leibnitz, was not
                                                                  surprisingly also the first to phrase a coherent and formal-
Copyright is held by the author/owner(s).
LDOW2010, April 27th, 2010, Raleigh, North Carolina.              izable definition of identity, often called ‘Leibnitz’s Law’ or
.                                                                 the ‘The Identity of Indiscernables,’ namely that for every x
and every y, whatever is true of x is true of y, then x is iden-   a word in Tibetan” is false. Is “Qomolangma is the highest
tical to y [1]. This notion of identity states that identity is    mountain in the world” true? This fact was not necessarily
composed of properties, so that in order for two things to be      known in Tibet before the era of global geological surveys.
the same they must share all the same properties. This law         So one could easily have a case of a geoscientist who has
can then be stated logically as ∀x∀y(∀P.(P (x) ↔ P (y)) →          never visited the mountain knowing it is the highest moun-
x = y). If x = y, then they are the same thing, so of course       tain in the world and a Tibetan monk who lives not too far
all the properties of x are also properties of the y: there is     from the mountain not knowing - or caring - if it is the high-
only one thing there, to either have the property or not have      est mountain in the world. This distinction was called the
it.                                                                distinction between two names having the same referent but
    A number of classical problems already crop up in this         different senses, i.e. contexts that do or do not share certain
analysis of identity. First, exactly what properties are be-       information [7]. Often contexts where a name can be substi-
ing counted? Obviously, we can imagine worlds where things         tuted are called extensional, while referentially opaque con-
have the exact same properties but are nonetheless not iden-       texts where a name can not be substituted are intensional.
tical, such as an exact clone (which is all too easy with          In general, indirect quotations and statements of belief, such
digital objects). There are two obvious escape hatches: a          as “Rajendra Pachauri believes the glaciers on Mount Ever-
thing may have a property of “being=itself” (an [haecce-           est are melting” are considered opaque. Although in prac-
ity) or things having different temporal-spatial co-ordinates      tice the principle of substitution is subtle and its use often
could be counted as different, even if they share the rest         wrought with confusion, the key point is straightforward: A
of their properties in common. While that sounds like a            name can identify different things in different contexts.
common-sense distinction, is it true that Tim Berners-Lee
is the same person he was when he was a child? Or if he lost       2.2 The IS-A Debate in Semantic Networks
his arm? This leads straight into arguments about perdu-
                                                                      It would be easy to dismiss these arguments over identity
rance and endurance in philosophy. Are there two different
                                                                   as being mandarin philosophical questions, until the ‘pedal
kinds of properties, properties that are somehow intrinsic
                                                                   hits the metal’ in the the world of knowledge representation.
to identity and others that are extrinsic, i.e. purely rela-
                                                                   This is precisely what happened to semantic networks, a pre-
tive to other things? Lastly, perhaps the real question is
                                                                   decessor of the Semantic Web in knowledge representation.
who or what determines the conditions of identity, namely
                                                                   Semantic networks, as pioneered by Quillian, were viewed
that identity can only be made in context of an explicit the-
                                                                   as an alternative knowledge representation scheme to first-
ory of identity criteria. These theories can be formalized in
                                                                   order logic in the early days of artificial intelligence [12]. In
terms of the semantic interpretation of sentences to refer-
                                                                   essence, a semantic network is similar to an RDF graph ex-
ents. However, this does not mean that these theories are
                                                                   cept that instead of using URIs, the nodes and edges were
compatible. If one has a theory of identity where one is talk-
                                                                   labeled purely using natural language or pseudo-natural lan-
ing only about people as employees in a particular role, then
                                                                   guage labels.
two different people who have the same job will be identi-
                                                                      Semantic networks, by relying on words from natural lan-
cal, but if one has a more fine-grained interpretation the
                                                                   guage or pseudo-words to label their constructs, whose mean-
very same people would be different. One can even imagine
                                                                   ings were somehow supposed to be simply obvious, actually
theories of identity based on different criteria, where some
                                                                   led to these constructs being ambiguous. The classic exam-
theories of identity subsume weaker or stronger ones. There
                                                                   ple was the infamous IS-A label used by Brachman in his
is also the problem of vagueness, and the inability to specify
                                                                   What IS-A is and isn’t. An Analysis of Taxonomic Links in
precise properties (such as the exact latitude and longitude
                                                                   Semantic Networks [3]. Often, two nodes were connected by
boundaries of Mount Everest). Regardless of the problems,
                                                                   an IS-A link. Were IS-A links assertional, such that some-
the point of Leibnitz’s Law is clear: When someone says
                                                                   how two nodes connected by an IS-A were identical? Or
two things are the same, they mean they share all the same
                                                                   were they taxonomic, such that they meant a sub-class or
properties.
                                                                   subset relationship? Or a structural relationship between a
    Frege was the first to note what Galois called the “linguis-
                                                                   concept and an instance? Brachman found that there were
tic counterpart” to Leibnitz’s Law: the Principle of Substi-
                                                                   a proliferation of the various meanings of IS-A links in se-
tution, which states that if x and y are identical, then x
                                                                   mantic networks, and that not only were they incompatible
may be substituted for y without changing the truth-value
                                                                   between different semantic networks, but that within a sin-
of the sentence in which the substitution is made [6]. For-
                                                                   gle network IS-A links were often given different meanings
mally, this is generally phrased in the precise same manner
                                                                   within the same network [3]. Given this lack of clarity about
as Leibnitz’s Law. However, there is a subtle difference, for
                                                                   exactly what was being represented by the knowledge rep-
while Leibnitz’s Law dealt with the identity of concepts and
                                                                   resentation, semantic networks could not be transferred or
individuals on the level of properties, the Principle of Sub-
                                                                   combined with each other with any degree of reliability.
stitution deals with when the use of the name itself in the
                                                                      In an effort to remedy this crisis, Brachman and others
context of actual sentences.
                                                                   decided to split what they called the “epistemological level”
    There are two important consequences. The first is the
                                                                   - the kinds of nodes and edges that remained neutral to the
classic division between use and mentioning of a name.
                                                                   underlying primitives yet could be given a specific semantics
Even if “Mount Everest” in English and “Qomolangma”
                                                                   from the rest of the semantic network whose meaning could
in Tibetan mean the same thing, names from different lan-
                                                                   only be grounded in some linguistic convention [4]. These
guages cannot be substituted for each other often. Sentences
                                                                   logical constructs could be given a formal semantics (and
like “Qomolangma is a word in Tibetan” mention a name,
                                                                   thus a model theory) by mapping them to a language with a
while the sentence “Mount Everest is the highest mountain
                                                                   well-defined semantics, such as first-order predicate calculus.
in the world” uses the name. Obviously, “Mount Everest is
                                                                   Therefore, semantic networks could be considered just an
intuitive (or slightly odd, depending on your preferences)           The most typical link used is owl:sameAs, which is in gen-
notation for logic.                                               eral used to to say “that two URI references actually refer
   The Semantic Web seems to have learned from semantic           to the same thing” [2]. For example, the city of Paris is
networks. The formal semantics of RDF are important pre-          referenced in a number of different Linked Data-sets: rang-
cisely because RDF statements can be given the same logical       ing from OpenCyc to the New York Times. In DBPedia, a
meaning uniformly across a distributed network, even if the       Linked Data export of Wikipedia, these data-sets are con-
semantics of RDF have relatively light inferential power and      nected by owl:sameAs. In particular, dbpedia:Paris is owl:sameAs
do not constrain the semantic interpretations very tightly        as both the opencyc:CityOfParisFrance and
[8]. This is also on purpose, as it allows RDF to be - in         opencyc:Paris DepartmentFrance, as OpenCyc distinguishes
theory - used as a foundation or “glue” for other more con-       that the department of Paris from Paris itself, as Paris DepartmentFrance
strained vocabularies. Furthermore, it was precisely the ex-      is a distinct geopolitical entity from CityOfParisFrance, de-
plorations of the semantics of “semantic” networks that led       spite the fact that both share the same territory, while Wikipedia
to description logic, and so OWL. By giving OWL and RDF           does not make this distinction.
a formal semantics - albeit a very limited one - it was imag-
ined that the Semantic Web would not repeat the mistakes          4. THE SEMANTICS OF OWL:SAMEAS AND
of semantic networks.
                                                                     ALTERNATIVES
                                                                     At first, this use of owl:sameAs seems to be harmless.
3.   THE IDENTITY CRISIS OF LINKED DATA                           Its actual definition is that “the built-in OWL property
   Contrary to popular belief in some circles, formal seman-      owl:sameAs links an individual to an individual” and “Such
tics are not a silver bullet. Just because a construct in a       an owl:sameAs statement indicates that two URI references
knowledge representation language is prescribed a behavior        actually refer to the same thing: the individuals have the
using formal semantics does not necessarily mean that peo-        same identity” [13]. Furthermore, OWL states that “It is
ple will follow those semantics when actually using that lan-     unrealistic to assume everybody will use the same name to
guage “in the wild.” This can be laid down to a wide variety      refer to individuals. That would require some grand design,
of reasons. In particular, the language may not provide the       which is contrary to the spirit of the web” [13].
facilities needed by people as they actually try to encode           However, owl:sameAs does have a particular semantics of
knowledge, so they may use a construct that seems close           individual identity, namely that the two individuals are ex-
enough to their desired one. A combination of not reading         actly the same and so share all the same properties, and thus
specifications - especially formal semantics, which even most     are equivalent in terms of Leibnitz’s identity of indiscern-
software developers and engineers lack training in - and the      ables. Given that OWL has no unique name assumption,
labeling of constructs with “English-like” mnemonics nat-         once there is an application of owl:sameAs to two different
urally will lead to the use of a knowledge representation         URIs, then any statement that is given to a single URI is
language by actual users that varies from what its designers      true for every other URI that has an owl:sameAs link. Fur-
intended. In decentralized systems like the Semantic Web,         thermore, while in OWL Full owl:sameAs can be considered
this problem is naturally exacerbated. However, far from          to be the same as between any URIs as classes can be con-
being a sign of abuse, it is a sign of success, as it means       sidered “individual” instances of other classes and proper-
that the Semantic Web is actually being deployed outside          ties can be considered individuals, in OWL DL in order to
academia and research labs.                                       preserve decidability individuals are strictly separated from
   The problem has definitely arisen on the Semantic Web          classes, and so one should use OWL DL equivalentClass and
in terms of the use of owl:sameAs in Linked Data. In              equivalentProperty instead. Therefore, quick-and-dirty use
Linked Data, each item of interest is given a URI, that in        of owl:sameAs will almost always lead to OWL Full, which
turn redirects to either human-readable HTML or machine-          very little work has been done on in terms of efficient im-
readable RDF depending on content negotiation. The URI            plementations of inference. The real trick with owl:sameAs
for the item itself, which is called rather confusingly a “non-   is that it works both ways: as it is both symmetric and
information resource” in Linked Data circles, as a web-page       transitive, so that anyone can link to your data-set with
or RDF graph would be an information resource, as the             owl:sameAs from anywhere else on the Web without your
“ distinguishing characteristic of these resources is that all    permission, and any statement they make about their own
of their essential characteristics can be conveyed in a mes-      URI will immediately apply to yours. As imaginable, such
sage” [9]. Usually, this data is released in some sort of au-     transitive closures can immediately get very large. There
tomated or semi-automated manner, often by mapping re-            have been considerable rumors in the Linked Data com-
lational data to RDF. Somehow, a URI is chosen for each           munity that such use of owl:sameAs is somehow “wrong”
identifier in the data-set that is exported in Linked Data.       with regards to the formal semantics of OWL. It does seem
Although the general thinking in RDF (and thus, the main          intuitively that the use of owl:sameAs may be the logical
idea behind the ability of RDF graph merge) was that URIs         equivalent of a bulldozer. Since inference is rarely used on
would be re-used, in practice URIs are simply minted anew         the Linked Data, these possible side-effects have not been
for each identifier in a Linked Data set. As opposed to the       noticed. Does this really matter? Is the use of owl:sameAs
simple exporting of data-sets into RDF, what puts the links       an exploding time-bomb for Linked Data, or a harmless con-
in Linked Data is the use of what we term identity links          vention? What exactly is the point of linking data if nobody
- links that define two things to be identical or otherwise       is going to draw any conclusions which use the links?
closely related - to link between diverse and heterogeneous
data-sets. While there has been some research that deals
with this problem [10], the scope of the problem is just be-
                                                                  5. FOUR VARIATIONS OF IDENTITY IN
ginning to be understood.                                            LINKED DATA
5.1 Same Thing As But Referentially Opaque                          worth distinguishing between using a representation to refer
   The first case is when the two URIs do refer to the same         to the represented, such as using a picture of Berners-Lee to
thing, but all the properties ascribed to one URI are not nec-      refer to Tim Berners-Lee himself, using something acciden-
essarily accepted by the other. This means that the use of          tally or contextually to refer to something, a phenomenon
the URI is referentially opaque, which means that one URI           called displaced reference. The example of using an e-mail
cannot be substituted for another (the Principle of Substi-         box to refer to a person is not an error but rather more a
tution is violated), i.e. the context is intensional. A classic     displaced reference.
example of this would be the the concept of sodium in DBpe-
dia, which has an owl:sameAs link to the concept of sodium
                                                                    5.4 Very Similar To
in OpenCyc. The OpenCyc ontology says that an element                  Sometimes its clear that two things are not identical but
is the set (class) of all pieces of the pure element, so that for   simply closely related in some manner. This, for example, is
example sodium in Cyc has a member which is the lump of             the relationship between the district of Paris and the Depart-
pure metallic sodium. On the other hand, sodium as defined          ment of Paris in Cyc. Furthermore, there are often complex,
by DBPedia is used to also include isotopes, which have dif-        structured, yet hard-to-specify relationships between things,
ferent number of neutrons than “standard” sodium. So, one           such as the relationship between isotopes and elements, the
should not state the number of neutrons in DBPedia’s use of         quantity and a measurement of a quantity, and an image
sodium, but one can with OpenCyc. Therefore, owl:sameAs             and a facsimile of that image. In web architecture, it is
here is in error, as it does not allow mutual substitutivity.       clear there is a close relationship These relationships that
Indeed, this use of URIs in an opaque referential context           are ‘very similar to’ seem to deserve their own property, but
is likely what most uses of owl:sameAs actually are for, as         are often currently lumped together in Linked Data under
it is unlikely that most deployers of Linked Data actually          the all-encompassing use of owl:sameAs. Most of the more
check whether or not all the properties and their associated        noticeable errors of owl:sameAs seem to come from this cat-
inferences are shared amongst linked data-sets. This prop-          egory, and it is likely that examples such as the relationship
erty is exceedingly important for Linked Data, as contrary          of sodium within DBPedia to sodium in OpenCyc are of this
to popular doctrine, it is possible that the Web is full of ref-    kind as well.
erentially opaque contexts. The problem is there is no way
to get a handle on contexts informally without descending           6. MOVING FORWARD
into non-logical reasoning currently.
                                                                    6.1 Same Thing As But Referentially Opaque
5.2 Same Thing As But Different Context                                Surprisingly, most of the time people use owl:sameAs they
   In this case, two URIs do refer to the same thing and all        are accidentally doing what is sort of an implicit import of
properties do hold of both URIs, but that we cannot re-use          statements of the subject of the owl:sameAs statement. Ob-
the URI in a different context. The central intuition here          viously, to address the weaker identity implied by the refer-
is there are ’forms of reference’ appropriate to a context,         entially opaque use of identity, a weaker version of owl:sameAs
especially in social contexts. To use an example from Lynn          should be specified that does not import all the properties in
Stein, when at a meeting of the PTA (Parent-Teacher Asso-           a full transitive closure.Somewhat similar predicates already
ciation) she is Ms. Stein, Rachel’s mum, not Professor Stein        exist in SKOS as skos:exactMatch and skos:closeMatch, but
of MIT. This does not mean that in the PTA meeting Ms.              their use seems rare in Linked Data [11] and they require do-
Stein is somehow not a professor, but that within that con-         mains and ranges of SKOS concepts. As most Linked Data
text those properties do not matter. At first, this distinction     does not actually do much inference, one in reality only im-
may not seem directly relevant to linked data, provided we          ports what statements are actually used. So could continue
keep ’name’ in the social sense distinct from ’identifier’ in       using owl:sameAs with a kind of ‘importer beware’ princi-
the Web sense. However, this distinction raises other issues        ple. Informally, it is one thing to link to your URI, but its
about what kind of ’names’ URIs really are and precisely            another thing to believe what you say about it as though you
why certain properties for linked data are given in the RDF         were talking about my URI. Put another way, one should
description of a certain URI and others are not.                    be wary of accepting conclusions over here that could have
                                                                    been drawn over there, so to speak.
5.3 Represents
   Often identity is conflated with representation. While the       6.2 Same Thing As But Different Context
term “representation” is often very contentious, its intuitive         There is already a notion of context built into RDF, namely
definition is that, just as a picture of something depicts          named graphs [5]. Even though it is not part of the official
something, a URI can be for a representation of a thing             standard (albeit, snuck into RDF through SPARQL and im-
rather than the thing itself. Intuitively, there seems to be        plemented in almost every tool-set), it is clear that part of
a clear-cut line between that which represents something            the problem with owl:sameAs usage on the Semantic Web is
(the representation) and that which is represented (the ref-        that sameAs should not always be a statement between two
erent), sometimes called the relationship between a “sign”          URIs in a unqualified manner, but may be qualified as hold-
and a “signifier.” However, the relationship is often not as        ing only within a certain named graph. Furthermore, noting
clear-cut as we would lead ourselves to believe. In fact, in        the that the use of owl:sameAs is somewhat equivalent of
human natural language use-mention confusions are ubiqui-           an accidental usage of owl:imports, although the exact be-
tous and often useful. For example, often a web-page or an          havior of this construct has only been intuitively (although
e-mail address are used to refer to a person. Rather than           not formally) specified. These implicit imports should prob-
yell at the world to get an education in philosophical logic,       ably either be separated, so that one states at first that two
it may be better to clarify this relationship. It also might be     items are identical using the weaker form of identity given
above, and then independently if one feels strongly about            It is possible to do empirical studies of exactly how people
that the two URIs are not referentially opaque, one imports       use owl:sameAs in the wild. Examples of owl:sameAs can
all (or even some of) the associated properties of the “identi-   be taken from the Linked Data Web in the wild in order to
cal” resource. Furthermore, there could be an inverse of this     determine how experimentally robust these distinctions are
implicit importing of identity, where statements that are im-     would be, i.e. do people actually use owl:sameAs in the four
ported due to the transitive closure of owl:sameAs are not        ways that are outlined above, and are there more possible
imported. This would allow a more fine-grained measure of         ways that we are not aware of? In fact, even the ability to
control over the use of identity in named graphs.                 recognize these kinds of distinctions may vary quite wildly
                                                                  by background and training. Lastly, if a number of em-
6.3 Represents                                                    pirical distinctions between identity links that are currently
   The use of owl:sameAs is already a sort of statement of        conflated by owl:sameAs can be made in a robust manner,
this kind in the FOAF vocabulary, the foaf:isPrimaryTopicOf       then there is considerable formal semantic work to be done.
statement. One possible solution to this problem would be         Giving the Linked Data community well-defined (both for-
to wrap such a property into some core W3C approved stan-         mally and informally) predicates should be done even when
dard. However, the problem is that it is unclear if a strict      one does think of the properties given to URIs as absolute
separation between mention and use is necessary or even de-       truths given by Linked Data publishers or W3C specifica-
sirable. In many contexts, as relevant experience in OpenID       tions, but as functions of their actual use. The (ab)use of
deployment shows, using an e-mail as an identifier for a per-     owl:sameAs in Linked Data is not a threat, it’s an opportu-
son is often more natural than the URI of a home-page, or         nity.
even a “non-information resource.” What is needed how-
ever, is a way to make the distinctions that either conflate      8. REFERENCES
or separate mention and use or on the fly. The use of weak         [1] H. Alexander. The Leibniz-Clarke correspondence.
identity statements - and in this case, a “represents” state-          Manchester University Press, Manchester, United
ment - and explicit importing and de-importing of properties           Kingdom, 1956. Republished 1998.
                                                                   [2] C. Bizer, R. Cygniak, and T. Heath. How to publish
within the context of particular named graphs would allow              Linked Data on the Web, 2007.
us to do state things like “Within this named graph and only           http://www4.wiwiss.fu-
within this named graph, the e-mail address URI is identi-             berlin.de/bizer/pub/LinkedDataTutorial/ (Last
cal to the person and shares their properties” and “Within             accessed on May 28th 2008).
this other named graph, the e-mail address represents the          [3] R. Brachman. What IS-A is and isn’t: An analysis of
person, but does not have all the properties of that person.”          taxonomic links in semantic networks. IEEE
                                                                       Computer, 16(10):30–36, 1983.
6.4 Very Similar To                                                [4] R. Brachman and J. Schmolze. An overview of the
                                                                       KL-ONE knowledge representation system. Cognitive
   Again, the tempting easy solution is simply to introduce            Science, 9(2):151–160, 1985.
a new predicate for “very similar to.” The SKOS vocabu-            [5] J. Carroll, C. Bizer, P. Hayes, and P. Stickler. Named
lary has a number of “matching” predicates that are close              graphs. Journal of Web Semantics, 4(3):247–267, 2005.
in meaning to this, ranging from hierarchically structured         [6] G. Frege. Begriffsschrift, eine der arithmetischen
skos:broadMatch and skos:narrowMatch to the more suit-                 nachgebildete Formelsprache des reinen Denkens.
able skos:closeMatch. However, the main issue with these               Halle, Germany, 1879.
predicates is that again, their use may be a matter of opin-       [7] G. Frege. Uber Sinn und Bedeutung. Zeitshrift fur
ion, as someone’s close match may be another person’s iden-            Philosophie and philosophie Kritic, 100:25–50, 1892.
                                                                       Reprinted in The Philosophical Writings of Gottlieb
tical match. One is also tempted to engage with some sort              Frege (1956), Blackwell, Oxford, United Kingdom
of “fuzzy” or numerically weighted uncertainty measure be-             (1956), translated by Max Black.
tween one and zero of identity, but then the real hard ques-       [8] P. Hayes and H. Halpin. In defense of ambiguity.
tions of where precisely will these real values come from              International Journal of Semantic Web and
and their relationship to actual probability theory muddies            Information Systems, 4(3), 2008.
these conceptual waters quickly. It seems that beneath this        [9] I. Jacobs and N. Walsh. Architecture of the World
apparently simple property is likely a whole family of het-            Wide Web. Technical report, W3C, 2004.
                                                                       http://www.w3.org/TR/webarch/ (Last accessed Oct
erogeneous and semi-structured identity relationships that             12th 2008).
should be studied more carefully and empirically observed         [10] A. Jaffri, H. Glaser, and I. Millard. Managing URI
before any hasty judgments are made.                                   synonymity to enable consistent reference on the
                                                                       Semantic Web. In Proceedings of the Workshop on
7.   CONCLUSION                                                        Identity, Reference, and the Web (IRSW) at
                                                                       ESWC2008, 2008.
  Obviously, the issue of how to express relationships of         [11] A. Miles and S. Bechhofer. SKOS Simple Knowledge
identity on Linked Data is more complex than just apply-               Organization System reference. W3c recommendation,
ing owl:sameAs. At the same time, a more nuanced ap-                   W3C, 2008. http://www.w3.org/TR/skos-reference/.
proach that fulfills the current four additional possible uses    [12] M. R. Quillian. Semantic memory. In M. Minsky,
of identity beyond owl:sameAs would be a useful step for               editor, Semantic Information Processing, pages
                                                                       216–270. MIT Press, Cambridge, Massachusetts, USA,
the Linked Data community. However, what becomes clear                 1968.
even after a cursory glance at possible solutions is that solv-   [13] C. Welty, M. Smith, and D. McGuinness. OWL Web
ing the issue of identity in Linked Data may require a certain         Ontology Language Guide. Recommendation, W3C,
refactoring of some core constructs of RDF, including relat-           2004. http://www.w3.org/TR/2004/REC-owl-guide-
ing identity to named graphs and to the use of imports on              20040210.
the Semantic Web.