Sketches, Views and Pattern-Based Reasoning

                                                           Ralph L. Wojtowicz
                                 Baker Mountain Research Corporation (ralphw@bakermountain.org)
                                          Shepherd University (rwojtowi@shepherd.edu)

    Abstract—The mathematical theory of sketches provides a             aligning knowledge models that differ due to simple renaming
graphical framework for describing and relating knowledge               and more complex reformulations of concepts.
representations and their models. Maps between sketches can
extract domain-specific context from a sketch, express knowledge            Examples of knowledge representations include relational
dynamics and be used to manage representations created for              algebra and its implementation in database languages (such
distinct applications or by different analysts. There are precise       as SQL), entity-relation-attribute diagrams, ontologies, data
connections between classes of sketches and fragments of first-         specifications and sketches. Our interest in the latter results
order, infinitary predicate logic. EA sketches are a particular class   from its graphical nature, deep connections between sketch
that is related to entity-attribute-relation diagrams and can be        theory and logic and a rich notion of contextual view of a
implemented using features available in many relational database
                                                                        sketch. Sketch theory has proved to be a valuable tool in
systems. In this paper we illustrate sketch theory through devel-
opment of a simple human terrain model. We apply the theory to          mathematical logic and the theory of computer programming
an example of aligning sketch-based knowledge representations           languages. Its relationship to other semantic technologies,
and compare the approach to one using OWL/RDF. We describe              therefore, warrants further exploration.
the computational infrastructure that is available for working
with sketches and outline research challenges.                              The purpose of this paper is to introduce the sketch data
                                                                        model to researchers and practitioners of other semantic-based
                                                                        technologies and to describe a program for its application. We
                       I.   I NTRODUCTION                               seek to give an overview of the theory through discussion
                                                                        and examples without focusing on the mathematical details.
    We use the term knowledge representation to refer to a              The following themes emerge. (1) An ontology or sketch
mathematical model of the concepts that we use to understand,           is a presentation of knowledge. Different presentations of
reason about and navigate our environment. It evolves in                the same knowledge are possible. The theory of a sketch
response to new experiences, concept formulation and the                is the formal mathematical object that such presentations
mission at hand. Ownership, membership, amicability, people             generate. (2) Alignment of distinct knowledge representations
and plans are examples of interrelated entities in this network.        and derivation of views of particular ones are more appro-
We use decision space to refer to a sets of individuals and             priately formulated using theories than presentations. (3) The
relationships that our knowledge representation organizes. This         sketch model emphasizes the distinction between a knowledge
space is more dynamic, densely populated and uncertain than             representation and its models. Instances, incompleteness and
the knowledge representation. The concept of ownership, for             uncertainty may be more appropriately incorporated in models
example, encompasses a list of ephemeral connections between            rather than in knowledge representations themselves. (4) The
individuals and their possessions. Our understanding of owner-          software infrastructure available for working with sketches
ship persists while instances of this relationship come and go.         currently is meager compared to that which has been developed
Moreover, different people can share a common understanding             around other semantic technologies such as OWL/RDF.
of ownership even if the instances of this relationship that
they observe have little or no overlap. They apply the same
knowledge representation to distinct models.                            A. Concept of Operations

    Different knowledge representations may characterize the                Figure 1 illustrates an example concept of operations that
same concept in distinct ways. Renaming the concept ‘own-               shows how the sketch data model might be used in a decision
ership’ as Eigentum or propriété, for example, results in a           support system. Later in this paper we discuss details of partic-
new presentation of the concept. A complex idea may, more               ular aspects of the data pipeline. Data from distributed sources
generally, be decomposed into distinct, simpler concepts by             is marshaled into local data models S that are expressed as
different people. Finally, as we build a knowledge model to             sketches. The local sketches are aligned using sketch maps into
organize our observations of a greater range of phenomena,              a common parent T called a theory. T is the sketch generated
we frequently derive and extract parts of it that are suitable          by the local sketches taking into account potential overlaps.
for context-based reasoning about particular situations.                Parent sketches evolve over time as local ones are modified
                                                                        and new data sources come online. Within a particular mission
    A mathematical formulation of knowledge should dis-                 context, a view V of the system knowledge representation
tinguish between the knowledge representation and decision              T is extracted. The problem of mathematically characterizing
space models. It should support evolution of the former and             the context from event and decision histories is a challenging
the dynamics and uncertainty that are characteristics of the            one and is an active area of research for applications such
latter. The mathematical framework should support derivation            as Internet search. A view is then a sketch equipped with a
of context-specific views of a knowledge representation and             sketch map into the current parent theory. Models of sketches
a decision space. Finally, it should provide mechanisms for             (including views) are distinct from the sketches themselves.


                                                  STIDS 2013 Proceedings Page 117
They include the observed instances that populate the classes             sketch of human terrain knowledge. The vertices Person, For-
and relations that symbols in the sketch represent. Uncertain-            eign, Coalition, Resident, Village and TribalElement represent
ties and partial information are accounted for in the model, not          classes of entities. Individuals who populate these classes
in the sketch. Data artifacts relevant to a view are analyzed to          are typically not represented in G (although it is possible
estimate statistical metrics for potential future states. Figure 1        to include them explicitly). They instead occur in semantic
is conceptual and necessarily incomplete. It does not show, for           models of the sketch and may be realized as, for example,
example, the roles of user interface components, visualization            rows in database tables. The SeenIn vertex represents a relation
tools and query and reasoning engines, nor of event and                   between the two classes to which it has edges. It represents the
decision history archives which would be built into a real                situation in which foreigners may be observed in one or more
command decision system.                                                  local villages. Instances of this relation, like individuals who
                                                                          populate the classes, occur in models of the sketch instead of
                      data data                                           being represented in the sketch itself.
                     source source
                                                                                                            Person
                                                      Knowledge
                      S1        S2
                                                 representations:
                                                        sketches                                                                      has        TribalElement


                                                                                                                           Resident
          theories         T1        T2          ···


                                                                                         Foreign


                                                                                                                                                        settled by
                                                                                                              Coalition
                                                              context
                                                                                                                                      liv
                                                                                                                                         es
                                                                                                                                            in


                                                                                                   SeenIn
                           V1        V2          ···            view
                                                                                                                          based in                  Village

               M1               M2     M3        ···           models                                                     place
                                                                          Fig. 2.   Part of a graph representing human terrain knowledge
                                                            potential
                                                             futures
                                                                              Intuitively, the edges of G represent functions. As we dis-
                                                                          cuss below, however, edges may model incomplete or uncertain
                                                       cost/reward        information. We intend the edge from Resident to Village, for
                                                         volatility       example, to model a situation in which each resident of an
                                                                          area of interest is associated with a unique home village. Each
                                 Decision spaces: sketch models           instance of the class represented by the Coalition vertex is
Fig. 1.   Concept of operations for the sketch data model                 to be based in a specified village. Moreover, each village is
                                                                          settled by a unique tribal element. The two edges from SeenIn
                                                                          represent the process of identifying a foreigner and a village
                                                                          in which he or she was observed. Multiple or no observations
B. Historical Background                                                  of a particular individual are possible. Note that in a sketch,
                                                                          relations (properties) are modeled by vertices rather than edges
    Category theory is a mathematical field introduced by
                                                                          as is the idiom in OWL/RDF. A relation vertex, however, is the
Eilenberg and Mac Lane in the 1940s to manage transforma-
                                                                          domain of edges that specify the types of entities that it links.
tions between certain geometric and algebraic structures. It saw
                                                                          That is, the types of the variables that occur as the domain and
explosive growth after Kan discovered the unifying concept
                                                                          range of (binary) relations must be specified.
of adjoints in 1958 [22]. During subsequent decades it has
been applied across diverse areas of computer science and                     Various features of the human terrain that we seek to
mathematics including statistics [9], linguistics [8], dynamic            represent are not captured by the graph alone. We express these
systems, semantics of programming languages [3], topology                 using extra structures called diagrams, cones and cocones. In
and, in particular, logic [20] where it provides a non-set-               Section II-A below we give a general discussion of the way in
theoretic foundation for mathematics. The theory of sketches              which these constraints are specified using graph maps. In this
is a subdomain of category theory developed by C. Ehresmann               paper we describe examples of how these concepts are used
in 1968 [11]. It was almost exclusively a tool of the French              but do not define them precisely. For details, see [2].
school of category-theorists until publication of [2], [3]. A
category is a collection of objects (e.g., sets, probability spaces           The triangle involving Resident, Village and TribalElement
or vector spaces) and maps between them (e.g., functions,                 is an example of a diagram. It expresses the intuition that the
stochastic matrices or linear transformations). In this paper             tribal element of a resident R should coincide with the tribal
we use categories to construct models of sketches. Sketches               element that has settled the village in which R lives. This
themselves form a category having rich structure [13].                    semantics is imposed on models of the sketch by including
                                                                          an appropriate diagram in the sketch constraints and by the
                                                                          mathematical definition of sketch model. The constraint can
               II.     S KETCHES , M ODELS AND M APS                      be implemented, for example, using database triggers if the
                                                                          instances are stored in database tables.
    A sketch is a graph-based knowledge representation. It
consists of an underlying directed graph G together with                     The Foreign, Coalition and Resident classes are to be
extra structures that impose semantic constraints on models.              construed as subclasses of Person. Again, this intent is not
Figure 2 shows part of the underlying graph of a simple                   captured by the graph alone. We express subtype relations by


                                                        STIDS 2013 Proceedings Page 118
including particular cone constraints in our formulation of the     B. Semantic Models of Sketches
sketch. One such cone would be included for each of the three
                                                                        Individuals who populate the classes of a sketch are not
subtypes that occurs in Figure 2. Cones, like diagrams and
                                                                    typically represented in the sketch itself. They are elements
cocones, impose mathematical requirements on models.
                                                                    of models of the sketch. As we discuss below, this framework
    Finally, we may intend the classes Foreign, Coalition and       clarifies the distinction between the syntax of a knowledge
Resident to be mutually exclusive and to exhaust the possible       representation and its semantics. We can use this formulation
classifications of Person instances. This feature is not captured   to introduce partiality (i.e., missing data) and uncertainty into
by the graph but can be included in the sketch using three cones    models rather than requiring these features to be part of the
(to assert the subtype constraints) and a cocone to assert the      syntax. First, however, we describe deterministic, set-based
disjoint union constraint.                                          models. A (set-based) model of a graph G is an assignment
                                                                    of a set M (v) to each vertex v of G and a function M (e) :
                                                                    M (A) → M (B) to each edge e : A → B of G. There are no
A. Sketch Maps                                                      further restrictions on models of a graph.
    A map H → G from a graph H to a graph G is a pair                   A model of a sketch S is a model of its underlying graph
of functions that assigns a G vertex to each H vertex and a         G that satisfies the restrictions that are represented by the
G edge to each H edge in a way that respects the source             constraints (diagrams, cones and cocones) of S. In this paper
and target information for edges in the two graphs. Graph           we seek to give an overview of the theory and do not give a
maps play important roles in defining and applying sketches.        precise definition of these constraints or their semantics. For
First, each of the three types of semantic constraints (diagrams,   details, see [2]. Figure 3 shows a model of a fragment of the
cones and cocones) is defined as a type of graph map from a         human terrain sketch that was shown in Figure 2. The Resident
base graph B to the underlying graph G of the sketch.               and TribalElement vertices are interpreted as sets of instances.
                                                                    The edge labeled ‘has’ is interpreted as a function between
                          B        !G                               the sets. To constitute a model of the sketch, the functional
                                                                    interpretations of the edges lives in and settled by must be
The three classes of constraints are distinguished by the shapes    consistent with that of ‘has’.
of their base graphs. Second, maps between sketches are
defined to be maps between the underlying graphs that preserve                          Amina                           Dhulbahante
the semantic constraints. To illustrate this idea, observe that a                       Faysal
                                                                                                                            Isaaq
graph map                                                                               Bashir
                                   ! G′                                                  Said                              Darod
                          G
between the underlying graphs of two sketches S and S ′                              M (Resident)                   M (TribalElement)
                                                                                                        M (has)
converts each S constraint B −→ G into a graph map
                                                                    Fig. 3.   Functional model of a fragment of the sketch shown in Figure 2.
                     B        !G          ! G′

via composition of graph maps. If G → G′ is a sketch map,               By varying the semantic category in which sketch models
then this composite is also an S ′ constraint. Maps between         take their values, we may represent lack of information and
sketches give a rigorous, general framework for addressing          uncertainty. The edges of the underlying graph may, for
knowledge model dynamics, alignment and views. For exam-            example, represent partial functions rather than total functions.
ple, if a knowledge model S ′ subsumes another model S, we          Recall that a partial function from a set X to a set Y is a
can express this fact using a sketch map.                           function X ′ → Y for some subset of X and that composition
                                                                    g ◦ f of partial functions is associative (like composition of
                          S        ! S′                             total functions) and is defined by further restricting the domain
                                                                    of f . Figure 4 shows a partial function model of a fragment of
Not every sketch map expresses a parent-child relationship,         the human terrain sketch that was shown in Figure 2. In this
however. Those that do are called monomorphisms and satisfy         example, the tribal element membership of Faysal is unknown.
a condition that generalizes the notion of a one-to-one function.
A sketch map can, alternatively, merge distinct vertices or                             Amina
edges to eliminate redundancy such as equivalent classes that                                                           Dhulbahante
                                                                                        Faysal
have been given distinct names.                                                                                             Isaaq
                                                                                        Bashir
                                                                                         Said                              Darod
   Alignment of intersecting sketches S1 and S2 in a common
parent S is expressed by the following diagram of sketch maps
                                                                                     M (Resident)                   M (TribalElement)
                                                                                                        M (has)
                            !!! S """""
                               " #"
                       !!!                                          Fig. 4.   Partial functional model of a fragment of the Figure 2 sketch.
                    S1 #"""              " S2
                           """        !!!
                                   !!!
                                S0
                                                                       In Figure 5 we illustrate a probabilistic model of the edge
where S0 is a sketch representing the intersection of the two       ‘has’ that occurred in Figure 2. In this model, each point of
knowledge models. We discuss an example in Section II-E.            the source object (which for ‘has’ is the set M (Resident))


                                                 STIDS 2013 Proceedings Page 119
is mapped to a probability function on the target object (the                The definition of map between models requires the two paths
set M (TribalElement) in this case). That is, in this semantic               to define the same function. That is, the village of a resident
category, edges are interpreted as stochastic matrices (all                  who occurs in both models should be the same in both models.
entries are non-negative and columns sum to 1). Composition                  Of course, not every map of models represents an extension or
is matrix multiplication.                                                    subsumption relationship. As with alignment of sketches, we
                                                                             can express alignments of models using maps. For example,
                    Amina                    0.8
                                          0.2        Dhulbahante             alignment of intersecting models M1 and M2 in a common
                    Faysal         1.0                                       parent M is expressed by the following diagram of model
                                                          Isaaq
                    Bashir           1.0                                     maps where M0 is a model representing the intersection.
                     Said                                Darod
                                       1.0
                 M (Resident)                      M (TribalElement)                                     !!# M $""""""
                                   M (has)                                                           !!!!
                                                                                                 M1 %###            !!& M2
                                                                                                        ###
Fig. 5.   Probabilistic model of a fragment of the Figure 2 sketch.                                             !!!!
                                                                                                            M0

    There is a rich literature investigating classes of sketches, as
distinguished by the types of semantic constraints they include,             D. Presentations and Theories
and classes of semantic models. Examples include linear, finite
product, finite limit, EA (entity-attribute) and mixed sketches.                 A sketch (or an OWL ontology) is a compact presentation
The expressiveness of the class of sketch imposes requirements               of the much larger body of knowledge T that it generates. For
on the classes of structures that may be employed in semantic                example, if an ontology defines a class A, a subclass A′ of
models. Just as various OWL dialects are associated with                     A and a property P that is defined on A, then we can derive
different fragments of the predicate calculus, so too are classes            a subproperty P ′ by restricting P to A′ . This restriction P ′
of sketches.                                                                 may or may not be explicitly defined in the ontology. It is
                                                                             part of the larger body of knowledge T that the ontology is
C. Maps of Models                                                            designed to present. Sketch theory defines and provides tools
    The theory of sketches also provides a notion of maps                    for analyzing this generated body of knowledge.
between semantic models. We call these model maps. They
can be used to represent model dynamics, comparisons and                         The theory T of a sketch S is the sketch that S generates
combinations. For example, the fact that different people                    by recursive application of the constructions supported by
may populate our tabulations of the Resident class that is                   the type of sketch. These constructions can include property
represented by the corresponding vertex of Figure 2, should                  chains (i.e., composition), property inverses (i.e., reciprocals),
not require us to change the syntax of our knowledge represen-               property restrictions, products (ordered pairs) and coproducts
tation. In other words, our understanding of the concepts and                (unions) of classes, and extraction of subclasses and subprop-
relationships of the human terrain does not necessarily change               erties. The constructions are specified as types of diagrams,
when we observe a new individual to add to our information                   cones and cocones since these are the concepts used to specify
system. This modular approach to information and knowledge                   semantic constraints in sketches. T is usually much larger
management is a strength of the sketch framework.                            than S. It can be infinite even if S is finite. Consequently,
                                                                             when we write down a knowledge representation, we almost
    As with models themselves, model maps can introduce                      never write down T . We formulate a presentation S instead.
partiality and uncertainty. We focus on deterministic maps.
Let M and M ′ be models of a sketch S. For each vertex                           In Figure 6 we compute a small example. The underlying
v of S, the models have corresponding sets M (v) and M ′ (v)                 graph G of the sketch has two vertices and two edges.
of individuals. A map τ from M to M ′ is a collection of                     To make the example a bit more concrete, assume that P
functions                    τv
                                 ! M ′ (v)                                   represents a class of people and E represents a class of elected
                     M (v)
                                                                             officials who serve political districts. The edge r represents
between these sets of instances. In order to be a map of models,             an assignment of elected officials to people while u identifies
these functions must be consistent with the functions in the                 elected officials as particular instances of people. We impose
models themselves that arise from edges in the underlying                    one semantic constraint: the property chain (composite r ◦ u)
sketch graph. Two models M and M ′ of the Figure 2 sketch,                   of the two edges indicated in the triangle should coincide with
for example, each have associated sets of Resident and Village               the identity function on elected officials. That is, each elected
instances. If M ′ subsumes M by, for example, adding new res-                official serves his or her own political district. The finite graph
idents, then the new model should maintain the data about the                G could, potentially, generate an infinite family of property
previously-known residents. This is expressed by the following               chains: r ◦ u, u ◦ r, u ◦ r ◦ u, r ◦ u ◦ r, etc. The semantic
diagram in which we use τ to denote the two functions τVillage               constraint has the effect of truncating this list so that the only
and τResident .                                                              distinct properties are those shown in the path graph on the
                                      τ                                      right side of Figure 6. The path graph is the underlying graph
                 M (Resident)              ! M ′ (Resident)
                                                                             of the theory T1 of the sketch S1 whose underlying graph is
              M(lives in)                             M ′ (lives in)         shown on the left side of Figure 6 and whose only constraint
                       "                             "                       is the diagram shown in the center. The derived edge u ◦ r
                  M (Village)                ! M ′ (Village)                 connects each person to his or her elected representative.
                                      τ


                                                           STIDS 2013 Proceedings Page 120
                                 u                   id                          id
        u               E                P                      u                                             S1         V $$    S2
                                                          E                  P                                       ###     $$
 E                  P                        r
                                id                              r                                              ! $###m1        $
                                                                                                                            m2 $% !
        r                                E                                                                    T1 $$              T
                                                                             u◦r                                    $$          # 2
                                                                                                                      $$     ###
Fig. 6. The two-vertex graph G (left) together with the diagram D (center)                                              & '##
generate the theory T1 (right). The graph and diagram form a sketch S1 .                                                 T
                                                                                      Fig. 9. Alignment of the presentations (e.g., sketches or ontologies) S1 and
                                                                                      S2 into the common knowledge representation T using the union (pushout)
                                                                                      operation on sketches.
E. Alignment
    A body of knowledge may be presented in different ways
even within a fixed formalism (e.g., ontologies or sketches).                         sketches and maps. This illustrates a particular case of the
People use terms differently and use different words to describe                      data alignment step shown at the top left of Figure 1.
the same concepts. The notion of theory of a sketch provides
                                                                                          The need to use structures generated from the available
a framework for formulating the alignment problem. Consider
                                                                                      presentations, rather than using the presentations alone, is
the elected officials example discussed above. An alternative
                                                                                      evident in the alignment example discussed in Chapter 10
presentation is shown in Figure 7. This sketch has only a
                                                                                      of [14]. In this example, two OWL ontologies are aligned using
single vertex C that represents a class of citizens. It has a
                                                                                      OWL statements. The first ontology is defined below.
single edge e that represents the connection of each citizen to
his or her elected official. The semantic constraint (indicated                       ex1:Mother      rdfs:subClassOf      ex1:HomeDweller.
by the center diagram) again asserts that each elected official                       ex1:Father      rdfs:subClassOf      ex1:HomeDweller.
                                                                                      ex1:Son         rdfs:subClassOf      ex1:HomeDweller.
represents himself or herself. The theory generated by this                           ex1:Daughter    rdfs:subClassOf      ex1:HomeDweller.
sketch has one vertex and two edges. It is shown below right.                         ex1:hasChild    rdf:type          owl:ObjectProperty.
                                                                                      ex1:hasSon      rdfs:subPropertyOf      ex1:hasChild.
                                     e                                  id            ex1:hasDaughter rdfs:subPropertyOf      ex1:hasChild.
                            C                C
            C                                    e              C                     The second is defined with prefix ex2. It overlaps with but is
                                     e
                                             C                      e
                                                                                      visibly not equivalent to the first.
                e
                                                                                      ex1:Relative           rdf:type                           owl:class.
Fig. 7. An alternative formulation of the knowledge model shown in Figure 6.          ex1:Mother             rdfs:subClassOf                 ex1:Relative.
The one-vertex graph G (left) together with the diagram D (center) generates          ex1:Father             rdfs:subClassOf                 ex1:Relative.
the theory T2 (right). The graph and diagram form a sketch S2 .                       ex1:Child              rdfs:subClassOf                 ex1:Relative.
                                                                                      ex1:hasParent          rdfs:type                 owl:ObjectProperty.

    We seek to align these two formulations of the same                               We align the two using the OWL statements below.
concepts. To do this we can not use the presentations S1 and S2                       ex1:Mother   owl:equivalentClass                         ex2:Mother.
themselves. We must use theories. Although we can identify                            ex1:Father   owl:equivalentClass                         ex2:Father.
the vertex P of the S1 with the vertex C of S2 , the problem                          ex1:Son      rdfs:subclassOf                              ex2:Child.
is that there is no edge in S1 that corresponds to the edge e                         ex1:Daugher rdfs:subclassOf                               ex2:Child.
                                                                                      ex1:hasChild owl:inverseOf                            ex2:hasParent.
of S2 . The appropriate edge occurs in the theory T1 of S1 not
in the sketch S1 itself. Figure 8 illustrates how the alignment
problem is formulated using sketches. The task is to find a                               Although the Mother and Father classes coincide, there
sketch V and sketch maps into the theories generated by the                           are no appropriate classes in ex2 to identify with ex1:Son
two presentations that we seek to align. V can be construed as                        or ex1:Daughter. The classes occur in a knowledge rep-
the overlap between the two theories. It is a view (as defined                        resentation generated by ex2 not in ex2 itself. Similarly,
                                                                                      ex1:hasChild and ex2:hasParent have no corresponding
in the next section) of both presentations.
                                                                                      element in the other’s ontology. They are identified with
                        S1                       S2                                   elements constructed from the other.
                                         V"
                                   !       ""
                                 !!          ""
                               !!m        m2 ""                                       F. Contexts and Views
                          ! "!! 1               # !
                         T1                      T2                                       Decision making uses both general knowledge and specifics
                                                                                      of the decision space to balance the expected costs and risks
Fig. 8.    Formulation of the alignment problem using sketches. To align              of a program of actions. It focuses on the components that
presentations S1 and S2 (i.e., sketches or ontologies), we compute a sketch
V and maps m1 and m2 into the theories generated by the presentations.                are most relevant to a mission, its goals and tasks. It must
                                                                                      efficiently and effectively manage the available data.
                                                                                          Context carries information about intent. Views are imple-
   The sketch framework supports an operation called pushout                          mentations of context in a knowledge representation. A view of
which is essentially the union accounting for overlaps between                        a database is derived using a query. In SQL implementations,
sketches. With this union operation we can align the two                              a view is typically a single (virtual) table. The view update
presentations that we have been discussing into a single                              problem addresses the question of how to determine an appro-
knowledge representation T . Figure 9 shows the resulting                             priate update to the state of the total database when a view


                                                              STIDS 2013 Proceedings Page 121
is modified. The influential paper [1] developed the constant                     first-order logic is undecidable, NP-complete for propositional
complement approach to view updates. [4], [12] defined the                        logic, and P-complete and linear for propositional Horn logic
notion of lens that characterized a class of updates. At about                    (a property exploited in the Prolog language) [26].
the same time, [15] described update strategies for particular
update classes. Sketches were introduced into the study of data                       A sketch is an alternative, graphical way of presenting a
semantics in order to better understand database dynamics; in                     logical theory [2], [20]. In the sketch data model, we express
particular, the view update problem [24]. [19] uses sketches to                   relationships using diagrams, cones and cocones in directed
extend the lens concept and classes of view update strategies.                    graphs instead of with formulas and terms. Logical inference
                                                                                  employs graph properties associated with constraints. A sketch
    Within the sketch data model, views are particular sketch                     with no constraints is like a logical signature with no axioms.
maps and, in general, are much more expressive than a single
derived table. Specifically, a view of a knowledge representa-                                                                M (A)
tion S is a sketch V together with a sketch map
                           V       !T                                                                      Q       M (P )        pb     M (C)
where T is the theory generated by S (see Section II-D).
Consider, for example, the human terrain sketch shown in                                                                      M (B)
Figure 2. One view of interest is obtained by restricting the                     Fig. 11.   Universal mapping property that characterizes pullback cones
SeenIn relation to the subrelation in which the village is one
in which coalition personnel are based. This view involves
                                                                                  A pullback cone, for example, is characterized by the property
subclasses of each of the classes (in addition to the subrelation
                                                                                  illustrated in Figure 11 which shows a model M of such a
of SeenIn that we mentioned). None of these subclasses occur
                                                                                  constraint (see [2]). If the two outer functions from Q to M (C)
in the sketch itself but they are generated by applying cone
                                                                                  are equal, then there is a unique function from Q to M (P )
and cocone constraints.
                                                                                  for which the two paths from Q to M (A) are equal as are
    A challenge to implementing this framework in a decision-                     the two from Q to M (B). Such graph definitions are called
making context is using the available event history and other                     universal mapping properties [22]. From these we derive other
context-specific data to construct an appropriate sketch view in                  inference rules such as: If the function from M (B) to M (C) is
a semi-autonomous manner. The graphical nature of sketches                        a subtype (is a) relationship, then so is the edge from M (P )
may facilitate the adaptation of recent techniques developed                      to M (A) [22].
for context sensitive Internet search [6], [16], [21], [28] that
                                                                                      Sketches, like logics, are developed with varying levels
use the graph structure of the Web.
                                                                                  of fidelity. Linear sketches are the least expressive. Finite
                                                                                  limit, finite sum, EA (entity-attribute) and mixed sketches are
G. Support for n-ary Relations                                                    richer. Despite the distinct character of logical and sketch-
    A limitation of OWL/RDF described by practitioners is its                     based inference, they share deep connections. For various
lack of direct support for representing n-ary relations. Such                     classes of sketches, there are algorithms for constructing
properties can be represented directly in sketches. To express                    logical theories that have equivalent categories of models (see
a relation R that may hold among n entities that have types                       D.2.2 of [20]). Reasoning about a knowledge model expressed
A1 , · · · , An , we introduce vertices for each of these n + 1                   as a sketch, therefore, may be achieved either directly using
classes and we include n edges R → Ai . Any axioms that the                       the computational category theory techniques discussed below
relation is intended to satisfy would then be formulated using                    in IV or indirectly by converting to a first-order theory and
diagrams, cones and cocones.                                                      using a predicate calculus reasoner.
                                      R                                              The problem of pattern-based reasoning with sketches is
                                                                                  similar to the ontology alignment problem [18] that is solved
                    A1               ···               An                         mathematically via the theory (or syntactic category) of a
                                                                                  sketch [20]. We may align, for example, the human terrain
Fig. 10. Sketch for an n-ary relation R among entities of types A1 , . . . , An   sketches that are shown in Figures 13 and 2 via a sketch map
                                                                                  from the latter to the former. Refinement of the knowledge
                                                                                  base to represent levels in a tribal hierarchy (e.g., ethnic
                     III.    L OGICAL I NFERENCE                                  groups, tribes, clans and factions) is accomplished with a
                                                                                  sketch map from the Figure 2 sketch to a new sketch that
    The graphical nature of the sketch data model supports                        would include additional edges and constraints. The simple
implementation of pattern-matching reasoning capabilities that                    business knowledge representation shown in Figure 12 can be
emulate the process of experienced decision makers [23].                          mapped to a sub-sketch of our human terrain model.
In classical logic, we express properties and relationships
as terms and formulas that are recursively-constructed from
basic components. Inference is formulated as rules for deriving                                   IV.    S OFTWARE I NFRASTRUCTURE
valid expressions. Like models of physical phenomena, logics                         The software infrastructure available for working with
are developed with varying levels of fidelity based on their                      sketches is meager relative to that associated for other semantic
intended applications. Examples include classical, descriptive,                   models (e.g., OWL/RDF). The Easik tool1 is the most mature.
modal and linear logics. Expressiveness, however, comes at
the expense of higher computational complexity: inference for                       1 http://mathcs.mta.ca/research/rosebrugh/Easik


                                                          STIDS 2013 Proceedings Page 122
                                 Employee                            Division              The Easik tool supports read and write operations between
                                                                to                         knowledge representations and XML files, SQL files and SQL
                                                             ed


                                                                         home of
                                                          ign                              (MySQL or PostgreSQL) database connections. In Easik, a
                                                    ass
                                                                                           sketch is implemented as a database schema. Each sketch

                    Consultant


                                        Full-Time
                                                                                           entity (graph node) is a table created according to the schema.
                                                    works at          Office               Values that populate the tables form a model of the sketch.
                                                                                           Each table has an implicit, integer-valued primary key. In
Fig. 12.   Part of a sketch representing business structure knowledge                      general, a primary key constraint on a table expresses the fact
                                                                                           that the values in one or more columns together are a unique
It provides a graphical interface for building a collection of                             identifier of a row. This does not preclude the possibility
sketches and views. It implements procedures for reading and                               that two rows may refer, for example, to a single individual.
writing sketches to and from XML and SQL files. It also                                    Attributes are table columns. For example, an entity B with
provides an interface to models maintained in PostgreSQL                                   no attributes or outgoing edges is implemented in PostgreSQL
and MySQL databases. Easik does not implement a reasoning                                  as follows where the id column is an automatically-generated
engine. Figure 13 shows a sketch that is similar to the one                                (i.e., SERIAL), key.
whose graph is shown in Figure 2. It was developed using                                   CREATE TABLE B ( id SERIAL PRIMARY KEY );
Easik. The screen shot illustrates convenient abbreviations for
various semantic constraints. The decorated arrows to Person,                                  A foreign key constraint specifies that the values in one
for example, indicate subtype relationships. These are imple-                              or more columns must match the values occurring in some
                                                                                                                                                    e
mented as cones. Vertices connected to the + symbol form                                   row of another table. We implement a sketch edge A −→ B
a kind of cocone. Its base consists of the vertices Coalition,                             as a foreign key contained in the A-table and referencing the
Foreign and Resident. The paths connected to CD indicate a                                 primary key of the B-table. In PostgreSQL this is expressed
diagram constraint.                                                                        as follows.
                                                                                           CREATE TABLE A ( id SERIAL PRIMARY KEY,
                                                                                                            e INTEGER NOT NULL REFERENCES
                                                                                                            B (id) ON DELETE CASCADE
                                                                                                                   ON UPDATE CASCADE );
                                                                                               Insertion of a row into the A-table, therefore, involves
                                                                                           specifying values for the columns that are introduced into that
                                                                                           table for the edges having domain A. Deletion of a B-row
                                                                                           can impact the A-table if an A-row references the B-row via
                                                                                                         e
                                                                                           the edge A −→ B. Foreign keys serve to implement relations
                                                                                           (object properties) in the sketch data model since a relation is
                                                                                           simply an entity having edges to the nodes that correspond to
Fig. 13.   Human terrain sketch implemented using the Easik software tool                  the types of its participants.
    Category theory, despite its abstract nature, is highly                                   Although subtype relations (monic edges) are a particular
computational. All semantic constraints, for example, can be                               kind of cone constraint, they can be implemented as foreign
computed from four basic types: cones can be expressed                                     keys with unique references in the codomain table. In general,
using product cones and equalizer cones; cocones can be                                    however, sketch constraints are implemented using triggers.
expressed using coproducts and coequalizers. One freely-                                   A trigger for a database table or view executes a specified
available implementation of these and related computations                                 function whenever certain events occur. The simple diagram
is written in the ML programming language and is described                                 shown in Figure 14, for example, asserts that semantics of the
in [25]. This work and similar research activities could provide                           composite edge f followed by g should equal that of h. In
a basis for implementing a reasoning engine for a sketch-based                             PostgreSQL we express this as follows.
information system.
                                                                                           CREATE FUNCTION commutativeDiagram0()
    If the only constraints of the sketch are diagrams (i.e., the                                RETURNS trigger AS $commutativeDiagram0$
sketch has neither cones nor cocones), then the theory T (when                                DECLARE _cdTarget1 CONSTANT INTEGER := NEW.h;
T is, in fact, finite) generated by the sketch may be computed                                        _cdTarget2 CONSTANT INTEGER :=
                                                                                                                  (SELECT B.g FROM B
via the left Kan extension algorithm which generalizes the                                                         WHERE B.id = NEW.f);
Todd-Coxeter procedure from group theory [7], [27]. If the                                    BEGIN   IF    _cdTarget1 IS DISTINCT
generated sketch is infinite, the algorithm, of course, does not                                                          FROM _cdTarget2
terminate. Its complexity in the cases when it does terminate                                         THEN RAISE EXCEPTION
has not been characterized and is highly sensitive to small                                              ’Commutative diagram constraint
                                                                                                          failure’;
variations in the sketch [5].                                                                         END IF;
                                                                                                      RETURN NEW;
                                                                                              END;
   V.      I MPLEMENTING S KETCH M ODELS IN DATABASES                                      $commutativeDiagram0$ LANGUAGE plpgsql;
                                                                                           CREATE TRIGGER commutativeDiagram0
   Features available in major relational database systems in-                               BEFORE INSERT ON A
cluding PostgreSQL and MySQL provide an interface between                                      FOR EACH ROW EXECUTE
the mathematical theory of sketches and their application.                                       PROCEDURE commutativeDiagram0();


                                                                         STIDS 2013 Proceedings Page 123
    It is a trigger that fires before insertion of a row into               [2]  M. Barr and C. Wells. Toposes, Triples and Theories. Springer. 1985
the A-table to confirm that the values entered in the foreign               [3]  M. Barr and C. Wells. Category Theory for Computing Sciences.
key columns for f and h satisfy the commutativity constraint                     Prentice-Hall. 1990
taking into account the value in the g-column in an appro-                  [4] A. Bohannon, J. Vaughan and B. Pierce. Relational Lenses: A Language
priate row of the B-table. Cone and cocone constraints are                       for Updatable Views. In Proc. of the 25th ACM SIGMOD-SIGACT-
                                                                                 SIGART Symposium on Principles of Database Systems. ACM. 2006
all similarly implemented. All EA sketch constraints can be
                                                                            [5] J. J. Cannon, L. A. Dimino, G. Havas and J. M. Watson. Implementation
constructed from these [22].                                                     and Analysis of the Todd-Coxeter Algorithm. Mathematics of Computa-
                                      f                                          tion. 27(123):463–490. July 1973
                               B             A                              [6] H. Cao, D. Jiang, J. Pei, E. Chen and H. Li. Towards Context-Aware
                                      cd                                         Search by Learning a Very Large Variable Length Hidden Markov
                                 g                                               Model from Search Logs. In Proceedings of the 18th World Wide Web
                                            h                                    Conference (WWW’09). pp. 191–200. 2009
                                      C                                     [7] S. Carmody, M. Leeming, R. F. C. Walters. The Todd-Coxeter Procedure
                                                                                 and Left Kan Extensions. J. Symbolic Computation. 19:459–488. 1995
Fig. 14.   Diagram to implement using a SQL trigger
                                                                            [8] C. Casadio, P. J. Scott and R. A. G. Seely, Eds. Language & Gram-
                                                                                 mar: Studies in Mathematical Linguistics and Natural Language. CLSI
    A research question that arises is how might one utilize the                 Publications. 2005
sketch data model in a context of large-scale, distributed data.            [9] N. N. Čencov. Statistical Decisions Rules and Optimal Inference. Amer-
Broader demand for a scalable system that supports views and                     ican Mathematical Society. 1982
integrity constraints are well-known. Megastore, Tenzing, and               [10] E. F. Codd. Derivability, Redundancy, and Consistency of Relations
Spanner are Google products developed to meet this demand.                       Stored in Large Data Banks. IBM Research Report RJ599. 1969
Apache Cassandra is an open-source alternative.                             [11] C. Ehresmann. Esquisses et types de structures algébriques. Bul. Inst.
                                                                                 Politehn. Iaşi. 14:1–14. 1968
                                                                            [12] J. Foster et al. Combinators for Bi-Directional Tree Transformations:
                         VI.    C ONCLUSION                                      A Linguistic Approach to the View Update Problem. ACM Transactions
    OWL/RDF and related semantic web technologies have                           on Programming Languages and Systems. 29. 2007
established tenable positions in the intelligence, defense and              [13] J. W. Gray. The category of sketches as a model for algebraic semantics.
                                                                                 In Categories in Computer Science and Logic. V. 92 of Contemporary
security domains. The sketch data model, however, integrates a                   Mathematics. AMS. pp. 109–135. 1989
variety of features that can be leveraged. These include its deep
                                                                            [14] J. Hebeler, M. Fischer, R. Blace and A. Perez-Lopez. Semantic Web
connections with infinitary predicate logic, the ability to imple-               Programming. Wiley Publishing, Inc. 2009
ment sketches and their models using major relational database              [15] S. J. Hegner. An Order-Based Theory of Updates for Closed Database
systems, its graphical nature and, perhaps most significantly,                   Views. Annals of Math. and Artificial Intelligence. 40:63–125. 2004
its sophisticated notion of view of an information system. It               [16] T. Joachims. Optimizing Search Engines Using Clickthrough Data.
is possible to formulate OWL constructs using sketches. The                      SIGKDD’02, Alberta Canada. 2002.
classes of sketches that can be expressed in the dialects of                [17] M. Johnson and R. Rosebrugh. Sketch Data Models, Relational Schema
OWL2 is an open question. We have illustrated these concepts                     and Data Specifications. Electr. Notes Theor. Comp. Sci. 61(6):1–13. 2002
and their application to a simple ontology alignment problem.               [18] M. Johnson and R. Rosebrugh. Ontology Engineering, Universal Al-
                                                                                 gebra, and Category Theory. In Theory and Applications of Ontology:
    The sketch data model also clarifies the formulation of cer-                 Computer Aplications. Springer-Verlag. pp. 565–576. 2010
tain challenges that we encounter in applications of OWL/RDF.               [19] M. Johnson, R. Rosebrugh and R. J. Wood. Lenses, Fibrations and
In the sketch approach, uncertainty and lack of information are                  Universal Translations. Mathematical Structures in Computer Science.
aspects of models of the sketch. They are not features of the                    22:25–42. 2012
knowledge representation itself. Moreover, sketches (and OWL                [20] P. Johnstone. Sketches of an Elephant. Oxford University Press. 2005
ontologies) are more appropriately construed as presentations               [21] A. Kustarev, Y. Ustinovesky and P. Serdyukov. Measuring Usefulness
of the larger bodies of knowledge that they generate. In sketch                  of Context for Context-Aware Ranking. ACM WWW 2012 Companion,
                                                                                 Lyon, FR. 2012
theory, this larger knowledge base is called the theory of
                                                                            [22] S. Mac Lane. Categories for the Working Mathematician. 2nd Ed.
a sketch. Except in simple cases of renaming, alignment of                       Springer-Verlag. 1999
presentations involves maps into theories. The extent to which              [23] J. Morrison, R. T. Kelly, R. A. Moore and S. G. Hutchins. Implications
procedures for generating a theory from a sketch can support                     of Decision Making Research for Decision Support and Displays. In
partial-automation of alignment problems is an open research                     Decision Making Under Stress: Implications for Training and Simulation.
question.                                                                        J. A. Cannon-Bowers and E. Salas, Eds. pp. 375–406. 1998
                                                                            [24] R. Rosebrugh and R. J. Wood. Relational Databases and Indexed
    Finally, the concept of view of a knowledge representation                   Categories. Category Theory 1991: Proceedings of an International
is formulated as a sketch map to a theory. This generalizes the                  Summer Category Theory Meeting Held June 23–30. Vol. 13 of CMS
notion of view of a database. Recent techniques developed for                    Conference Proceedings. AMS. pp. 391–407. 1991
context sensitive Internet search exploit the graph structure of            [25] D. E. Rydeheard and R. M. Burstall. Computational Category Theory.
the Web and search histories. The extent to which these tech-                    Prentice-Hall. 1988
niques and the graphical nature of sketches can be exploited to             [26] A. Troelstra and H. Schwichtenberg. Basic Proof Theory. Cambridge
                                                                                 University Press. 2000
support semi-automated extraction of context-relevant views of
                                                                            [27] R. L. Wojtowicz and N. S. Yanofsky. Quantum Kan Extensions and
a knowledge representation is another open research challenge.                   Applications. DOI Contract D11PC20232 Final Report. 2013
                                                                            [28] B. Xiang, D. Jiang, J. Pei, X. Sun, E. Chen and H. Li. Context-
                             R EFERENCES                                         Aware Ranking in Web Search. ACM SIGIR’10 29–23 July, Geneva,
[1]   F. Bancilhon and N. Spyratos. Update Semantics of Relational Views.        Switzerland. 2010
      ACM Transactions on Database Systems. 6:557–575. 1981


                                                      STIDS 2013 Proceedings Page 124