<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Knowledge Representation and Automated Formal Reasoning in Description Logic ALC</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vasyl Lenko</string-name>
          <email>vs.lenko@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Volodymyr Pasichnyk</string-name>
          <email>vpasichnyk@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Natalia Kunanets</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yurii Shcherbyna</string-name>
          <email>yshcherbyna@yahoo.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ivan Franko National University of Lviv</institution>
          ,
          <addr-line>1 Universytetska str., Lviv, 79000</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Lviv Polytechnic National University</institution>
          ,
          <addr-line>12 S. Bandery str., Lviv, 79013</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The paper presents a systematic analysis of the description logic formalism that has proved to be the distinguished approach for automated formal reasoning. The focus is on the principles of knowledge representation and reasoning within the framework of basic description logic ALC. An application of the specified formalism towards the domain of mass media reveals the benefits of automated reasoning for decision making.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Description logic</kwd>
        <kwd>automated formal reasoning</kwd>
        <kwd>tableaux algorithm</kwd>
        <kwd>ALC</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Description logics (DLs) play a fundamental role in modern ontologies that are based on OWL 2.
They form the basis of direct model-theoretic semantics [1] for OWL 2 ontologies, which are de facto
the standard in knowledge bases engineering. The historical evolution of ontologies from a frame model
towards logic-based knowledge representation models is explained by the issues of an ambiguous
interpretation of syntactic constructions in frames and limited generality of the associated reasoning
methods [2]. Description logics, as decidable fragments of first-order logic, provide formal
modeltheoretic semantics and sound methods (forms) of reasoning over syntactic expressions.</p>
      <p>For OWL 2, the most appropriate semantics among description logics is SROIQ, while for OWL it
is description logic SHOIN. Description logics are named using abbreviations of sets of constructors
(axioms), which form the basis for their grammar. For instance, SROIQ consists of the following
constructors: S – denotes the set of constructors of the basic description logic ALC, extended with the
transitive roles; R – denotes an extended set of role axioms, also called RBox; O – denotes nominals; I
– denotes the inverse roles; Q – denotes qualified cardinality restrictions. The variety of description
logics is explained by the combinatorial possibilities of the constructors’ set selection but obtaining the
description logic with attractive expressiveness and computability is a difficult research task. The
reason is a fundamental trade-off between expressiveness and decidability of formal language – higher
expressiveness leads to a higher probability of undecidability of reasoning problems. Adoption of a new
description logic requires formal proofs of its properties as a decision procedure, as well as careful
selection of a set of constructors.</p>
      <p>We will conduct a systematic analysis of 1) the structure of description logics, based on the example
of the description logic ALC; 2) the main reasoning problems that can be stated and solved in ALC; 3)
the concept of a knowledge base in the context of description logics; 4) fundamental properties of
description logics; 5) model-theoretic semantics and tableaux-based method for the inference of logical
conclusion.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Description logics</title>
      <p>Description logics are a family of logic-based formal languages for knowledge representation and
reasoning that date back to the late 1980s. The notion “description logic” corresponds to the ontological
commitments of this formalism: the main aspects of the domain of interest are presented in the form of
a “description” of its entities (concepts) and relationships (roles), which are created by the constructors
(axioms) of the selected description logic. For the sake of knowledge reuse, syntactic expressions are
classified by their generality and divided into two parts – a set of terminological axioms, TBox, and a
set of statements about individuals, ABox, which together form the knowledge base, KB, also known
as an ontology.</p>
      <p>A fundamental property of description logics is the presence of formal semantics, which allows to
interpret correctly syntactic expressions, axioms, and operators, thus creating an unambiguous, shared
understanding of when the statement itself is a logical consequence from the knowledge base. Since
description logics are decidable fragments of predicate logic, the logical inference is executed using
automated reasoning methods, which are based on the principles of semantic admissibility and
deductive derivability of a statement. The peculiarity of reasoning in descriptive logics is that
computation is always carried out over the entire knowledge base. The conceptual trade-off between
the expressiveness of the representation language and the computational complexity of reasoning
problems, as well as the number of statements in the knowledge base, give rise to the need for a variety
of description logics. In particular, the family of description logics EL is characterized by limited
expressiveness of the representation language but guarantees a polynomial time solution for some
reasoning problems in large ontologies.</p>
      <p>The expressiveness of description logics is always limited. This property is forced by the need to
ensure the decidability of reasoning problems and can be tightened to ensure the decidability of
reasoning problems in polynomial time with regards to the size of the knowledge base. Decidability,
especially in polynomial time, is a major factor that promotes the use of description logics in knowledge
base engineering, therefore the design of varieties of specialized description logics requires careful
analysis and formal proofs of their properties. This need is caused by the fact that a synthesis of
decidable description logic extensions can lead to undecidability. If the expressiveness of description
logic is insufficient to represent the domain of interest, then instead of using undecidable description
logics, it is better to build a custom application around the decidable DL or use alternative knowledge
representation models. The definition of semantic structures that can be represented in description logic
is carried out within the framework of formal model theory, which in particular defines the semantics
of first-order logic.</p>
      <p>Some kinds of description logics go beyond first-order logic and can use operators of modality, the
concepts from fuzzy logic, or probability theory. Syntactically, the statements of description logics are
similar by form to the expressions of modal logics, but their truth values are established according to
A. Tarski’s semantic truth theory [3], which is the paradigm of truth definition in model theory. Despite
the differences that are present in the varieties of description logics, the process of ontology engineering
is mostly the same. At first, we define the dictionary of the domain of interest, which is later formalized
into the TBox statements. Some applications may use only TBox, while others may the whole ontology
and even databases. The logical consistency of an ontology structure, as well as processing of user
queries to a knowledge base, are ensured by a module of automatic or semi-automatic logical reasoning,
which is usually provided as part of an ontology development environment (i.e., HermiT reasoner [4]
in Protégé IDE [5]).</p>
    </sec>
    <sec id="sec-3">
      <title>3. Description logic ALC</title>
      <p>The use of description logics for the conceptual modeling of the domain of interest involves the
description of important abstractions that are inherent to it, as well as the individuals who are the
instances of these abstractions. Description logics have four major components that participate in the
process of creating a formal representation of the domain [2]:
1. Concepts are abstractions that are built from the concept names and role names using
constructors of description logic. In model theory, the concept is interpreted as a set of elements,
and it corresponds to a unary predicate in first-order logic.
2. Roles (role names) are the relationships between two concepts or individuals. In model theory,
the role is interpreted as a set of pairs of elements that belong to the appropriate concept extensions,
and it corresponds to a binary predicate in first-order logic.
3. Individuals (instances) are the embodiments of concept abstractions in the form of objects and
are designed to name the object of a particular abstraction. In model theory, the individual is
interpreted as an element of a concept’s extension set, and it corresponds to a constant in first-order
logic.
4. Concept language is a formal language that creates concepts and roles descriptions from
primitives like concept names and role names, by the means of description logic operators. An
interpretation of the concept language operators in a model theory puts them in line with the
operators on sets, while in first-order logic they are compared with the operators on logical
statements.</p>
      <p>An attributive language with complements, ALC, belongs to the family of description logics AL –
attributive languages that provide operators of atomic negation, concepts intersection, universal and
existential quantifiers. Unlike the AL family, where only the complements of atomic concepts are
available, the ALC language allows the complement of any well-formed concept [3]. The set of atomic
concepts consists of the concept names, “top” ⊤ and “bottom” ⊥ concepts; the rest are compound
(complex) concepts because they are formed using the concept language operators. Thus, a set of ALC
concept descriptions consists of concept names, role names, individual names, ⊤ and ⊥, concepts
intersection and union, concept negation (complement), existential and value restrictions. The syntax
of a formal language allows distinguishing well-formed expressions from arbitrary ones. An inductive
definition of ALC concepts’ syntax is as follows [2]:
• Every concept name is an ALC concept description;
• ⊤ and ⊥ are ALC concepts descriptions;
• If C and D are ALC concept descriptions and r is a role name, then the following are also
ALC concept descriptions:
o C ⊓ D (conjunction or intersection of concepts);
o C ⊔ D (disjunction or union of concepts);
o ¬C (negation or complement of concept);
o ∃r.C (existential restriction);
o ∀r.C (values restriction).</p>
      <p>The formal semantics of a language allows to establish unambiguously the meaning of its
wellformed expressions and is also required for the introduction of a truth predicate, which assigns a truth
value to the logical expressions in the language. Semantics can be considered as a metalanguage, which
contains the structures that are described by the syntax of the corresponding formal language. Since
metalanguage is also a language, it is possible to construct its own semantics, and so on. A. Tarski’s
semantic truth theory [3] states: to avoid the Russell’s or liar’s paradoxes in a formal language, the truth
predicate must be defined exclusively in a metalanguage. This paradigm preserves the consistency of
the logical systems, which allow forming the statements like “This sentence is a lie”; the truth value of
this statement is determined outside the language that was used for its formulation.</p>
      <p>The semantics of description logics is an “interpretation” – a structure that is characterized by the
following properties: 1) consists of a non-empty set of elements, which is called interpretation domain;
2) for each concept name defines its extension, i.e., a subset of elements from the interpretation domain;
3) for each role name defines a set of pairs of elements from the interpretation domain, which are related
to each other by this role. Formally, an interpretation  = (∆ , · ) consists of a non-empty set ∆ , which
is called the interpretation domain, and a mapping · , which maps each concept name  ∈  to a set
  ⊆ ∆ , and each role name  ∈  to a binary relation   ⊆ ∆ × ∆ . Interpretation of the other concept
descriptions is carried out by extending the mapping · with the following rules [2]:
• ⊤ = ∆ (entire interpretation domain),
• ⊥ = ∅ (empty set);
• ( ⊓  ) = C ∩   (intersection of concepts extensions);
• ( ⊔  ) = C ∪   (union of concepts extensions);
• (¬С) = ∆ \C (interpretation domain without concept extension);
• (∃ .  ) = { ∈ ∆ | exists  ∈ ∆ with ( ,  ) ∈   and  ∈ С };
• (∀ .  ) = { ∈ ∆ | for all  ∈ ∆ , if ( ,  ) ∈   , then  ∈ С }.</p>
      <p>The properties of an interpretation are limited only by the rules that are explicitly specified in its
definition, namely: the interpretation domain should not be empty, but can be of arbitrary power, even
infinite; concept extension may contain any number of elements, from zero to the entire interpretation
domain; the role name extension may contain any number of pairs of elements, from zero to all possible
pairs. Modification of the interpretation  , by adding or removing elements from interpretation domain,
concepts extensions, or roles extensions, gives rise to the alternative interpretations. For the sake of
interpretation structure analysis, it is convenient to visualize it in the form of an oriented graph, where
the arcs and vertices are labeled. In this graph, a vertex represents an element from the interpretation
domain, while its labels denote the concepts that contain the specified element in their extensions; an
arc represents a pair of elements contained in the extension of some role r, and the arc label corresponds
to the name of that role.</p>
      <p>Example 1. Create a model of the domain “IT project structure” using the description logic ALC and
define its interpretation in the context of the “Smart City” project.</p>
      <p>Solution. The first step of modeling is to determine the concept and role names that are important
for us in the domain of interest. Let the set of concept names  = {Engineer, Technology, Project}, the
set of role names  = {is_part_of, has}. The second step of modeling is to define an interpretation for
the elements of these sets. Concepts and roles extensions usually adhere to the desired context and
purposes of modeling:
∆ = {Andy, Mary, Javascript, UML, Smart_City};
Engineer = {Andy, Mary};
Technology = {JavaScript, UML, Smart_City};
Project = {Smart_City};
is_part_of  = {(Andy, Smart_City), (UML, Smart_City)};
has = {(Mary, UML), (Andy, Javascript)}.</p>
      <p>The third step of modeling consists in the formalization of interpretation elements using the syntax
of description logic ALC. It should be noted that there might be multiple options for interpretation
elements representation and each of them might be valid, but their efficiency for achieving the goals of
modeling might vary. For example, the instance “Smart_city” belongs to the extensions of the atomic
concept Project and the compound concept Project ⊓ Technology, as well as the extensions of the
concepts like ¬Engineer and ⊤. The instance “Andy” belongs to the extensions of the concepts
Engineer, ∀has. Technology, ∃is_part_of.Project and many others. Moreover, all elements of the
interpretation domain always belong to the extension of the top concept “⊤” and at the same time, none
of them belong to the extension of the bottom concept “⊥”, which corresponds to the state of absurdity
in the systems of logic.</p>
      <p>The specified interpretation of the conceptual model of the domain of interest can be visualized in
the form of a directed graph, which facilitates the analysis of the relationships between the elements of
concepts extensions and simplifies the task of clusters identification:
•
•
•
•
•
•
•
•
•
•
•
properties of interpretation  [2]:
⊤</p>
      <p>= ( ⊔ ¬С) (the law of excluded middle, LEM);
⊥ = ( ⊓ ¬С) (absurdity);
(¬¬С) = C (double negation elimination);
¬( ⊓  ) = (¬ ⊔ ¬ ) (de Morgan’s law);
¬( ⊔  ) = (¬ ⊓ ¬ ) (de Morgan’s law);
(¬(∃ .  )) = (∀ . ¬ ) (variation of de Morgan’s law);
(¬(∀ .  )) = (∃ . ¬ ) (variation of de Morgan’s law).</p>
    </sec>
    <sec id="sec-4">
      <title>4. Knowledge bases in ALC</title>
      <p>The ability to construct formal descriptions of the domain concepts and to combine them using
concept language operators is an important, but not sufficient achievement. The sequences of syntactic
transformations and combinations of formal descriptions can be infinite, so having limited time and
memory resources, it makes sense to remember useful (by naming) and connected (by relations)
descriptions, to avoid repetitive calculations. As a result, the following syntactic constructions are
introduced to ALC description logic:
 ⊑</p>
      <p>≡ 

 : 
( ,  ): 
= C</p>
      <p>⊆  
= C ⊆   and   ⊆ C
=   ∈ С
= (  ,   ) ∈  
(concept inclusion);
(concepts equivalence);
(concept assertion);
(role assertion).</p>
      <p>These relations allow structuring the elements of the knowledge base in accordance with the classical
approaches to systematization and reuse of concepts and their properties. For example, the equivalence
operator ≡ establishes the meaning of the concept  through the description of the concept  , which
corresponds to the encyclopedic format of the concept’s definition. The representation of background
knowledge is conveniently carried out through the operator of concept inclusion ⊑, which marks that
concept  belongs to the more general concept  , and thus implicitly inherits from  some properties
and operational restrictions. Representation of concrete contexts requires the specification of concept
and role instances in the form of assumptions. The operator “:” establishes the affiliation of individual 
to the concept  and a pair of individuals ( ,  ) to the role  , respectively. The usage of the specified
operators is presented in the following examples:</p>
      <sec id="sec-4-1">
        <title>Technology ⊑ Methods ⊔ Tools</title>
      </sec>
      <sec id="sec-4-2">
        <title>Engineer ≡ Human ⊓ ∃has. Technology</title>
      </sec>
      <sec id="sec-4-3">
        <title>Mary ∶ Engineer</title>
      </sec>
      <sec id="sec-4-4">
        <title>UML ∶ Technology</title>
        <p>(Mary, UML) ∶ has</p>
        <p>In description logics, the knowledge base is usually divided into two structural parts: terminological,
TBox, and declarative, ABox. In general-purpose knowledge bases or terminologies, the declarative part
may be missing. The terminological part contains statements that describe the concepts and
relationships between them (⊑, ≡), while declarative consists of statements that describe individuals of
concepts and relationships between them in the form of roles (:). Comparing the structure of the
knowledge base and the database, it is possible to establish a significant similarity between the TBox
and the database schema, as both contain general limitations: the form of the conceptual model of the
domain and the form of the data, respectively. The ABox is similar to the database content because it
contains expressions related to the concrete instances of concepts and roles.</p>
        <p>Formally, TBox consists of a finite set of statements of the form  ⊑  , where  and  can be
descriptions of both atomic and compound concepts. If  ⊑  and  ⊑  , the notation  ≡  is used
for the ease of operating. Each of these statements is called a general concept inclusion (GCI) or an
axiom [2]. An interpretation  satisfies GCI if C ⊆   is satisfied. If interpretation  satisfies every GCI
containing TBox  , then it is called a model of  . The larger number of GCI TBox contains, the fewer
models exist for it, because for two   ,  ′, if  ⊆  ′ then each model  ′ is a model  . TBox 
divides a set of interpretations into two parts: a set of  models and a set of interpretations that do not
satisfy all GCIs in  . Thus,  focuses the modeling efforts exclusively on interpretations that meet
the designer’s intuitive expectations about the structure of the domain.</p>
        <p>The declarative part of the knowledge base, ABox, consists of a finite set of assertions, which are
represented in the form  :  , the concept assertion, or ( ,  ):  , the role assertion. It is assumed that an
interpretation mapping · maps every instance  to the element   ∈ ∆ . An interpretation  satisfies the
concept assertion  :  , if   ∈ С , and satisfies the role assertion ( ,  ):  , if (  ,   ) ∈   . If
interpretation  satisfies all concept and role assertions contained in the ABox  , then it is called a
model of  .</p>
        <p>The combination of TBox  and ABox  forms an ALC knowledge base  = ( ,  ), in which
concepts and their definitions are specified in the terminological part, and then used in the declarative
part. An interpretation  , which is both a model of  and a model of  , is called a model of  . A
knowledge base is similar to a database by structure, but they significantly differ in the approach of
ensuring data consistency. If the data is not present in the database, then its validity is considered as
“false” and can be interpreted as a violation of data integrity. For the knowledge base, it is quite
acceptable to have a description of a concept in TBox without a description of the concept instances in
ABox. This paradigm is called the “open world assumption” [6].</p>
        <p>The syntax of description logic ALC allows the creation of cyclic concept descriptions. For instance,
the concept  ≡  ℎ ⊓ ∀ .  is cyclic since the concept name
 appears in the description of its own definition. Such expressions can be satisfied by
multiple interpretations and can lead to ambiguity of solutions for some reasoning problems. This issue
can be mitigated by the introduction of a syntactic restriction, which increases the defining power of
TBox and introduces the notion of acyclic TBox. It is worth noting that the form of cyclicity of the
 concept is simple and obvious, but there are hidden forms of cyclicity that arise due to
the nesting of concept descriptions.</p>
        <p>Let the expression  ≡  be a definition of the concept  , where  is a possibly compound concept,
and  is a finite set of concept descriptions. If  ≡  ∈  and the concept name  appears in  , then 
directly uses  . The concept  uses  if  directly uses  , or there is a concept name  ′, such that 
uses  ′, and  ′directly uses  . A finite set of concepts descriptions  is called an acyclic TBox, if: 1)
there are no concept names in  that use themselves; 2) there are no concept names that appear more
than once in the left part of the definition of the concept in  . If  is an acyclic TBox with  ≡  ∈  ,
then  is called strictly defined in  , and  is called the definition of  in  .</p>
        <p>The importance of an acyclic TBox is that it can be unfolded into ABox by interpreting concept
descriptions as macros. The mechanism of the process is to recursively replace all concept names in
ABox with their definitions from TBox. Let  = ( ,  ) be a knowledge base with acyclic TBox, which
has the form  = {  ≡   | 1 ≤  ≤  }. Let  0 =  and   +1 be the result of the following
substitutions: 1) find some  :  ∈   , in which  directly uses   , 1 ≤  ≤  ; 2) replace in  all  
with   . If no more substitutions can be applied in   , then   is the result of unfolding  into  .
Unfolding is an important step in the process of solving many reasoning problems, but it can cause an
exponential explosion of ABox size. Therefore, it is important to remember the strategies of the lazy
and greedy concept descriptions unfolding.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Reasoning problems in ALC</title>
      <p>The structure of a knowledge base in description logics allows to model concepts, roles, individuals,
and relations between them. The number of statements in the knowledge base, the nesting of concept
descriptions, model discovery, establish the need for reliable and effective methods of logical reasoning,
which would allow obtaining answers about the important properties of a knowledge base. Let  =
( ,  ) is an ALC knowledge base,  and  are possibly compound ALC concepts, and  is an individual
name. Then, the basic reasoning problems inherent to the ALC knowledge base can be defined as
follows [2]:
•  is satisfied with respect to  if there exists a model  of  and some element  ∈ ∆ such that
 ∈   ;
•  is included in  with respect to  , and is written as  ⊨  ⊑  if C ⊆   in all models  of
 ;
•  is equivalent to  with respect to  , and is written as  ⊨  ≡  if C =   in all models 
of  ;
•  is consistent if there is a model of  ;
•  is an instance of  with respect to  , and is written as  ⊨  :  if   ∈ С in all models  of
 .</p>
      <p>The sign of the semantic logical consequence ⊨, in contrast to the sign of the deductive logical
consequence ⊢, is based on the inclusion of models of statements that it connects. When  is fixed,
inclusion and equivalence with respect to  are binary relations of possibly compound concepts, so to
denote this fact, the notations  ⊑  and  ≡  are used, respectively. The problems of satisfaction,
inclusion and equivalence are defined with respect to TBox only, but the problems of consistency
between the knowledge base and concept’s instance are defined with respect to both, TBox and ABox.
The consistency property of TBox only, or ABox only, is considered as a case of consistency of the
knowledge base of the specific form:  = ( , ∅) for TBox and  = (∅,  ) for ABox.</p>
      <p>The relation of concept inclusion ⊑ is transitive, and the ALC description logic is monotonic, i.e.,
the more statements the knowledge base contains, the more logical consequences it allows.
Equivalences that can be derived from every TBox are called tautologies, for example,  ⊨  ⊑  if
and only if  ⊨ ⊤ ⊑ (¬ ⊔  ). It is important to note that there are also hidden relationships between
the basic problems of reasoning, which contribute to the process of their solution. Let  = ( ,  ) is an
ALC knowledge base,  and  are possibly compound ALC concepts, and  is an individual name.
Then the following statements hold:
• C ≡T D if and only if C ⊑T D and D ⊑T C;
• C ⊑T D if and only if C ⊓ ¬D is not satisfied with respect to T;
• C is satisfied with respect to  if and only if C ⋢ ⊥;
• C is satisfied with respect to  if and only if (T, {b: C}) is consistent;
• (T, A) ⊨ b: C if and only if (T, A ∪ {b: ¬C}) is inconsistent;
• If T is acyclic and A′ is the result of the unfolding of T into A, then K is consistent if and only
if (∅, A′) is consistent.</p>
      <p>The outcome of the specified relationships between the reasoning problems is the fact that all the
basic problems can be reduced to a common one – the consistency of a knowledge base. Thus, the
presence of an efficient algorithm for calculating the consistency property of a knowledge base allows
solving the other basic reasoning problems. It is worth noting that there are also reasoning problems
that can’t be reduced to the problem of knowledge base consistency or cause an exponential increase of
the problem size during the reduction, i.e., query answering using conjunction.</p>
      <p>Knowledge base engineering usually involves gradual filling of the TBox and ABox with statements
about the domain of interest. Since the formal syntax, and concepts nesting, introduce some complexity
for an engineer on establishing the properties of a knowledge base, there is a need for the auxiliary tools
that will evaluate the consequences of knowledge base extension or modification. In description logics,
this task is performed by the reasoning services, which injectively correspond to the basic problems of
reasoning:
• Given a TBox  and a concept  , determine whether  is satisfied with respect to  ;
• Given a TBox  and the concepts  and D, determine whether  is included in  with respect
to  ;
• Given a TBox  and the concepts  and  , determine whether  is equivalent to  with respect
to  ;
• Given a knowledge base K = ( ,  ), determine whether  is consistent;
• Given a knowledge base K = ( ,  ), a concept  , and an individual name  , determine whether
 is an instance of  with respect to  .</p>
      <p>These descriptions are specifications of the reasoning services, which might be implemented in
multiple ways. Additionally, the basic reasoning services can be reused to create more intelligent
services that reveal the potential of an automated retrieval of a set of homogeneous objects and solving
reasoning problems about them. The “generalized” reasoning services include the following ones [2]:
• Classification of a TBox: given a TBox  , determine the hierarchy of inclusions of all concept
names that are present in  with respect to  . That is, for each pair of concept names  and 
that are defined in  , determine whether  ⊑  and whether  ⊑  . The result of the service
is an inclusion hierarchy, which can be conveniently presented in the form of a graph or Hesse
diagram for partial order;
• Checking the satisfiability of concepts in T: given a TBox  , for each concept name  ,
determine whether  is satisfied with respect to  . The dissatisfaction of a concept is the sign
of a modeling error;
• Instance retrieval: given a concept  and a knowledge base  , determine all the individual
names  such that  is an instance of  with respect to  . The result of the service is a set of
individual names that affiliates with the concept  ;
• Realisation of an individual name: given an individual name  and a knowledge base  ,
determine all the concept names  that are defined in  , for which  is an instance of  with
respect to  . The result of the service is a set of concept names that are the realisation of an
individual name.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Reasoning in ALC with tableaux algorithm</title>
      <p>Reasoning over the statements in a knowledge base is a process that facilitates the achievement of
several goals: 1) substantiation or refutation of the existing knowledge; 2) generation of the new
knowledge; 3) finding solutions to decision-making problems; 4) shared ontologies engineering [7]. In
logic, the reasoning process is considered as a “mechanized” syntactic calculus over the statements of
a formal system and consists in the application of deductive derivation rules or constructive properties
of the associated semantics. A reasoning method should possess three qualitative properties,
decidability, completeness, and soundness, to be considered a decision-making procedure in a system
of formal logic. Decidability requires the presence of proof that a reasoning method terminates for all
admissible kinds of input data, and potentially proof of the existence of an “effective” algorithm, which
is characterized by acceptable time complexity. The completeness requires proof that a “valid” result
will be returned for all admissible input data with the desired property, while soundness requires proof
that a “valid” result will be returned only for the admissible input data with the desired property.</p>
      <p>There are a variety of reasoning methods associated with the description logic ALC, which are based
on different calculus approaches. The most widespread reasoning methods are the resolution method,
sequent calculus, automata-based calculus, tableaux methods, query rewriting [8]. For the expressive
description logics, the tableaux methods are a default choice, i.e., the hyper-tableaux method is a core
of HermiT reasoner, which integrates as a module into the Protégé ontology development environment.
Since most of the basic reasoning problems in the description logic ALC can be reduced to the problem
of knowledge base consistency, the properties of the tableaux method as a decision-making procedure
will be formulated relative to the input data of the knowledge base consistency problem. The proofs of
the tableaux method properties are based mainly on the semantics of the description logic ALC, namely
the structural features of those models that can be expressed in the ALC concept language.</p>
      <p>By definition, a knowledge base  is consistent if there is a model  of  ; this model is called a
witness to the consistency of K and is denoted by  ⊨  . The core idea of tableaux methods is to prove
the consistency of a knowledge base  = ( ,  ) by constructing a model  of  . A knowledge base 
is inconsistent, if during the construction of a witness to the consistency of  some obvious
contradictions are revealed that refute the possibility of the existence of model  of  . The process of
witness construction starts with an  and, if necessary, extends it with the outcomes of restrictions that
are imposed by the semantics and the axioms contained in  or  . The efficiency and validity of tableaux
methods are ensured by the structure of a witness they are designed to build, namely the tree model.
The tree model significantly limits the number of potential witnesses (efficiency), facilitates detection
of structural loops in tree branches (decidability), and allows proving the existence of possibly infinite
models using the method of structural induction (soundness). Moreover, the proof of absence of a tree
model for a knowledge base  indicates the absence of any model for  (completeness).</p>
      <p>Tableaux method’s principles can be conveniently analyzed in the context of three situations: for
 = (∅,  ), for  = ( ,  ), where  is acyclic, and in the general case for  = ( ,  ). Since all the
concepts contained in  or  can be transformed to a negated normal form using de Morgan’s laws or
duality between the quantifiers of existence and universality, the expression ¬̇ С will correspond to the
negated normal form of ¬ . An ABox  is considered normalized if the following conditions are met:
• all concepts contained in  or  are in the negated normal form, so the negation operator is
applied only to the concept names and is not preceded by parentheses, quantifiers, or operators;
• each individual name contained in  is present in at least one statement of the form  :  ;
•  is not empty, thus contains statements.</p>
      <p>For the case  = (∅,  ), when TBox is empty, the expansion rules mirror the semantics of
compound concepts and apply it to the statements in  . Since the main principle of the tableaux method
is to decompose compound concepts into sub-concepts, there always exists an iteration, where all
compound concepts are completely decomposed into atomic ones, and the algorithm terminates. Each
satisfied ALC concept has a tree-like model, and since the concept instances can be related by roles, a
consistent ALC knowledge base  has a forest-like model. If  is consistent, then it has a model that
consists of a nonempty set of disjointed trees, where the root of a tree represents some individual name
in  and can be connected by edges with the roots of other trees. The tableaux method builds a
forestlike model of an ABox  using syntactic expansion rules, which extend  with the new statements only
if their conditions are met.</p>
      <p>The tableaux method applies the expansion rules to the ABox statements until  is complete because
it is easier to detect logical contradictions or clashes in the complete ABox. An ABox  contains a clash
if for some individual name  and for some concept  , { :  ,  : ¬ } ⊆  ; if  does not contain a clash,
then  is called clash-free.  is called complete if it contains a clash or none of the expansion rules can
be applied due to unmet conditions.</p>
      <p>Since any ALC ABox can be transformed to a normalized form, without reducing the generality, the
algorithm for determining the consistency of an ABox will take as an input a normalized ABox. Then,
the tableaux algorithm for ALC ABox consistency can be presented via the following pseudo-code [2]:
Algorithm  ( ):</p>
      <p>If  ( ) ≠ ∅
Then return «consistent»</p>
      <p>Else return «inconsistent»
Procedure  ( ):</p>
      <p>If  is not complete
Then select expansion rule  , whose conditions are satisfied in  , and select few
statements  from  , which are eligible for  application
If exists  ′ ∈ exp( ,  ,  ), such that  ( ′) ≠ ∅</p>
      <p>Then return  ( ′)</p>
      <p>Else return ∅
Else</p>
      <p>If  contains a clash</p>
      <p>Then return ∅</p>
      <p>Else return</p>
      <p>Here, the auxiliary function exp( ,  ,  ) returns a set of ABox, which are formed by the application
of rule  to the statement  in  . All the expansion rules, except the ⊔-rule, are deterministic and return
a set with one ABox. Depending on the way of its application, the ⊔-rule can generate two different
ABox, and it is mandatory to verify their consistency. The way of selecting the sequence of expansion
rules  and statements  does not affect the result of the tableaux method but can drastically affect its
performance.</p>
      <p>The tableaux method for establishing the consistency of a knowledge base  = ( ,  ), where  is
an acyclic TBox, is largely based on the expansion rules and procedures used for  = (∅,  ). This
property is due to the fact that an acyclic TBox can be entirely unfolded into ABox  , and it actually
reduces the current consistency problem to the previous case. The greedy strategies of unfolding  into
 can lead to an exponential increase of the knowledge base size, so it is recommended to use the “lazy”
unfolding strategies, which unfold concept descriptions only when necessary [9]. Below are the lazy
unfolding rules, which together with the previous expansion rules form the basis of a tableaux method
for establishing the consistency of a knowledge base with an acyclic TBox.</p>
      <p>Establishing the consistency in the general case of a knowledge base  = ( ,  ), where  may
contain cyclic concept descriptions, requires some auxiliary constructions that will ensure decidability.
Without reducing the generality, the input parameter of a tableaux algorithm is a normalized knowledge
base  = ( ,  ) that consists of a normalized TBox  and a normalized ABox  . A TBox  is called
normalized if all its axioms have the form ⊤ ⊑  , where  is reduced to the negated normal form.
Reduction of axioms to the expected form can be carried out using the following properties:
•  satisfies  ⊑  ⟺  satisfies ⊤ ⊑  ⊔ ¬С;
•  satisfies  ≡  ⟺  satisfies ⊤ ⊑ ( ⊔ ¬С) ⊓ ( ⊔ ¬ ).</p>
      <p>The set of expansion rules for a knowledge base consistency consists of five elements, and three of
them, namely ⊓-rule, ⊔-rule, ∀-rule, are completely identical to the ABox expansion rules. The other
two rules are also similar to their counterparts but have some important changes: ⊑-rule uses a
normalized form of TBox axioms, while ∃-rule contains an additional requirement for a “non-blocking”
individual, which is a property that ensures decidability or algorithm termination.</p>
      <p>A termination problem of the tableaux algorithm for general knowledge base consistency is caused
by the possibility to combine ⊑-rule and ∃-rule, which without the blocking mechanism can generate
the infinite repetitive sequences of individuals [10]. If an ABox does not contain clashes, then its model
can describe the infinite sequences using loops. During the execution of the tableaux algorithm, only
the ∃-rule adds new vertices (individuals) and edges (roles) to the ABox tree model. If the ∃-rule adds
a new individual  and a new role assertion ( ,  ):  , then  is the successor of  and  is the predecessor
of  . The transitive closure of roles between successor and predecessor creates the notions of descendant
and ancestor, respectively. Let   ( ) = {  |  :  ∈  } is a set of concept names that are present in
the concept assertions of the form  :  . The individual name  is blocked in ALC ABox  if the
following conditions are met: 1)  is the ancestor of  ; 2)   ( ) ⊇   ( ). Thus, if an individual
name is blocked, then all its descendants are also blocked. Since root individuals don’t have ancestors,
they are never blocked.</p>
      <p>Example 2. Verify the consistency of an ALC knowledge base  = ( ,  ) of the “Mass media”
domain using the tableaux method. The structure of the knowledge base is as follows:
 = {</p>
      <sec id="sec-6-1">
        <title>News ⊑ ∀has. Author;</title>
      </sec>
      <sec id="sec-6-2">
        <title>Author ≡ Human ⊔ Organization ⊔ Bot;</title>
      </sec>
      <sec id="sec-6-3">
        <title>Bot ⊑ ¬Human ⊓ ∀has. Author;</title>
        <p>Organization ⊑ ∃has. Human;
};
 = {
a: News;
(a, b): has;
b: Bot;
(b, c): has;
c: Organization;
}.</p>
        <p>Solution. The first step consists in a normalization of the knowledge base, which serves as an input
argument to the tableaux method. Note that the ABox is already normalized, but the statements from
TBox still require normalization, which can be performed using the properties stated above and de
Morgan’s laws:
 = {
⊤ ⊑ ∀has. Author ⊔ ¬News;
⊤ ⊑ (Human ⊔ Organization ⊔ Bot) ⊔ ¬Author;
⊤ ⊑ (¬Human ⊓ ∀has. Author) ⊔ ¬Bot;
⊤ ⊑ ∃has. Human ⊔ ¬Organization;</p>
        <p>Upon providing the normalized knowledge base to the tableaux algorithm, its execution begins. At
each iteration, the algorithm selects an expanding rule and a statement (or a set of statements) that
satisfy its conditions. Let’s select ⊑-rule and apply it to the statement ⊤ ⊑ ∀has. Author ⊔ ¬News.
The specified statement belongs to T, a: News ∈ A, and a: ∀has. Author ∉ A, thus a new concept
assertion is added to ABox A = A ∪ {a: ∀has. Author}.</p>
        <p>Let’s apply the ⊑-rule to the rest of the statements contained in the TBox. The statement ⊤ ⊑
(Human ⊔ Organization ⊔ Bot) ⊔ ¬Author does not satisfy all conditions of the ⊑-rule, namely
d: Author ∉ A. The other statements do satisfy the rule conditions, so they can be expanded. After an
application of the ⊑-rule to the statements in T, the ABox A is as follows:
 = {
a: News;
(a, b): has;
b: Bot;
(b, c): has;
c: Organization;
a: ∀has. Author;
b: ¬Human ⊓ ∀has. Author;
c: ∃has. Human;
}.</p>
        <p>Next, let’s apply the ∃-rule to the statement c: ∃has. Human. Since the individual name c ∈ A and
corresponds to the root in a tree model, then c is not blocked. Because ∄d, {(c, d): has, d: Human} ⊆ A,
the statement c: ∃has. Human can be expanded by ∃-rule. As a result, the following statements are
added to A: (c, d): has and d: Human, where  is the new individual name in A.</p>
        <p>= {
a: News;
(a, b): has;
b: Bot;
(b, c): has;
c: Organization;
a: ∀has. Author;
b: ¬Human ⊓ ∀has. Author;
c: ∃has. Human;
(c, d): has;
d: Human;
}.</p>
        <p>At the next iteration, let’s select the ⊓-rule and apply it to the statement b: ¬Human ⊓
∀has. Author. Both conditions of the ⊓-rule are satisfied, b: ¬Human ∉ A and b: ∀has. Author ∉ A,
therefore A = A ∪ {b: ¬Human, b: ∀has. Author}. Then, let’s apply the ∀-rule to the statements
a: ∀has. Author and b: ∀has. Author. Both of them satisfy the conditions of ∀-rule, (a, b): has ∈ A,
b: Author ∉ A and (b, c): has ∈ A, c: Author ∉ A, thus A = A ∪ {b: Author; c: Author}. After the
application of ⊓-rule and ∀-rule, the ABox A is as follows:
 = {
a: News;
(a, b): has;
b: Bot;
(b, c): has;
c: Organization;
a: ∀has. Author;
b: ¬Human ⊓ ∀has. Author;
c: ∃has. Human;
(c, d): has;
d: Human;
b: ¬Human;
b: ∀has. Author;
b: Author;
c: Author;
}.</p>
      </sec>
      <sec id="sec-6-4">
        <title>Now let’s apply the ⊑-rule to the statement ⊤ ⊑ (Human ⊔ Organization ⊔ Bot) ⊔ ¬Author twice, considering that {b: Author; c: Author} ∈ A. Therefore, A = A ∪ {b: Human ⊔ Organization ⊔</title>
        <p>Bot; c: Human ⊔ Organization ⊔ Bot}. Conditions of the ⊔-rule are not satisfied because b: Bot ∈ A
and c: Organization ∈ A. The other rules are not applicable to the statements in A or T, so A is complete.
Moreover, A does not have any explicit clashes of the form {x: D; x: ¬D} ⊆ A, so A is clash-free.</p>
        <p>Answer. The ALC knowledge base  = ( ,  ) of the “Mass media” domain is consistent. The
tableaux method has created a witness to the consistency of the knowledge base  , namely a model 
of  . Its visual representation contains all individuals and the atomic concepts that they belong to, as
well as the roles that connect them.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>System analysis of the description logic ALC, which acts as a basis for more expressive description
logics like SROIQ(D), identified the main principles of knowledge representation in the description
logics, their model-theoretic semantics, the notion and structure of knowledge bases, core reasoning
problems and reasoning over them with the tableaux method. An application of the ALC formalism to
the representation of the “Mass media” domain highlighted the peculiarities of knowledge specification
and the inference of implicit knowledge from the existing one. The benefits of automated reasoning are
especially noticeable in large and complex knowledge bases.</p>
    </sec>
    <sec id="sec-8">
      <title>8. Acknowledgements</title>
      <p>The authors would like to thank Franz Baader, Ian Horrocks, Carsten Lutz, and Uli Sattler for writing
“An Introduction to Description Logic” [2]. Many of the formal definitions and notations associated
with the described logic are given in this excellent textbook.</p>
    </sec>
    <sec id="sec-9">
      <title>9. References</title>
      <p>[1] W3C, OWL 2 Web Ontology Language Direct Semantics (Second Edition), 2012. URL:
https://www.w3.org/TR/owl2-direct-semantics.
[2] F. Baader, I. Horrocks, C. Lutz, U. Sattler, An introduction to description logic, Cambridge</p>
      <p>University Press, Cambridge, 2017. doi:10.1017/9781139025355.
[3] A. Tarski, Pojęcie prawdy w językach nauk dedukcyjnych, Prace Towarzystwa Naukowego</p>
      <p>Warszawskiego, Wydział III Nauk Matematyczno-Fizycznych, volume 34, 1933.
[4] The Data and Knowledge Group at the Oxford University, HermiT OWL Reasoner. URL:
http://www.hermit-reasoner.com.
[5] Stanford Center for Biomedical Informatics Research at the Stanford University School of
Medicine, Protégé: a free, open-source ontology editor &amp; framework for building intelligent
systems. URL: https://protege.stanford.edu.
[6] C. M. Keet, Open World Assumption, in: W. Dubitzky, O. Wolkenhauer, K. H. Cho, H. Yokota
(Eds.), Encyclopedia of Systems Biology, Springer, New York, NY, 2013, p. 1567.
doi:10.1007/978-1-4419-9863-7_734.
[7] R. Brachman, H. Levesque, Knowledge Representation and Reasoning, Morgan Kaufmann</p>
      <p>Publishers Inc, San Francisco, CA, 2004.
[8] F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, P. F. Patel-Schneider, The description logic
handbook: theory, implementation, and applications, 2nd. ed., Cambridge University Press,
Cambridge, 2007. doi:10.1017/CBO9780511711787.
[9] D. Scott, The lattice of flow diagrams, in: E. Engeler (Ed.), Symposium on Semantics of
Algorithmic Languages, volume 188 of Lecture Notes in Mathematics, Springer, Berlin,
Heidelberg, 1971, pp. 311–366. doi:10.1007/BFb0059703.
[10] H. Levesque, R. Brachman, Expressiveness and tractability in knowledge representation and
reasoning, Computational Intelligence, volume 3, issue 1, (1987) 78–93.
doi:10.1111/j.14678640.1987.tb00176.x.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>