<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>From ER to Ontology with Natural Language Text Generation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Csaba Veres</string-name>
          <email>Csaba.Veres@idi.ntnu.no</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jennifer Sampson</string-name>
          <email>sampsonj@idi.ntnu.no</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Clare Atkins</string-name>
          <email>catkins@nmit.ac.nz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Business and Computer Technology Nelson Marlborough Institute of Technology Nelson</institution>
          ,
          <country country="NZ">New Zealand</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We describe the automation of a novel technique (NaLER) which was originally designed to facilitate legacy database model validation. The NaLER technique uses natural language sentences built from live database content to elicit validation judgments from domain experts. However, during implementation we discovered that the method we had adopted for the automation had a serendipitous side effect in that the legacy model first had to be mapped to an upper ontology. This normally difficult process was significantly eased by the sentence templates which are defined as part of the NaLER technique. It is this novel process of mapping, and the choice of ontology it entails, which forms the focus of the paper. We therefore describe here the process of mapping to the upper model, and investigate how the motivation for modeling impacted on the choice of modeling language. Finally we describe the prototype of a tool and how it fits with the development methodology.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The NaLER (Natural Language for ER) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] method is designed to help with
data schema validation by exposing users to natural language expressions about
instance data, allowing them to comprehend the way actual data is represented
in a live database. Since database schemas represent agreed meanings about the
data, user validated data models can be used as a valuable input to the
creation of a consensual ontology for corporate knowledge management. Mapping a
database to an ontology may be necessary when an organisation wishes to define
a common understanding of the structure of all data sources in the domain. The
resulting ontology conceptualises and structures the domain knowledge within
a community of interest. Ontologies are useful for making explicit logical
statements about a domain and for reasoning about those statements.
      </p>
      <p>
        Creating a quality ontology based on legacy databases can be difficult,
especially when the underlying business rules are incorrect or no longer valid. We
suggest that the NaLER method for maintenance and verification is useful in
clarifying business rules that are important in ontology creation. With this
assumption in mind we describe first the techniques that help ontology engineers
revisit and verify business rules represented in the legacy data model, and second
we provide a method and implementation for mapping a relational model
represented by an entity relationship model or extended entity relationship model
(E-R/R) to the Penman Upper Model [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], an abstract domain independent
upper ontology which we argue is ideal given the requirements of the task. We will
discuss this upper model more fully in subsequent sections.
      </p>
      <p>
        Originally developed to fill a gap in the relational database design process
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], NaLER is an appropriate tool for identifying the semantics of existing data
structures. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] suggests that a major issue is that “...[d]ata models...are so poor
at ’capturing the meaning’ and people are so effective at intuitively
accommodating these weaknesses that [people] project meaning onto the data structures
rather than abstracting meaning out of them”. NaLER was intended to provide
a formal way of capturing the meaning represented by the database constructs
and thus to reduce, if not eliminate, the assumptions and individual
interpretations that can be projected on to them. Although several E-R/R methods
have created mechanisms for presenting model information as natural language
sentences (e.g. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]; [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]), they are concerned primarily with interpreting the
relationships between entities. NaLER extends the principle by requiring more
detail in these ’relationship’ sentences, specifically primary key information, and
by constructing such sentences for all the objects in the data model [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The
construction of NaLER sentences can also assist in determining semantic
equivalence between two different structural representations together with their
correspondence to ’facts’ in the universe of discourse that they are describing, a
necessary pre-requisite to the successful mapping of pre-existing data structures
to an ontology. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] suggested that ”two data bases are equivalent if they represent
equivalent facts about a certain slice of reality.” However as they also conclude,
this necessitates relying on ”...a common understanding of natural language”.
Any reliance on natural language interpretations of such structures while
increasing accessibility will also introduce ambiguity and thus lead once again to
the situation described by [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        To assist in minimising this problem, NaLER, drawing on the NIAM-CSDP
[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] method which it was designed to complement, advocates the use of ’example
data’ to instantiate the sentences for the purpose of semantic validation. As a
database design aid, it was intended that the examples used would be those
collected during the initial stages of requirements analysis. However, to create
a meaningful description of existing database objects, data stored within those
objects would be required. This technique, a form of NIAM-CSDP in reverse
was tested by [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and found to be effective in providing a means by which users
could evaluate the validity of their implemented databases. Use of the NaLER
technique thus provides for the creation of a set of natural language example
sentences, of minimised ambiguity, using the terms of the original data model,
the names of existing database objects and sets of data extracted from these
objects. These sentences can thus be used to both validate the data structures
themselves and to check for consistency with the universe of discourse and other
structural representations of it. As such we believe that it can facilitate the
mapping of a data model to an ontology and its integration with other ontologies.
      </p>
      <p>Exposure of NaLER as a technique has been limited because up until now
NaLER was not implemented in a tool, requiring that the large number of
exemplar sentences be manually constructed. The current work began with an
implementation of NaLER with the aim to allow the full potential of the technique
to be realised. But instead of attempting a somewhat mindless, template based
algorithm for generating sentences from instance data, the implementation we
decided on is novel in that it used a flexible, domain independent ontology-based
system for sentence generation in order to produce the sentences. This resulted
in a two-step process. First the data model is mapped to an upper ontology,
which is then used to generate sentences of the instance data by the system. The
first step is performed by a database administrator, who can use the mapping
phase to perform an initial verification on the data model, by noting if any of
the mappings are suspect. The second phase is fully automated and outputs
an arbitrary number of natural language sentences for domain expert validation.
While this process may appear overly complex, we feel it adds two benefits. First,
all legacy databases are mapped to the same upper model, thereby facilitating
interoperability. Second, the legacy database is validated by end users before it
is mapped to an ontology.</p>
      <p>The paper is structured as follows. In the following section we outline the
steps for generating the NaLER sentences. In section 3 we introduce a sentence
generation system that will be used for producing the NaLER sentences. In this
section we will also consider the ontology which forms our upper model. We argue
that the ontology is both useful as an upper model and necessary given the goals
of sentence generation. In section 4 we will discuss the process of mapping data
models to the upper model. In section 5 we describe an implementation, and
conclude in section 6.
2</p>
    </sec>
    <sec id="sec-2">
      <title>NaLER steps</title>
      <p>
        NaLER is designed for use with a relational data model, represented
diagrammatically by an ER/EER diagram. (It is important to note that we are assuming
the existence of a data model which corresponds to the database. In the absence
of such a model, some amount of reverse engineering is clearly required before
the NaLER method can be applied). The primary objective is to provide a way
of translating relational data models into a format that can be verified by users.
This is important on its own as the ability to understand and read accurately
the information content of EER models, has a much wider application. It is a
fundamental skill required by any person involved with EER models in almost
any capacity. Not only the modellers themselves and the users whose
requirements have been sought, but other end users, such as domain experts, auditors,
systems analysts, database designers and administrators, also have a need to
read a model. However, it can be notoriously difficult for non experts to read
ER models accurately, which in itself makes NaLER a useful technique. Table 1
summarizes the seven steps of the NaLER method (from [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]).
      </p>
      <p>The following discussion will briefly describe the most relevant steps and
illustrate using the example student data model fragment in figure 1. Note that the
model is highly simplified and does not, for example, include entity attributes.</p>
      <p>
        [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] describes a set of pre-requisites and assumptions for EER models with
respect to the different CASE tools used. These assumptions are to be documented
and checked in steps 1 and 2. An example of one such assumption is as follows:
Relationships are optional unless clearly annotated as mandatory? (refer to [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
for a full list of model assumptions).
      </p>
      <p>Below we illustrate some key steps in the process. Step 3 is concerned with
extracting the sentences that relate to the simple entities within the model by
completing three subtasks (note Sn stands for “Sentence number”). Step 4
concerns hierarchically related entities. Step 6 involes populating the abstract
sentences with instance data.
3. For simple entities
(a) Construct the sentence for the primary key:</p>
      <p>Sn: Each Ename is uniquely identified by Eprimary key .</p>
      <p>Sentence 1: Each Student is uniquely identified by Student no.
(b) Construct the sentence for the attributes:</p>
      <p>Sn: Each Ename (Eprimary key) must have only one Eattribute name .
Sentence 2: Each Approved P rogram (Studentno, P rogram code) must
have only one Head Approval.
(c) Construct the sentence for the relationships, so that for each binary
relationship that the entity participates in, we construct two sentences.</p>
      <sec id="sec-2-1">
        <title>Sn: Each E1,name ( E1,primary key ) Roptionality Rname E2,name ( E2,primary key ).</title>
        <p>Sentence 3: Each Student (student no) may enrol in many
Approved P rograms (Student no, P rogram code).</p>
      </sec>
      <sec id="sec-2-2">
        <title>Rcardinality</title>
        <p>SnR: Each E2,name ( E2,primary key ) Roptionality Rname Rcardinality
E1,name ( E1,primary key ).</p>
        <p>Sentence 3R: Each Approved P rogram (Student no, P rogram code) must be
enrolled by at least one student (student no).
4. Construct super/sub-type sentences, so that for each subtype entity we
construct a sentence as:
Sn: Each Sub Ename ( Eprimary key) is a Super Ename ( Eprimary key ).
Sentence 5: Each Student (Student no) is a P erson (P erson N umber) (not
shown in the model due to space constraints).
6. Populate the sentences with examples:</p>
        <p>This step is where the sentences are populated with valid examples. In some
cases there is a paradigmatic change to reflect the instantiated nature of the
sentences. For example sentence 3 becomes
Student (111) may enrol in Approved P rogram (111, 666).</p>
        <p>An important point to reiterate is that the domain experts are only expected
to check and validate the natural language sentences produced in step 6. In the
next sections we describe how we implement step 3, which needs input from a
database administrator.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The KPML Sentence Generation System</title>
      <p>The Komet-Penman Multilingual (KPML) 1 natural language generation system
was originally designed as a tool for exploring natural language grammar, but
also as a domain general system which could be used to provide plugin
functionality for system developers who needed to add text generation capabilities
to their applications. The key to this functionality is the Upper Model (UM), an
interface ontology that mediates between the formal components that determine
the grammatical expressions, and domain specific conceptual entities from
independently constructed application ontologies. The basic procedure, then, is to
find an appropriate UM concept to subsume each domain concept, which results
in the domain concept inheriting the necessary features for text generation in
the KPML system.</p>
      <p>
        The UM is a linguistically motivated upper ontology that contains only
concepts which have an impact on grammatical expression. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] argues that the UM is
suitable as a general upper ontology for organizing domain knowledge because it
provides a theoretical framework for the ontology structure. But its framework
is unique because the UM differs both from highly targeted domain specific
ontologies and from abstract, domain independent upper ontologies. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] argues
that if different ontologies are constructed for very different and highly specific
domains and reasoning tasks, then this results in highly diverse and possibly
irreconcilable ontologies. Hence the need for some unifying framework. But large,
general purpose ontologies that are meant to serve a diversity of domains and
reasoning tasks are too underspecified for this task: the requirement to simply
“represent the world” is not sufficient to construct a useful organization of the
knowledge. The linguistic UM is offered as a solution because its construction
derives from the most general but still formally specifiable task that is common to
all domains: expression of knowledge about the domain in natural language. The
interesting theoretical claim is that the concepts which are needed for generating
natural language can unify the conceptual spaces of different subject domains.
This seems intuitively plausible because natural language is clearly the most
expressive general language for expressing domain facts, and its conceptual core
a good candidate for a unifying ontology. It is this property that makes the UM
a suitable foundation for the conversion of legacy data models into ontologies.
      </p>
      <p>To the extent that upper ontologies are useful, there is considerable debate
about which might be the most appropriate one to use. For example both SUMO2
and Cyc3 offer very comprehensive but differing ontologies with their
corresponding upper ontology structures. And, while both can be used in natural language
applications, neither has the feature that the ontology is a direct reflection of
the possible linguistic expressions. In designing the current application we found
that the directness of the mapping from the data model to the ontology to the
natural language sentences made the implementation relatively straightforward.
1 http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/README.html
2 http://www.ontologyportal.org/
3 http://www.cyc.com/</p>
      <p>The Upper Model which is used in KPML 4.0 consists of 410 concepts
organized in a single hierarchy. Part of the top level of this hierarchy is shown in
figure 2.</p>
      <p>OBJECT
UM−THING</p>
      <p>PROCESS
QUALITY</p>
      <p>CONSCIOUS−BEING
NON−CONSCIOUS−THING
DECOMPOSABLE−OBJECT
NONDECOMPOSABLE−OBJECT
NAMED−OBJECT
MATERIAL−PROCESS
MENTAL−PROCESS
RELATIONAL−PROCESS
VERBAL−PROCESS
MATERIAL−WORLD−QUALITY</p>
      <p>LOGICAL−QUALITY</p>
      <p>It may be noted that 410 concepts is relatively few for representing all
possible world concepts. However, it is an empirical question whether or not it
is sufficient to express required concepts in their veridical linguistic form, and
whether such an ontology is useful for facilitating inter-operation between data
models which have had subsumption relations defined against the same upper
ontology. Of course, the reader should remember that the nodes which are formed
in the ontology will include the properties that are transferred from the entity
attributes, which will be important in questions of interoperability. Nevertheless,
the interesting empirical and methodological point is that the choice of modeling
constructs and methods is determined to a large extent by the requirements of
the task.</p>
      <p>
        One unusual feature of the ontology is the ubiquitous use of reification for
modeling relationships between concepts. Inherently relational concepts like
behind, under, similar, are modeled as concepts subsumed by two-place-relation.
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] justify this decision by noting that knowledge representation languages
typically provide mechanisms for defining concepts and relations among them, but
“... in Natural Language this distinction is often blurred, so that it is not always
clear whether a concept or a relation should be used to represent a given
property” [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] (p. 6). For example the spatial locating concept behind would normally
be taken as a relation holding between two concepts. Instead, the upper model
defines it as a regular concept by reifying the relation, then defining the domain
and range as the participants in the relationship. This impacts on the way ER
concepts are mapped to the UM since all terms are mapped to UM concepts.
On the other hand this eliminates the pesky indeterminacy and confusion since
the modeler does not need to worry if a data model construct should be mapped
to an ontological concept or to a relation.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Mapping Domain Concepts to the Upper Model</title>
      <p>This phase of the process is responsible for ensuring that the correct sentences are
generated, and for mapping the data model onto the upper ontology structure.
As we will see, the sentence generation has two complementary roles. First, it
generates canonical sentences for verification by domain experts. Second, the
mapping necessary for generating the correct surface form of the expression
actually guides the selection of the most appropriate ontology concept.</p>
      <p>The mapping was conducted manually by first narrowing the possible UM
concepts and then selecting the one that provided the closest matching output.
The procedure then is to select a candidate UM concept as subsumer, generate
the sentence, and compare the target with the generated sentence. If there is
a mismatch, the candidate is refined, most commonly by choosing a super- or
sub-class of the original candidate. The interesting part of this process is to
use the generated linguistic string to arrive at the most suitable conceptual
mapping. The assumption is that the empirically derived linguistic constraints
on concept/text correspondence as embodied in the UM and KPML are to a large
extent veridical. It is a matter of design principle that the ontological components
of the Upper Model exist because they produce canonical surface forms for the
available conceptual specifications. It therefore follows that we have identified
the most acceptable ontological mapping when the generated form matches the
target form4.</p>
      <p>
        As an example consider in figure 1. the relationship between Staff and
Offering, expressed by the target NaLER sentence “each Staff may run one or more
Offering”. First we note that Staff is categorized simply because it is subsumed
quite naturally by the UM concept Person, and there is no justification for
trying the more specific concepts female and male that are defined in the UM.
The relation in the model is named by a verb as typically prescribed [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], in this
case run. Verbs in natural language introduce predicates that typically denote
an action, occurrence, or a state of being 5, which gives us some clue about its
ontological status. To determine which UM concept the relationship name falls
under, we note that the initial division in the UM is between object, process, and
quality. Of these, it is clear that object or quality are not appropriate, leaving
process or one of its subclasses as the only possible subsumer. Further, since all
relationships in the UM are reified, it turns out that every concept
representing a relationship is itself subsumed by process, or more specifically by the UM
4 This is not to claim that the Upper Model ontology is a universally “correct” ontology
in some sense. On the contrary, like all empirical artifacts it undergoes constant
evaluation and revision. Nevertheless this evaluation is theory driven and traceable.
5 This is a more or less linguistically un-sophisticated description as commonly
construed, e.g. in http://en.wikipedia.org/wiki/Verb. The more technical syntactically
motivated definitions are, however, not necessary for our purposes.
concepts one-place-relation and two-place-relation which are subsumed by
process. So naturally all relationships in ER diagrams should map to one of these
concepts.
      </p>
      <p>One strategy for finding the appropriate mapping is to begin with process or
one of its most general subclasses representing a relation. In this case “running
an Offering” seems like a process undertaken by the staff member so we begin
with process. But this generates the surface string “Each Staff may run”. Any
mention of the Offering is suppressed because the process is assumed to be
simply “running”. In order to bring in the affected entity we descend two levels
in the hierarchy to the concept directed-action which is the first to introduce an
ACTEE role. This time KPML generates the sentence “Each Staff may run at
least one Offering.” which is equivalent to the target sentence.</p>
      <p>Finally, the entity “Offering” needs to be mapped to an appropriate UM
concept. The astute reader may have realized that an initial assignment must
have been made for “Offering” in order to generate the target sentences. In fact,
this means that the choice of assignments is often iterative because the surface
form is a function of all assignments. Ideally the assignments would be made
in parallel, simultaneously satisfying all requirements. In this case the default
assignment was the most general UM concept, UM-THING, which was obviously
sufficient to produce the required target. However, we try other assignments
because it is somewhat unsatisfying to remain with such a general one, and
because we must ensure that the generated string will not be changed with more
specific assignments. But it is not natural to construe of “Offering” as either an
object, process, or quality, so we explore deeper in the ontology. We note that the
object hierarchy is associated with the following somewhat non-committal gloss:
”An entity which is not a process or a quality.”6 Object therefore subsumes
concepts such as ordered-set, time-interval, and non-conscious-thing. Of these,
the latter immediately subsumes abstraction which is a reasonable candidate
to subsume “Offering”. In making this assignment we find the target sentence
generated as required.</p>
      <p>Consider now a somewhat more difficult example with the relationship
between student and ApprovedProgram: the target NaLER sentence is “Each
Student must enrol in at least one ApprovedProgram”. As in the previous example,
Student is subsumed by the UM concept Person. But “enrol-in” is somewhat
trickier. One possibility is to try the previously successful assignment to
directedaction, where the “Student” acts on the “ApprovedProgram” (for now assigned
to UM-THING). However this generates the following surface form: “A Student
enrols an ApprovedProgram.” From this we see that a directed-action must
describe an event in which the actor somehow causes something to happen to the
actee. This was appropriate only in the previous example. A further option is
to map to nondirected-action which are “... those material-actions which either
have no ‘actee’, or whose ‘actee’ is not created or affected by the action” (LOOM
6 Definitions can be found in the LOOM knowledge base available at
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/resources/merged-uppermodel.zip
knowledge base). While this seems plausible, it creates a further problem because
the “ApprovedProgram” can no longer be the actee. Browsing the UM
ontology reveals that spatio-temporal relationships receive prominent representation,
and suggests that an “ApprovedProgram” can be regarded as temporal entity
which contains enrolments. This assignment generates the correct target
sentence and has the interesting side effect that it alerts the data modeler that
perhaps “ApprovedProgram” should have a property for the dates during which
it was offered, which is another aspect for verification.</p>
      <p>Let us briefly mention another interesting aspect to this method. It is of
course entirely possible that the relationship in the original data model had
been labeled with a different word like, say, take: “Each Student must take at
least one ApprovedProgram.” In this case the assignment of “take” to
directedaction does generate the NaLER sentence: ”Each STUDENT must take at least
one ApprovedProgram.” But the difference in the mappings to the UM concepts
should alert the architect to some subtle differences in the possible
interpretations of the documented models. We just argued that “ApprovedProgram” is
best viewed as a sort of container for “Students”, with possible attributes to
reflect this role. The alternative designation, in analogy with “Staff - runs
offering”, suggests a more active role in which the actor has a primary effect on
the actee. Of these two alternatives the former seems more appropriate,
showing that the structure of the ontology highlights possible design decisions and
inconsistencies in the data model.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Implementation</title>
      <p>The specification that mediates between the domain ontology, the UM and the
rest of the KPML grammar and generation system is called the Sentence Plan
Language (SPL). Figure 3 shows a simple specification for the logical component
of the example “Each Staff may run at least one Offering.”
(EN / DIRECTED-ACTION :LEX run
:ACTOR (P1 / PERSON :LEX staff :set-totality-q total)
:ACTEE (P2 / abstraction :LEX offering :AT-LEAST 1)
:modality may )</p>
      <p>In this example the surface form of lexical items is directly introduced by
:LEX in the SPL expression, and its UM type by the expression immediately
preceding. But it is also possible to define a lexicon which contains this
information, in which case only the word needs to be specified in the SPL.</p>
      <p>Note that the implementation of the generation mechanism falls somewhat
short of the intended goal, in that the SPL must often contain some
information outside the UM to produce the desired surface form. The SPL in figure 3
is quite simple, and some examples require even more non-UM terms to
generate the correct sentence. This requires that application developers should know
something about the details of the KPML system. However in the current work
we decided that these details should be hidden from the user as much as
possible, and we are developing a comprehensive tool that will simplify the mapping
process. Only some of the necessary functionality has been implemented at the
point of writing.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>In this paper we have outlined a system that can help automate the
production of exemplar NaLER sentences from legacy databases. Once the ontological
structure is determined the generation is fully automatic. This will help with
validation of the data model. But the novelty of the implementation is in the
required mapping to the UM ontology, which is a step towards converting a legacy
data model into an ontology which exposes the full range of rules and
relationships inherent in the database, thus facilitating the exchange of knowledge.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgement</title>
      <p>This work was sponsored by the Norwegian Research Council, WISEMOD project,
160126V30 in the IKT-2010 program, and the ADIS project in the same
program.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Atkins</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>INTECoM: An integrated approach to the specification and design of information requirements</article-title>
          . In Becker, S (ed.)
          <source>Designing Developing Quality Complex Database Systems: Practices</source>
          , Techniques, and
          <string-name>
            <surname>Technologies</surname>
          </string-name>
          . Ideas Group Publishing, Hershey P.A,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Atkins</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Patrick</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          <article-title>NaLER: A natural language method for interpreting entity-relationship models</article-title>
          .
          <source>Campus-Wide Information Systems</source>
          <volume>17</volume>
          (
          <issue>3</issue>
          ),
          <fpage>85</fpage>
          -
          <lpage>93</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Barker</surname>
            ,
            <given-names>R. CASE</given-names>
          </string-name>
          *Method: Entity Relationship Modelling, Addison Wesley, Wokingham, England,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bateman</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          <article-title>The Theoretical Status of Ontologies in Natural Language Processing</article-title>
          .
          <source>Proceedings of the workshop on 'Text Representation and Domain Model ling : Ideas from Linguistics and AI '</source>
          , held at the Technical University Berlin, October 9th - 11th,
          <year>1991</year>
          .
          <source>KIT Report 97</source>
          , edited by Susanne Preu and
          <string-name>
            <given-names>Birte</given-names>
            <surname>Schmitz</surname>
          </string-name>
          .
          <article-title>Cmp-lg Paper No: cmp-lg/9704010</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bateman</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kasper</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moore</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Whitney</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <article-title>A general organization of knowledge for natural language processing: The Penman Upper Model</article-title>
          .
          <source>Technical Report. Information Sciences Institute. Marina del Rey</source>
          , California, (
          <year>1989</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Bergamaschi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castano</surname>
            ,
            <given-names>S.</given-names>
            De Capitani di Vimercati, S.
          </string-name>
          ,
          <string-name>
            <surname>Montanari</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Vincini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>An Intelligent Approach to Information Integration</article-title>
          . In Nicola Guarino, editor,
          <source>Formal Ontology in Information Systems</source>
          , pages
          <fpage>253</fpage>
          -
          <lpage>267</lpage>
          . I0s Press,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Biller</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <article-title>and</article-title>
          <string-name>
            <surname>Neuhold</surname>
            ,
            <given-names>E.J.</given-names>
          </string-name>
          <article-title>Semantics of data bases: the semantics of data models</article-title>
          .
          <source>Information Systems</source>
          , Vol.
          <volume>3</volume>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>30</lpage>
          ,
          <year>1978</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Chen. P. English Sentence</surname>
          </string-name>
          Structure and
          <string-name>
            <surname>Entity-Relationship Diagram</surname>
          </string-name>
          .
          <source>Information Sciences</source>
          , Vol.
          <volume>1</volume>
          , No. 1,
          <string-name>
            <surname>Elsevier</surname>
          </string-name>
          , Pages
          <fpage>127</fpage>
          -149, May,
          <year>1983</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Dou</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and LePendu, P.
          <article-title>Ontology-based Integration for Relational Databases</article-title>
          , SAC'06,
          <string-name>
            <surname>Dijon</surname>
          </string-name>
          , France,
          <source>April 23-27</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Finkelstein</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>An Introduction to Information Engineering: ¿From Strategy Planning to Information Systems</article-title>
          . Addison Wesley, Sydney,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Halpin</surname>
            <given-names>T. Information</given-names>
          </string-name>
          <string-name>
            <surname>Modeling</surname>
          </string-name>
          and Relational Databases:
          <article-title>From Conceptual Analysis to Logical Design Prentice Hall</article-title>
          , London,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Marche</surname>
            ,
            <given-names>S. Measuring</given-names>
          </string-name>
          <article-title>the stability of data models</article-title>
          .
          <source>European Journal of Information Systems</source>
          Vol
          <volume>2</volume>
          no
          <issue>1</issue>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Ryder</surname>
            ,
            <given-names>M.R.</given-names>
          </string-name>
          <string-name>
            <surname>Facilitating</surname>
          </string-name>
          <article-title>Evolution in Relational Database Design: A procedure to evaluate and refine novice database designers schemata</article-title>
          ,
          <source>Unpublished thesis</source>
          , Massey University, New Zealand,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>