=Paper=
{{Paper
|id=None
|storemode=property
|title= COMIC: A System for Conceptual Modelling and Information Construction
|pdfUrl=https://ceur-ws.org/Vol-961/paper27.pdf
|volume=Vol-961
|dblpUrl=https://dblp.org/rec/conf/caise/Kangassalo89
}}
== COMIC: A System for Conceptual Modelling and Information Construction==
COMIC: A SYSTEM FOR CONCEPTUAL MODELLING AND INFORMATION CONSTRUCTION* Hannu Kangassalo University ofTampel'e, Department of Computer Science, P.D.Box 607, SF-33101 Tampel'e 10, Finland Abstract: The designer of a data base or an information system has to develop conceptual models describing the Universe of Discourse (UoD) for the specification of meaning and semantic structure of the data. He has also to handle descriptions of data structures and perform technical design of the system. Finally the system must be implemented. Often the amount of work required is so large that it would be useful to have a computerized tool to support the design process. In this paper such an integrated system is briefly described. The system, called COMIC, supports first the conceptual modelling phase, then the logical design phase, and finally the implementation of the data base, and moreover it can be used as a query and update system of the data base. The graphical language, CONCEPT D, used to describe a conceptul schema is new. CONCEPT D is a visual language that supports conceptual modelling of the DoD, the development of a conceptual schema, and the use of the application data base corresponding to the conceptual schema. The language is based on the intensional approach to conceptual modelling, i.e. the descliption of knowledge in concepts is emphasized instead of extensions of these concepts. The language contains three sub-languages: one, called CONCEPT DID, for knowledge acquisition about concepts used in the DoD and de- veloping graphic descriptions of concept definition hierarchies, one, called CONCEPT DICS, for describing conceptual schemata of the UoD, and one, called Conceptual Query Language CQL, for interacting with the application data base. A conceptual schema is used as a basis for making queries from the application data base corresponding to the conceptual schema. A graphic conceptual schema supports the user in recalling and understanding the conceptual structure of the DoD. It hides the relational data base schema from the user so that he works only with concepts of the UoD. Hehasnot to know e.g. therelations or foreign keys between relations. In this work someofthemain features ofCQL are presented. A query is formulated by pointing at concepts required, and possibly by giving some selection criteria and instructions for formatting the output. CQL is unique in its capability in manipulating concept definition hierarchies. The core component of the COMIC sys- tem is the concept data base on which other components, such as a conceptual schema editor, a schema translator and a conceptual query language have been built. 1. INTRODUCTION In the design of data bases and information systems there has been a continuous trend to increase the amount and complexity of data processed by these systems. Accordingly, the utilization of semantic infOlmation describing the Universe of Discourse (UoD) and the meaning of data has been increasing [1,7,8). There are several reasons for this, arising from different backgrounds: 1. The requirements from users to improve the quality of data produced by infOlmation systems. 2. In orderto benefit from fOlmal design methods and advanced design tools it is necessary to describe the object of design, Le. data, in greater detail than before [13,28). 3. Allempts to apply and integrate methods from lIJ1ificial intelligence to data management, software engineering and decision support systems [3,4,5,28,29). 4. Allempts to develop methods and tools for improving users' knowledge of the semantics of available data [1,6). ( Conceptual modelling itself, by which we mean the study and development of concepts concerning the DoD, and the construction of models by using these concepts, has been studied very little. The best introductions seem to be found in the literature concerning the construction of scientific theories and the analysis ofproperties of these theories reg. 9,10). More work has been done on modelling languages, such as the Entity-Relationship approach [13). Because of the requirements described above, thc design and management of the information resources of a company is becoming more and more important as well as more and more difficult. The effective management of the stnlcture and meaning of data requires that there are efficient and flexible design and management tools available, which must be easy to use. The goal of the COMIC system is to support the entire design process of a data base and an infonnation system, as well as their implementation and use. The main emphasis is on the methods and tools for conceptual modelling. However, attention has also been given totheentiredata base design methodology, because un integrated methods and tools have often shown themselves difficult to use and they have been abandoned after short initial trials. The COMIC system releases a user from the burden of knowing data base technology in detail, so that he can concenu'ate on solving problems that relate to the domain of application. In section 2 the phases of the data base design process lIJ'e briefly described using concepts and terminology of the approach used in this work. The lIJ'chitecture of the COMIC system in its current f0l111 is described in section 3. Section 4 contains a shon summary. 2. THE DATA BASE DESIGN PROCESS Data base design ranges from the design and definition of the conceptual content of the UoD to the technical specification of a data base. In its first phases it concentrates strongly on the recognition, analy- sis and definition of concepts used in the UoD and later applied in the definition of meaning of data and relationships between components of data. The definition of meaning of data requires that the concepts are well known and defined. From this it follows that conceptual modelling of the UoD has become an inseparable part of all design methodologies (see, e.g., [33)). Data base design can no longer be seen as a single, purely technical task. It pervades the whole infonnation system design and implementation process. The conceptual structure of data cannot be implicitly built into the system or programs, but it must be explicitely described and it must be possible to study it separately when needed. The knowledge about concepts of the DoD and their relationships has to be moved from programs into the conceptual 2 description of the UoD and made available to all users or designers who may need it. (see (36]). The goal of the data base design is thus, in addition to the technical design of a data base, to design the infonnation content ofdata so that it is well defined and easily adaptable according to changes of the UoD, and is independent of changes in the technical implementation of the system. To reach this goal requires that the technical structure of data be designed on the basis of a carefully analyzed and defined concep- tual content of the UoD. In figure 1 a model of the data base design process is shown. For the sake of clarity, the describtions of possible iterations have been left out. Similar models have been presented elsewhere, e.g., in (12]. The main difference to other models is in the concepts and methodology applied in the first three phases of this model. The phases ofwork are symbolized by boxes and the results ofeach phase by ovals. The arrows show the main direction of the design process. The first three phases are completely independent of any data base management system. Their goal is to develop a conceptual schema describing the UoD of the user community. It forms the conceptual basis for the whole design process. DEFlSITlON OF DIl~lNITIONOF DEFINITION OF CO).;CEVl' COI'CEPT CONCIlPT STRUCTURES 000 STRUCTURES STRUCTURES OF A OF S OF l' DEFINITION OF DE~'INITIONOF DEFINITION OF COSCliI'TUAL 000 CO).;CEI"TUAL COSCEI"TUAL SCIIIlMA OF A SCIIEMA OF S SCHEMA OF l' COSCEPTUAL LEVEL DATA STRUCTURE LEVEL DIS11lm1JTION DESIGN SELECTION OF DATA STRUCTL'RES SELECTIOS OF 1.0GlCAL STORAGE STRUCTLlU':S IAI'I'I."G TO TIlE PIIYSICAL STORAGE PHYSICAl. U:VEL PIIYSICAL DATA IIASE Figure 1. A model of the data base design process. A conceptual schema contains a conceptual description of the UoD and data collected from the UoD. It is developed by using only concepts lIsed in the UoD and it consists of concepts and rules of the UoD. 3 It is important to note that it contains (or it should contain) all the concepts and integrity rules recognized in the UoD (lOO%-rule) [36]. The design is started by collecting. analyzing and defining the concepts of the VoD needed by users. In figure 1 the users or groups of users are symbolized by capital letters A. B•....T. The goal of this phase is to es tablish a well defined conceptual foundation for further work. The results are called concept structures. A concept structure is a hierarchical description (definition) of a concept. It gives a detailed analysis of the knowledge content of a concept. From experiments we have learned that. at least as far as abstract and complex concepts are concerned. the consu'uction of concept structures is a very useful or even necessary step. instead of trying to build a conceptual schcma directly [23]. Concept structures are building blocks from which the conceptual schema can be consuucted. Concept strUctures are regarded as separate objects. In the second phase the VoD of a user or of a group of users is described by using concepts defined by them. The result is a conceptual schema or view of the VoD. There is at least one view for each user or a group of users. It should be noted that a view is not a description of data alone, but is a local 'theOlY' of the VoD. It contains concepts and rules on different levels of abstraction. Only some of them will be implemented as data and programs in the infornlation system. The rest are necessary for correct function of the VoD and for understanding the data. Thus the view is also a local knowledge base. which can be used e.g. for training neW employees, for design of new rules or concepts. etc. The uumber of user views may be large. Because each of them is a subjectively biased description of the users' VoD, they must be integrated into a common conceptual schema from which all contradictions have been removed and relationships between different views, as well as common components of views. are described in an unified way. Integration is done in the third phase. It is a complex task. which needs good support for checking the completeness. integrity and consistency of the common conceptual schema [2.16.30]. Often a conceptual schema contains SU'uctures and rules for which there are no suitable means for implementation in the underlaying data base management system. It must then be restricted to fit to the rules of the data model and the DBMS. This is done in the fout1h phase. Structures and mles which cannot be implemented by using the DBMS must be taken out of the conceptual schema and implemented in application programs. 3. THE COMIC SYSTEM 3.1. An overview of the system architecture In the following a short description of the architecture and functions of the COMIC system is given. The COMIC system is more than just a collection of design tools. It is an integrated design SUppOt1 system for elicitiug and acquisition knowledge from users. for creating, storing and manipulating a conceptual schema. and for using the conceptual schema and the corresponding application data base. The concep- tual schema is the core of the architecture. It is connected with an underlying data base management system in such a way that it can be used directly as a component of the conceptual query language. The COMIC system consists of several integrated components (see figure 2): 4 1. Concept editor for eliciting and acquiring knowledge from users and for creating, manipulating and storing graphical concept structures. 2. Conceptual schema editor for creating, storing, browsing and manipulating a conceptual schema based on concept structures. 3. Integrator for pairwise integration of conceptual schemata. 4. Schema translator for transforming a conceptual schema into a corresponding relational data base schema which can be directly given to a relational data base management system. 5. Conceptual query language CQL for making queries and updates to the relational data base by pointing at components in the conceptual schema. 6. Concept base management system for storing semantic information contained in concept structures and conceptual schemata. 7. Application data base which contains data corresponding to the conceptual schema. USER DATA BASE ADMINISTRATOR ( CONCEPT SCHEMA EDITOR EDITOR INTEGRATOR WORKSPACE CONCEPT BASE MANAGEMENT SYSTEM CONCEPT DATA BASE I WORK SPACE SCHEMA TRANSLATOR SCHEMA SPECIFICATION DATA BASE SCHEMA ORACLE DATA BASE Figure 2. The architecture of the COMIC system The concept data base is used by components I - 5 as a dynamic design data base. as a concept definition library, as a store for the conceptual schema and as a dictionary needed by the system for the evaluation of CQL- queries. 5 3.2. Development of the conceptual schema 3.2.1. Motivation for using the conceptual schema The goal of conceptual modelling is to clearly recognize and describe the set of concepts of the UoD re- garded as important by users, and to construct a conceptual description of the UoD as the users want to see it by using the set of well defined concepts. On the other hand, the goal is to establish a firm foundation for the specification of the meaning of the data to be stored in the data base, and for the specification of relationships between these data. 3.2.2. Concepts and concept structures In the following a concept is defined to be an independently identifiable fomlal construct with an internal structure, and consisting of structured semantic information, i.e. knowledge [17,18,22]. Concept structure diagrams describe important concepts of the working environment of users. They may describe data requirements, reports, data structures, objects, events, processes or relationships of the UoD. They can be as large and complex as needed to explain the essential content of concepts. Concepts are not classi- fied into entities, attributes or relationships - we just have concepts which arc regarded as basic epistemological components of human knowledge. The working hypothesis used in this work is that the basic epistemological relation between concepts is the relation of intensional containment, and the methodology and notations used in conceptual modelling should be based on this relation. The relation of intensional containment is a binary relation defined within the set of concepts [26]. An intuitive explication can be given by saying that concept a contains intensionally concept b if the knowledge that forms concept a contains the knowledge that forms concept b. Note that we are talking about the knowledge required to recognize phenomena a and b in the UoD, not the way how the definitions of these concepts are constructed. The relation of intensional containment gives a possibility to develop all generally known modelling constructs in a systematic and consistent way, together with some novel modelling constructs [18]. A concept structure is a construct which consists of a defined concept (definiendum) and of its definition hierarchy, and in which the properties of the definiendum derive from the properties of basic concepts [18]. The graphical layout is meaningful in a concept structure diagram. The definiendum is on top of the hierarchy and concepts defining it are on the next or lower levels of the hierarchy. Structurally a defined concept is always a directed acyclic graph based on the relation of intensional containment. The definien- dum contains intensionally other concepts which are some of its ch~u·acteristics. A characteristic of a concept can be anothcr concept contained in the defined concept, a relationship of intensional con- tainment, or an additional inscription contained in the definition hierarchy. An inscription can be a condi- tion, a constTaint, a conditional constraint, 01' the definition of an identifying property 01' the limitation of its scope. The defining concepts are in turn defined by other concepts until the level of undefined, basic concepts is reached. For an occurrence of a concept to exist, occurrences of all its defining concepts must exist, but not vice versa. The definition of a concept is represented as a diagram that associates the definiendum with the set of defining concepts. The type of the diagram together with some attached inscriptions specify how the propel1ies of the definiendum are to be derived from the properties of defining concepts. The most commonly used types of definition are: I. Aggregation, in which a concept is defined as a collection of its characteristics. There are several different types of aggregation. 6 2. Generalization, in which a concept is defined as a collection of common characteristics of its defining concepts. A generalization can be either an unconstrained generalization or a constrained generalization. Constraining can be done either implicitly or explicitly. 3. Vallie transformation, in which a concept is 'defined' by specifying how the value representing it can be derived from values representing defining concepts. It is imp0I1ant to note that a definition and the corresponding concept are different things. Structurally a concept is always an aggregation, i,e., a collection of its characteristics. A definition is a rule or instruction which specifies how the knowledge forming the defined concept is to be constructed from the knowledge given in the definition itself and in the defining concepts. Concept structures in COMIC are described by using a graphical language CONCEPT DID [18,19,22]. Some extremely simple examples of concept structures are shown in the following. In aggregation a definiendum is constmcted by composing two or more characteristics together and by assigning a name to the resulting construct. At least one of the characteristics must be a concept. In the definition all the relations between contained concepts must be specified, as well as all inscriptions attached to them. The general pattern of aggregation is in Figure 3. For simplicity the symbols for exclusive-or (classification) and for identifiers have been omitted. I, A qualifier can be a condition list, a constraint list or a conditional constraint. A CR-qualifier can be a constraint list or a conditional constraint. The details of conditions and constraints are not described. To each concept name a list of attributes can be attached. An attribute can be e.g. a value set, occurrence con- straint or a semantic rule. qualifier·n·1 >{:}O 000 {: }O I I I 1 ~~R:quJ!Uf!e!.!~ : o o : <9~·51\:l~II!I~r:n~ : " Figure 3. Partern of aggregations. Figure 3 indicates how conditions, constrai nts, and conditional constraints can be attached to a definition. They may concern values of OCClllTences, value sets, extensions of concepts, equality or differences between occurrences and occurrence time or occurrence conditions of concepts. They can be written ei- ther using a natural language or a formal language. Examples of simple aggregations are in Figure 4. 1:n ~ EMP· AQ.E ~ NUMBER NUMBER ~ Figure 4. Examples of definitions by aggregation. 7 The concept EMPLOYEE has been defined with three concepts (EMPNUMBER, COMPETENCE, WORK-PHASE) and one cardinality constraint (1:n). Concept COMPETENCE contains concept COURSE and one cardinality constraint. The definition of PERSONNEL indicates that personnel of a company consists of a set ofemployees. This kindof structure, sometimes called an association in the liter- ature is only a special case of aggregation in CONCEPT D. An OCCUlTence of the concept defined by aggregation exists if occun'ences of all defining concepts exist and all constraints in the definition are true. This implements in a natural way different kinds of subclass and specialization abstractions used in some other modelling techniques. They are represented by ordinary concept definitions in CONCEPT D. For a certain occurrence of the definiendum, the occurrence of a defining concept can be missing if the condition attached to the corresponding intensional containment line is false. By using this mechanism different versions of the same concept can be defined easily. In geneTalization a definiendum is constructed by assigning to it either all common characteristics of defining concepts, or somesubsetofall common characteristics ofdefining concepts. The general graphic pattern of generalization is in Figure 5. {: }O {: }O 000 {: }O i I : ~g~-!l~aJi!i~r:1.?' : Figure 5. Pattern of generalizations. Definition by generalization differs from the definition by aggregation in one essential aspect. The defi- nition must be evaluated before the intension of the definiendum is revealed. The resulting concept is structurally just an ordinary concept. This fact explains why it is important to differentiate between a concept and its definition. In the evaluation three alternatives exist. These alternatives are expressed by generalization type (G I GE I GI) in the general pattern. In unconstrained generalization (G) the definiendum will contain all common characteristics of its defining concepts. In explicit generalization (GE) the selection expression is applied to the result of unconstrained generalization and all characteristics for which it evaluates true are accepted as characteristics of the definiendllm. During the evaluation of the definition first all common characte- ristics are recognized, then the chatacteristics not selected are removed and finally the implications of removals are worked out. The last step has to be done because the removal of a characteristic may cause some structural inconcistencies in the definiendllm, which must then be eliminated. In implicit general- ization (GI) the selection expression is applied to the result of unconstrained generalization and all characteristics for which it evaluates false are accepted as characteristics of the definiendum. In this case the definiendum contains those characteristics which are common to all defining concepts and are not mentioned in the selection expression. Also in this case some implications must be taken into account. A selection expression can also be a list of concept names. 8 In value transformation a definiendum is constnlcted by specifying how the value representing the definiendum is derived from the values repre. enting the defining concepts. In the definition also some conditions, constraints and conditional constraints can be used. The general graphic pattern of value transfonnation is in Figure 6. {: }O {: }O ... {: }O I I I :----------0---------' I I J I .. I .. I : <_C~:~u~l!fi_er-IJ':. I Figure 6. Pattern of value transformations. For example, the value representing concept SALARY is derived from the values representing concepts HOURS and WAGE according to salary function specification. In value transformation cardinality specifications can be used to indicate the number of occurrences of each defining concept required by the function. A condition specification can be used to indicate that a defining concept is a optional argument for the function. Concept structures are drafted on the screen of a graphical workstation under the control of the concept editor. The user employs a mouse and menus for constructing concept structures. There is a menu option for each different type of concept definition. Concepts stored in the concept base can be retrieved and edited. As a result of the concept development phase there will be in the concept data base a library of concept definitions. From this fact an idea of standard concept definitions is developed [15]. All core concepts of an enterprise can be standardized and users can then use restricted versions of standard definitions. Standard concepts can be developed to completely follow the legislation. They contain all knowledge which must be observed in different systems. By using them the conceptual analysis of information systems can be simplified and the number of errors can be reduced. 3.2.3. Conceptual schema and the schema editor A conceptual schema (CS) is developed from concept structures. A conceptual schema is a formal construct composed of concepts, intensional relationships between concepts, and of constraints and conditions between concepts. A CS not only describes the entities and their attributes and relationships in the UoD, but it defines the whole system of concepts that pertain to the UoD. I A conceptual schema can be seen as a three dimensional construct which consists of concept definition hierarchies. Structurally it is a directed acyclic graph based on the relation of intensional containment. In fact, it ha exactly the same fonnal structure as a concept structure. They differ only in their graphical representation. A conceptual schema may contain several hundred concept structures. In the graphical representation a concept structure is viewed «from the side'. This representation supports strongly the idea of hienu'chical 9 definition and helps the user to analyze the structure of the concept being defined. For a conceptual schema (and even for an exceptionally larg concept structure) this representation is impossible because the diagram would contain a lot of lines that are almost horizontal and cross many other lines. The repre- sentation space must be used more effectively. The solution used in COMIC is to apply a three di- mensional representation, which is believed to cOlTespond closely to the natural way of organizing human knowledge. The evidence collected so far seems to support this hypothesis at least with some people. The graphical representation of a CS consists of nodes representing concepts and of arcs representing relationships between concepts [20,22]. There are two types of arcs: arrows representing the r lation of intensional containment, and dotted arcs representing various types of constraints. The head of an arrow points to the defined concept. To each arc additional inscriptions can be attached. An example is in figure 7. Figure 7. A sample conceptual schema. The conceptual schema editor is a tool for inspection and modification of the CS f32]. The CS or a part of it is shown on the screen from 'the top down through the lev Is'. The levels are stacked on each other in such a way that the highest level of the CS is on top of the Slack and the lower levels under it sequenced according to the number of the level. The levels are semitransparent in such a way that several levels can be viewed at a time without being confused by the whole complexity of the CS. This feature creates an illusion of three-dimensionality which supports the understanding of the structure of the CS. The user has a possibility to 'navigate' in the CS. The schema can be scrolled up and down,left and right, and new levels can be added on the top or on the bottom of the screen stack or they can be removed. In this way the information on the screen is focused very effectively and the amount of information visible to the user is considerably increased. Because some parts of the schema can be off the screen it must be possible to project a miniaturized view of the schema on the screen in order to make the whole schema visible. A concept stmcture can be transformed to the conesponding conceptual schema by using the conceptual schema editor. The resulting schema will be stored in the concept ba e from which it can be retrieved for inspection or for integration with some other conceptual schema. 3.2.4. Integration of local conccptual schcmata The set of local conceptual schemata or lIser views produced from separate concept structures must be integrated to make up a global conceplllal schema. In the COMIC system the integration is quite a difficult 10 task because of the rich modelling language. The integration is an interactive process perfomled by us- ing the integration module, which works on two conceptual schemata at a time. The designer sees both schemata on the screen and gives commands on how the integrated conceptual schema is to be con- structed. The design of the integration module is not as yet completed, but the integration strategy will be a binary ladder strategy, in which the set of conceptual schemata is ordered according to their relative importance or weight and in every integration cycle the heaviest local schema is integrated into the partially integrated global conceptual schema until the whole set is exhausted. Detailed integration rules are given in [27]. 3.3. Use of the conceptual schema in COMIC 3.3.1. Transformation of CONCEPT D conceptual schema into a relational database schema A conceptual schema must be transformed to a relational data base schema before the data base is implemented. The transformation process takes advantage of structural information inherent in the conceptual schema. While representing an application with a setofconcepts, the CONCEPT Dconceptual schema also includes structural definitions of these concepts, i.e., concept structures. Concept structures and conceptual schemata contain implicitly both functional and multivalued dependencies [35], as well as inclusion dependencies, too [11]. The transformation produces the relational data base schema in 4NF with Iossless join property. In certain cases the designer has a possibility to decide that 3NF is sufficient. The transformation consists of two phases [34]. In the first phase a conceptual schema is decomposed into several partial conceptual schemata on the basis on those conditions, constraints and conditional constraints that deal with cardinalities of occurrences of concepts. The decomposition algorithm goes recursively through the conceptual schema. For every concept encountered it checks whether it contains any defining concepts, or is just a basic concept. In the case of derived concept, its definition type de- termines how the decomposition is performed on that level. An identifier hierarchy consisting of a se- quence of identifiers on consecutive levels must be associated with every partial conceptual schema generated. In the second phase partial conceptual schemata are transformed to relation schemes, which together constitute a relational database schema. In this schema every partial conceptual schema has a conespon- ding relation scheme. Basic concepts and possibly concepts defined by a transformation function are taken as attributes to relation schemes. Identifiers and identifier hierarchies are represented as key attributes. Finally it is checked that the resulting schema is nonredundant, that is, that any relation scheme is not already contained in some other relation schemes. The transfOlmation process cannot be made completely automatic, because in some situations a decision of a user is needed. The resulting relational databa e chema and a corresponding set ofdata description language statements for the relational DBMS is generated. In the present ver ion of the COMIC system ORACLE and SQL are used. In the concept data ba e the mapping between the conceptual schema and the relational schema is constructed. 3.3.2. Conceptual query language CQL The use of a graphical conceptual. chema opens a po sibility to develop a user-friendly query interface [21,24,25]. The formulation of a query should be as easy as possible, Le. require no knowledge of the structure of the data base, or mental mappings between the knowledge of the user and the conceptual schema. The conceptual query language CQL attempts to meet this requirement. In the implementation design only those features of the language are taken into account which can be implemented by using SQL. 11 The goal in the design of CONCEPT D and CQL is that the stlUcture of user's own internal conceptual model would be isomorfic with the conceptual schema on the screen as much as possible. Then the conceptual schema would reflect the mental structure of knowledge the user has in his mind. That would eliminat a complex structural transformation from the user using the conceptual schema and CQL. A conceptual schema supports the user in recalling and comprehending conceptual structure of the DoD and the data base. The userofthe CQL query facility will see a graphical conceptual schema on the screen of the workstation. The fOl1nulation of a query consists of selecting an object of interest (01) from the schema and pointing at it with a cursor that can be moved with the mouse. Constraints can be imposed on the 01 by use of the cursor, menus and the keyboard. There are several benefits to using a conceptual level graphical query facility [21]: 1. The user has to have no knowledge of the structure or technical features of the data base. 2. The query language has only very few syntactic rules. 3. The use of a conceptual schema for fOJmulating queries helps the user come to a better understan- ding of the subject matter. The user can see the three-dimensional conceptual schema around the 01 and study it in detail easily. An example of the screen is in figure 8. The menu is on the right, the conceptual schema is on the left, and the text line for condition specifications is below the schema. The schema contains three hierarchical levels. STUDENT is on the highest level and SCORE and some other concepts on the lowest level. * + = =/ < > >= <= ( ) and or sum cnt min max avg cond group by process output simple topical struct. exit Figure 8. User interface for making CQL queries [24]. The class of queries can be divided in simple queries and queries containing complex relationships. In simple queries the or is just one concept and the data representing its occurrences. In simple queries the user may want a list of all occurrences of the 01, or just a subset defined by some selection criteria. A selection criterion can be e.g. a given value of an identifier of the selected concept, some value or a range of values representing a concept contained in the selected concept, or a relationship the selected concept has with some other concept. Boolean expressions can be used in constraints. 12 For example, if a list of all students is wanted, one points at STUDENT and selects the process command from the menu. The system finds out that STUDENT contains other concepts (ST- NO, S-DAT A, EXAM) which in turn contain other concepts. The output list has six columns as follows: STUDENT(ST-NO,SSN,S-NAME,(COURSE-ID,DATE,RESULT)) ST-NO and SSN both identify OCCUlTences of students and COURSE-ID identifies occurrences of courses. Because of the l:n relationship between STUDENT and EXAM, the occurrences of examina- tions are identified with a pair ST-NO,COURSE-ID (or with SSN,COURSE-ID). If some subset of students is to be selected, then the user has to give qualifications. For example, to select a student whose ST-NO is 30504, touch STUDENT, select cond, ST-NO, and write '= 30504' and give a process command. The result will be a list like the following. STUDENT CST-NO, SSN, S-NAME, COURSE-ID, DATE, RESULT) 30504 23096lX MIKKO LAHTI A2.1 6.5.88 2 30504 23096lX MIKKO LAHTI A2.3 4.9.88 1 30504 230961X MIKKO LAHTI A2A 7.9.88 3 If several students are to be to selected, then a list of their student numbers is to be supplied. To make several qualifications on different concepts contained in the same superordinated concept, the user has first to touch the superordinated concept, and then repeat the qualification sequence described above for each contained concept. Accordingly, to select only some contained concepts for the output, one has to touch them all after the superordinated concept. In some cases a user is interested in two or more concepts which belong to different, but interrelated concept structures. The relationship between them can be of three different types, or it can be a combination of these types. 1. The related concepts share at least one concept which is contained in all of them. 2. The related concepts are all contained in the same concept, i.e., they share a common identifierfrom the higher level. 3. The related concepts are connected through a chain of constraints of various types. The evaluation of a query containing complex relationships can result in two basically different results. One alternative is that there is exactly one path of relationships between the related concepts. The system ha to find the path, analyze it, generate the corre ponding set of SQL data manipulation commands, and I. finally process the data base in order to find out whether there is data corresponding to that path. The other alternative is that there are several relationships between the related concepts. The system can find them all and the user has to choose the action in the next step, by proceeding in one of the following ways: 1. The user can tell the system that the data in all relationships should be 1i ted (i.e. full search is required). 2. The user can tell the system that the automatic mode should be used. This means that the system must try to find the 'most plausible' of the relationships the tlser is interested in. 3. The user can tell the system that the manual mode is to be used. The user must himself select the relationship between the selected concepts and indicate this to the system by 13 touchi ng the corresponding arcs in the schema. A query can be qualified also in the case of complex relationships. The qualification is done similarly to that of a simple quelY. This alternative has not been implemented yet. Similar approach is used for updates of the application data base [31]. A user selects the 10 by pointing at it on the screen and possibly gives qualifications. The system analyses the 01, opens a small window under the concepts which must be updated and waits for the users response. User writes the data on each window and gives the process command. The system produces SQL commands and updates the data base. 4. SUMMARY A new approach to conceptual modelling and a tool supporting the associated methodology are introduced. The approach is based on the idea that in conceptual modelling the concepts used to describe the UoD must first be constructed and a conceptual schema of the application shall be developed by using them. The construction of concepts requires that the intemal structure of them be analyzed and described. Concepts are defined graphically by using concept structure diagrams. They describe the internal ( structure of concepts. They are based on the use of the relationship of intensional containment. In a concept stl'llcture all knowledge about the concept can be given in an easily understandable fonn. Concept structures are manipulated by means of the concept editor. By using the setofdefined concepts a conceptual schema of the UoD is developed. The conceptual schema is a multilevel construct which is viewed from the top down on the screen of the workstation. The conceptual schema is manipulated by means of the schema editor. The user can navigate in the schema to all directions. The set oflocal conceptual schemata must be integrated before the development of the data base schema. Integration is done with the schema integrator, which is based on the use of the ladder strategy. In the present version of the system the integrator has not been implemented, yet. The schema translator is used to transfom1 a conceptual schema into a corresponding relational data base schema. The schema specification can be given to a relational data base management system. In the present version of the system ORACLE is the DBMS used. A special feature of the COMIC system is its conceptual quely facility based on the conceptual query language CQL. A user can make queries by pointing at the conceptual schema visible on the screen of the workstation, with no need to know the structure of the data base. Updates of the application data base can also be made from the conceptual schema level. The system is being implemented on APOLLO DN3000 workstations at the University of Tampere. The work started in January 1984. The last phase of the development ended at the end of 1988. Continuation of the work is being planned. Several possibilities for extensions to previous work have been recognized. ACKNOWLEDGEMENT: I would like to express my thanks to many people who have contributed to the COMIC project either by giving comments to my own work or by developing parts of the implementation. Much of the work has been done in close cooperation and many solutions have evolved and been improved quite a lot in long discussions. Especially I would like to mention Ossi Numminen and Jyrki Nummenmaa who have been developing the schema editor. Sami Kari implemented the CQL, Raimo Mansikkaoja designed the rules for the integrator, and ArlO Viitanen studied the possibility to use 14 COMIC in a distributed environment and implemented the current version of the concept data base. Jari Poso developed an original method for schema u·anslation. Jyrki Nummenmaa and Arto Viitanen developed the update function for CQL. REFERENCES [1]. G.Barber, P.DeJong and C.Hewitt, Semantic Support for Work in Organization. In: R.E.A.Mason (Ed.), InfOlmation Processing 83. North-Holland 1983. [2]. C.Batini, M.Lenzerini and M.Moscarini, Views Integration. In: [25]. [3]. M.Brodie and J.Mylopoulos (Eds.), On Knowledge Base Management Systems. Springer-Verlag, 1986. [4]. M.Brodie, J.Mylopoulos and J.W.Schmidt (Eds.), On Conceptual Modelling. Springer-Ver- lag,1984. [5]. M.L.Brodie and S.N.Zilles (Eds.), Proceedings of Workshop on Data Abstraction, Databases and Conceptual Modelling. ACM SIGMOD Record, Vol. 1I , No.2, February 1981. [6]. J.Bubenko,jr. On the Role of 'Understanding Models' in Conceptual Schema Design. In: Proc. of the 5th Int. Conf. on Very Large Data Bases. Rio de Janeiro, October 3-5, 1979. [7]. J.Bubenko,jr. InfOlmation Modeling in the Context of System Development. In: S.H.Lavington (Ed.), Information Processing 80. North-Holland 1980. [8]. J.Bubenko,jr. Information and Data Modelling: State of the Art and Research Directions. In: H.Kangassalo (Ed.), Second Scandinavian Research Seminar on Information Modelling and Data Base Management. Acta Universitatis Tamperensis, Ser B. Vol 19, Tampere, 1983. [9]. M.Bunge, Scientific Research I, The Search for System. Springer-Verlag, Berlin, 1967. [10]. M.Bunge, Treatise on Basic Philosophy, Vol. I, Semantics I: Sense and Reference. D.Reidel Publishing Company. Don:lrecht, 1974. [11]. M.Casanova,R.Fagin,C.H.Papadimitriou, Inclusion Dependencies and Their Interaction with Functional Dependencies. Journal of Computer and System Sciences 28, 29-59 (1984). [12]. S.Ceri (Ed.), Methodology and Tools for Data Base Design. North-Holland" 1983. [13]. P.Chen (Ed.), Entity-Relationship Approach to Systems Analysis and Design. North-Holland 1980. [14]. E.F.Codd, Extending the Database Relational Model to Capture More Meaning. ACM Transac- tions on Database Systems, VolA, No.4, December 1979. [15]. G.Di Battista, H.Kangassalo and R.Tamassia, Definition Libraries for Conceptual Modelling. In C.Batini, (Ed) 7th International Conference on Entity-Relationship Approach. (Rome, Itaiy, November 16-18), North-Holland 1988. [16]. R.Elmasri and S.Navathe, Object Integration in Logical Database Design. In: International Conference on Data Engineering. Los Angeles, April 24-27, 1984. [17]. H.Kangassalo, On the Concept of Concept in a Conceptual Schema. In: H.Kangassalo (Ed.), First Scandinavian Research Seminar on Information Modelling and Data Base Management. Acta Universitatis Tamperensis, Ser.B, Vol.l7, Tampere 1982. I [18]. H.Kangassalo, CONCEPT D - A Graphical Formalism for Representing Concept Sn·uctures. In: H.Kangassalo, (Ed.), Second Scandinavian Research Seminaron Information Modellingand Data Base Management. Acta Universitatis Tamperensis. Ser.B, Vol.l9, Tampere 1983. [19]. H.Kangassalo, CONCEPT DID - A Technique for Graphical Description of Concept Structures. University of Tampere, Department of Mathematical Sciences, October 1984, (Unpublished working paper) (In Finnish, English version is in preparation). 73 pages. [20]. H.Kangassalo, CONCEPT DICS - A Technique for Graphical Description of Conceptual Schemata. University ofTampere, Department of Mathematical Sciences, October 1984. (Unpub- lished working paper), (In Finnish, English version is in preparation).92 pages. [21]. H.Kangassalo, A Definitional Conceptual Schema as a Query Interface of the Information System. 15 (Third Scandinavian Research Seminar on InfOImation Modelling and Data Base Management, FinIand,June 1985), In: H.Kangassalo (Ed.), Infoffilation Modelling and Data Base Management. Lecture Notes in Computer Science. Springer-Verlag, Berlin, (to appear). [22]. H.Kangassalo, CONCEPT D: A Graphical Language for Conceptual Modelling and Data Base Use. Invited paper, IEEE 1988 Workshop on Visual Languages. (Pittsburgh, USA,October 10- 12), IEEE, New York, 1988. [23]. I-:I.Kangassalo and P.Aa1to, Experiences on User Pal1i- cipation in the Development of a Concep- tual Schema by Using a Concept Strncture Interface. In: B.Shackel (Ed.), Human-Computer Interaction INTERACT- '84. North-Holland, 1985. [24]. S .Kari, A Plan for the Conceptual Query Language. MS Thesis. University of Tampere, Depart- ment Of Mathematical Sciences. November 1986. (In Finnish). [25]. S .Kari, H.Kangassalo and J.Poso, CQL - Conceptual Query Language: A Visual User Interface to Application Data Bases. Proceedings of the Joint Scandinavian-Japanese Seminar on Infonnation Modelling and Knowledge Bases. Ellivuori, Finland, June 6 - 10, 1988. Acta Universitatis Tam- perensis, Ser.B, (to appear). [26]. R.Kauppi, Einflihrung in die Theorieder Begl'iffssysteme. Acta Universitatis Tamperensis, Ser.A, Vol. 15, Tampereen Yliopisto, Tampere, 1967. [27]. R.Mansikkaoja, Integration of CONCEPT D Conceptual Schemata. MS Thesis, University of Tampere, Department of Computer Science, April 1987, (In Finnish). ( [28]. R.Meersman and G.M.Nijssen, From Data Bases to Knowledge Bases. In: Infotech Stateofthe Art Review, London, November 1983. [29]. T.Moto-Oka (Ed.), Fifth Generation Computer Systems. North-Holland, , 1982. [30]. S.Navathe, R.Elmasri and J.Larson, Integrating User View in Database Design. Computer, January, 1986. [31]. J.Nummenmaa and AViitanen, Data Base Updates from Conceptual Level. Proceedings of the Joint Scandinavian- Japanese Seminar on InfOImation Modelling and Knowledge Bases. Ellivuo- ri, Finland, June 6 - 10, 1988. Acta Universitatis Tamperensis, Ser.B,(to appear). [32]. O.Numminen and J.Nummenmaa, Graphic Editors for Knowledge Acquisition and Conceptual Schema Design. Proceedings of the Joint Scandinavian-Japanese Seminar on Infoffilation Model- ling and Knowledge Bases. Ellivuori, Finland, June 6 - 10, 1988. Acta Universitatis Tamperensis, Ser.B,(to appear). [33]. T.W.Olle, H.G.Sol and A.A.Verrijn-Stuart (Eds.), InfOImation Systems Design Methodologies: A Comparative Review. North-Holland, , 1982. [34]. J.Poso, Translation of COMIC Conceptual Schema into the Relational Database Schema. Univer- sity of Tampere. Department of Mathematical Sciences. Report C44, October 1986. [35]. J.D. Ullman, Principles of Database Systems. Pitman, London 1980. [36]. J.van Griethuysen, Concepts and Tenninology for the Conceptual Schema and the Information Base. ISOrrC97/SC5/WG3 -report. Available from ANSI under publication number ISOrrC97/ SC5-N695. The work described in this paper has been supported by the Academy of Finland, the Center for Technological Development (TEKES) and the University of'l'ampere. 16