Unveiling and Conceptual-Logical Modeling of Phase Sequences
in Data Engineering
Aleksandr Rodionova and Georgiy Tsoya
a
    Computer center of far eastern branch of the Russian academy of sciences, Khabarovsk, 68000, Russia


                 Abstract
                 The acceptance of the concept, according to which object types of the conceptual level are
                 differed by functional heterogeneity, leads to the realization of the fact that there are no entity
                 types that wouldn’t exchange their instances and wouldn’t form numerous sequences in their
                 domains-phase sequences. Formally, such sequences are the directed graphs, in nodes of
                 which are placed entity types, and arcs set sources and receivers of moving instances. The
                 article aims at developing the method for revealing phase sequences at conceptual schemas
                 by means of using a number of types of adjunct categories and subsequent representation of
                 found sequences in databases logical structures. Along with simplest sequences with disjoint
                 types the paper concerns the issuers of formalization and logical modeling of sequences,
                 whose items can simultaneously be present in several types.

                 Keywords 1
                 Phase types, phase sequences, direct and reverse type conversion, categories of types,
                 markers of phase sequences

1. Introduction
    One of the key tasks that are addressed during conceptual data modeling is to establish a set of
                                  𝐽
classes of interactions С = {𝑐𝑗 }1occurring in the domain, as well as constraints on these interactions.
An arbitrary interaction class 𝑐𝑗 always contains some set of types – {T1,..Tт}, whose objects
participate and (or) can participate in the j-th interaction (figure 1).
    No less relevant are the “dynamic " relationships that arise between classes and are associated with
the movement of objects from one type to another, including those belonging to other classes. Such
types in domains constitute sequences (phase sequences – PS), elements of which are termed phase
types as, for example, in [8]. The same article points out that phase types appear when the source type
are partitioned. It follows the types of PS aren’t overlapped. References to phase types as components
of evolutionary, circulating, incremental, loop, and networked object lifespans, can also be found in
[6].
    Meanwhile, phase sequences with intersecting types are often revealed in domains. For example,
in the “Academic degrees” PS, a candidate of sciences in one realm remains a candidate of science
even if he received a doctor of sciences degree in another field. Or, the title of master of sports does
not cancel the title of candidate for master of sports, which can be observed in the "Sports titles"
phase sequence
In addition to identifying such sequences, which is the subject of yet another problem of conceptual
data modeling, the mechanisms of PS formation are also of interest. Since a current work is
exclusively addressed to the information modeling issues, the certain groups of specific objects and
interactions resulting to deriving phase sequences elements will be implied under the modeling

VI International Conference Information Technologies and High-Performance Computing (ITHPC-2021),
September 14–16, 2021, Khabarovsk, Russia
EMAIL: ran@newmail.ru(A. 1)
ORCID: 0000-0003-43N-8562 (A. 1)
            ©️ 2021 Copyright for this paper by its authors.
            Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                    54
mechanisms. Some classes (categories) of similar objects were found and elaborated earlier in [1]. It’s
is the rank- and profile-classes. In the paper, the pointed list will be expanded due to addition to it
types of stage and status categories.

                              PS2                       C3
                                             𝑇5


                                                                          𝑇6


                         C1                                                         PS1
                                                                     C2
                    𝑇1
                                                   𝑇2                          𝑇4

                          PS2
                                    𝑇3
                                                              PS1


Figure 1: Interaction classes in the style of Object-Role Modeling (ORM) notation

    The concept of the phase type considered in the article also applies to the types of the role category
[5, 7], allowing us to interpret the latter as phase types with a nominally constant composition.
    This work is aimed at developing methods for identifying phase sequences in conceptual schemas,
their subsequent representation in logical database structures, and reverse transformation of logical
model types into the phase types of the conceptual model.

2. Procedures for               conceptual-logical           and      logical-conceptual           data
   transformation
    At first glance, it may seem that mapping classes of conceptual level interactions into logical
structures (the direct conversion) is not a difficult task, unless one circumstance is taken into account.
Conceptual level types can overlap. Intersections of logical level types are forbidden. This is not the
only requirement for the organization of logical data models. The designed data structures should also
be compact and lack of redundancy. If the suite of conceptual types didn’t systemize in advance –
type’s categories aren’t determined, the number of potential logic schemas will be excessively large.
Given the preliminary carried out categorization of types, the conversion procedure may be come to
the form shown in Figure 2. (The purpose and functionality of all categories listed in Fig.2 will be
elaborated on below as they are arisen in model constructions.)
    One should notice that a trivial biunique correspondence are ascertained between most conceptual
and logical types and therefore direct conversion doesn’t represent any hurdles. The exception is the
types of entity categories, for which variant ambiguity is preserved, as the “intersection” issue is
relevant only for the entity types.
    The task becomes even more complicated if we take into account that entities in domains are
present in two kinds - in Prototype and in Sample/Instance – formats (kinds) [2]. (The same is true for
processes, with the exception of the Sample format because types of process category can’t be in the
Sample format.) Prototype-instances compose a Prototype-kind of an entity type and represent
prototypes, models of real entities. Instances of other format, being real entities, are generally
dispersed among one Sample- and several Instance-kinds, but can form either Sample - or Instance-
kinds separately. Entities that participate in interactions can belong to different kinds. Prototype-kinds
are associated with Sample/Instance-kinds by associations that have cardinality 1:M. Entity types of
the conceptual level can only subsume into one of kinds listed the above, and the correct identification


                                                     55
of kinds is no less important than the categorization of types themselves, since both delineate
predefined classes of interactions that instances of the individual entity “types/kinds" can engage into.
                                                                                               sts-types
  status-types (sts)
                                  rank-types (rnk)
                                                                                                            rnk-types
                                                                                                                                 smn-types
                                                   semantic-types
                                                   (smn)
 stage-types (stg)                                                                      stg-types
                              profile-types (prf)                                                          prf-types

                                                                                                                                spl-types
                                                        split-types (spl)


     Entity-types (Ent)                                                 DIRECT CONVERSION                    Base-types (Base)

      P                   P                         P                       P                      P                             P
      1                   1                         1                   1                      1                            1     1
     М                    М                         М               М                          М                     subP
       I                      I                             I               I                      I                   P
                                                                                                                            1
                                                                                                                     subP            М
                                                                                                                                      I
                                                                                                                            1
                                                                                                                     subP                 1
                                                                                                                                          1
               P      P                                 P                                                                                     subI
                                                                    P
           1              1                         1                   1                                                                 1
                                                                                                                                              subI
           М              М                                                     INVERSE CONVERSION
                                                   М                    М
          I                   I                                                                                                           1          1
                                                         I          I                                                                         subI


    process-                       relator-types        collection-types               process-types relator-types              collection-types
    types (prc)                        (rlt)                (clt)                         (prc)      (rlt)                      (clt)
               P                       P                        P                          P                  P                           P
              1                        1                        1                          1                 1                        1

           М                          М                         М                         М                 М                         М

                  I                        I                        I                          I                 I                        I


     document-types                                                                                    document-types
        (doc)                                                                                             (doc)


Figure 2: Direct and reverse conversions of categorized types of the conceptual and logical levels


                                                                                  56
   Another important point is that entity instances can migrate from one type to another. Options for
potential migrations are signified in Figure 1 with a dashed line. In addition, subsets that exchange
their elements may appear within individual types. In order not to complicate the principle diagram of
conceptual-logical modeling, such subsets are not included in the scheme.
   Note that the abstract conceptual construction in Figure 2 is already a transformed initial
conceptual model, in which all classes of interactions in the modeled domain were to be represented.
   Regardless of how entity types are structured in the logical scheme, in which their suites are
named as base types, the task of inverse transformation of base types into dynamic types remains key
for the functional component of any information system with a database.

3. Localization of phase sequences in the conceptual modeling stage. Phase
   sequence markers
    The objects of the world around us are constantly in motion. For specific modeled domains, the
vast majority of all possible “motion” kinds are not of any interest. Only a small part of them, which
is important in the context of a particular universe of discourse, is the subject to fixing. Transitions
can be orthogonal, “natural” such as, for example, changes in the age or weight of an object or
“programmed”, motivated by certain circumstances. In the first case, the influence of some objects on
dynamic of other objects is either absent, limited, or insignificant, in the second - it matters. Many
methods and models have been developed for setting and consequent tracking motion trajectories.
With respect to information systems, workflows are the most widespread [3, 6]. In the scope of data,
workflows can be represented in the format of phase sequence models. Formally, the latter are
directed graphs with the set of states and transitions between them. One of the information modeling
tasks is to reveal and represent, first at the conceptual and then logical level, an object type, the
instances of which will transfer between states, forming dynamic types associated with these states.
The formalization of ways of forming dynamic types is especially important if we take into account
that a particular such type also defines the classes of interactions, in which instances of this type can
engage.
    Detecting of phase sequences among numerous types of conceptual level without using auxiliary
types, mostly non-entity ones, is a rather serious issue. Here is a list of types that indirectly indicate
the presence of phase sequences in the simulated space:
    -     age groups by sport – stg-+spl-/prc-types;
    -     weight categories – stg-type;
    -     semesters at universities – sts-type;
    -     track scientific paper in a journal – sts-type + adjacency matrix;
    -     academic degrees and ranks – rnk-type;
    -     military ranks of Russian Navi – rnk- type+ prf-type;
    All types marked with underscores, if they are linked to one or more entity types, are able to
specify phase sequences. Moreover, the very presence of the listed categories types in domains makes
sense only when they are combined with the entity types. The types marked as rnk are the rank types,
which are ordinary ordered lists. An instance of any similar type is nothing more than the name of one
of a phase sequence item. The rank, being by definition the position of an element in relation to other
elements, can be set either explicitly or implicitly. Implicitly-as the ordinal number of the item in the
list. Explicitly-in the form of some kind of a rating scale.
    Another category of types, by which it is often customary to describe phase sequences, we have
designated as status. Statuses can be either nodes, or transitions between process nodes, or both,
depending on what meaning is attached to these elements in the domain. The objects involved in the
process move from one state (status) to another, forming both phrases (statuses) and phase sequences.
In general, if the conceptual model of a domain contains prc-types, it is worth to look for phase
sequences that correspond to this types. In the role of typical rank types are served, for example,
"Military Ranks” and "Scientific Positions".
    If we exclude the presence of order in the types that are able to set transitions, we get ordinary
lists, such as the type - "Role of players in team sports".


                                                     57
    Types subsumed into the stage category also model dimensions of processes, but only the
processes, in which transitions occur the natural way, as in the case with age groups or weight
categories. Here, too, we can find traces of processes. Stage category is introduced only in order to
somehow differ “natural” and “programmed” processes. Since stage-, status -, and rank-type objects
are involved in setting phase sequences, it makes sense to reduce them all into the general class
(including for the subsequent references to them), and to term this class, for example as the chain (ch)
class.
    Regardless of which type of the listed categories is involved in specifying the phase sequence, any
phase type will represent one of the variants of the entity collections classified in [9]. Depending on
the presence or absence of duplicates, as well as the significance or insignificance of the order, the
authors of this work suggest to distinguish four kinds of collections: list, set, bag, and ranking.
    Phase sequences in domains not often are revealed at the first attempt. First, some phase sequences
can be elementary omitted during the conceptual modeling. Secondly, some transitions between
individual types may not turn out as obvious as they actually are. For example, this applies to such a
set of types as: "Bachelor”, "Master" and "Doctoral". And, thirdly, there is simply no need to keep
track any transitions. In some measure, the severity of this problem is reduced while using the
markers of phase sequences represented by the same ch-types. But wholly, any entity type need
consider also the phase type. And in general, any single entity type should be considered as a phase
type - the only one in the corresponding phase sequence. At least, in virtue of the fact that there
always exist sources and receivers of entity instances.

4. Modeling phase sequences at the logical level
    Earlier, it was noticed that ordinary adhering types of chain classes such as rnk or prs to base types
sets the phase sequences. The formal schemas covering all possible ways for specifying the phase
sequences (resulting ultimately from the permissible permutation of logical scheme tools) are depicted
in Fig. 3. Their number is limited.
    Let us proceed from abstract schemas (fig. 3a) to practice situations that better illustrate the
meaning and content the issues under consideration. Two rnk-types in fig. 3b are connected with the
base type “Persons”. In both cases, the cardinality of the connections is M:M. This is a redundant
cardinality, since it would be possible to limit it to 1:M, which would reflect and at the same time
implement the restriction of the disjointness for individual types in the two phase sequences: “Military
ranks” and “Academic degrees”. The M:M cardinality indicates that it is necessary to track the history
of transitions of elements between phase types. To fix transitional tracks the weak entity
corresponding to M:M must content the mandatory "temporary" attribute as one of the components of
its composite primary key.
     A)                                                            B)                        rnk
                      1
      Ch-type                               Ch-type                         Military ranks
             1                                 М                                        М
             М                      М          М                                        М          bs
          Base-type                     Base-type                                   Persons
          е
                      1             1                                               М
                      М             М                                                        rnk
                                                                                    М
                          Ch-type                                           Academic degrees


Figure 3: Variants and examples of modeling phase sequences in logic schemas

   Rank-types, as a rule, rarely act as independent units. They are usually consolidated with types of
other categories, for example, with profile-, split- or base-types, and only then are connected with
those base-types, whose instances participate in phase sequences. Similar concatenations are

                                                      58
particular true while modeling phase sequences with overlapping types (PAOT). Moreover, the very
existence of PAOT depends on the presence of consolidated types, one of which must pertain to the
rank category.
   To reveal the concatenation mechanism, we again resort to the "Academic degrees". The specified
type is always primarily attached to the “Academic degrees”, and then, attracting an intermediary –
the “Thesis committees”, is connected with the “Persons”. The evolution of the transformations for
the logical subschema containing all relevant situations for the concrete domain looks like this, as
shown in Figure 4.
                      rnk                                                    prf                                      spl
                                                                                       1       М
                              М            М
  Academic degrees                                   Scientific specialties            1               Branches             of
                                                     М
                                                       е                                               sciences
        М         М                                                   М                                            base
                   М                                 М                                             М
                                                                                                         Educational
                         Thesis committees                                                               institutions
                                                                     М      base
                                               М
                                                               Persons М
                                                     М
                        rnk                                           prf                                             spl
                                                                                   1       М
    Academic degrees                       Scientific specialties                                  Branches of sciences
                                                                               1
                    1                      1
                        М              М           rlt, qualifier                                                 base
                                                                                                          М
                              R1=Ad+Sc                                                         М       Educational
                                  1
                                       М                 rlt                                           institutions
                                      R2=R1+Tс
                                  М            1
                                                    М               base
                                                     М
                                           Thesis committees
                                                                            base
                                                    М
                                                                Persons

Figure 4: Evolution of logical construction

   For the first time prf- and spl-types appeared on the schema. The first type specifies an abstract
"scope" where the rnk-type is applicable. The second classifies this area by partitioning it into disjoint
subsets. The pair of rnk-prf-types, as shown in [1], generates a new type concerning to the qualifier
category, which formally is a “weak entity” in the format of a rlt-type-materialized relation [10]. In
the diagram, it is designated as R1. The inclusion of another rlt-type R2, as well as the previous R1 in
general, is caused by the fact that their instances manifest themselves as self-sufficient objects, i.e.,
they participate in interactions with “Tc” (R1) and “Persons” (R2).
   The principle diagram for modeling PAOT at the logical level (fig. 4) is obvious. The latter must
contain from one to several target base-types - suppliers of elements for phase types, rnk-type, and
from one to several “auxiliary” types represented by base-types, which distinguish from target types,
prf- and types of other categories.


                                                                 59
                                   BASE                        BASE
                           М               М               М           М
                           М                                          М
             terminal relator          М
                                                                М     terminal relator
                   М           1                                                         1
                                       М        base                                             М      base
                   1                                                           1

               relator             1   М       аdjacency                   relator           1    М    аdjacency
         М             М                          matrix                                                  matrix
                                                                       М           М

         1             1                                              1            1

       RANK                    profile                                RANK                   profile


Figure 5: Elements of a logic subschema for modeling multiphase sequences with intersecting types

    If there are transition matrixes (adjacency matrixes) that reflect the coherence of rlt-type instances,
we can obtain several PAOT. For an example from the practice used in the article, it can be such
PAOT as, for example, “Candidates of Sciences – Doctors of Sciences”, "Candidates of Sciences by
branches of Sciences – Doctors of Sciences by branches of sciences”, etc.
    All adjacency matrixes should be attributed to the prototype-kind for an obvious reason, since they
define the allowed tracks for instances of the target base-types. Real transitions are tracked by means
of instances of weak entities that implement relationships between the “terminal relator” and target
base-types.
    If there are several rlt-types in a structural cluster, it is appropriate to articulate the issue of the
relations between different terminal relators and the constraints on these relations. Suppose that the
schema in the figure 5 contains two connected terminal relators. But the hierarchical nature of the
links connecting the rank-type with the terminal relator suggests that the higher adjacency matrix
should "absorb" all the lower-lying adjacency matrix. It follows that the number of independent rank-
types concentrated in a structural cluster also determine the maximum number of adjacency matrix in
it.

5. Conclusion
    There are no entity types in domains that wouldn’t exchange their instances, and thus wouldn’t
form phase sequences. Even if phase sequence consists of only one single type, there must exist (most
likely outside the modeled domain) some “input” and “output” types that point to the sources and
receivers of entity instances. Specific types in phase sequences are ordinary role types, in the
configuration of which, as well as the phase sequences themselves, types of certain non-entity
categories participate. First of all, this applies to the types of rank, stage or status categories, which
are put together in a chain-group.
    The appearance of types of listed categories at conceptual schemas immediately indicates the need
for identification and consequent modeling of a particular phase sequence. The article describes in
detail kinds, assignments and various options for the attachment of chain-types to the entity types, as
well as logical constructions for modeling the composition of phase types and the movement of
particular elements between them.
    Separately, the issues of detecting and modeling phase sequences with overlapping types reflecting
the fact that the same instance can simultaneously be in more than one phase type are discussed. We
ascertained that in order to generate the corresponding sequences it needs a mandatory rank-type and
from one to several concomitant types that related to types of profile, entity or split categories.

                                                           60
    Types of corresponding categories, apart from the aforementioned role, bear also the basic load in
ensuring transformation of base types (base types are the logical model types containing entities) into
the entity types of a conceptual level. All data needed for transformation are concentrated in the
associations that connect base and auxiliary types. The paper generally reveals the mechanism of
direct and inverse transformations for types of conceptual and logical models, which allows us to
form a holistic system view of the purpose, basic properties and interrelation of conceptual and logical
structures.

6. Acknowledgements
   The studies were carried out using the resources of the Center for Shared Use of Scientific
Equipment       "Center      for     Processing    and      Storage      of    Scientific     Data
of the Far Eastern Branch of the Russian Academy of Sciences", funded by the Russian Federation
represented by the Ministry of Science and Higher Education of the Russian Federation under project
No. 075-15-2021-663.

7. References
[1] A. N. Rodionov, The abstract roles and the primitives of role modeling in the conceptual, logical,
    and physical data models system Information technology, 2019, №4, V. 25, pp. 451–466
[2] A. N. Rodionov, Semantic identification, configuration and entities types modeling for the data
    model engineering Vestnik NSU. Series: Information technologies, 2014, V. 12, №.1, pp. 64–78.
[3] N. Sidorova, C. Stahl, N. Trcka, Soundness verification for conceptual workflow nets with data:
    Early detection of errors with the most precision possible // Information system – 2011. vol. 36. –
    P. 1026-1043.
[4] B. Thalheim, Component development and construction for database design // Data and
    knowledge engineering, 2005, V.54, pp. 77-95.
[5] F. Steimann, On the representation of roles in object-oriented and conceptual modeling // Data
    and knowledge engineering. 2000. V.35. P.83-106.
[6] W. van der Aalst, van Hee, Workflow management: models and systems The MIT press
    Cambridge, Massachusetts London, England 2002
[7] T. Halpin, T. Morgan, Information modeling and relational databases. Morgan Kaufmann
    Publishers, 2008. 970 p.
[8] G. Guizzardi, Ontological patterns, anti-patterns and pattern languages for next-generation
    conceptual modeling ER 2014, LNCS 8824, pp. 13-27, 2014
[9] S. Hartmann, S. Link, Collection type constructors in entity-relationship modeling ER 2007, pp.
    307-322.
[10] G. Guizzardi, Ontological foundations for structural conceptual models, Ph.D., thesis, Center
    for Telematics and Information Technology, University of Twente, The Nethelands, 2005, 441 p.


                                                    61