=Paper= {{Paper |id=Vol-1458/F06_CRC59_Mueller |storemode=property |title=POQL: A New Query Language for Process-Oriented Case-Based Reasoning |pdfUrl=https://ceur-ws.org/Vol-1458/F06_CRC59_Mueller.pdf |volume=Vol-1458 |dblpUrl=https://dblp.org/rec/conf/lwa/MullerB15 }} ==POQL: A New Query Language for Process-Oriented Case-Based Reasoning== https://ceur-ws.org/Vol-1458/F06_CRC59_Mueller.pdf
          POQL: A New Query Language for
        Process-Oriented Case-Based Reasoning

                        Gilbert Müller and Ralph Bergmann

                           Business Information Systems II
                                  University of Trier
                                54286 Trier, Germany
                        [muellerg][bergmann]@uni-trier.de,
                           http://www.wi2.uni-trier.de



        Abstract. Sharing and reuse of best-practice process models is an im-
        portant knowledge management approach for business process modelling.
        Process-oriented Case-Based Reasoning (PO-CBR) supports this by re-
        trieving and adapting processes or workflows based on models stored in
        the repository, which requires an expressive query language. Hence, we
        present a novel query language for workflows that enables to express
        generalized query terms and negation. Further, it allows a ranking of the
        repository workflows.

        Keywords: Process-oriented Case-based Reasoning, Business Process
        Querying, Workflows


1     Introduction
Nowadays, business processes have to be organized highly flexible due to in-
creasing globalization and competitive pressure. Thus, business processes and
workflows implementing them have to be promptly defined or adapted to new
circumstances. For this purpose, sharing and reuse of existing best-practice pro-
cess models is an important approach that introduces knowledge management
concepts into business process modelling [11]. Thus, important knowledge man-
agement tools are searchable repositories of process models that enable the re-
trieval of reusable processes and may in addition propose ways of reusing them.
Process-oriented Case-based Reasoning (POCBR) [18] is a research area that
deals with applying case-based reasoning (CBR) to experiential knowledge rep-
resented in process models and workflows and thus provides the foundations for
building knowledge management tools supporting process modelling. Current
research in POCBR addresses the similarity-based retrieval and adaptation of
process and workflow models. However, the formulation of queries for the pur-
pose of modelling and adaptation support has not been discussed extensively.
    Copyright c 2015 by the papers authors. Copying permitted only for private and
    academic purposes. In: R. Bergmann, S. Görg, G. Müller (Eds.): Proceedings of
    the LWA 2015 Workshops: KDML, FGWM, IR, and FGDB. Trier, Germany, 7.-9.
    October 2015, published at http://ceur-ws.org




                                          247
2

Current POCBR research usually assumes that a query just consists of a partial
workflow/process currently being designed. Then retrieval searches for similar
workflows that mostly match the current query. Thus, the found workflows can
be considered a source of the auto-completion of the current partial workflow,
which can be supported by case-based adaptation methods. However, for ap-
propriate retrieval and adaptation, a query language is needed that is able to
capture as best as possible all current requirements on the workflow/process to
be created. In this paper, we address this issue by proposing a new, more ex-
pressive approach for the formulation of queries in POCBR and we sketch a way
of tweaking existing retrieval methods to deal with this language.


2     Foundations

We build this research upon our previous work in POCBR which addresses
the similarity-based retrieval and adaptation of semantic workflows [7,19,20].
The methods developed so far are implemented in the prototypical software
system CAKE [6] and analyzed in various application domains. In this paper,
we illustrate our approach in the domain of cooking recipes. A cooking recipe
is represented as a workflow describing the instructions for cooking a particular
dish. We now briefly outline previous relevant work as the main foundation of
this paper.
    Broadly speaking, workflows consist of a set of activities (also called tasks)
combined with control-flow structures like sequences, parallel (AND) or alterna-
tive (XOR) branches, as well as repeated execution (LOOPs). Tasks and control-
flow structures form the control-flow. In addition, tasks exchange certain data
items, which can also be of physical matter, depending on the workflow domain.
Tasks, data items, and relationships between them form the data flow. In our
work we extend this traditional view of workflows by adding semantic anno-
tations to (potentially) all workflow items as a means to support case-based
reasoning. In formal terms, a semantic workflow is defined as a directed graph
W = (N, E, S, T ) where N is a set of nodes and E ⊆ N × N is a set of edges.
Nodes and edges have types assigned by the function T , which partitions the
nodes N of a workflow into a single workflow node N W and several data nodes
N D , task nodes N T , and control-flow nodes N C . Likewise we distinguish data-
flow edges E D , control-flow edges E C and part of edges N P . Further, nodes have
a semantic description from a semantic meta data language Σ, which is assigned
by the function S : N → Σ.


2.1   Semantic Workflows

Figure 1 shows a simple fragment of a workflow graph from the cooking domain.
Here, the tasks represent the cooking steps and the data items refer to the in-
gredients being processed by the cooking steps. The main source of knowledge
in POCBR is a repository of semantic workflows (in CBR terminology called the
case base) available for reuse. In order to obtain a reusable workflow similarity




                                       248
                                                                                             3


                                    n1                          n8: task: simmer
                                                                n8: duration: until tender
    n2


    n3              n5                     n6                   n7                    n8

    n4              n5: task: saute                              n7: task: add
                    n5: duration: 5 min.
                                                  n6: ingredient: Mushrooms
          n4: ingredient: onion                   n6: status: sliced
          n4: status: chopped


            Workflow node                   Data node                     Task node

            Data flow edge                  Control flow edge             Part-of edge


                         Fig. 1. An example workflow graph


search or process model querying [11,21] can be applied. According to Dijkman
et al. [11] “the main difference between querying and similarity search is that
querying searches for exact matches of a query to a part of a process model,
while similarity searches for inexact matches of the query to a complete process
model”. We focus on similarity search as it is able to provide results even if
exact matches are not available, which is very likely in many application scenar-
ios. Various approaches for similarity search of workflows have been proposed in
the literature, such as graph edit metrics, graph/subgraph isomorphism, most
common subgraph approaches [3,7,10,14,15]. In our research [7], we developed
an approach that follows the tradition of CBR and uses explicitly modelled local
similarity measures for task and data items, based on task and data ontologies,
which are also used for the semantic annotation of the workflows. The over-
all similarity sim(QW, CW ) ∈ [0, 1] between a query workflow QW and a case
workflow CW from the repository is defined as an optimization problem aiming
at finding the best possible type-preserving, partial, injective mapping of the
nodes and edges of QW to those of CW. The optimization target is the average
similarity of the mapped nodes and edges. This similarity measure assesses how
well the query workflow is covered by the case workflow. In particular, the sim-
ilarity is 1 if the query workflow is exactly included in the case workflow as a
subgraph.


3    Query Language Requirements

Previous work on POCBR (including our own) is limited by the type of queries
that can be considered. As described in the previous section, a query is a single




                                           249
4

(partial) workflow that describes task and data items and structural relationships
of the desired workflow. This can be roughly considered a conjunctive query, as
the ideal workflow contains the whole query workflow, i.e., all components it
contains. Thus, disjunction and negation cannot be expressed. However, the
user may also want to express undesired workflow elements and structures. For
example, some tasks or data elements must not occur in the workflow or a certain
sequence of activities is undesired. Also, disjunction/generalization is sometimes
required, e.g. by providing more general conditions, such as specifying a class
of tasks (from the ontology) or by specifying that a certain task must occur
some time (but not necessarily directly) before another one. More expressive
queries are not only desirable for retrieving more suitable workflows but are
also essential to guide automatic adaptation methods from POCBR as they can
provide hints concerning which workflow elements need to be added, deleted, or
moved to a different position. Besides these usage scenarios, literature also lists
a wide range of additional purposes for queries [17,1], e.g., dependency analysis
between workflow elements [9] and decision making support [12]. However, in the
following we focus our investigations on retrieval and adaptation support. Based
on this, we derive the following main requirements for a new query language,
which have also been partially mentioned in the literature [16,2]:

    – Expressiveness: The query language must be expressive enough to be able
      to represent the relevant requirements of the user. Thus, the query language
      should not only be able to represent what the user desires, it should be
      also able to handle undesired workflow elements, which requires a kind of
      negation. Additionally, generalization of structure and items is required.
    – Intuitiveness: The query language should be easy to understand. Thus,
      new notations should be only introduced if required and it should be based
      on the already known concepts. Additionally, a visual query language is
      preferable as workflows can become very complex and thus also its queries.
    – Ranking: For the specified query language it must be able to identify all
      matching workflows. Moreover, as fully matching workflows are very unlikely
      in many application scenarios, it must be possible to rank the workflows
      w.r.t. suitability for the query, if they don’t match perfectly.

Thus, the similarity-based retrieval must be extended towards a retrieval con-
sidering suitability w.r.t. a more expressive query.


4      POQL: A New Query Language for Workflows

POQL is a workflow query language developed according to the previously men-
tioned requirements. It extends the purely conjunctive query approach by in-
troducing negation and generalized query workflows. Formally, a POQL query
Q is defined as follows: Q = DW ∧ ¬RW1 . . . RWn . Here, DW is the desired
workflow representing properties that the searched workflow should fulfill, RWi
are restriction workflows, each of which represents one undesired situation that




                                        250
                                                                                     5

should be avoided. The desired workflow and the restriction workflows are so-
called generalized query workflows. They are in principle workflows in the pre-
viously introduced sense, but they are extended to represent generalizations by
two means. As a consequence, generalized query workflows are not executable
workflows anymore, but they are just a means to express a query in a way similar
to a workflow. Thereby, the intuitiveness requirement can be fulfilled as building
a query is very similar to building a workflow.

Generalized Task and Data Labels: While workflows from the repository
usually contain ontology instances for tasks and data items specifying the flow of
activities in executable terms, a generalized workflow [20] is allowed to contain
concept labels of the data- and task ontology, thus specifying classes of them. For
example, a recipe workflow may specify that meat is desirable without specifying
in detail which meat should be used. For the representation of generalized task
and data labels, the meta-data language used for semantic annotation must also
include the ontology concepts.

Transitive Control-Flow and Dataflow Connectedness: While workflows
exactly specify the control- and dataflow of a workflow, a generalized query
workflow may just specify that a certain tasks must be executed some time
before another task or that one task produces data that at some later point is
used by another tasks. In formal terms these relations are defined as follows.

 1. Let t1 < t2 define that there exists a control-flow edge between t1 ∈ N T
    and t2 ∈ N T defining that t1 is executed before t2 . We define that two tasks
    t1 , t2 ∈ N T are transitively control-flow connected t1 <∗ t2 iff t1 < t2 ∨ ∃t ∈
    N T : t1 < t < ∗t2 . We represent t1 < ∗t2 in a generalized query graph by
    introducing a new type of edge leading from t1 to t2 .
 2. Let t1 n t2 denote that the tasks t1 , t2 ∈ N T are data-flow connected, i.e., it
    holds t1 n t2 iff t1 < t2 ∨ ∃d ∈ N D : ((t1 , d) ∈ E D ∧ (d, t2 ) ∈ E D ). Based on
    this we define that two tasks t1 , t2 ∈ N T are transitively data-flow connected
    t1 n∗ t2 , if t1 n t2 ∨ ∃t ∈ N T : t1 n t n∗ t2 . To represent transitive data-flow
    connectedness, a new type of edge is introduced. For example, this edge can
    be used to express in the query that after a specific preparation step using
    an ingredient another preparation step follows that uses the same ingredient.

Figure 2(a) shows an example of a generalized query workflow. This query spec-
ifies that a workflow is desired that requires less than 20 minutes of preparation
time, which contains some kind of meat (generalized data label) , and the prepa-
ration steps stuff and bake. Furthermore, it is specified that the tasks for stuffing
the meat must occur before the baking tasks (transitive control-flow connected).
     The restriction workflows can be constructed in a similar manner as illus-
trated in figure 2(b). The four restrictions shown specify that the searched
workflow should require less than 30 minutes of preparation time (RW1 ), that
it should not contain any seafood (RW2 ) or deep-fry preparation steps (RW3 ),
as a frying machine is maybe not available. Furthermore, it is required that no




                                         251
6



                                                                  n1                                n2
                  n1       n1: duration: < 20 min
                                                         n1: duration: < 30 min         n2: ingredient: seafood

                                                                 RW1                           RW2

    n4           n5 n5 <* n6 n6                             n3             n4          n5                 n7

                                                                                   n5: task: add     n7: task: bake
                                                      n3: task: deep-fry
            n5: task: stuff    n6: task: bake                              n4: ingredient: cheese

    n4: ingredient: meat                                   RW3                              RW4
         (a) Query Workflow Example                     (b) Restriction Workflow Examples

                                  Fig. 2. POQL Query Example


cheese is added to a dish component which is later baked (RW4 ), thus casserole
recipes are undesired.
    Please note that in principle it would be possible to allow more than one
desired workflow in a query. However, this is not necessary as it would not extend
the expressiveness of the language, as several desired workflows could be easily
merged into a single desired workflow representing the same query semantics.
    The processing of a POQL query requires ranking all workflow from the
repository and presenting the best ranked results. For a query which does not
include any restriction workflows, our approach for workflow similarity can be
extended in a straight forward manner. Generalized task and data labels are
addressed by applying taxonomic similarity measures, which are well-established
in CBR [5].
    To consider transitive control-flow and dataflow connectedness, the workflow
graph of all workflows in the repository is extended to represent all relations
t1 < ∗t2 and t1 n∗ t2 which must be pre-computed during the initialization of
the repository. Given this, these relations can be matched in the same man-
ner as all other edges of the graph. However, this approach obviously increases
the complexity of the representation and thus the complexity of the similarity
assessment, which is an issue of our future research.
    To handle restriction workflows in POQL requires dealing with negation and
conjunction. We propose an approach adopted from fuzzy logic [8] by treating
the similarity value computed by comparing a case workflow with a generalized
query workflow as a fuzzy membership value. Then we can apply fuzzy negations
and a t-norm to compute the conjunction. In particular, we compute the rank
of a workflow as follows:


rank(Q, CW ) = min{sim(DW, CW ), 1−sim(RW1 , CW ), . . . , 1−sim(RWn , CW )}
                                                                       (1)




                                                252
                                                                                 7

   The resulting rank(Q, CW ) ∈ [0, 1] reflects how well the workflow matches
the query, while the following conditions hold:

 1. Rank = 1 if the case workflow exactly matches the desired workflow and
    does not contain any subworkflow which is somehow similar to any of the
    restriction workflows.
 2. Rank = 0 if the case workflow contains a subworkflow which exactly matches
    one of the restriction workflows.
 3. Rank ∈]0, 1[ if the case workflow partially matches the desired workflow and
    if all restriction workflows are at most matching partially (not exactly).


5   Conclusions

We presented the novel query language POQL for POCBR which is highly in-
tuitive and enables not just the retrieval but also the adaptation of workflows.
A POQL query can be easily constructed using a graphical workflow editor by
introducing additional link types among tasks as well as ontological concepts for
semantic annotation. We presented an approach that allows an ordering of the
best workflows from the repository by a ranking approach.
    In process model querying, BPMN-Q [1,21] is a related approach applied
to BPMN business processes. The approach is able to identify processes that
match the partial modelled workflow. However, the approach is so far neither
able to consider the dataflow nor undesired data or tasks. Furthermore, there is
no ranking between the processes found. Awad et al. [2] extend BPMN-Q by re-
garding semantics between workflow elements. The related approaches presented
by Beeri et al. [4] or by Markovic et al. [16,17] are not able to support negations
or to rank the results by similarity which both is required for the modelling and
adaptation support of workflows. Recently, PQL [13] has been presented, which
also addresses the querying and changing of process models. However, in con-
trast to our work where required changes are implicitly derived from the query,
PQL required to define those changes explicitly in a SQL-like statement.
    Future work will extend the query language to support other usage scenarios
(see section 3), i.e., scenarios in which only fragments of workflows are searched
rather than complete workflows. Additionally, an evaluation of the presented
POQL will be undertaken.


Acknowledgements. This work was funded by the German Research Founda-
tion (DFG), project number BE 1373/3-1.


References

 1. Awad, A.: BPMN-Q: A language to query business processes. In: EMISA. vol. 119,
    pp. 115–128 (2007)




                                       253
8

 2. Awad, A., Polyvyanyy, A., Weske, M.: Semantic querying of business process mod-
    els. In: Enterprise Distributed Object Computing Conference, 2008. EDOC’08.
    12th International IEEE. pp. 85–94. IEEE (2008)
 3. Bae, J., Liu, L., Caverlee, J., Rouse, W.B.: Process mining, discovery, and inte-
    gration using distance measures. In: Web Services, 2006. ICWS’06. International
    Conference on. pp. 479–488. IEEE (2006)
 4. Beeri, C., Eyal, A., Kamenkovich, S., Milo, T.: Querying business processes. In:
    Proceedings of the 32nd international conference on Very large data bases. pp.
    343–354. VLDB Endowment (2006)
 5. Bergmann, R.: Experience management: foundations, development methodology,
    and internet-based applications. Springer-Verlag (2002)
 6. Bergmann, R., Gessinger, S., Görg, S., Müller, G.: The collaborative agile knowl-
    edge engine cake. In: Proceedings of the 18th International Conference on Support-
    ing Group Work. pp. 281–284. GROUP ’14, ACM, New York, NY, USA (2014)
 7. Bergmann, R., Gil, Y.: Similarity assessment and efficient retrieval of semantic
    workflows. Inf. Syst. 40, 115–127 (Mar 2014)
 8. Burkhard, H.D., Richter, M.M.: On the notion of similarity in case based reasoning
    and fuzzy theory. In: Soft computing in case based reasoning, pp. 29–45. Springer
    (2001)
 9. Dai, W., of Waterloo. School of Computer Science, U.: A Query-based Approach
    to Workflow Process Dependency Analysis. University of Waterloo (2005)
10. Dijkman, R., Dumas, M., Garcı́a-Bañuelos, L.: Graph matching algorithms for
    business process model similarity search. In: Business process management, pp.
    48–63. Springer (2009)
11. Dijkman, R.M., La Rosa, M., Reijers, H.A.: Managing large collections of business
    process models-current techniques and challenges. Computers in Industry 63(2),
    91–97 (2012)
12. Hepp, M., Leymann, F., Domingue, J., Wahler, A., Fensel, D.: Semantic business
    process management: A vision towards using semantic web services for business
    process management. In: e-Business Engineering, 2005. ICEBE 2005. IEEE Inter-
    national Conference on. pp. 535–540. IEEE (2005)
13. Kammerer, K., Kolb, J., Reichert, M.: PQL - A Descriptive Language for Querying,
    Abstracting and Changing Process Models. In: BPMDS’15. pp. 135–150. No. 214
    in LNBIP, Springer (June 2015)
14. Kapetanakis, S., Petridis, M., Knight, B., Ma, J., Bacon, L.: A case based reason-
    ing approach for the monitoring of business workflows. In: Case-Based Reasoning.
    Research and Development, pp. 390–405. Springer (2010)
15. Ma, Y., Zhang, X., Lu, K.: A graph distance based metric for data oriented work-
    flow retrieval with variable time constraints. Expert Systems with Applications
    41(4, Part 1), 1377 – 1388 (2014)
16. Markovic, I.: Advanced querying and reasoning on business process models. In:
    Abramowicz, W., Fensel, D. (eds.) Business Information Systems, Lecture Notes
    in Business Information Processing, vol. 7, pp. 189–200. Springer Berlin Heidelberg
    (2008)
17. Markovic, I., Costa Pereira, A., de Francisco, D., Muoz, H.: Querying in business
    process modeling. In: Di Nitto, E., Ripeanu, M. (eds.) Service-Oriented Computing
    - ICSOC 2007 Workshops, Lecture Notes in Computer Science, vol. 4907, pp. 234–
    245. Springer Berlin Heidelberg (2009)
18. Minor, M., Montani, S., Recio-Garca, J.A.: Process-oriented case-based reasoning.
    Information Systems 40(0), 103 – 105 (2014)




                                         254
                                                                                   9

19. Müller, G., Bergmann, R.: Workflow streams: A means for compositional adapta-
    tion in process-oriented cbr. In: Case-Based Reasoning Research and Development,
    pp. 315–329. Springer (2014)
20. Müller, G., Bergmann, R.: Generalization of Workflows in Process-Oriented Case-
    Based Reasoning. In: 28th International FLAIRS Conference. AAAI, Hollywood
    (Florida), USA (2015)
21. Sakr, S., Awad, A., Kunze, M.: Querying process models repositories by aggregated
    graph search. In: Business Process Management Workshops. pp. 573–585. Springer
    (2013)




                                        255