<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Data-driven and Context-aware Process Provisioning</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Queensland University of Technology</institution>
          ,
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Business process provisioning deals with the execution of a process by providing
su cient resources (people, technology, and information), to realize its goals.
To ensure e ective execution of the business process, process execution logs or
event logs are analyzed considering multiple perspectives: i) ordering of
activities (process), ii) completion time of tasks and process (timing), iii) variances
based on case attributes (case), and iv) capabilities, roles of resources and their
allocation (organization). A large part of existing work has focused on these
perspectives in an isolated manner. This thesis is mainly concerned with analyzing
variance in resource e ciency based on case attributes and its impact on
process performance, thus considering organizational, case, and timing perspective
together. Further, an important dimension of operational context is considered
as event log data suggests that human workers or resources with the same
organizational role and capabilities can have heterogeneous e ciencies based on
their operational context. In practice, experienced managers use this knowledge
and consider these heterogeneous resource 1 e ciency, when allocating tasks to
meet the required process performance. The thesis is mainly concerned with
answering operational questions such as, \How does resource e ciency vary with
case attributes, resource behavior (that manifests as context), further impacting
task allocation?", \How do we support task allocation based on process context,
for pull-based and pull-based dispatching scenarios?", and \How do we identify
useful contextual information from process data?" Hence, we refer to this work
as context-aware process provisioning.</p>
      <p>
        The importance of context has been recognized by many researchers in
disciplines such as mobile computing, ubiquitous computing, information retrieval, as
well as business process management [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In BPM, context has been
considered as\the initial state of the process and the set of external events that a ect
the process instance at run time" [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The context can also contain important
information about resource behavior (which, are external to the process
execution) and yet could in uence the resource and process. Early work on dynamic
resource allocation, alludes to the importance of considering certain resource
and case attributes such as suitability, availability, conformance and urgency [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
1 this work focuses on human resources only
      </p>
      <p>
        The in uence of individual contextual parameters such as workload and
cooperation on resource performance have been studied [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. As several contextual
parameters can in uence resources and their e ciency, we learn the in uence
of operational context and case on resource e ciency from event logs. The key
argument of this thesis is that, for an e ective resource allocation, it is
necessary to consider the in uence of several factors related to process instance and
operational context of resources in conjunction. To this end, we present a set of
data-driven methods (machine learning models) relevant to di erent dispatching
scenarios.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Contributions</title>
      <p>The overall contribution of this thesis is to validate the in uence of multiple
factors related to case attributes and the context on resource performance, develop
solutions that learn the in uence of the factors from event data using supervised
machine learning algorithms, and explore the possibility of mining contextual
factors from additional process data sources such as textual information to
support process analysis. In this thesis, in line with other de nitions of context, we
consider context as exogenous knowledge potentially relevant to the execution of
the process that is available at the start of the execution of the process, and that
is not impacted/modi ed via the execution of the process</p>
      <p>
        The key contributions to context-aware resource performance analysis and
allocation are as follows:
{ Evaluate the in uence of context and case attributes on resource performance
using event logs which leads to a context model comprising resource behavior
indicators such as experience, cooperation, preference [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and other task
related contextual factors.
{ A recommender system that uses the context model in addition to task and
resource attributes, for pull-based dispatching scenarios.
{ A supervised machine learning-based method, to derive resource allocation
policies, in push-based dispatching scenarios.
{ A method to explore and discover contextual information that impacts
performance outcome, from unstructured textual data available in
communication and message logs of business processes tasks.
2.1
      </p>
      <sec id="sec-2-1">
        <title>Heterogeneous Resource Performance</title>
        <p>In a business process where contractual service levels have to be maintained, the
time taken (completion time) to complete a task is a critical performance metric.
The completion time comprises of: (a) queue waiting time in the process, and (b)
the service time of the resource (time required to complete a single unit of work).
The queue waiting time depends on the amount of work that exists in the system
and the resources available for doing that work. The service time of a worker
(or resource) is in uenced by several factors. In this work multiple factors are
Data-driven and Context-aware Process Provisioning
considered: (a) complexity of work (b) importance or priority of the work, and
(c) contextual factor such as expertise level of the worker required for a work. It
is observed that, while experts have lower service time than novices for complex
work and important work, they tend to have the same e ciency as novices for
less important work. The study illustrates that the e ciency of human resources
in any process, depends on multiple factors that go beyond the role or availability
of the workers. A simulation model is built to account for the behavior of experts
and novices for varying work complexity and priority. A search based optimizer
uses the simulation model to arrive at an optimal sta ng. The simulation model
results in di erent sta ng solutions based on the contextual factors considered.
Hence, the results presented serve as the basis for guidelines on using data-driven
analysis taking into consideration the operational context when allocating task
to resources.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Context-aware Resource Allocation</title>
        <p>In a process, where tasks allocation follows a pull-based dispatching policy, the
ownership of selecting the right task to work on, lies with the resource. The
proposed approach involves recommendation of tasks to resources taking into
consideration the context of the resource and the task. Such a decision support
is useful to novice service workers. To this end, we build a context-aware
recommender system (CARS). The event logs are used to infer context and the outcome
of the execution. The outcome is the performance indicator de ned for the task.
The recommender predicts the suitability of a task for a resource, by providing
a rating. Prediction is based on the assumption that resources who have similar
ratings on certain tasks are likely to have similar ratings towards other tasks.
The rating of a task is predicted, by identifying resources who have had similar
ratings on tasks under similar context, hence considering the notion of context).
A context model based on resource characteristics or behavior and task context
such as `time of the day' is de ned. In order to conduct the evaluation, two
Business Process Intelligence Challenge (BPIC) event logs are used (BPIC 2012 and
BPIC 2013). Based on the identi ed performance outcome (completion time,
quality), ratings are computed for each resource-task pair. The event logs are
enriched by inferring contextual factors such as resource workload, experience
and preference. The rating of a task for a resource is predicted and compared
with the actual rating. Prediction with and without contextual information are
compared to evaluate the importance of context in deriving a suitable rating
of a resource for a task. Results indicate that the mean absolute error (MAE)
considering contextual factors is lower than MAE without considering context
2. Further, the improvement in MAE is studied on individual contextual factors.
The factors that in uence rating vary for each business process.
2 https://github.com/renuka98/context_aware_allocation</p>
        <p>Sindhgatta et al.</p>
      </sec>
      <sec id="sec-2-3">
        <title>Learning In uence of Context on Resource Allocation</title>
        <p>In push-based dispatching, a system or a person, is responsible for allocating
the tasks of the process to relevant human resources. To study the in uence of
context on resource allocation, two common explainable machine learning
algorithms are used: (1) Decision tree learning, and (2) the k-nearest neighbor
(k-NN). With the former, context inferred from event logs are used as input
features and the process outcome is used as a target variable to derive a decision
tree, predicting the performance of a process instance. The decision tree thus
obtained can also be used to extract rules correlating contextual knowledge with
process data when the intent is to guarantee a certain set of outcomes (in other
words, a certain performance pro le). With the k-NN approach, k-NN regression
is used to determine from the nearest neighbors of a process instance, those
values of the process context (and particularly those that characterize resources)
that would likely lead to the desired outcomes. The evaluation of the approach is
presented using both a real-world dataset (BPIC 2013) and a synthetic dataset.
Evaluation of the synthetic dataset aims to verify the ability of using the
approach to discover context dependent task allocation rules, that are synthetically
introduced in the log. The real-world data is used to validate in uence of
context when allocating tasks. The decision tree model helps explain the in uence
of contextual factors on process outcome (and hence resource e ciency). The
model predicts process outcome with an F1-score of 67.13% on the hold-out test
data.
2.4</p>
      </sec>
      <sec id="sec-2-4">
        <title>Context Mining</title>
        <p>
          Observing and analyzing impact of the context of a process or the
environmental factors, on its execution outcome helps adapting and improving the process.
There can be two views of context [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. First, explicit context which is
information that is identi ed by domain experts and can be de ned a priori. Second,
implicit context where some situations arise as a part of performing a task or an
activity, and may not be known prior to task or process execution. These implicit
contextual dimensions need to be discovered from various sources of information
[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. In this work, the problem of exploiting unstructured textual data to discover
implicit context is studied. In the proposed approach, phrases of textual data
are extracted from relevant textual logs of process instances. The phrases are
used instead of the entire user comments, as comments would include several
aspects of task execution. The extracted phrases or segments of information are
clustered. Since the phrases are very short documents, di erent clustering
approaches are evaluated to identify the most suitable method for short document
clustering. The clusters are semi-automatically pruned by applying ltering rules
such as size and process outcome, to arrive at subset of textual clusters that are
likely to relate to implicit contextual information and impact process outcome.
The nal decision of the information being a contextual dimension or not, is
made by domain experts. Di erent clustering algorithms and textual similarity
measures are evaluated on an annotated dataset containing cluster labels. This
enables identifying a suitable clustering algorithm and similarity metric. Next,
the approach using clustering and similarity metric is used to identify phrases
on a business process log. Clusters of phrases that have statistically signi cant
variance in the performance outcome, are identi ed. The phrases mined from
the approach present interesting insights on possibility of identifying implicit
context through unstructured textual data.
        </p>
        <p>However, a key challenge in the evaluation of this study is the use of
unsupervised learning algorithm on business process logs that do not contain a
gold-standard (or ground truth), that other natural language processing tasks
utilize for evaluation. Lack of process event logs containing textual information
limits the ability to evaluate on multiple process logs and assess the
repeatability of results. Hence, we consider this work as a rst step in mining contextual
data from process logs that could be further improved with the use of supervised
learning on an expert annotated or labeled dataset.</p>
        <p>In conclusion, this thesis compiles a set of methods that are data-driven and
context-aware to help managers and resources allocate suitable tasks and help
improve the overall process outcome.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>O.</given-names>
            <surname>Saidani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rolland</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Nurcan</surname>
          </string-name>
          , \
          <article-title>Towards a generic context model for BPM," in 48th HICSS 2015, Hawaii</article-title>
          , USA,
          <year>2015</year>
          , pp.
          <volume>4120</volume>
          {
          <fpage>4129</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rosemann</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Recker</surname>
          </string-name>
          , \
          <article-title>Context-aware process design exploring the extrinsic drivers for process exibility,"</article-title>
          <source>in Proc. of the CAISE*06 Workshop on BPMDS '06</source>
          ,
          <string-name>
            <surname>Luxemburg</surname>
          </string-name>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ghattas</surname>
          </string-name>
          , P. So er, and M. Peleg, \
          <article-title>A formal model for process context learning," in BPM 2009 International Workshops</article-title>
          , Ulm, Germany, September 7,
          <year>2009</year>
          . Revised Papers,
          <year>2009</year>
          , pp.
          <volume>140</volume>
          {
          <fpage>157</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. M. P. Van Der Aalst</surname>
            , and
            <given-names>E. M. W.</given-names>
          </string-name>
          <string-name>
            <surname>Verbeek</surname>
          </string-name>
          , \
          <article-title>Dynamic work distribution in work ow management systems: How to balance quality and performance,"</article-title>
          <string-name>
            <given-names>J.</given-names>
            <surname>Manage</surname>
          </string-name>
          . Inf. Syst., vol.
          <volume>18</volume>
          , no.
          <issue>3</issue>
          , pp.
          <volume>157</volume>
          {
          <issue>193</issue>
          ,
          <string-name>
            <surname>Jan</surname>
          </string-name>
          .
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Nakatumba</surname>
          </string-name>
          and
          <string-name>
            <surname>W. M. P. van der Aalst</surname>
          </string-name>
          , \
          <article-title>Analyzing resource behavior using process mining," in BPM Intl</article-title>
          . Workshops, Germany,
          <year>2009</year>
          , pp.
          <volume>69</volume>
          {
          <fpage>80</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Dijkman</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Song</surname>
          </string-name>
          , \
          <article-title>Optimal resource assignment in work ows for maximizing cooperation,"</article-title>
          <source>in BPM</source>
          <year>2013</year>
          , China,
          <year>2013</year>
          , pp.
          <volume>235</volume>
          {
          <fpage>250</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pika</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Wynn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Fidge</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. H. M. ter Hofstede</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Leyer</surname>
            , and
            <given-names>W. M. P. van der Aalst,</given-names>
          </string-name>
          \
          <article-title>An extensible framework for analysing resource behaviour using event logs," in CAiSE, Greece</article-title>
          , June,
          <year>2014</year>
          , pp.
          <volume>564</volume>
          {
          <fpage>579</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Dourish</surname>
          </string-name>
          , \
          <article-title>What we talk about when we talk about context," Personal Ubiquitous Comput</article-title>
          ., vol.
          <volume>8</volume>
          , no.
          <issue>1</issue>
          , pp.
          <volume>19</volume>
          {
          <issue>30</issue>
          ,
          <string-name>
            <surname>Feb</surname>
          </string-name>
          .
          <year>2004</year>
          , issn:
          <fpage>1617</fpage>
          -
          <lpage>4909</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kiseleva</surname>
          </string-name>
          , \
          <article-title>Context mining and integration into predictive web analytics,"</article-title>
          <source>in 22nd International World Wide Web Conference, WWW '13</source>
          , Rio de Janeiro, Brazil, May
          <volume>13</volume>
          -17,
          <year>2013</year>
          ,
          <string-name>
            <given-names>Companion</given-names>
            <surname>Volume</surname>
          </string-name>
          ,
          <year>2013</year>
          , pp.
          <volume>383</volume>
          {
          <fpage>388</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>