<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>IRCDL</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Simulating User Interaction and Search Behaviour in Digital Libraries</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Saber Zerhoudi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Granitzer</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christin Seifert</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joerg Schloetterer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Duisburg-Essen</institution>
          ,
          <addr-line>Duisburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Passau</institution>
          ,
          <addr-line>Passau</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>18</volume>
      <fpage>24</fpage>
      <lpage>25</lpage>
      <abstract>
        <p>Providing a satisfying user experience is key to successful digital libraries. An eficient search user interface design requires a detailed understanding of user behaviour to match their needs. A common approach to modelling users is to transform the recorded users' navigation behaviour to Markov models and analyse them to provide personalised content. However, personalisation based on log data is hard to achieve due to the lack of suficient daily user interactions in comparison to web search engines, even when achieved, it's usually on the cost of users' privacy and the confidentiality of their personal information potentially leading to privacy violation. In order to allow data analysis while retaining users' privacy, we propose a Markov approach to simulate user sessions and help reducing the amount of collected user data while preserving the profiling eficiency. Specifically, given log data of search sequences performed by users in a digital library, we explore basic, conditional, contextual, time-aware and query-change Markov models to simulate similar search sequences. Our results on an academic search engine log dataset show that our methods reliably simulate global and type specific search behaviour. Simulated search sequences representing an approximation of real behaviour can be used to generate log data at any desired volume and variety such that A/B-testing digital libraries becomes scalable.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Simulating user session</kwd>
        <kwd>Markov models</kwd>
        <kwd>User search behaviour</kwd>
        <kwd>User modelling</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The main aim for search engines is to ofer a satisfying user experience. Their success can be
measured by how well they are able to predict user actions and provide users with the right
content at the right time. Modelling user behaviour accurately also allows search engines to
adapt their user interface design to match users’ needs. This includes deciding where each
user interface component should be placed and which content should be provided to the user.
However, problems arise when improving platforms such as digital libraries (e.g., academic
search engines) that aim to provide personalised content to their user-base, namely researchers
and students. In fact, it is often not possible to build personalised user models due to the lack
of suficient daily user interactions in comparison to web search engines [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and even when
available, such models would be generated at a significantly smaller scale and would be less
accurate.
      </p>
      <p>Recently, user behaviour has been subject to controversy debates in information security
and privacy related areas, especially with the large portion of collected user data in web search
engines and digital libraries. Analysing this wealth of information without privacy-concerns
allowed - web search engines on a larger scale and digital libraries on a minor one - to improve
their search experience. However, concerns started to rise regarding the usage of collected
sessions and the impact of storing personal, privacy-sensitive data without the user knowing.</p>
      <p>
        To reduce the amount of collected user data while preserving the profiling eficiency and
help domain-specific search engines, where the amount of user data is limited, to provide
accurate information to users while protecting their privacy and the confidentiality of their
personal information, we propose to use simulation. Simulation ofers a way to overcome the
lack of experimental real-world data, especially when the acquisition of such data is costly or
challenging [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Simulating user search behaviour also allows information retrieval systems (IR)
to conduct A/B-tests for the purpose of monitoring user interactions and parameterising the
a-priori distribution of search types using diferent back-end configurations and user interface
variants (e.g., changing facets placement in the user interface to more prominent place) to
improve the retrieval functionality.
      </p>
      <p>To better simulate users’ search behaviour and model their information needs, search engines
ofer contextual information with data such as previous queries and click-actions, which is very
useful for understanding user search behaviour and improving the retrieval system
configurations involved in the search process. In fact, digital libraries should consider user’s browsing
context (e.g., which users are more likely to formulate more queries) and sequentiality (e.g.,
which search actions are performed knowing that the user has formulated its -th query) to
improve the precision of results. Users in the same group are most likely to behave in a similar
manner. Therefore, it gets easier to recognise users’ intentions and simulate their behaviour
when exploiting the available contextual information.</p>
      <p>
        In this paper, we propose the use of Markov models to simulate global and search-type
specific behaviour to support users in their search. Markov models have been widely used for
discovering meaningful patterns in browsing data due to their good interpretability [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6">3, 4, 5, 6</xref>
        ].
In particular, they capture sequences in search patterns using transitional probabilities between
states and translate user sessions into Markov processes.
      </p>
      <p>Our contributions are the following: (1) we conduct experiments on a real-wold dataset and
simulate user search behaviour using a basic Markov approach which we consider as baseline, (2)
we then demonstrate that while using user’s browsing sequentiality, the Markov model simulates
user sessions marginally better than the baseline, (3) we also showcase that Markov model
perform well when considering user’s actions or the user’s intent-level query-change as context,
even when the contexts are derived based on common sense assumptions and (4) modelling the
dwell time and using it as an additional feature for session simulation yielded better results.
Our contributions also include (5) an empirical validation that utilises the Kolmogorov-Smirnov
statistical test to compare the similarity between real log and simulated data distribution and (6)
an evaluation technique that uses classification to assess the quality of simulated search session
and calculate the model’s improvement in comparison to the baseline.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        While Markov models can be used in other domains such as credit card fraud detection [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
computational biology [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] or cyber-security [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], we restrict the discussion here to understanding,
modelling and predicting users’ search behaviour on search engines.
      </p>
      <p>
        Several Markov models variations and extensions were proposed for modelling user
navigation behaviour. For example, Cadez et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] used first-order Markov models to cluster user
behaviour and reported some interesting insights (e.g., order in which users request pages) that
improved the website which the data was taken from. Liu et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] explored Continuous-Time
Markov Processes for modelling users’ browsing behaviour in web search to help computing
page importance. Azzopardi introduced Search Economic Theory (SET) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] to model the
search process and estimate users’ efort during a search process using a cost function. He
compared the interaction costs of diferent search strategies using simulated user interactions.
Hybrid-order tree-like Markov models [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], Selective Markov models [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and variable order
Markov model [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] were also used to extract users’ interactions from logs and capture both
small and large order Markov dependencies.
      </p>
      <p>
        Besides, the most commonly used techniques are Hidden Markov Models (HMM) based.
Hidden Markov Models are used to model exploratory search sessions. Lucas found HMM to be
suitable for pattern recognition problems on the basis of sequential data [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] (e.g., credit card
fraud detection, handwriting recognition or biological sequence analysis). However, working
with HMM usually requires an understanding of the domain since we do not have control over
the state transitions and the states are not completely observable. Kiseleva et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] suggested
to use a hierarchical clustering approach to decompose users’ web sessions into non-overlapping
temporal segments. Authors also identified temporal context and utilised it to predict future
user’s actions more accurately using Markov models.
      </p>
      <p>
        Our discussion of related work reveals a strong focus on predicting and modelling user’s
search behaviour. Despite this fact, the application of Markov model as a simulation model
outside the healthcare domain [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] is still in its infancy.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>The main purpose of this study is to simulate user search behaviour based on user’s behaviour
patterns. In other words, given log data of search sequences performed by users in a digital
library, our goal is to simulate similar search sequences. In this work, we use Markov model
as a simulation model. In general, the input for this problem is a sequence of search sessions
produced by real users, and the goal is to build Markov models using diferent approaches that
model user’s search behaviour.</p>
      <sec id="sec-3-1">
        <title>3.1. Basic Markov model</title>
        <p>
          We propose investigating the use of Markov Chains to model the search dynamics. The
theoretical model is based on first-order Markov models [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. Our model is defined as a weighted
directed graph, where user actions are represented as nodes and the transition probability
between actions as edges. In fact, let  = (, , ) where:
•  is the set of all possible user search actions including START/END node, with each
action defined as state ,
•  ⊆  ×  is the set of all possible transitions between two actions,
•  :  → [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] is a function that assigns a weight  to every pair of states ( ,  )
which represents the likelihood of a transition from state  to state .
        </p>
        <p>Let  be the random variable that models actions in a user search session. Basic Markov
approach models the transition probability between a pair of states using maximum likelihood
estimation as follows:
 ( = |− 1 = ) = , (1)

where  is the total amount of how many times the state  occurred in the training data
and , is the amount of how many times the transition from state  to state  has been
observed.</p>
        <p>We use first-order Markov model as a baseline to model the global user search behaviour
and simulate user search sessions in such a way that each action that is performed by a user
corresponds to a state in the model.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Conditional Markov model</title>
        <p>
          In conditional Markov approach, we aim to model the probability to perform an action given the
current action and query, and the probability to formulate a query given the current action and
query. We extend the work that has been done in [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] by adjusting it to the simulation scenario.
We utilise the user search sessions in the log data to calculate the transition probabilities
between diferent states resulting in a Markov model that takes into consideration the following
definitions:
•  ∈ N user query: the -th query formulated in a user search session
•  ∈ N user action: the -th action performed for the -th query in a user search session
Every user action in a search session is related to the preceded query formulation. With the
exception of the first query formulation that marks the start of a user search session, every
query is also related to the preceded action performed by the user.
        </p>
        <p>
          Let  be the random variable that models query formulation in a user search session and 
the random variable that models user actions performed for the last query in a search session
after  transitions. In order to model user sessions, we define the pair (  = ,  = ) in a
way that it represents when a user performs -th action after formulating an -th query. We
introduce two diferent types of transitions. The first one models the transition probability of
formulating -th query given that  actions were performed in the (  − 1) -th query [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]:
and the second one models the transition probability of performing  actions given that -th
query was formulated [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]:
 (  = |− 1 = , − 1 =  − 1) =
where the first event in a user search session is a query formulation followed by either a user
action or a new query reformulation. After  events, the transition probability is calculated
using the Markov transition matrix   of -th order.
        </p>
        <p>In the basic approach, the random variable  represents the action/state (regardless whether
it’s a query or an action) and the transition probability from state  to  is  ( = |− 1 = ).
The Markov property (i.e., future state depends only on the current one) is also preserved. For
the conditional approach, we add a second condition, i.e., the future action always depends on
the current action and query, which makes it necessary to distinguish queries  and actions
. We then have  ( = |− 1 =  − 1, − 1 = ) and  ( = |− 1 = , − 1 =  − 1)
as transition probability.</p>
        <p>We compute the transition probabilities based on these two definitions (i.e., Eq. 2 and 3)
by looking at the current action and the current formulated query when trying to model the
probability to perform another action or formulate another query.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Contextual Markov model</title>
        <p>
          During a search session, a user performs diferent search actions to find documents that fulfill
his/her information need. The technique that we propose here aims to categorise users into
diferent groups based on their search behaviour. Search tasks are commonly divided into two
major types of user’s behaviour [
          <xref ref-type="bibr" rid="ref21 ref22">21, 22</xref>
          ]: (1) "exploratory search" where users tend to explore
the search result list exhaustively, rephrase queries or extensively use filtering operations on
the results, and (2) "known-item" where users only investigate the first few results and rephrase
their queries quickly to match their need.
        </p>
        <p>In this approach, we utilise search-type context that splits the training data into smaller
portions. More precisely, we build a first-order Markov model for each search type (i.e., exploratory
search and known-item) by adopting context and compare the simulation’s performance to the
Markov model built from the whole data. This would allow us to evaluate the impact of context
on the accuracy of simulated sessions.</p>
        <p>
          Kumaripaba et al. [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] extended the work of Marchionini [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] and provided a few simple
indicators of information search behaviours to categorise users into exploratory and known-item
searchers. They showed that the most distinctive indicators that characterise exploratory search
behaviours are query length, maximum scroll depth, and task completion time. They proved
empirically that exploratory tasks can be distinguished within in the first query session by these
indicators. Following their findings, we identify known-item search sessions by fine-tuning
their indicators to fit our dataset and capping the first query duration to 200 seconds, the dwell
duration of the first three actions to 120 seconds, the cumulative actions tally in the first query
iteration to 4 actions and the the task completion time to 800 seconds. The remaining user
sessions are considered as exploratory search.
        </p>
        <p>The transition probability between any two states  and  is modelled similarly to (Eq. 1).</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Time-aware Markov model</title>
        <p>Our time-aware Markov model addresses time distribution between actions by taking the dwell
time spent between each transition into account. We assume the existence of a distinctive
1 (︂</p>
        <p>which can be estimated from the training data. With the
, ln( ) −  ( ) ≈
a realistic manner.
|− 1 = ) =
training data.</p>
        <p>gamma distribution parameters estimated to fit the data, we aim to represent the dwell times in</p>
        <p>
          Similarly to the Basic Markov, time-aware Markov approach models the transition probability
between any two states  and  using maximum likelihood estimation as follows:  ( =
, , where , is the number of times we observed a transition from
state  to  and  is the total number of transitions from state  to any other state in the
 ∼
follows [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]:  ( ; ;  ) =
distribution that governs how much time a user needs to move from state  to state  for each
possible transition  → . Each transition time is collected from training data and used to
estimate the time distribution of that transition. In order to calculate this time distribution, we
utilise the gamma distribution which is governed by two parameters [
          <xref ref-type="bibr" rid="ref23 ref24">23, 24</xref>
          ].
        </p>
        <p>Let  be the random variable that models time between transitions.  is gamma-distributed
where  is the scale parameter and  the shape parameter.  is denoted by: ( ,  ) =
Γ( ,  ). The probability density function of gamma distribution can be expressed as
1
Γ( )  − 1− / ,  &gt; 0; ,  &gt; 0.</p>
        <p>Let  be a set of independent observations ( 1, ...,  ) of transition times between two
states ( , ). The likelihood function is:
( ;  ) = ∏︁  ( ; ;  )

=1
and thus, the maximum likelihood function with respect to  is (obtained by setting the derivative
of the log of equation (4) to zero) ^ =
∑︀=1  , and similarly the maximum with respect to
(4)</p>
        <p>In this experiment, we seek to describe user action sequences while taking dwell time into
consideration. We use the maximum likelihood to estimate the parameters of the gamma
distribution for every transition from the log data. We then used the most likely transition time
as a feature.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Query-change Markov model</title>
        <p>
          Users tend to generate search sessions with borderline behaviours and mixed characteristics. In
fact, users may have precise search goals, yet the search process is not straightforward. Several
studies have shown that the query change which occurs in a search session is heavily related
to the contents of previously viewed search results [
          <xref ref-type="bibr" rid="ref26 ref27 ref28">26, 27, 28</xref>
          ]. Therefore, looking into how
users change their query from − 1 to  can partially explain how they express their dynamic
information need in the session. As consequence, such terms may be used for categorising users
into finer groups to better simulate search-type specific sessions.
        </p>
        <p>
          In this approach, we expand the findings of Huang et al. [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] and adopt user’s session-level
query change process as an implicit indicator for categorising search behaviour and estimating
the change in user’s information need. We develop a syntactic taxonomy by analysing the query
change Δ as the syntactic change which occurs between two consecutive queries − 1 and :
Δ =  − − 1.
        </p>
        <p>
          Let  () be the bag of words for  and , − 1 two successive queries. A query  can be
decomposed into three main parts in relation to − 1, namely theme, added and removed terms
and written as follows:  = Δ= + Δ+ − Δ− , where: Δ= = { |  ∈  (),  ∈
 (− 1)}; Δ+ = { |  ∈  (),  ̸∈  (− 1)}; Δ− = { |  ̸∈  (),  ∈
 (− 1)}. We refer to the common terms that appear in both query  and − 1 as theme
terms and denote it (Δ=) as they usually represent the main thematic of a session. Added
terms (Δ+) represent the new terms that users add to the previous query and removed terms
(Δ− ) refer to the terms that users delete from the previous query as they seek to improve bad
performing queries. Theme terms (Δ=) are generated using the longest common subsequence
(LCS) approach [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] in both queries , − 1. The LCS can be the common part or parts of
two queries as long as the subsequence appears in both queries in the same relative order
but not necessarily continuous. For instance, 1 : 1 →− 2 is a search session where 1 =
"tire recycling technique in Europe" and 2 = "tire recycling facilities in Europe". Δ= = "tire
recycling in Europe", Δ+ = "facilities" and Δ− = "technique".
        </p>
        <p>In query-change Markov model, we employ queries as states. Specially, we define the pair
(  = ,  = ) in a way that it represents when a user performs  action (i.e., keeping,
adding or removing query terms) after formulating an -th query. The transition probability of
formulating -th query given that  actions were performed in the (  − 1) -th query is modelled
as follows:</p>
        <p>Similarly to the first type of transition in the conditional approach, we compute the transition
probabilities based on the first definition (Eq. 2) by looking at the current query-change (i.e.,
adding, removing or keeping query terms) and the current formulated query when trying to
model the probability to formulate another query.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Setup</title>
      <sec id="sec-4-1">
        <title>4.1. Dataset</title>
        <p>
          We use Sowiport1 User Search Session data set (SUSS) [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] for our experiments. The digital
library provides records and information for the social sciences in English and German. We used
data that was collected over a period of one year (from April 2014 to April 2015). The dataset
contains over 480,000 individual search sessions and around 8 million log entries. we filter logs
to remove any session that does not contain a query (i.e., users having searched nothing) or
has invalid query annotations. In order to describe users’ search actions, SUSS ofers a list of
58 diferent actions that covers all user’s activities while interacting with the interface of the
digital library (e.g., formulating a query, clicking on a document, viewing the full document’s
content, selecting a facet, using search filters, etc.). For each user interaction, a session id, date
stamp, length of the action and other additional information are stored to describe user’s path
during the search process.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Evaluation Metrics</title>
        <sec id="sec-4-2-1">
          <title>4.2.1. Kolmogorov-Smirnov-based Evaluation</title>
          <p>
            Several statistical tests exist to evaluate the goodness of fit of two statistical models such as
Likelihood ratio and the Person chi-square [
            <xref ref-type="bibr" rid="ref32 ref33 ref34">32, 33, 34</xref>
            ]. In this work, we utilise the
KolmogorovSmirnov goodness-of-fit test [
            <xref ref-type="bibr" rid="ref35">35</xref>
            ] with its two-sample test (KS-2) variant. The advantage of this
test is that it is one of the most useful and non-parametric methods of comparing two sample
distributions. Essentially, we test the null hypothesis that the two independent samples belong
to the same continuous distribution and proceed with calculating the absolute value of the
distance between two data samples which we refer to as the test statistic  to compare their
distribution for similarities. The test statistic  is calculated as follow:
 = sup |1() − 2()|

(6)
where, 1 and 2 are observations from the first and second sample respectively. 1() and
2() are the empirical distributions of the first and second sample, obtained from n1 and n2
observations. We then compare the test statistic value against the critical value derived from
the KS-2 table value to either accept or reject the hypothesis. The null hypothesis is accepted
when the test statistic value is less than the table value and rejected otherwise.
          </p>
        </sec>
        <sec id="sec-4-2-2">
          <title>4.2.2. Classification-based Evaluation</title>
          <p>In addition to the KS-2 test, we define a classification-based evaluation to (1) evaluate the
simulation performance of our models and (2) calculate their improvement in comparison to
the baseline.</p>
          <p>In order to evaluate how good our model simulates user search behaviour, we first develop a
set of features that represent the sequentiality of a user search session in the form of a feature
vector. Then we train a classifier to distinguish simulated sessions from real log data sessions
and report the results.</p>
          <p>
            Inspired by the findings in [
            <xref ref-type="bibr" rid="ref36">36</xref>
            ] about what kinds of engineered features are best suited
to various machine learning model types, we develop a set of features that represent the
sequentiality of the search session (i.e., query search types; search types; advanced search types;
clicking, viewing and exporting actions) and discard those that only describe the user’s overall
search behaviour (e.g., tally of search actions, queries formulation and clicks).
          </p>
          <p>We used one-cold-encoding to indicate the presence of a feature (i.e., (0) if present and (1) if
not) and ordered features in the sequence (i.e., _feature where  refers to the sequence order of
the query in a session , e.g., 1_search, 2_view_record). The resulting feature vector has a higher
dimensionality (170) than the raw feature categories (58) due to the inclusion of the sequence
order in the binary feature vector. Fig.1 gives an example of a user session in the form of a
feature vector.
[ "1_search", "1_search_change_paging", "1_view_record", "2_search", "2_search_change_paging",
"3_search", "3_view_record", "3_view_description", "end"]</p>
          <p>Another important feature that we also considered in our third approach as described in
section 3.4 is the dwell time. Dwell time in a search action refers to the amount of time that the
user spends in between the current and the future action (formulating a query, browsing a search
result page, assessing a document, end of session.. etc.). We estimated the dwell times using
gamma distribution for each state transition and added them to the list of features described
above.</p>
          <p>Each user session is converted to a feature vector, labelled and fed to a classifier. This task
was repeated separately for each of the Markov approaches defined in section 3. We combined
an equal amount of data from log sessions and simulated sessions for a balanced classification
and split our sequential dataset of user’s actions into a training set (consist of 80%), on which
we trained the classifiers, and a test set (consist of 20%), on which we evaluated the model.</p>
          <p>
            To evaluate how well the classification results generalise, we adopted 10-fold cross-validation
that we performed in all classification experiments. We ran all experiments using the simulated
sessions that we have generated from each approach and reported the average score over 10
independent runs. We used the five most popular algorithms in binary classification, namely,
Logistic Regression [
            <xref ref-type="bibr" rid="ref37">37</xref>
            ], Support Vector Machine [
            <xref ref-type="bibr" rid="ref38">38</xref>
            ], K-Nearest Neighbors [
            <xref ref-type="bibr" rid="ref39">39</xref>
            ], Decision
Trees [
            <xref ref-type="bibr" rid="ref40">40</xref>
            ], Random Forest [
            <xref ref-type="bibr" rid="ref41">41</xref>
            ] and reported the average score. We also used automated
machine learning (Auto-sklearn [
            <xref ref-type="bibr" rid="ref42">42</xref>
            ]) as it employs an ensemble of top performing models
discovered during the optimisation process and reports the best result.
          </p>
          <p>Since we are interested in finding a classifier that is close to 100% recall on the real log sessions
(i.e., successful in detecting all real log sessions) and a high recall on the simulated sessions
(i.e., good at detecting most of simulated sessions), we incorporate a bias in the classifier by
weighting the class of real data ( = 104,  = 1) as we need to penalise bad real
log sessions predictions.</p>
          <p>In order to calculate improvements of our models in comparison to the baseline, we utilise
metrics such as Precision, Recall, F-measure and Accuracy which are common for objectively
measuring the classifier’s performance. In our case, we consider True Positive (TP) to be the
scenario where the model classifies simulated sessions as simulated. A score of 0.5 (chance
level of balanced binary classification) means that the classifier cannot distinguish between
simulated and real log sessions and therefore, the simulated sessions are similar to real log data
sessions. With a score below 0.5, the classifier performs even worse than chance-level, whereas
with a score of 1, simulated sessions and log data are completely diferent.</p>
          <p>Since we can distinguish between real log and simulated sessions, reporting the accuracy
alone can obfuscate some of the performance that F-measure would highlight. F-measure tells
how precise the classifier is (i.e., how many instances it classifies correctly), as well as how
robust it is (i.e., does not miss a significant number of instances). In fact, if F-measure showed
low precision/recall along with a low accuracy, we can have better confidence in the results.
Therefore, we utilise all four metrics to demonstrate relative performance and consistency of
the results.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experimental Results</title>
      <sec id="sec-5-1">
        <title>5.1. Kolmogorov-Smirnov-based Evaluation</title>
        <p>For this evaluation test, we used the baseline Markov model approach described in section 3.1
to simulate global user search sessions and a separate model for each approach to simulate
type-specific user search sessions. For each model, we utilise the transition probabilities between
states which are drawn from the log sessions and the simulated sessions separately to generate
two independent samples. By feeding these data points to the given formula (Eq. 6) , we got
the results reported in Table 1. The p-value provides a measure of how much we can trust
the statistical test. In general, the closest the p-value to zero is, the more reliable the test is.
Likewise, the critical value for the two samples can be approximated using the following formula:
1,2 = ( )√︁ 11+·22 where  refers to the level of significance and 1, 2 are observations
from the first and second sample respectively. In our experiment, we set 1 = 2 = 300, 000
and  = 0.01.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Classification-based Evaluation</title>
        <p>Since the KS-2 critical values are all significant, it means that our diferent approaches do not
improve the simulation or at least it is hard to quantify the improvement using a KS-2 test.
Therefore, we need to adopt a second evaluation method: we investigate whether we can train
a classifier and try to distinguish between real log and simulated sessions through controlled
scenarios. For each scenario, we simulate an equal amount of sessions as present in the log
data to balance class distribution for each approach, namely, baseline, conditional, contextual,
time-aware and query-change.</p>
        <p>We investigate whether we can distinguish between log sessions and simulated sessions that
were generated using our predefined approaches in section 3. Table 2 shows that conditional
Markov model, which simulate user actions based on the query-action approach, achieves
higher performance in comparison to the baseline with a relative improvement of 22% for
the F-measure score. This demonstrates that separating user search actions depending on the
query’s ordinal position help improving the simulation quality.</p>
        <p>Table 2 also shows that when using the contextual Markov approach, the model did better
while simulating sessions for known-item search types with an F-measure score of 0.509 in
comparison to exploratory search search types with a score of 0.568. One possible explanation
for this is that known-item sessions are probably easier to simulate since there is less variation.
The exploratory group of users generate longer sessions, thus higher total of state transitions
which results in a diverse number of simulated sessions. Using search types to discover context
helped creating simulated sessions more similar to original ones (i.e., reducing the accuracy of
the classifier). In addition, table 3a also reports low precision (i.e., 0.573 for known-item and
0.636 for exploratory)/recall (i.e., 0.487 for known-item and 0.527 for exploratory) values along
with a low accuracy score (i.e., 0.564 for known-item and 0.608 for exploratory) when using the
exploratory-known-item approach in comparison to the baseline, which indicates that we can
have better confidence in the results.</p>
        <p>
          Our initial hypothesis was that the baseline model should outperform the time-aware Markov
model since the baseline is less complex to learn (e.g., excluding dwell time features). The
less information needed to encode in the model the easier it is to get "similar" simulated data.
However, adding time improves our model’s simulation quality as we gained around 2.67%/1.21%
in term of the relative improvement over the baseline. Another hypothesis that we have tested
is to replace the estimated time distribution between actions using gamma distribution with
a calculated average of dwell time of the same state transition and feed it as a feature to the
model [
          <xref ref-type="bibr" rid="ref43">43</xref>
          ]. We concluded that the gamma estimation scored much better than averaging the
dwell time value.
        </p>
        <p>Although these results do not undoubtfully prove that we succeeded in simulating look-alike
user search sessions, they show that users in our simulated sessions behave in a way that comes
closer to the original behaviour from the log data.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>
        In this paper we have presented various variants of Markov model-based simulation techniques
that simulate global and user-type specific behaviour models. In our approaches, we first
introduced a formal baseline that consists of exploring a first-order Markov model which we
argue to be the best fit for our dataset. Then we extended the work of Baeza-Yates et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]
and presented a conditional Markov approach that is adapted for simulation modeling. Next,
we initiated the problem of learning contextual Markov models where we partitioned users
according to their navigation behaviour. In particular, we grouped users by learning a mixture
of search characteristics that categorise them into exploratory search and known-item search
types. Then we explored the work of Huang et al. [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] and adopted user’s session-level query
change to develop a syntactic Markov model that models the syntactic change which occurs
between two consecutive queries. Finally, in our time-aware approach, we modeled the dwell
time that users spend in between actions using the gamma distribution and then fed it as a
feature to our evaluation model. There are several straightforward extensions to our work.
We can use better clustering techniques for context discovery in contextual Markov models.
Another extension is to model the duration of state transition diferently.
      </p>
      <p>We performed experiments on a real-wold dataset with conditional, contextual, time-aware
and query-change Markov models and provided more empirical results showing that our Markov
approaches succeeded in simulating sessions that are deemed to be good approximations of
actual user sessions, making our Markov models approaches a suitable choice for user session
simulation.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work has been partially carried out within the project “SINIR: Simulating INteractive
Information Retrieval” funded by the DFG (grant HA 5851/3-1).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wolfram</surname>
          </string-name>
          ,
          <article-title>State digital library usability: Contributing organizational factors</article-title>
          ,
          <source>Journal of the American society for information science and technology 53</source>
          (
          <year>2002</year>
          )
          <fpage>1085</fpage>
          -
          <lpage>1097</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , X. Liu,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhai</surname>
          </string-name>
          ,
          <article-title>Information retrieval evaluation as search simulation: A general formal framework for ir evaluation</article-title>
          ,
          <source>in: Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>193</fpage>
          -
          <lpage>200</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>I.</given-names>
            <surname>Cadez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Heckerman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Meek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Smyth</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. White</surname>
          </string-name>
          ,
          <article-title>Visualization of navigation patterns on a web site using model-based clustering</article-title>
          ,
          <source>in: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          ,
          <year>2000</year>
          , pp.
          <fpage>280</fpage>
          -
          <lpage>284</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P. L.</given-names>
            <surname>Pirolli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Pitkow</surname>
          </string-name>
          ,
          <article-title>Distributions of surfers' paths through the world wide web: Empirical characterizations</article-title>
          ,
          <source>World Wide Web</source>
          <volume>2</volume>
          (
          <year>1999</year>
          )
          <fpage>29</fpage>
          -
          <lpage>45</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Manavoglu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pavlov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L.</given-names>
            <surname>Giles</surname>
          </string-name>
          ,
          <article-title>Probabilistic user behavior models</article-title>
          ,
          <source>in: Third IEEE International Conference on Data Mining</source>
          , IEEE,
          <year>2003</year>
          , pp.
          <fpage>203</fpage>
          -
          <lpage>210</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>Haider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chiarandini</surname>
          </string-name>
          , U. Brefeld,
          <article-title>Discriminative clustering for market segmentation</article-title>
          ,
          <source>in: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>417</fpage>
          -
          <lpage>425</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lucas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.-E.</given-names>
            <surname>Portier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Laporte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>He-Guelton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Caelen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Granitzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Calabretto</surname>
          </string-name>
          ,
          <article-title>Towards automated feature engineering for credit card fraud detection using multiperspective hmms</article-title>
          ,
          <source>Future Generation Computer Systems</source>
          <volume>102</volume>
          (
          <year>2020</year>
          )
          <fpage>393</fpage>
          -
          <lpage>402</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Krogh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Larsson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Von Heijne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. L.</given-names>
            <surname>Sonnhammer</surname>
          </string-name>
          ,
          <article-title>Predicting transmembrane protein topology with a hidden markov model: application to complete genomes</article-title>
          ,
          <source>Journal of molecular biology 305</source>
          (
          <year>2001</year>
          )
          <fpage>567</fpage>
          -
          <lpage>580</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xie</surname>
          </string-name>
          , S.-
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <article-title>A large-scale hidden semi-markov model for anomaly detection on user browsing behaviors</article-title>
          ,
          <source>IEEE/ACM transactions on networking 17</source>
          (
          <year>2008</year>
          )
          <fpage>54</fpage>
          -
          <lpage>65</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>I.</given-names>
            <surname>Cadez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Heckerman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Meek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Smyth</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. White</surname>
          </string-name>
          ,
          <article-title>Visualization of navigation patterns on a web site using model-based clustering</article-title>
          ,
          <source>in: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          ,
          <year>2000</year>
          , pp.
          <fpage>280</fpage>
          -
          <lpage>284</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gao</surname>
          </string-name>
          , T.-Y. Liu,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ma</surname>
          </string-name>
          , S. He,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Browserank: letting web users vote for page importance</article-title>
          ,
          <source>in: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval</source>
          ,
          <year>2008</year>
          , pp.
          <fpage>451</fpage>
          -
          <lpage>458</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Azzopardi</surname>
          </string-name>
          ,
          <article-title>The economics in interactive information retrieval</article-title>
          ,
          <source>in: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>15</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>X.</given-names>
            <surname>Dongshan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Junyi</surname>
          </string-name>
          ,
          <article-title>A new markov model for web access prediction</article-title>
          ,
          <source>Computing in Science &amp; Engineering</source>
          <volume>4</volume>
          (
          <year>2002</year>
          )
          <fpage>34</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Deshpande</surname>
          </string-name>
          , G. Karypis,
          <article-title>Selective markov models for predicting web page accesses, ACM transactions on internet technology (TOIT) 4 (</article-title>
          <year>2004</year>
          )
          <fpage>163</fpage>
          -
          <lpage>184</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Borges</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Levene</surname>
          </string-name>
          ,
          <article-title>Evaluating variable-length markov chain models for analysis of user web navigation sessions</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>19</volume>
          (
          <year>2007</year>
          )
          <fpage>441</fpage>
          -
          <lpage>452</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lucas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.-E.</given-names>
            <surname>Portier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Laporte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>He-Guelton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Caelen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Granitzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Calabretto</surname>
          </string-name>
          ,
          <article-title>Towards automated feature engineering for credit card fraud detection using multiperspective hmms</article-title>
          ,
          <source>Future Generation Computer Systems</source>
          <volume>102</volume>
          (
          <year>2020</year>
          )
          <fpage>393</fpage>
          -
          <lpage>402</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kiseleva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. Thanh</given-names>
            <surname>Lam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pechenizkiy</surname>
          </string-name>
          , T. Calders,
          <article-title>Discovering temporal hidden contexts in web sessions for user trail prediction</article-title>
          ,
          <source>in: Proceedings of the 22nd International Conference on World Wide Web</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>1067</fpage>
          -
          <lpage>1074</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D.</given-names>
            <surname>Budd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Burns</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. L'Italien</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Lapuerta</surname>
          </string-name>
          ,
          <article-title>Impact of early intervention and disease modification in patients with predementia alzheimer's disease: a markov model simulation</article-title>
          ,
          <source>ClinicoEconomics and outcomes research: CEOR</source>
          <volume>3</volume>
          (
          <year>2011</year>
          )
          <fpage>189</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <article-title>Markov models and optimization</article-title>
          , Routledge,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>R.</given-names>
            <surname>Baeza-Yates</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hurtado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mendoza</surname>
          </string-name>
          , G. Dupret,
          <article-title>Modeling user search behavior</article-title>
          , in: Third Latin American Web
          <string-name>
            <surname>Congress (LA-WEB</surname>
          </string-name>
          '
          <year>2005</year>
          ), IEEE,
          <year>2005</year>
          , pp.
          <fpage>10</fpage>
          -pp.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>G.</given-names>
            <surname>Marchionini</surname>
          </string-name>
          ,
          <article-title>Exploratory search: from finding to understanding</article-title>
          ,
          <source>Communications of the ACM</source>
          <volume>49</volume>
          (
          <year>2006</year>
          )
          <fpage>41</fpage>
          -
          <lpage>46</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>K.</given-names>
            <surname>Athukorala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Głowacka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Jacucci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Oulasvirta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vreeken</surname>
          </string-name>
          ,
          <article-title>Is exploratory search diferent? a comparison of information search behavior for exploratory and lookup tasks</article-title>
          ,
          <source>J. Assoc. Inf. Sci. Technol</source>
          .
          <volume>67</volume>
          (
          <year>2016</year>
          )
          <fpage>2635</fpage>
          -
          <lpage>2651</lpage>
          . URL: https://doi.org/10.1002/asi.23617. doi:
          <volume>10</volume>
          .1002/asi.23617.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J.</given-names>
            <surname>Mun</surname>
          </string-name>
          ,
          <article-title>Understanding and choosing the right probability distributions</article-title>
          ,
          <source>Adv. Anal. Model</source>
          (
          <year>2015</year>
          )
          <fpage>899</fpage>
          -
          <lpage>917</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hassan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. L.</given-names>
            <surname>Klinkner</surname>
          </string-name>
          ,
          <article-title>Beyond dcg: user behavior as a predictor of a successful search</article-title>
          ,
          <source>in: Proceedings of the third ACM international conference on Web search and data mining</source>
          ,
          <year>2010</year>
          , pp.
          <fpage>221</fpage>
          -
          <lpage>230</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>R. V.</given-names>
            <surname>Hogg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>McKean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Craig</surname>
          </string-name>
          , Introduction to mathematical statistics,
          <source>Pearson Education</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ni</surname>
          </string-name>
          ,
          <article-title>What afects word changes in query reformulation during a task-based search session</article-title>
          ?,
          <year>2016</year>
          , pp.
          <fpage>111</fpage>
          -
          <lpage>120</lpage>
          . doi:
          <volume>10</volume>
          .1145/2854946.2854978.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>H.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Guan</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Zhang,</surname>
          </string-name>
          <article-title>The query change model: Modeling session search as a markov decision process</article-title>
          ,
          <source>ACM Trans. Inf. Syst</source>
          .
          <volume>33</volume>
          (
          <year>2015</year>
          ). URL: https://doi.org/10.1145/2747874. doi:
          <volume>10</volume>
          .1145/2747874.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sloan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>A term-based methodology for query reformulation understanding</article-title>
          ,
          <source>Information Retrieval Journal</source>
          <volume>18</volume>
          (
          <year>2015</year>
          )
          <fpage>145</fpage>
          -
          <lpage>165</lpage>
          . URL: http://dx.doi.org/10.1007/ s10791-015-9251-5. doi:
          <volume>10</volume>
          .1007/s10791-015-9251-5.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>J.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. N.</given-names>
            <surname>Efthimiadis</surname>
          </string-name>
          ,
          <article-title>Analyzing and evaluating query reformulation strategies in web search logs</article-title>
          ,
          <source>in: Proceedings of the 18th ACM conference on Information and knowledge management</source>
          ,
          <year>2009</year>
          , pp.
          <fpage>77</fpage>
          -
          <lpage>86</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Hirschberg</surname>
          </string-name>
          ,
          <article-title>Algorithms for the longest common subsequence problem</article-title>
          ,
          <source>J. ACM</source>
          <volume>24</volume>
          (
          <year>1977</year>
          )
          <fpage>664</fpage>
          -
          <lpage>675</lpage>
          . URL: https://doi.org/10.1145/322033.322044. doi:
          <volume>10</volume>
          .1145/322033. 322044.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>P.</given-names>
            <surname>Mayr</surname>
          </string-name>
          ,
          <article-title>Sowiport user search sessions data set (suss)</article-title>
          ,
          <source>GESIS Datorium</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>P.</given-names>
            <surname>Billingsley</surname>
          </string-name>
          ,
          <article-title>Statistical methods in markov chains</article-title>
          ,
          <source>The annals of mathematical statistics</source>
          (
          <year>1961</year>
          )
          <fpage>12</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>R. N.</given-names>
            <surname>Hiscott</surname>
          </string-name>
          ,
          <article-title>Chi-square tests for markov chain analysis</article-title>
          ,
          <source>Journal of the International Association for Mathematical Geology</source>
          <volume>13</volume>
          (
          <year>1981</year>
          )
          <fpage>69</fpage>
          -
          <lpage>80</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>T. W. Anderson</surname>
            ,
            <given-names>L. A.</given-names>
          </string-name>
          <string-name>
            <surname>Goodman</surname>
          </string-name>
          ,
          <article-title>Statistical inference about markov chains</article-title>
          ,
          <source>The Annals of Mathematical Statistics</source>
          (
          <year>1957</year>
          )
          <fpage>89</fpage>
          -
          <lpage>110</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>F. J.</given-names>
            <surname>Massey</surname>
          </string-name>
          <string-name>
            <surname>Jr</surname>
          </string-name>
          ,
          <article-title>The kolmogorov-smirnov test for goodness of fit</article-title>
          ,
          <source>Journal of the American statistical Association</source>
          <volume>46</volume>
          (
          <year>1951</year>
          )
          <fpage>68</fpage>
          -
          <lpage>78</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>J.</given-names>
            <surname>Heaton</surname>
          </string-name>
          ,
          <article-title>An empirical analysis of feature engineering for predictive modeling</article-title>
          ,
          <source>SoutheastCon</source>
          <year>2016</year>
          (
          <year>2016</year>
          ). URL: http://dx.doi.org/10.1109/SECON.
          <year>2016</year>
          .
          <volume>7506650</volume>
          . doi:
          <volume>10</volume>
          .1109/ secon.
          <year>2016</year>
          .
          <volume>7506650</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>D. G.</given-names>
            <surname>Kleinbaum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Dietz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gail</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Klein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Klein</surname>
          </string-name>
          , Logistic regression, Springer,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>W. S.</given-names>
            <surname>Noble</surname>
          </string-name>
          ,
          <article-title>What is a support vector machine?</article-title>
          ,
          <source>Nature biotechnology 24</source>
          (
          <year>2006</year>
          )
          <fpage>1565</fpage>
          -
          <lpage>1567</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Dudani</surname>
          </string-name>
          ,
          <article-title>The distance-weighted k-nearest-neighbor rule</article-title>
          ,
          <source>IEEE Transactions on Systems, Man, and Cybernetics</source>
          (
          <year>1976</year>
          )
          <fpage>325</fpage>
          -
          <lpage>327</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Quinlan</surname>
          </string-name>
          ,
          <article-title>Simplifying decision trees</article-title>
          ,
          <source>International journal of man-machine studies 27</source>
          (
          <year>1987</year>
          )
          <fpage>221</fpage>
          -
          <lpage>234</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>M.</given-names>
            <surname>Belgiu</surname>
          </string-name>
          , L. Drăguţ,
          <article-title>Random forest in remote sensing: A review of applications and future directions</article-title>
          ,
          <source>ISPRS journal of photogrammetry and remote sensing 114</source>
          (
          <year>2016</year>
          )
          <fpage>24</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>M.</given-names>
            <surname>Feurer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Klein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Eggensperger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Springenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Hutter</surname>
          </string-name>
          ,
          <article-title>Auto-sklearn: eficient and robust automated machine learning</article-title>
          ,
          <source>in: Automated Machine Learning</source>
          , Springer, Cham,
          <year>2019</year>
          , pp.
          <fpage>113</fpage>
          -
          <lpage>134</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>V.</given-names>
            <surname>Tran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Maxwell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Fuhr</surname>
          </string-name>
          , L. Azzopardi,
          <article-title>Personalised search time prediction using markov chains</article-title>
          ,
          <source>in: Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>237</fpage>
          -
          <lpage>240</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>