<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Exploring Bayesian belief network for risky behavior modelling: discretization and latent variables</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>A Suvorova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>TICS Lab, SPIIRAS</institution>
          ,
          <addr-line>39 14</addr-line>
        </aff>
      </contrib-group>
      <fpage>63</fpage>
      <lpage>70</lpage>
      <abstract>
        <p>Decision making in many areas is based on data about individual behavior often measured using different surveys. The study investigates the proposed approach for behavior modelling on the base of Bayesian belief networks that allows predicting behavior characteristics using small and incomplete data from surveys about behavior episodes. We explored the characteristics of the models using the automatically generated dataset that included 44350 cases. During our experiment, we considered three different model structures and compared three different discretization strategies. We found that simpler structures showed better prediction quality for all measures (average accuracy, precision, recall, F1 score). The observed difference was statistically significant but did not exceed 1% that can be considered unimportant if error price is low. Our findings suggested that ways of data transformation, particularly discretization strategies for input data, had a significant impact on prediction quality: background knowledge about distributions, theoretical assumptions about behavior led to higher prediction quality.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Decision making in many areas is based on data about individual behavior: personnel behavior [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1–3</xref>
        ],
customer behavior [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], user behavior [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ], patient behavior [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. Most of the studies measures
behavior frequency or behavior rate [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]: using special devices, diaries or surveys. Since the diary
method (recording of episodes) is extremely time-consuming, resource-intensive and even hardly
possible, surveys are especially popular in psychology, sociology and public health research (for
example, see [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]). But surveys cannot be very detailed and very long: respondents become tired, less
attentive and not willing to continue [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. As a result, researchers have to deal with small and
sometimes incomplete data about behavior. In [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] authors provided the method based on data about
several behavior episodes.
      </p>
      <p>
        The idea of analysis of such kind of data is based on Bayesian Belief Network (BBN) theory that
allows complementing empirical data with inputs from other models and expert knowledge [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Due
to its features, BBNs are widely used in decision making in many areas (for example, see [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]).
      </p>
      <p>
        Bayesian Belief Network is a type of probabilistic graphical models that represents a set of random
variables and their conditional dependencies [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. It consists of two components: structure and
parameters. A network structure is presented in the form of a directed acyclic graph where nodes
correspond to the random variables and directed edges represent dependencies among variables.
Parameters are represented as a set of conditional probability distributions, one for each variable,
characterizing the dependencies represented by the edges [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>
        Previous studies showed high prediction quality of BBN models for estimating risky behavior
characteristics [
        <xref ref-type="bibr" rid="ref16 ref17 ref18">16–18</xref>
        ]. This study explores the structure of the models and influence of structure
changes on prediction quality.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Model description</title>
      <p>
        The input data for the model [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] include the lengths of intervals between three last episodes of risky
behavior and the lengths of minimum and maximum intervals between episodes during a given period
of interest T . The data about episodes in most applications is obtained from respondents’ self-reports
[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. In addition, the model includes the latent variable that corresponds to the number of episodes
during T and the behavior rate, that is the key variable, the one we want to estimate. We assume that
for each respondent occurrence of episodes follows Poisson random process.
      </p>
      <p>
        Adding data about minimum and maximum intervals decreases the influence of recent behavior
represented by the last episodes. However, combining all the data about episodes leads to very
complicated joint distribution [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] that is impossible to represent as an elementary function.
      </p>
      <p>
        On the contrary, Bayesian belief networks allow determining complex relationships in terms of
simpler dependencies between small parts. Modelling risky behaviour as BBN gives a way to add all
available data into the model as well as include expert assumptions about relationships between them
and their distributions. Moreover, the existed software tools for BBNs representation, visualization,
structure and parameter learning, inference and analysis, for example [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] or [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], allow researchers
focus on description of the model while calculations are performed automatically.
      </p>
      <p>The structure of BBN model is a directed acyclic graph G(V , L) with vertices
V  le1, le2 , le3, tmin , tmax , rate, n and edges (or links) L  u, v : u, v V  , where rate is random
variable for behavior rate; lei is random variable for the length of the interval between (i-1)-th and i-th
episodes from the end (0 corresponds to interview moment); tmin and tmax (min and max in figures)
are random variables for the length of minimum and maximum intervals; n is random variable for the
number of episodes during period of interest.</p>
      <p>
        Previous studies [
        <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
        ] proposed a BBN risky behavior model where edges were defined
theoretically under the assumptions of Poisson random process. The corresponding structure is
presented on figure 1. The theoretical assumptions require the inclusion of latent variable n into the
model while it is not observed in input data.
      </p>
      <p>
        Further studies [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] explored other structures including those that were data-based and used
structure-learning algorithms. The Hill-Climbing algorithm and other score-based methods in many
experiments produced a simplified structure close to naive Bayes classifier (figure 2). In this structure
the behavior rate is related to number of episodes and all other variables depended on the number of
episodes only. This structure has simple interpretation because the number of episodes can be directly
calculated on the base of the rate.
      </p>
      <p>The aim of this study is to explore the next step: if the number of episodes can be directly estimated
on the base of the rate and vice versa when T is given, let’s exclude one of these variables and
explore a new, even more simplified model (figure 3). Thus, we plan to estimate the influence of latent
variable n .</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <p>
        Since collecting the real data about behavior episodes is extremely time-consuming and requires
financial and organizational resources, we explored the characteristics of the models using
automatically generated dataset. The generation process assumes that the behaviour follows Poisson
random process: the occurrence of the next episode is independent from the previous ones, length of
interval between concurrent episodes follows exponential distribution. This assumption corresponds to
the features of real-life risky behavior and it is widely used in previous studies [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. The detailed
description of the theoretical background and previously designed software is provided in [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. The
use of the automatically generated dataset provides an important advantage: we can compare the
theoretical (the a priori given rate) and the estimated rate.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Data description</title>
        <p>On the first step we generated 1500 values for behavior rate from Gamma distribution (shape k 1.2,
scale   0.3 ). The choice of parameters’ value was aimed to get dataset that corresponded to
“reallife” risky behavior: behavior rate in most cases (in 94% of cases in our dataset) was less than 1 and
generated about one episode in 3–4 days.</p>
        <p>Next step included generation of 30 “respondents” (sequences of behavior episodes) for each rate
value for period of 365 days in total, that summed up to 45000 sequences of episodes. Then we
calculated initial data for the model: lengths of minimum, maximum intervals between episodes and
lengths of intervals between the last three episodes. After deletion of incomplete cases (e.g. cases with
only one episode during 365 days) the final dataset included 44350 cases (or “respondents”).</p>
        <p>To estimate prediction quality of the models we randomly select 5000 cases to the test dataset and
do not use them in model learning.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Discretization strategies</title>
        <p>As mentioned in Section 2 we conducted the series of experiments to explore the influence of the
latent variable n (the number of behavior episodes during period of interest T ) on the prediction
quality of the model. Hence, we considered three different model structures (Figure 1–3). Since we
explored discrete Bayesian networks, the discretization strategy can influence the prediction quality
measures. So, in this study, we compared three different discretization strategies. The detailed
description of each strategy is provided below. We used the same number of intervals for all strategies
for further comparison.</p>
        <p>The first discretization is based on breaks provided by experts. The general idea is to make smaller
intervals for more frequent values. In addition, this discretisation uses more interpretable breaks. For
example, rate 0.5,1 means that there was one behavior episode in 1–2 days. Moreover,
interpretability becomes more important for variables corresponded to interval lengths: le1 7,14
means that the last behaviour episode was 1–2 weeks ago. The usage of weeks / months / year notation
can simplify the questionnaire design for real-world applications.</p>
        <p>The second discretization strategy is quite simple: we divided possible range into intervals with
equal lengths and then added the last interval with infinity as the upper limit. We included background
knowledge about behavior characteristics in this strategy too: the infinite interval started at 1 for rate
variable (since we assumed that most of the cases had rate less than 1) and it started at 180 for
timelength variables (since we assumed at least two behavior episodes).</p>
        <p>The last strategy uses the idea of equal interval probabilities not lengths and is based on quantile
calculation. At the experiment we used sample quantiles but for real-world problems when the rate
distribution is unknown it is possible to use theoretical quantiles based on distribution assumptions.</p>
        <p>The example of the variable discretization according to the strategies is shown on figure 4.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Experiment design</title>
        <p>
          First, we applied the described discretization strategies to the train dataset that consisted of 39350
cases in total. On each iteration we randomly selected 10000 cases from our train dataset. Then we
learned the parameters of Bayesian belief network for all three structure models described in
Section 2: 1) the initial model, 2) the simplified model with fewer links and 3) the last model without
n variable. Thus, we had nine different settings at each iteration (all possible combinations of three
model structures and three discretization strategies). Finally, we used test dataset with corresponding
discretization to estimate prediction quality (accuracy, precision, recall, F1 score). All measures were
calculated according to multi-class classification metrics (average accuracy, macro precision, macro
recall, macro F1 score) [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]. To summarize the results we repeated the experiment 50 times.
        </p>
        <p>
          The calculations and statistical analysis were performed using R [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]; in particular, for Bayesian
network analysis we used bnlearn [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ] and gRain [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] packages.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>The average accuracy through 50 iterations is shown on figure 5. The discretization with equal
probabilities presented the highest results for all model structures, while the discrete intervals with
equal length were the worst approach.</p>
      <p>The F1 score presented the same pattern: the highest measures for discretization with equal
probabilities, the lowest for the one with equal interval lengths (figure 6). F1 score was even more
skewed: it was about 0.6 for equal probability strategy, about 0.54 for expert-defined breaks and 0.16–
0.22 only for equal length strategy. The latter was explained with further analysis of precision and
recall measures. The discretization with equal lengths had a poor prediction quality for all classes
except the first two of them. Due to extremely skewed distribution (see figure 4), neither model,
regardless of its structure, predicted rates higher than 0.3, so predictions were all at the first two
classes.</p>
      <p>Mean quality measures and the standard deviations (SD) for all model structures and discretization
strategies are summarized in table 1. Since the discretization with equal lengths showed the similar
results for the initial structure and the structure without n variable and slightly better but still poor
results for the simplified model (structure on figure 2) we focused on other discretization strategies.</p>
      <p>To compare all quality measures among the proposed models we run pairwise t-test for multiple
comparisons with Bonferroni correction. The simplified model had statistically significant higher
prediction quality measures comparing to both initial model and model without n variable in case of
expert-defined discrete intervals (table 1 (mean values) and table 2 (p-values)). There was no
statistically significant difference in quality measures between initial model and model without latent
n variable.</p>
      <p>In case of discretization with equal probabilities of intervals, both the model with simplified
structure and the model with structure without n variable outperformed the initial model. The
simplified model also had significantly higher precision comparing to the model without n variable
with the same levels of other measures.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>
        Risky behavior modelling, behavior parameter estimating and prediction became the focus of
studies in many research areas. An increased attention is given to investigation of online
behavior and prediction of social media user’s characteristics [
        <xref ref-type="bibr" rid="ref29 ref30">29, 30</xref>
        ], but traditional
surveybased studies continue to be promising, helpful and efficient.
      </p>
      <p>The current study explores the proposed behavior model on the base of Bayesian belief
networks that allows processing incomplete data about behavior episodes and predicting behavior
characteristics.</p>
      <p>Our findings suggested that ways of data transformation, particularly discretization strategies for
input data, had a significant impact on prediction quality: background knowledge about distributions,
theoretical assumptions about behavior led to higher prediction quality.</p>
      <p>We found that for the same period of interest the simpler structures showed better prediction
quality but it is important to mention that the quality difference did not exceed 1% (for example,
89.7% vs 89.9% for average 8-class accuracy for equal probability discretization). The effect of this
difference can be estimated in practical settings only: sometimes the error price is high; in other
circumstances it is diminishing.</p>
      <p>In general, proposed model showed high prediction quality and has a great potential for
analyzing real-life behavior problems.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work was partially supported by the by RFBR according to the research project No. 16-31-60063
and No. 18-01-00626</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Hasanbeigi</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Menke</surname>
            <given-names>C</given-names>
          </string-name>
          and
          <article-title>Du Pont P 2010 Barriers to energy efficiency improvement and decision-making behavior in</article-title>
          <source>Thai industry Energy Efficiency</source>
          <volume>3</volume>
          (
          <issue>1</issue>
          ) pp
          <fpage>33</fpage>
          -
          <lpage>52</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>van Ryn</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burgess</surname>
            <given-names>DJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dovidio</surname>
            <given-names>JF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Phelan</surname>
            <given-names>SM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saha</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malat</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Griffin</surname>
            <given-names>JM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fu</surname>
            <given-names>SS</given-names>
          </string-name>
          and
          <string-name>
            <surname>Perry</surname>
            <given-names>S 2011</given-names>
          </string-name>
          <article-title>The impact of racism on clinician cognition, behavior, and clinical decision making Du Bois review: social science research on race 8(1</article-title>
          ) pp
          <fpage>199</fpage>
          -
          <lpage>218</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Rubin</surname>
            <given-names>EV</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kellough</surname>
            <given-names>JE</given-names>
          </string-name>
          <year>2011</year>
          <article-title>Does civil service reform affect behavior? Linking alternative personnel systems, perceptions of procedural justice</article-title>
          , and
          <source>complaints Journal of Public Administration Research and Theory</source>
          <volume>22</volume>
          (
          <issue>1</issue>
          ) pp
          <fpage>121</fpage>
          -
          <lpage>141</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Mohan</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sivakumaran</surname>
            <given-names>B</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sharma</surname>
            <given-names>P 2013</given-names>
          </string-name>
          <article-title>Impact of store environment on impulse buying behavior</article-title>
          <source>European Journal of Marketing</source>
          <volume>47</volume>
          (
          <issue>10</issue>
          ) pp
          <fpage>1711</fpage>
          -
          <lpage>1732</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Farzan</surname>
            <given-names>R</given-names>
          </string-name>
          and
          <article-title>Brusilovsky P 2011 Encouraging user participation in a course recommender system: An impact on user behavior Computers in Human Behavior 27(1</article-title>
          ) pp
          <fpage>276</fpage>
          -
          <lpage>284</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Beutel</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Akoglu</surname>
            <given-names>L</given-names>
          </string-name>
          and
          <string-name>
            <surname>Faloutsos C 2015</surname>
          </string-name>
          <article-title>Fraud detection through graph-based user behavior modelling Proc. of the 22nd ACM SIGSAC Conf</article-title>
          . on Computer and Communications Security ACM pp
          <fpage>1696</fpage>
          -
          <lpage>1697</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Amundsen</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nordøy</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lingen</surname>
            <given-names>KE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sørlie</surname>
            <given-names>T</given-names>
          </string-name>
          and
          <string-name>
            <surname>Bergvik</surname>
            <given-names>S 2018</given-names>
          </string-name>
          <article-title>Is patient behavior during consultation associated with shared decision-making? A study of patients' questions, cues and concerns in relation to observed shared decision-making in a cancer outpatient clinic Patient education</article-title>
          and counseling
          <year>2018</year>
          101(
          <issue>3</issue>
          ) pp
          <fpage>399</fpage>
          -
          <lpage>405</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Hojilla</surname>
            <given-names>JC</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koester</surname>
            <given-names>KA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohen</surname>
            <given-names>SE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buchbinder</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ladzekpo</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matheson</surname>
            <given-names>T</given-names>
          </string-name>
          and
          <string-name>
            <surname>Liu</surname>
            <given-names>AY 2016</given-names>
          </string-name>
          <article-title>Sexual behavior, risk compensation, and HIV prevention strategies among participants in the San Francisco PrEP demonstration project: a qualitative analysis of counseling notes AIDS</article-title>
          and Behavior
          <volume>20</volume>
          (
          <issue>7</issue>
          ) pp
          <fpage>1461</fpage>
          -
          <lpage>1469</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Chiauzzi</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodarte</surname>
            <given-names>C</given-names>
          </string-name>
          and
          <article-title>DasMahapatra P 2015 Patient-centered activity monitoring in the self-management of chronic health conditions BMC medicine 13(1</article-title>
          ) p
          <fpage>77</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Ehlers</surname>
            <given-names>AP</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Drake</surname>
            <given-names>FT</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kotagal</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simianu</surname>
            <given-names>VV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Achar</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Agrawal</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joslyn</surname>
            <given-names>SL</given-names>
          </string-name>
          and
          <string-name>
            <surname>Flum</surname>
            <given-names>DR</given-names>
          </string-name>
          <year>2017</year>
          <article-title>Factors influencing delayed hospital presentation in patients with appendicitis: the APPE survey</article-title>
          <source>Journal of Surgical Research</source>
          207 pp
          <fpage>123</fpage>
          -
          <lpage>130</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Hardigan</surname>
            <given-names>PC</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Popovici</surname>
            <given-names>I</given-names>
          </string-name>
          and
          <article-title>Carvajal MJ 2016 Response rate, response time, and economic costs of survey research: a randomized trial of practicing pharmacists Research in Social</article-title>
          and
          <source>Administrative Pharmacy</source>
          <volume>12</volume>
          (
          <issue>1</issue>
          ) pp
          <fpage>141</fpage>
          -
          <lpage>148</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Tulupyeva</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paschenko</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tulupyev</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krasnoselskikh</surname>
            <given-names>T</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kazakova</surname>
            <given-names>O 2008</given-names>
          </string-name>
          <article-title>HIV risky behavior models in the context of psychological defense and other adaptive styles Nauka</article-title>
          , SPb
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Pearl</surname>
            <given-names>J 2000</given-names>
          </string-name>
          <string-name>
            <surname>Causality: Models</surname>
          </string-name>
          ,
          <string-name>
            <surname>Reasoning</surname>
          </string-name>
          , and Inference Cambridge University Press, Cambridge
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Barton</surname>
            <given-names>DN</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benjamin</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cerdan</surname>
            <given-names>CR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>DeClerck</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Madsen</surname>
            <given-names>AL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rusch</surname>
            <given-names>GM</given-names>
          </string-name>
          and
          <string-name>
            <surname>Villanueva C 2016</surname>
          </string-name>
          <article-title>Assessing ecosystem services from multifunctional trees in pastures using Bayesian belief networks</article-title>
          <source>Ecosystem Services</source>
          <volume>18</volume>
          pp
          <fpage>165</fpage>
          -
          <lpage>174</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Darwiche</surname>
            <given-names>A 2009</given-names>
          </string-name>
          <article-title>Modelling and reasoning with Bayesian networks</article-title>
          Cambridge: Cambridge University Press
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Suvorova</surname>
            <given-names>A 2013</given-names>
          </string-name>
          <article-title>Socially significant behavior modeling on the base of super-short incomplete set of observations Information-measuring and</article-title>
          <source>Control Systems</source>
          <volume>9</volume>
          (
          <issue>11</issue>
          ) pp
          <fpage>34</fpage>
          -
          <lpage>38</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Suvorova</surname>
            ,
            <given-names>AV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tulupyev</surname>
            <given-names>AL</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sirotkin</surname>
            <given-names>AV</given-names>
          </string-name>
          <year>2014</year>
          <article-title>Bayesian belief networks for risky behavior rate estimates</article-title>
          .
          <source>Nechetkie sistemy i myagkie vychisleniya [Fuzzy Systems and Soft Computing] 9</source>
          (
          <issue>2</issue>
          ) pp
          <fpage>115</fpage>
          -
          <lpage>129</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Suvorova</surname>
            <given-names>A</given-names>
          </string-name>
          and
          <string-name>
            <surname>Tulupyev</surname>
            <given-names>AL</given-names>
          </string-name>
          2016
          <article-title>Evaluation of the model for individual behavior rate estimate: Social network data XIX IEEE Int</article-title>
          .
          <source>Conf. on Soft Computing and Measurements (SCM)</source>
          IEEE pp
          <fpage>18</fpage>
          -
          <lpage>20</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Stepanov</surname>
            <given-names>DV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Musina</surname>
            <given-names>VF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suvorova</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tulupyev</surname>
            <given-names>AL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sirotkin</surname>
            <given-names>AV</given-names>
          </string-name>
          and
          <string-name>
            <surname>Tulupyeva</surname>
            <given-names>TV</given-names>
          </string-name>
          <year>2012</year>
          <article-title>Risky behavior Poisson model identification: heterogeneous arguments in likelihood Trudy SPIIRAN</article-title>
          23 pp
          <fpage>157</fpage>
          -
          <lpage>184</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <article-title>GeNIe&amp; SMILE. Decisions systems laboratory</article-title>
          . School of Information Sciences. University of Pittsburg http://genie.sis.pitt.edu/
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <article-title>AgenaRisk Bayesian network</article-title>
          tool http://www.agenarisk.com
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Suvorova</surname>
            <given-names>A</given-names>
          </string-name>
          and
          <string-name>
            <surname>Tulupyev</surname>
          </string-name>
          <article-title>A 2016 Learning Bayesian Network Structure for Risky Behavior Modelling Proc</article-title>
          .
          <source>of the 3rd Int. Scientific Conf. on Intelligent Information Technologies for Industry (IITI'16)</source>
          . Springer International Publishing pp
          <fpage>95</fpage>
          -
          <lpage>102</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Spiegelman</surname>
            <given-names>D</given-names>
          </string-name>
          and
          <string-name>
            <surname>Hertzmark E 2005 Easy</surname>
            <given-names>SAS</given-names>
          </string-name>
          <article-title>calculations for risk or prevalence ratios and differences American journal of epidemiology 162(3</article-title>
          ) pp
          <fpage>199</fpage>
          -
          <lpage>200</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Suvorova</surname>
            <given-names>A 2015</given-names>
          </string-name>
          <article-title>Test data generator for risky behavior probabilistic graphical model Proc. of the VIII Int</article-title>
          .
          <source>Scientific Conf. on Integrated models and soft computing in artificial intelligence Fizmatlit</source>
          <volume>2</volume>
          pp
          <fpage>799</fpage>
          -
          <lpage>805</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Sokolova</surname>
            <given-names>M</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lapalme</surname>
            <given-names>G 2009</given-names>
          </string-name>
          <article-title>A systematic analysis of performance measures for classification tasks</article-title>
          <source>Information Processing &amp; Management</source>
          <volume>45</volume>
          (
          <issue>4</issue>
          ) pp
          <fpage>427</fpage>
          -
          <lpage>437</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>R</given-names>
            <surname>Core Team 2017 R:</surname>
          </string-name>
          <article-title>A language and environment for statistical computing</article-title>
          .
          <source>R Foundation for Statistical Computing</source>
          , Vienna, Austria, http://www.R-project.org/
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Scutari</surname>
            <given-names>M 2010</given-names>
          </string-name>
          <article-title>Learning Bayesian Networks with the bnlearn</article-title>
          R
          <source>Package Journal of Statistical Software</source>
          <volume>35</volume>
          (
          <issue>3</issue>
          ) pp
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Hojsgaard</surname>
            <given-names>S 2012</given-names>
          </string-name>
          <string-name>
            <surname>Graphical Independence</surname>
          </string-name>
          <article-title>Networks with the gRain Package for</article-title>
          R
          <source>Journal of Statistical Software</source>
          <volume>46</volume>
          (
          <issue>10</issue>
          ) pp
          <fpage>1</fpage>
          -
          <lpage>26</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Preoţiuc-Pietro</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Volkova</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lampos</surname>
            <given-names>V</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bachrach</surname>
            <given-names>Y</given-names>
          </string-name>
          and
          <string-name>
            <surname>Aletras N 2015</surname>
          </string-name>
          <article-title>Studying user income through language, behaviour and affect in social media PloS one 10(9) e0138717</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Ruths</surname>
            <given-names>D</given-names>
          </string-name>
          and
          <string-name>
            <surname>Pfeffer J 2014</surname>
          </string-name>
          <article-title>Social media for large studies of behavior Science</article-title>
          346 pp
          <fpage>1063</fpage>
          -
          <lpage>1064</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>