=Paper= {{Paper |id=Vol-1663/bmaw2016_paper_2 |storemode=property |title=Measuring the Risk of Public Contracts Using Bayesian Classifiers |pdfUrl=https://ceur-ws.org/Vol-1663/bmaw2016_paper_2.pdf |volume=Vol-1663 |authors=Leonardo Jorge Sales,Rommel Novaes Carvalho |dblpUrl=https://dblp.org/rec/conf/uai/SalesC16 }} ==Measuring the Risk of Public Contracts Using Bayesian Classifiers== https://ceur-ws.org/Vol-1663/bmaw2016_paper_2.pdf
        Measuring the Risk of Public Contracts Using Bayesian Classifiers



                                   Leonardo J. Sales1,3 and Rommel N. Carvalho1,2
       1
           Department of Research and Strategic Information at the Brazilian Office of the Comptroller General ⇤
                            2
                              Department of Computer Science at the University of Brası́lia †
                                3
                                  Department of Economics at the University of Brası́lia ‡
                                     {leonardo.sales,rommel.carvalho}@cgu.gov.br


                        Abstract                                are the institutional means by which consumption mate-
                                                                rializes, having important role in the search for efficiency
                                                                and effectiveness of public spending.
     Bayesian Classifiers are widely used in ma-
     chine learning supervised models where there               Given the huge number of contracts and purchasing pro-
     is a reasonable reliability in the dependent vari-         cesses to audit, this context raises the challenge of act-
     able. This work aims to create a risk measure-             ing effectively in the pursuit of management problems,
     ment model of companies that negotiate with                fraud, and corruption. This is the responsibility of the
     the government using indicators grouped into               governmental control units, which specially in Brazil has
     four risk dimensions: operational capacity, his-           limited resources.
     tory of penalties and findings, bidding profile,           Take the example of the Office of the Comptroller Gen-
     and political ties. It is expected that this model         eral (CGU), the central unit of internal control of the
     contributes to the selection of contracts to be            Brazilian federal government, which is responsible for
     audited under the central unit of internal con-            auditing any transaction that represents federal spending.
     trol of the Brazilian government, responsible              The CGU should audit both spending conducted directly
     for auditing more than 30,000 public contracts             (by the central units of the ministries) as the ones con-
     per year.                                                  ducted indirectly (by almost 20,000 decentralized units),
                                                                including all payments made by any state or municipal-
                                                                ity that receives federal funds through voluntary transfers
1   INTRODUCTION                                                (Brazil, 2003). Nevertheless, the CGU has only 1,200
                                                                auditors working directly in the oversight of these ex-
Public contracts can be understood as adjustments made          penditures.
between public administration and private sector for the        In this context, a big issue arises involving the need to ra-
attainment of public interest objectives (Di Pietro, 1999).     tionalize the use of auditing capabilities. There is a clear
The contract terms are set by the governmental unit, this       need to optimize the choice of what will be effectively
being understood as any body or public authority of fed-        audited, since the complete census is impossible and un-
eral, municipal, or state level.                                economical. Acting in a preventive way to avoid future
Government spending coming from public contracts and            problems is also important since most of the errors found
direct purchases of goods and services account for ap-          generate irrecoverable damage, such as paralysis of a en-
proximately 19% of the Brazilian GDP in recent years.           gineering project or the need to redo it.
Data from the Brazilian Institute of Geography and              Both the rationalization of choices (in a subsequent op-
Statistics (IBGE), published in National Accounts Re-           eration) and the understanding and treatment of vulner-
port in the 2015 fourth quarter, quantifies in R$ 1.07 tril-    ability (in preventive action) can be analyzed within the
lion the amount of government consumption expenditure           more general concept of risk assessment. After all, what
in that year (IBGE, 2016). The bidding and procurement          is sought in both cases is to identify factors or char-
    ⇤
    SAS, Quadra 01, Bloco A, Edificio Darcy Ribeiro Brasilia,
                                                                acteristics of purchases or contracts which increase the
DF, Brazil                                                      chance of future problems such as mismanagement or
    †
    Campus Darcy Ribeiro Brası́lia, DF, Brazil                  even fraud.
    ‡
    Campus Darcy Ribeiro Brası́lia, DF, Brazil




                                                       BMAW 2016 - Page 7 of 59
Supervised learning models have been used in similar           tails of the methodology used in the study, including
problems in private sector. Financial institutions assess      the understanding of data modeling, the creation of the
the risk of potential borrowers, among many suitors with       networks, and the validation of the models. Section 4
different characteristics and history using such models,       presents and discusses the results. Finally, Section 5 pro-
in this case called credit scoring (Lessmann et al., 2015).    vides conclusions and considerations on gaps and oppor-
Insurance companies also use such statistical models to        tunities for future work.
assign the value of insurance for a certain good. The
techniques learn from the transaction history and quan-        2     THEORETICAL REFERENCES
tify the weight of certain characteristics in determining
the risk of a client or specific process. Thus, the auto in-   In this section we describe the public bidding process
surance company knows that unmarried young men offer           in Brazil, the Bayesian classifiers used for learning the
more risk than married women with children.                    predictive models, and some related works.
In practice, these models are applications of statistical
and computational techniques of regression and classi-         2.1    PUBLIC BIDDING IN BRAZIL
fication using databases that have information of trans-
action history and labeled cases of “success” and “fail-       The whole process of buying products or hiring services
ure” (Friedman et al., 2001). A good condition in the          in the Brazilian federal government takes place accord-
construction of this type of risk analysis model is the ex-    ing to the rules of Law 8666/1993 (Brazil, 1993), called
istence of information on transaction history, with vari-      Procurement Law. Other regulatory acts complement
ables representing different characteristics of each trans-    this law, such as Law 10520/2005 (Brazil, 2002), estab-
action. Thus, one can distinguish and identify correla-        lishing the types of Auction and Complementary Law
tions between groups.                                          123/2006 (Brazil, 2006) establishing privileges for mi-
                                                               cro and small businesses in bidding. Law 8666/1993
This paper proposes to create a predictive model of risk       (Brazil, 1993) details the stages of the bidding process
in contracts based on Bayesian classifiers. It will re-        itself, the bidding types allowed, types of contracts, as-
sult in the quantification of the propensity that a sup-       pects of qualification of companies, and also defines ad-
plier has problems in government contracts, according          ministrative and criminal penalties to be applied to sup-
to the company’s characteristics. Learning models using        pliers in case of noncompliance.
Bayesian networks are especially useful when you need
to organize or discover the knowledge of a particular area     The Procurement Law, together with other mentioned
through the construction of cause and effect relationships     legislation, defines the following administrative penalties
captured from a set of data (Spiegelhalter et al., 1993).      to suppliers, due to total or partial non-performance of
Besides this, Bayesian Classifiers have been incorpo-          contracts:
rated into risk measurement studies, especially when it
is important to capture and explain the relationships of           • warning;
cause and effect between the different prediction param-           • pecuniary penalty;
eters, avoiding the “black box” issue, common in other
techniques.                                                        • temporary suspension of bid;
The model will be used to select high-risk contracts to            • declaration of non-trustworthiness; and
be audited by the CGU and will be based on the estima-
                                                                   • impediment to bid and hire.
tion of the relations of cause and effect between various
indicators that are related to the propensity of contrac-      The whole process of procurement and contracting in the
tual risk. The dependent variable is the occurrence of         federal government is done using the government’s Gen-
more severe punishment that can be given to a supplier         eral Services Administration System (SIASG). Each pur-
in Brazil: the impediment to bidding. The indicators           chase or contract is recorded in this system, since the
that will be used as predictors represent characteristics      opening of the process to the issue of commitment.
grouped into four risk dimensions: operational capacity,
history of penalties and findings, bidding profile, and po-    Existing since 1994, the SIASG started to be used by the
litical ties.                                                  government gradually and it already has more than 5 mil-
                                                               lion purchases. All federal administration is required to
This work is divided into 5 sections. Besides this in-         use this system. Annually it records over 700,000 bids.
troduction, Section 2 presents the theoretical framework       Some of these bids representing continued provision of
that supports the central idea of the work and the method-     services or delivery of goods turns into contracts, gener-
ological approach adopted. Section 3 contains the de-          ating nearly 30,000 new contracts per year.



                                                      BMAW 2016 - Page 8 of 59
2.2        BAYESIAN CLASSIFIERS MODELS                                  • the logarithm of the Bayesian Dirichlet equivalent
                                                                          score (bde); and
Since Bayesian networks (BNs) have been successfully
used in classification problems – e.g., see (Sahami et al.,             • the logarithm of the modified Bayesian Dirichlet
1998; Friedman et al., 1997; Goldszmidt et al., 2010;                     equivalent score (mbde).
Friedman and Goldszmidt, 1996; Cheng and Greiner,
1999; Ceccon et al., 2014; Ye et al., 2014; Shi et al.,             2.3    RELATED WORKS
2013) –, we decided to experiment with different BN
learning algorithms in order to classify the companies              Many studies use supervised learning models in order to
that sell service and goods to the government with high             predict risk in business transactions. The area where it
likelihood of noncompliance.                                        is more common this type of approach is the bank credit
                                                                    (Lessmann et al., 2015; Hand and Henley, 1997).
Score-based learning is a popular method for inducing
BNs. The main idea is to assign a score to a model based            These learning models attempt to quantify how the char-
on how well it represents the data set used for learning.           acteristics of potential borrowers influence the probabil-
Thus, the purpose of the algorithm is to maximize the               ity of default. Classically, the techniques most used for
goodness-of-fit score.                                              this purpose are Logistic Regression and Discriminant
                                                                    Analysis (Ghodselahi, 2011). Other studies have been
In this work we use standard and well-known Bayesian                testing and comparing some modern techniques (Baesens
network classifiers, which are aimed at classification.             et al., 2002). In other areas, such as insurance, such mod-
More specifically, we use two algorithms available in the           els are also widely used.
bnlearn R package1 (Scutari, 2009):
                                                                    Bayesian Classifiers have been incorporated into these
                                                                    studies, especially when you want to capture and explain
  • Naı̈ve Bayes (naive.bayes): a simple algorithm
                                                                    the relationships of cause and effect between the differ-
    that assumes that all explanatory variables are in-
                                                                    ent prediction parameters, avoiding the “black box” is-
    dependent of each other. In other words, the target
                                                                    sue, common in other techniques (Jiang and Wu, 2009;
    variable is the only parent of all other variables.
                                                                    Zonneveldt et al., 2010; Baesens et al., 2002). Bayesian
  • Tree-Augmented Naı̈ve Bayes (tree.bayes): an al-                algorithms provide more clear insights when modeling
    gorithm that relaxes the simple Naı̈ve Bayes as-                causal relationships.
    sumption of independence, by allowing the explana-
                                                                    A new approach to credit scoring by synthesizing Sim-
    tory variables to have one other variable as parent
                                                                    ple Naı̈ve Bayesian Classifier (SNBC) and the Rough Set
    besides the target one.
                                                                    Theory is presented by (Jiang and Wu, 2009). A compar-
                                                                    ison between Naı̈ve Bayes (NB) models, different aug-
Besides that, we also tried two different score-based               mented NB models, and a handcrafted causal network is
learning algorithms, which are also available in the                made by (Zonneveldt et al., 2010).
bnlearn R package used in this work (Scutari, 2009):
                                                                    In the context of public procurement, some initiatives al-
  • Hill-Climbing (hc): a hill climbing greedy search               ready exist in order to implement similar models in pre-
    on the space of the directed graphs.                            dicting irregularities or contractual problems. For exam-
                                                                    ple, Naı̈ve Bayes algorithms are used by (Balaniuk et al.,
  • Tabu Search (tabu): a modified hill-climbing able               2012) in an unsupervised approach to quantify the com-
    to escape local optima.                                         bined risk of private companies and government units in
                                                                    the execution of contracts.
The bnlearn package implements random restart with
                                                                    (Sales, 2014) built a model with the same objective of
configurable perturbing operations for both algorithms.
                                                                    this work (to measure the risk of public contracts) and
A number of different scores were used to fine tune the             with similar data. In that case the accuracy using Logistic
models learned from the score-based algorithms and to               Regression and Decision Tree were compared, resulting
improve their performance, which are also available in              in the best accuracy of 64%.
the bnlearn package (Scutari, 2009):
                                                                    3     METHODS AND PROCEDURES
  • the Akaike Information Criterion score (aic);
                                                                    The first step in building the Bayesian classification
  • the Bayesian Information Criterion score (bic);
                                                                    model was the definition of the criteria for characteriza-
      1
          The package is available at http://www.bnlearn.com/.      tion of the companies with the highest risk (the “Bad”).



                                                            BMAW 2016 - Page 9 of 59
In this sense, we chose to characterize the “Bad” group             • Bidding profile: company profile when participat-
all companies that suffered the following punishments in              ing in bids, as the average quantity of offers, and the
the years 2015 and 2016: temporary suspension of bid,                 degree of success of business (percentage of wins).
declaration of non-trustworthiness, and impediment to
bid and hire. The group of low-risk companies (here-                     – Quantity of indicators: 12.
inafter “Good”) are companies with existing contracts in                 – Examples of indicators: quantity of purchases,
the same period but without such punishment.                               purchase quantity of items, average amount of
                                                                           offers, number of units of the federation, num-
The database used contained 1,448 companies, of which                      ber of wins, percentage of victory, value of
724 were previously classified as “Bad” and other 724                      contracts, the difference in days between the
previously classified as “Good”2 .                                         opening of the company and the first participa-
From this initial setting, the second step was the creation                tion in a public procurement.
of risk indicators, which cover the past of relations be-
tween companies and government, considering the pe-                 • Political ties: company relationship with politi-
riod since 2011, as well as other information that are                cians, via donations in campaigns.
independent of the period, such as those from the reg-                   – Quantity of indicators: 01.
istry of companies. The idea is to answer the following
                                                                         – Examples of indicators: amount donated in po-
question: What happened in the recent past of the com-
                                                                           litical campaigns.
panies that contributed to its contractual default in 2015
and 2016?
                                                                  The next step was the transformation of all variables in
These indicators were obtained from the four dimensions           factors (categories), using a simple process of discretiza-
of risk: operational capacity, history of penalties and           tion, where values of each variable were divided into
findings, bidding profile, and political ties. The mean-          three intervals of equal size. Once complete, the database
ing of each of the risk dimensions and some indicators            has been divided in training set (70%) and test (30%).
used are described below:                                         The discretization was carried out due to the limitation
                                                                  of some algorithms used. In future experiments, we will
  • Operational capacity: irregularities related to the           learn models using algorithms that allows continous vari-
    existence or insufficient physical and operational            ables.
    structure of the contracted company.
                                                                  At first, we used standard Bayesian classifiers available
        – Quantity of indicators: 11.
                                                                  in the bnlearn R package, Naı̈ve Bayes (NB) and Tree-
        – Examples of indicators: number of employ-               Augmented Naı̈ve Bayes (TAN).
          ees, number of partners, the total amount re-
          ceived from the government, amount received             As the database does not have a very large number of ob-
          from the government per employee, value re-             servations, we used a process of estimation with cross-
          ceived from the government for partner, av-             validation in the training subset for both algorithms. The
          erage salary of employees, average salary of            Cross-Validation procedure applied was the random di-
          the partners, company size, number of activi-           vision of training based on 10 sample partitions of equal
          ties carried out by the company, age from the           size, for use in cycles of modeling where 9 partitions are
          company.                                                used for training and one for testing. Error measures are
                                                                  then combined to have a single measurement error.
  • History of penalties and findings: pre-existence of
    punishment or audit findings related to the com-              The estimation with cross-validation was performed us-
    pany.                                                         ing a Score-based learning algorithm, which ranks the
                                                                  network structures created with emphasis on model fit.
        – Quantity of indicators: 04.                             In these algorithms, various parameters can be adjusted
        – Examples of indicators: quantity of received            in search of the best results forecast.
          punishments, number of alerts generated in
          CGU monitoring.                                         The loss function used to measure the model results was
    2                                                             the misclassification, where the dependent variable value
      The 724 companies in the “Bad” group are all companies
that meet the criteria described for this class. The 724 compa-   is the result of local distributions (from its parents) and
nies in “Good” group was obtained by sampling in the set of       the error function is measured by coincidence or not with
41,000 companies that meet the requirements described. Sam-       the actual values (hit rate).
pling the second group was made in order to solve the dominant
class issue, in a process called undersampling (see (Japkowicz    Since an important aspect of machine learning is the pa-
et al., 2000) for more details of this process).                  rameter tuning and both NB and TAN in bnlearn do



                                                        BMAW 2016 - Page 10 of 59
not have any parameters to be tuned, we decided to also         performance with the test set. The 95% confidence in-
try another set of algorithms. In bnlearn, a set of al-         terval of the accuracy was (0.69, 0.77), which shows that
gorithms that allow many different configurations is the        the model generalizes well. The sensitivity of the model
score-based learning algorithms, namely: Hill-Climbing          (prediction ability of “BAD” companies) was 76%. Ta-
(HC) and Tabu Search (Tabu), both using incremental             ble 2 shows the results of prediction on the test set.
search. Tabu introduces changes in HC in order to avoid
local optima.
                                                                Table 2: The table shows the model results, that pre-
In score-based algorithms, it is critical to set the network    sented a total accuracy of 73%, with higher quality in
score calculation method, which measures the quality of         the identification of “Bad” cases.
the network created using the quantification of poste-
                                                                                              Real values
rior probability. Two variables were used in the score
parameterization: type of score and penalty parameter.            Prediction           Good               Bad
The tested scores types were AIC (Akaike Information                 Good               170                47
Criterion Score), BIC (Bayesian Information Criterion                 Bad                69               148
Score), BDE (Bayesian Dirichlet Equivalent Score), and                 %          71% (specificity) 76% (sensitivity)
MBDE (Modified Bayesian Dirichlet Equivalent Score),
suitable for categorical variables. Besides that, we also
                                                                We consider this a good result in the context of gov-
tried many different penalty parameters.
                                                                ernment contracts, especially when compared with other
The central idea was to try different values of each pa-        similar works. Taking as reference the results obtained
rameter in order to find the setting that present the best      by (Sales, 2014), you can see a reasonable gain in predic-
predictive ability. For better understanding, Table 1           tive ability. The sensitivity of the model is particularly
shows some of these tested settings and its accuracy mea-       important since what really matters is the identification
sure, aiming to compare the Naı̈ve Bayes (NB) algorithm         of high-risk cases, even assuming the cost of auditing
setting with different configurations3 of Score-Based al-       some low risk contracts, which were misclassified.
gorithms.
                                                                5   CONCLUSION AND FUTURE WORK
Table 1: The table shows that despite our efforts in set-
ting up the Score-Based Algorithms, there was no sig-           This work is consistent with a great effort that has been
nificant difference than the Naı̈ve Bayes and TAN algo-         developed by government control institutions to rational-
rithms. The Accuracy here is the proportion of true re-         ize the use of their human and material resources in order
sults, either true positive or true negative.                   to provide more effective results at lower operating and
                                                                financial costs.
    Algorithm      Setting           Accuracy - 95% CI
                                                                Considering the current Brazilian context, where a se-
       NB              -                (0.70, 0.76)            vere economic crisis has been treated through large cuts
      TAN              -                (0.72, 0.78)            in public budgets (reducing the sending of resources to
      Tabu        AIC, K=0.1            (0.67, 0.73)            control bodies), the efficient use of resources should be a
      Tabu       BDE, ISS=25            (0.65, 0.70)            permanent goal.
       HC       MBDE, ISS=10            (0.64, 0.70)
                                                                The attempt to use statistical models based on Bayesian
                                                                networks is in addition to other initiatives presented in
                                                                Section 2. The main purpose of these studies is to extract
4        RESULTS                                                knowledge from various databases that government con-
                                                                trol institutions have access in order to facilitate the se-
Since the best models did not present a statistically sig-      lection of audit objects more likely to present problems.
nificant difference in performance and usually the sim-         The classification results are slightly better than other su-
pler the model the better the generalization, we chose          pervised models applied in government databases with
the Naı̈ve Bayes algorithm to run the final model with          the same goal (see (Sales, 2014), described in section
all the data from the training set in order to check the        2.3). However, we believe that there is room for im-
     3                                                          provement in two possible ways: the inclusion of new
     The parameters used to set the algorithm were the score-
based algorithm, Hill-Climbing (HC) or Tabu Search (Tabu),      indicators that capture aspects ignored by this model and
the score types (AIC, BIC, BDE or MBDE) and the penalty         the use of optimization algorithms in the parameteriza-
parameter (ISS or K).                                           tion of score-based networks.



                                                      BMAW 2016 - Page 11 of 59
Each step in direction of improving these models is a per-      http://link.springer.com/article/
manent gain for the public auditing activity, and conse-        10.1023/A%3A1007465528199.
quently to society.                                           Ahmad Ghodselahi. A hybrid support vector machine
                                                                ensemble model for credit scoring. International Jour-
References                                                      nal of Computer Applications, 17(5):1–5, 2011.
                                                              Moises Goldszmidt, James J. Cochran, Louis A.
Bart Baesens, Michael Egmont-Petersen, Robert
                                                               Cox, Pinar Keskinocak, Jeffrey P. Kharoufeh,
  Castelo, and Jan Vanthienen. Learning bayesian
                                                               and J. Cole Smith.    Bayesian network classi-
  network classifiers for credit scoring using markov
                                                               fiers. In Wiley Encyclopedia of Operations Re-
  chain monte carlo search. In Pattern Recognition,
                                                               search and Management Science. John Wiley &
  2002. Proceedings. 16th International Conference on,
                                                               Sons, Inc., 2010. ISBN 9780470400531. URL
  volume 3, pages 49–52. IEEE, 2002.
                                                               http://onlinelibrary.wiley.com/doi/
Remis Balaniuk, Pierre Bessiere, Emmanuel Mazer, and           10.1002/9780470400531.eorms0099/
  Paulo Cobbe. Risk based Government Audit Plan-               abstract.
  ning using Nave Bayes Classifiers. In Advances in
                                                              David J Hand and William E Henley. Statistical clas-
  Knowledge-Based and Intelligent Information and En-
                                                                sification methods in consumer credit scoring: a re-
  gineering Systems, 2012. URL https://hal.
                                                                view. Journal of the Royal Statistical Society: Series
  archives-ouvertes.fr/hal-00746198/.
                                                                A (Statistics in Society), 160(3):523–541, 1997.
Brazil. Lei n 8666, de 1993, 1993.                            IBGE. Indicadores do Instituto Brasileiro de Geografia
Brazil. Lei n 10520, de 2002, 2002.                             Estatstica, Contas Nacionais Trimestrais, 2016.
Brazil. Lei n 10683, de 2003, 2003.                           Nathalie Japkowicz et al. Learning from imbalanced
                                                                data sets: a comparison of various strategies. In AAAI
Brazil. Lei Complementar n 123, de 2006, 2006.
                                                                workshop on learning from imbalanced data sets, vol-
S. Ceccon, D.F. Garway-Heath, D.P. Crabb, and                   ume 68, pages 10–15. Menlo Park, CA, 2000.
  A. Tucker. Exploring early glaucoma and the visual
                                                              Yi Jiang and Li Hua Wu. Credit scoring model based
  field test: Classification and clustering using bayesian
                                                                on simple naive bayesian classifier and a rough set.
  networks. IEEE Journal of Biomedical and Health In-
                                                                In 2009 International Conference on Computational
  formatics, 18(3):1008–1014, May 2014. ISSN 2168-
                                                                Intelligence and Software Engineering, 2009.
  2194. doi: 10.1109/JBHI.2013.2289367.
                                                              Stefan Lessmann, Bart Baesens, Hsin-Vonn Seow, and
Jie Cheng and Russell Greiner. Comparing bayesian               Lyn C. Thomas. Benchmarking state-of-the-art clas-
   network classifiers.  In Proceedings of the Fif-             sification algorithms for credit scoring: An up-
   teenth Conference on Uncertainty in Artificial In-           date of research.     European Journal of Opera-
   telligence, UAI’99, page 101108, San Francisco,              tional Research, 247(1):124–136, November 2015.
   CA, USA, 1999. Morgan Kaufmann Publishers Inc.               ISSN 03772217.         doi: 10.1016/j.ejor.2015.05.
   ISBN 1-55860-614-9. URL http://dl.acm.                       030. URL http://linkinghub.elsevier.
   org/citation.cfm?id=2073796.2073808.                         com/retrieve/pii/S0377221715004208.
Maria Sylvia Zanella Di Pietro. Direito administrativo,       Mehran Sahami, Susan Dumais, David Heckerman, and
 volume 22. Atlas São Paulo, 1999.                            Eric Horvitz. A bayesian approach to filtering junk
Jerome Friedman, Trevor Hastie, and Robert Tibshi-             e-mail. In Learning for Text Categorization: Papers
   rani.    The elements of statistical learning, vol-         from the 1998 workshop, volume 62, page 98105,
   ume 1. Springer series in statistics Springer, Berlin,      1998.
   2001. URL http://statweb.stanford.edu/                     Leonardo Jorge Sales. Risk prevention brazilian gov-
   ˜tibs/book/preface.ps.                                       ernment contracts using credit scoring. In Interdisci-
Nir Friedman and Moises Goldszmidt. Building clas-              plinary Insights on Fraud, chapter 11, pages 264–286.
   sifiers using bayesian networks. In Proceedings of           Cambridge Scholars Publishing, 2014.
   the national conference on artificial intelligence, page   Marco Scutari. Learning bayesian networks with the
   12771284, 1996.                                             bnlearn r package. arXiv preprint arXiv:0908.3817,
Nir Friedman, Dan Geiger, and Moises Goldszmidt.               2009.
  Bayesian network classifiers. Machine Learning, 29          Wei Shi, Yao Wu Pei, Liang Sun, Jian Guo Wang, and
  (2-3):131–163, November 1997. ISSN 0885-6125,                Shao Qing Ren. The defect identification of LED
  1573-0565. doi: 10.1023/A:1007465528199. URL                 chips based on bayesian classifier. Applied Mechanics



                                                    BMAW 2016 - Page 12 of 59
  and Materials, 333-335:1564–1568, July 2013. ISSN
  1662-7482. doi: 10.4028/www.scientific.net/AMM.
  333-335.1564. URL http://www.scientific.
  net/AMM.333-335.1564.
David J. Spiegelhalter, A. Philip Dawid, Steffen L. Lau-
  ritzen, and Robert G. Cowell. Bayesian Analysis in
  Expert Systems. Statistical Science, 8(3):219–247,
  1993. URL http://www.jstor.org/stable/
  2245959.
Ye Ye, Fuchiang (Rich) Tsui, Michael Wagner,
  Jeremy U. Espino, and Qi Li.       Influenza de-
  tection from emergency department reports us-
  ing natural language processing and bayesian
  network classifiers.  Journal of the American
  Medical Informatics Association, pages amiajnl–
  2013–001934, January 2014.       ISSN , 1527-
  974X. doi: 10.1136/amiajnl-2013-001934. URL
  http://jamia.bmj.com/content/early/
  2014/01/09/amiajnl-2013-001934.
S Zonneveldt, K Korb, and A Nicholson. Bayesian
  network classifiers for the german credit data. Tech-
  nical report, Technical report, 2010/1, Bayesian
  Intelligence. http://www. Bayesian-intelligence.
  com/publications. php, 2010.




                                                 BMAW 2016 - Page 13 of 59