<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Data Science Methods and Techniques for Goods and Services Trading Taxation: a Systematic Mapping Study</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Douglas Silva</string-name>
          <email>douglas.bernardes@inf.ufg.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergio Carvalho</string-name>
          <email>sergiocarvalho@ufg.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Federal University of Goias</institution>
          ,
          <addr-line>Goiania-GO</addr-line>
          ,
          <country country="BR">Brazil</country>
        </aff>
      </contrib-group>
      <fpage>231</fpage>
      <lpage>243</lpage>
      <abstract>
        <p>Taxation on goods and services trading operations is the main revenue source for States and Provinces around the world. Collecting such taxes, however, constantly faces a series of challenges, ranging from the incorrect filling of tax documents involved (which leads to the incorrect calculation of the due tax) to attempts of tax fraud. As this context involves analyzing a very large amount of data, data science techniques appear as an interesting alternative to provide effective solutions to the problems that arise. This article describes a systematic mapping of the literature aimed to identify how data science methods and techniques have been applied to this context and how the problems inherent in this domain are being handled. Results show that there are very well-defined categories of problems being researched in this area, and that data science can efficiently be used to improve the collection of these types of taxes.</p>
      </abstract>
      <kwd-group>
        <kwd>value-added tax</kwd>
        <kwd>goods and services tax</kwd>
        <kwd>sales tax</kwd>
        <kwd>data science</kwd>
        <kwd>systematic mapping study</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Taxes are compulsory financial charges imposed on an individual or entity taxpayer by Government
in order to fund public expenditures
        <xref ref-type="bibr" rid="ref27 ref28 ref29 ref36 ref42">(Mathews et al. 2018, Rad et al. 2015)</xref>
        . They are regulated by
specific laws that describe their composition, their collection and compliance processes and even
resulting revenue application — if needed.
      </p>
      <p>
        Taxation on the sale of goods and provision of services is the main source of revenue for most
states and provinces around the world and therefore a relevant kind of indirect taxes. It is a tax
applied to each link in the consumption chain, and which generates, at each transaction, one or more
tax documents with a complete record of the items and parties involved in the transaction
        <xref ref-type="bibr" rid="ref50">(Yu et al.
2019)</xref>
        , including their tax classification and due tax rate. This information is usually registered in an
electronic invoice.
      </p>
      <p>
        Trading operations taxation has been implemented in different ways, but usually as a
noncumulative tax due proportionally to each taxpayer that compose the consumption chain. Some
taxes of this nature are Sales Tax in the United States
        <xref ref-type="bibr" rid="ref7">(Buxton et al. 2019)</xref>
        , ICMS (that stands for Tax
on Circulation of Goods and Services) in Brazil, GST (Goods and Services Tax) in countries like Australia,
Canada, Singapore and recently India
        <xref ref-type="bibr" rid="ref27 ref28 ref29 ref36">(Mathews et al. 2018, Mehta et al. 2018)</xref>
        , and variations of VAT
(Value-Added Tax) — that is used in most countries, like China
        <xref ref-type="bibr" rid="ref50">(Yu et al. 2019)</xref>
        and European Union.
      </p>
      <p>
        Tax law applied to goods and services trading, however, in addition to being complex, constantly
changes, and the taxpayer is not always up to date on the tax rules applicable to each product he
sells, or to each service he is willing to provide
        <xref ref-type="bibr" rid="ref25">(Lahann et al. 2019)</xref>
        . Tax benefits and exemptions are
also granted seasonally and for a specific period of time to specific segments of taxpayers, and all of
these possibilities directly impact the tax bookkeeping declared by all of them.
      </p>
      <p>
        These situations allow taxpayers, intentionally or unintentionally, to generate damage to public
treasury and consequently undermine provision of public services to the citizen. It becomes then
necessary to not only collect the taxes, but to verify if taxpayers done it properly and to proceed with
debt collection when necessary
        <xref ref-type="bibr" rid="ref1">(Abe et al. 2010)</xref>
        .
      </p>
      <p>
        The analysis of tax compliance information is currently a tax auditor's responsibility. The limited
number of human resources, associated with the volume of generated information, however, makes
conventional procedures ineffective and inefficient (Wang 2012). It is necessary to direct auditor's
focus, so that he acts less in formalities and more in signs of anomalies or fraud
        <xref ref-type="bibr" rid="ref5">(Basta et al. 2009)</xref>
        .
      </p>
      <p>Although technological development has enabled the automation of operational processes,
analysis of massive amounts of data aimed at identifying anomalies, inconsistencies and behavior
patterns for detecting evidence of fraud and tax non-compliance is still a challenge.</p>
      <p>Methods traditionally used to solve aforementioned problems are time-consuming, costly and
imprecise, and in big data scenario it is impractical.</p>
      <p>
        Although government have been analyzing tax data for ever, and analytics, AI and modern
technology help them do better, big data in this domain is recent. GST itself has been implemented
in India, e.g., only on 2017
        <xref ref-type="bibr" rid="ref11">(Das et al. 2017)</xref>
        , and electronic invoicing was made mandatory for Italian
companies just in January 2019
        <xref ref-type="bibr" rid="ref4">(Bardelli et al. 2020)</xref>
        . Problems related to data characteristics — as
volume, inconsistency and incompleteness — are hence also recent, and mapping how that
computing areas deals with this domain becomes needed.
      </p>
      <p>Data science models and strategies are, in general, useful to a context analogous to this. However,
their applicability varies according to the characteristics of available data. Therefore, it is necessary
to identify which techniques could be used and for which reasons, and also to identify aspects of
these models and strategies that have not yet been addressed.</p>
      <p>This article presents a systematic mapping of the literature that intends to comprehend the
domain of tax collection in goods and services trading operations and how data science has been
used to solve problems that emerge in this context, identifying a possible consensus or good
practices in handling these situations. However, it is important to understand that this is a
systematic mapping and not a systematic review. Its main objective is to map the domain area, its
datasets characteristics and how they influence researchers' choice about which techniques to use,
so it can help to clarify the way to future researchers. Techniques itself and how they handle the
problems found in this domain here would better be explored in a systematic review, with research
questions aimed to this end.</p>
      <p>The remainder of the paper is organized as follows. The second section introduces concepts
involving tax due to trade of goods and services, and data science techniques. Third section presents
the materials and methods used to define the systematic mapping protocol. Fourth section displays
the results of the systematic mapping performed, while the fifth section analyzes these results. The
sixth section discusses the results found and perceptions over them. Seventh and last section
presents the final considerations on the performed procedure.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Review Protocol</title>
      <p>
        This systematic mapping of literature followed the procedure described in
        <xref ref-type="bibr" rid="ref40">(Petersen et al. 2008)</xref>
        . As
part of the process, a research protocol was defined, which is detailed in the following items.
2.1
      </p>
      <sec id="sec-2-1">
        <title>Research Questions and Search Strategy</title>
        <p>This mapping sought to establish the state of the art in scientific research conducted in the field of
data science in the domain of tax data from goods and services trading operations. The specific
review questions addressed were:
1) What problems in the domain of goods and services trading taxation have been studied in
the area of data science?
2) What types of techniques and learning strategies have been applied?
3) Which data sources are used in the analysis?
4) Do the selected attributes vary according to the region / location (where the problem
occurs)?
5) Which datasets are used?
6) How big are these datasets?
7) Has the volume of data been a complicating factor for the analysis?
8) How has the problem of volumetry been dealt within this context?</p>
        <p>From the main keywords identified in these research questions, an initial string was defined and
calibrated through a pilot search in digital libraries from IEEE Xplore, ACM and Scopus, in order to
reduce likelihood of polarization.</p>
        <p>Assessment also took into account that taxes with these characteristics are called Sales Tax in the
United States, GST in India (among other countries) and VAT in the European Union and China. By
adding these three variations, we apparently reached all (or most of) aimed publications.</p>
        <p>The evaluation of pilot search results led to the following search string:
("value-added tax" OR "goods and services tax" OR "sales tax") AND
("data science" OR "artificial intelligence" OR "data mining" OR "machine learning"</p>
        <p>OR "neural network")</p>
        <p>
          After defining the string, we selected most common publications databases to perform systematic
mappings and reviews of the literature in the area of Software Engineering
          <xref ref-type="bibr" rid="ref13">(Dyba et al. 2005)</xref>
          ,
namely: ACM Digital, IEEE Xplore, ScienceDirect, SpringerLink and Scopus.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Criteria for Study Selection</title>
        <p>Once primary studies were obtained from the aforementioned research sources, inclusion and
exclusion criteria were applied to them in order to select those notably relevant to the systematic
mapping objetive.</p>
        <p>Thereby studies were considered eligible if they had tax collection in the trade of goods and
services as motivation and as scenario for implementation/validation of the proposed method, or
whose method had applicability to a context similar to that mentioned.</p>
        <p>Selected studies were also evaluated for their relevance (they should bring up data science
techniques) and formality, being excluded from the review publications that did not meet the
aforementioned eligibility criteria and:
• Papers that do not propose the use of data science methods or techniques to solve a problem
found in the mentioned domain;
• Papers that do not present the method proposed to solve the problem;
• Publications that have not been subjected to peer review;
• Publications that are not in English or Portuguese;
• Publications without the full text or unavailable;
• Repeated publications.</p>
        <p>The number of excluded papers, as well as the reason for their exclusion, were recorded as the
articles were evaluated.</p>
        <p>
          The process for selecting studies followed the one proposed by
          <xref ref-type="bibr" rid="ref37">Meline (2006)</xref>
          :
• Step 1 (screening): eligibility criteria were applied to the search results through a preliminary
evaluation of their title, abstract and keywords;
• Step 2: studies were then discarded if they meet one or more exclusion criteria, being evaluated
the same elements as step 1;
• Step 3 (full text review): eligibility and exclusion criteria were then applied to
remaining/accepted studies, now evaluating their full text.
2.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Methods for Data Extraction and Study Synthesis</title>
        <p>After evaluating full text of accepted articles, we filed them using a data extraction form, equalizing
the results found in each research and allowing their analysis and summarization.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results</title>
      <p>The following information was extracted from the selected articles: (i) title, authors and year of
publication; (ii) research problem; (iii) proposed data analysis technique; (iv) learning paradigm and
strategy, if it fits; (v) data sources used; (vi) datasets used; (vii) datasets volumetry (and inherent
problems); and (vii) gaps observed by the researchers.</p>
      <p>Following the process described in Section 2.2, we carried out a literature search on December 29,
2020, which initially returned 867 papers. Of these, 24 papers came from the IEEE Xplore digital
library, 66 from the ACM database, 258 from Scopus, 218 from the ScienceDirect digital database
and 301 from the SpringerLink database.</p>
      <p>After the initial reading of abstract, keywords and title, 71 duplicate articles were found and
discarded, and 747 articles were also rejected because they did not meet the eligibility criteria. In
these 747, 742 were excluded for not having as motivation and as validation scenario problems
related to tax collection in the trade of goods and services, not even by similarity, and another 5 were
rejected for proposing computational techniques not related to data science (such as blockchain or
ontologies) aiming some other aspect of the mentioned tax domain.</p>
      <p>It is important to highlight that, according to our view, the string contains only the terms
necessary to direct the results: three variations of how this consumption tax is called around the
world, and the name of techniques or areas that could indicate the application of Data Science to this
domain. Aiming avoid false positives, even (known) abbreviations of these taxes were removed from
the search string. However, several articles mention, often only once, the tax itself, or how useful it
would be to use data science to deal with it. Their application, however, or the domain itself, were
not the focus of these articles — and whenever that happened, they were discarded.</p>
      <p>
        Thus, from 747 articles, 49 remained for full text evaluation. These 49 articles were obtained and
evaluated as full text, and we found that 4 (four) of them should be rejected because they did not
have, as their main motivation, the improvement of tax collection in goods and services trading
operations
        <xref ref-type="bibr" rid="ref19 ref23 ref38">(Hoglund 2017, Kong et al. 2014, Krzikallová 2020, Meservy 1992)</xref>
        , and three of them
were discarded by the exclusion criterion related to the non-use of data science methods or
techniques to solve a problem found in the mentioned domain
        <xref ref-type="bibr" rid="ref2 ref6 ref9">(Akinboade et al. 2009, Bogdanov et
al. 2015, Cai et al. 2011)</xref>
        . In addition, two of them were rejected for not been written in Portuguese
or English
        <xref ref-type="bibr" rid="ref17 ref8">(Cadena et al. 2019, Hasanli et al. 2014)</xref>
        . Finally, three of them were not even accessible
        <xref ref-type="bibr" rid="ref26 ref27 ref28 ref29 ref36 ref45">(Loan et al. 2018, Mathews et al. 2018, Vicente et al. 2016)</xref>
        .
      </p>
      <p>After examining the full texts 37 articles remained. We applied then the data extraction form
defined in Section 2.3 and carried out the analyzes show bellow.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Results Analysis</title>
      <p>
        The evaluated publications start, chronologically, with the proposal of
        <xref ref-type="bibr" rid="ref46">Voorhees (2006)</xref>
        to carry out
a forecast of goods and services trading revenue through neural networks. He mentions that using
a neural network for this purpose is better than performing a regressive analysis, since it is limited
to the extent that independent variables cannot be correlated, residues must be independent and
errors must be equally distributed.
      </p>
      <p>
        Defa and Jing (2010) and
        <xref ref-type="bibr" rid="ref7">Buxton et al. (2019)</xref>
        also present approaches to forecasting revenue from
this tax. Defa and Jing combines three prediction models: a regression equation model, a time series
model and gray model, maximizing their combined accuracy and reaching less than 5% error.
Buxton et al., with a more recent work, also combine two models — Auto-regressive MultiLayer
Perceptron and LSTM — and are effective in forecasting the collection of different product
categories, such as fuels, construction and medicines.
      </p>
      <p>
        The expected tax, however, does not always match collected one. The process of verifying — and
seeking — the correctness of the tax declared by the taxpayer is known as tax compliance. In this
sense,
        <xref ref-type="bibr" rid="ref25">Lahann et al. (2019)</xref>
        presents an anomaly detection approach, in order to identify obvious
transactions that have a high degree of probability of being associated to a false tax code (and,
consequently, lead the taxpayer to pay an undue tax and, in most cases, a smallest one). In the same
line,
        <xref ref-type="bibr" rid="ref14">Fjeldstad et al. (2020)</xref>
        proposes a model based on a decision tree that verifies whether the
expected behavior and the taxpayer documents correspond to the tax operation planned for him.
        <xref ref-type="bibr" rid="ref32 ref33 ref34 ref35">Mehta et al. (2019)</xref>
        , to increase compliance levels, propose a regression model to identify defaulting
debtors and friendly Android apps to assist auditors in collecting tax. However, they also deal with
another aspect in the quest to guarantee the correct collection: the verification of tax evasion. To do
this, they explore the detection and analysis of a tax evasion mechanism, known as circular trading,
using advanced social network and algorithmic analytical techniques.
      </p>
      <p>
        <xref ref-type="bibr" rid="ref32 ref33 ref34 ref35">Mehta et al. (2019)</xref>
        have published a series of surveys involving the analysis of tax data and the
detection of tax fraud and tax evasion behavior by the taxpayer. Only from their work group
(apparently) 8 (eight) other articles were selected for full-text review in this systematic mapping.
      </p>
      <p>
        <xref ref-type="bibr" rid="ref27 ref28 ref29">Mathews et al. (2018)</xref>
        had already started exploring the circular trade problem. In this type of
transaction, a group of merchants "manufactures" sales and (or) purchases between themselves,
which results in the flow of goods in a circular manner without any added value: for the collecting
entity, the taxpayer (or the group) is entitled to an abatement of the tax to be paid, since the nature
of the tax indicates that it must pay only the tax on the value it added to the product. However, as
there was no acquisition initially, this "credit" is free, and in fact the taxpayer is only withholding
what would be due to him for selling the goods.
      </p>
      <p>To solve this problem, the entire series of articles published by the group seeks to model the
relationships between taxpayers, as well as the commercial transactions that take place between
them, in the form of a graph (where the contributors are the vertices and their relations, the edges),
and so that machine learning models can identify patterns and outliers in these relationships.</p>
      <p>
        In another paper by
        <xref ref-type="bibr" rid="ref27 ref28 ref29">Mathews et al. (2018)</xref>
        , the classification of suspected contributors is given in
three steps. In the first, taxpayers are clustered based on 7 correlations between variables such as tax
paid, the total amount of sales, the amount of tax paid in cash and the amount of tax-free sales. They
then use an application of Benford's law to classify taxpayers in each cluster as "trusted" and
"suspect". Finally, it uses data from trusted taxpayers to create a linear regression model, which is
then applied to suspect taxpayers to predict the amount of tax each tends to evade in the next period.
      </p>
      <p>
        <xref ref-type="bibr" rid="ref36">Mehta et al. (2018)</xref>
        try to predict whether a taxpayer tends to declare the tax appropriately in the
next reference. They are based on the behavior of statements of each company in previous years, on
the turnover of the current month, on the value of interactions with other taxpayers and on the
average absolute deviation obtained by the law of Benford, when applied to taxpayer sales
transactions. It also uses information from transport communications to carry out associations (all
transport of products demands this auxiliar document).
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>Table 1 shows a comparison of accepted papers. As can be seen, there is a preponderance on using
machine learning unsupervised techniques in two major categories of tax problem, which are even
related: fraud and tax evasion.</p>
      <p>
        Evasion occurs when any action by the taxpayer leads to the non-collection by the Public
Administration of the taxes due to it. It can occur intentionally or not, but regardless it causes
damage to the treasury, and for this reason it is combated. Fraud is a more specific case of evasion,
in which the taxpayer (or a group of taxpayers) intentionally uses techniques or subterfuge to avoid
being held responsible for the purchase and sale of goods they carry out. The most prominent of
these, according to the results of systematic mapping, is Circular Trading
        <xref ref-type="bibr" rid="ref27 ref28 ref29 ref30 ref32 ref33 ref34 ref35 ref36 ref41">(Mathews et al. 2018,
Mathews et al. 2018, Mehta et al. 2019, Mehta et al. 2019, Priya et al. 2019, Mathews et al. 2021)</xref>
        .
However, there are other actions, such as the indication of a false operating address to get rid of tax
obligations — known as Residence Fraud
        <xref ref-type="bibr" rid="ref15">(Junqué de Fortuny et al. 2014)</xref>
        — and clandestine
transportation of goods without a tax document.
      </p>
      <p>As we analyze the results of the mapping, it is clear that the techniques and learning paradigms
vary widely, but in general are associated with the characteristics of the data available in each
context.</p>
      <p>When it comes to a problem that involves historic of carried out operations, such as audits already
carried out or collection from previous months, the paradigm is usually supervised, since the data
tend to be labeled. This is also the case for tax compliance, as it is inherent to it to know the expected
tax classification for each item and to check if proper rate has been assigned to it.</p>
      <p>Tax fraud or evasion cases, on the other hand, can be dealt under both points of view. If data
analysis makes use of information from audits already carried out, with proofs that a certain
behavior was actually due to a "fraudulent" contributor, learning will be supervised and the
algorithm will use the characteristics associated with the given label to rank the next contributors.</p>
      <p>This is a rarer case, however, as the volume of audits performed and recorded is still small
compared to the volume of tax documents issued. Therefore, the trend observed in systematic
mapping is that the algorithms and learning techniques use the relationships between the taxpayers,
and the commercial transactions carried out by them, to identify patterns and outliers that indicate
suspicious behavior in an effective and efficient way.</p>
      <p>It is also worth noting that the use of machine learning in this domain is recent. According to the
mapping, 75% of the elected works carried out in this area were published in the last 5 years.</p>
      <p>This is due, in part, to the fact that the tax documents processed in the operations of trade in
goods and services have only recently become electronic. In the state of Goias, e.g., they are 100%
electronic since 2018, only.</p>
      <p>Finally, it is necessary to highlight that the volume of tax data to be processed during the learning
process was not mentioned as a problem. However, this may be due to a fact mentioned in several
studies: fiscal secrecy prevents researchers outside Revenue agencies from having access to data
from commercial transactions, limiting the scope of the proposals.</p>
      <p>This, however, could be a new opportunity when it comes to evaluating new learning techniques,
if access to tax data is granted.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Threats to Validity</title>
      <p>
        Despite the mapping's systematic character, some aspects are threats to its validity. The main one is
due to a characteristic inherent to a mapping or systematic review: when addressing specific
research questions, and for this purpose choosing the most appropriate terms for the search string,
search may fail to return interest results to the purpose of the review or mapping — just by not
matching the chosen terms
        <xref ref-type="bibr" rid="ref21">(Kitchenham et al. 2007)</xref>
        .
      </p>
      <p>For this work's matter, we defined that one of the mandatory expressions would be value-added
tax (with its syntactic variations), due to its recurrence as a tax on operations in the trade of goods
and services in different parts of the world. However, its acronym (VAT) was not included, as well
known as, but associated with the most diverse expressions (such as Visceral Adipose Tissue, in
medical articles). In contrast, articles of interest in this research that use only the known acronyms
of the surveyed taxes (VAT, GST), without naming them in full, were not returned by this review.</p>
      <p>Another threat to validity is due to the fact that the mapping was carried out by a single reviewer,
which may have biased in some way papers' interpretation.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Concluding Remarks</title>
      <p>The systematic mapping study presented here showed, within the scope of the main digital libraries
used to index studies published in the field of Computing, the state of the art of the proposed
approaches to deal with aspects related to tax collection in operations of trade in goods and services
through data science methods and techniques.</p>
      <p>Mapping showed that there are five major problems researched by the scientific community in
this context, with a greater focus on identifying and predicting of tax evasion behaviors by the
taxpayers, whether due to incorrect filling of tax documents or intentional attempts at tax fraud and
evasion.</p>
      <p>The mapping also showed that each of these problems requires specific data analysis methods
and techniques, and that the nature of these data leads to the choice of the appropriate learning
technique for each case. To address tax compliance (verifying if proper rate is being applied to each
product), for example, characteristics related to each tax class are labeled and a supervised learning
algorithm is needed to classify products and taxpayers. In order to detect tax evasion or fraud
attempts, such as circular trading, not only purchase and sale operations are analyzed, but also the
relationships between taxpayers, in order to identify outliers in their behavior. For this, an
unsupervised learning technique for clustering these taxpayers seems to be more suitable.</p>
      <p>Regarding the datasets used, there are two considerations. Unlike the initial suspicion, the
volume of data was not mentioned — in general — as an issue to be handled. On the other hand,
this may be due to the fact that most returned papers found it difficult to access tax data, due to
confidentiality involved, which limited the amount and variability of data used in the validation of
the proposed methods. It also guided — and maybe biased — the choice of the learning technique
to be used in some cases.</p>
      <p>Major implications for future research include a need for more taxpayers' behavior analysis
variations. As data is limited — in amount and depth, by confidentiality — only some aspects of
taxpayer behavior, as amount of sales and related tax, are usually investigated. Some works have
been done around fraud techniques as circular trading and residence fraud, mas it is still limited.
Taxpayers use regulation gaps in tax domain to apply fraud without breaking out tax procedures,
and therefore not being seen as an anomaly. Tax benefits and exemptions, granted seasonally and
for a specific period of time to specific segments of taxpayers, are also a huge opportunity for tax
evaders. This exceptions and unusual behaviors must be taken into account and be added to current
models for improvement and performance analysis.</p>
      <p>Furthermore, it would be interesting to systematically evaluate techniques current proposed to
handle tax evasion, how they arrange to adapt incomplete and inconsistent tax data and if a
consensus emerge of it. This could be proper done with a systematic literature review focused on
data science methods and techniques specifically proposed for tax evasion and fraud behavior.</p>
      <p>Finally, it lacks an evaluation of efficiency loss due to incomplete tax data, by the confidentiality
issue, and a definition of how to definitely deal with this problem. It could be achieved throw a
comparison of performance and effectiveness between a complete and incomplete data scenarios.
Defa, C., &amp; Jing, C. (2010). Construction of combination forecasting model and related validation – based on combined
forecast of sales tax and enterprise income tax in heilongjiang province. pp. 328–331.
Wang, G.L. (2012). Research on sampling method of tax-checking based on neural network. pp. 1541–1546.</p>
      <sec id="sec-7-1">
        <title>About the Authors</title>
        <p>Douglas Silva
Sergio Carvalho
Douglas B. Silva is a PhD student in Computer Science at the Federal University of Goias, Goiania-GO, Brazil.
His research interests include Data Science, Artificial Intelligence, Computer Systems and E-Government.
He currently works at the Public Treasury of the State of Goias, Brazil, analyzing goods and services trading
operations data.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Abe</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Melville</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Pendus</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Reddy</surname>
            ,
            <given-names>C.K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Jensen</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>V.P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Bennett</surname>
            ,
            <given-names>J.J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>G.F.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Cooley</surname>
            ,
            <given-names>B.R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kowalczyk</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Domick</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Gardinier</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Optimizing debt collections using constrained reinforcement learning</article-title>
          .
          <source>Proceedings of the 16th ACM SIGKDD</source>
          . p.
          <fpage>75</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Akinboade</surname>
            ,
            <given-names>O.A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kinfack</surname>
            ,
            <given-names>E.C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mokwena</surname>
            ,
            <given-names>M.P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kumo</surname>
            ,
            <given-names>W.L.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Benchmarking tax compliance efficiency among south african retail firms using stochastic frontier approach</article-title>
          .
          <volume>32</volume>
          (
          <issue>13</issue>
          ),
          <fpage>1124</fpage>
          -
          <lpage>1146</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Assylbekov</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Melnykov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Bekishev</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Baltabayeva</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Bissengaliyeva</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mamlin</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Czarnowski</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Caballero</surname>
            ,
            <given-names>A.M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Howlett</surname>
            ,
            <given-names>R.J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Jain</surname>
            ,
            <given-names>L.C.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Detecting Value-Added Tax Evasion by Business Entities of Kazakhstan</article-title>
          , pp.
          <fpage>37</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Bardelli</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Rondinelli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Vecchio</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Figini</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Automatic electronic invoice classification using machine learning models</article-title>
          .
          <source>Machine Learning and Knowledge Extraction</source>
          <volume>2</volume>
          (
          <issue>4</issue>
          ),
          <fpage>617</fpage>
          -
          <lpage>629</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Basta</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Fassetti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Guarascio</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Manco</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Giannotti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Pedreschi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Spinsanti</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Papi</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Pisani</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>High quality true-positive prediction for fiscal fraud detection</article-title>
          . pp.
          <fpage>7</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Bogdanov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Jõemets</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Siim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Vaht</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          ,
          <string-name>
            <surname>O.</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>How the estonian tax and customs board evaluated a tax fraud detection system based on secure multi-party computation</article-title>
          . vol.
          <volume>8975</volume>
          , pp.
          <fpage>227</fpage>
          -
          <lpage>234</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Buxton</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kriz</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Cremeens</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Jay</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>An auto regressive deep learning model for sales tax forecasting from multiple short time series</article-title>
          .
          <source>Intern. Conf. on Machine Learning and Applications</source>
          .
          <volume>1359</volume>
          -
          <fpage>1364</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Cadena</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Morán</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Analysis for possible tax evasions from the value added tax in ecuador using an stochastic model with a non-parametric technique</article-title>
          . pp.
          <fpage>428</fpage>
          -
          <lpage>438</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Cai</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Cai</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>The improvement on china's regional standard value added tax revenue estimate method - the construction, application and verification of standard rate model</article-title>
          . pp.
          <fpage>783</fpage>
          -
          <lpage>786</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Castellón</given-names>
            <surname>González</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            , &amp;
            <surname>Velásquez</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.D.</surname>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Characterization and detection of taxpayers with false invoices using data mining techniques</article-title>
          .
          <volume>40</volume>
          (
          <issue>5</issue>
          ),
          <fpage>1427</fpage>
          -
          <lpage>1436</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kolya</surname>
            ,
            <given-names>A.K.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Sense gst: Text mining &amp; sentiment analysis of gst tweets by naive bayes algorithm</article-title>
          . pp.
          <fpage>239</fpage>
          -
          <lpage>244</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Didimo</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Grilli</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Liotta</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Menconi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Combining network visualization and data mining for tax risk assessment</article-title>
          . pp.
          <fpage>16073</fpage>
          -
          <lpage>16086</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Dyba</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kitchenham</surname>
            ,
            <given-names>B.A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Jorgensen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Evidence-based software engineering for practitioners</article-title>
          .
          <volume>58</volume>
          -
          <fpage>65</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Fjeldstad</surname>
            ,
            <given-names>O.H.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kagoma</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mdee</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Sjursen</surname>
            ,
            <given-names>I.H.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Somville</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>The customer is king: Evidence on vat compliance in tanzania</article-title>
          .
          <volume>128</volume>
          ,
          <fpage>104841</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Junqué de Fortuny</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Stankova</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Moeyersoms</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Minnaert</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Provost</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Martens</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Corporate residence fraud detection</article-title>
          . p.
          <fpage>1650</fpage>
          -
          <lpage>1659</lpage>
          . KDD '
          <volume>14</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>González-Martel</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Hernández</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Manrique-de Lara-Penãte</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Identifying business misreporting in vat using network analysis</article-title>
          . p.
          <fpage>113464</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Hasanli</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Agayev</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Assessment of tax evasion risks for vat payers</article-title>
          .
          <volume>153</volume>
          (
          <issue>3</issue>
          ),
          <fpage>487</fpage>
          -
          <lpage>495</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>He</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Zeng</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Attention and memory-augmented networks for dual-view sequential learning</article-title>
          .
          <source>Proceedings of the 26th ACM SIGKDD</source>
          . p.
          <fpage>125</fpage>
          -
          <lpage>134</lpage>
          . KDD '
          <volume>20</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Hoglund</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Tax payment default prediction using genetic algorithm-based variable selection</article-title>
          .
          <volume>88</volume>
          ,
          <fpage>368</fpage>
          -
          <lpage>375</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Holkova</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Falat</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Statistical learning as a tool for optimizing the level of excise tax of mineral oils in slovakia</article-title>
          .
          <volume>192</volume>
          ,
          <fpage>318</fpage>
          -
          <lpage>323</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Kitchenham</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Charters</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Guidelines for performing systematic literature reviews in software engineering</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Kleanthous</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Chatzis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Gated mixture variational autoencoders for value added tax audit case selection</article-title>
          .
          <volume>188</volume>
          ,
          <fpage>105048</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Kong</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Saar-Tsechansky</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Collaborative information acquisition for data-driven decisions</article-title>
          .
          <volume>95</volume>
          ,
          <fpage>71</fpage>
          -
          <lpage>86</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>Krzikallová</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp; Tosenovsk`y,
          <string-name>
            <surname>F.</surname>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Is the value added tax system sustainable? the case of the czech and slovak republics</article-title>
          .
          <volume>12</volume>
          (
          <issue>12</issue>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>Lahann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Scheid</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Fettke</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Utilizing machine learning techniques to reveal vat compliance violations in accounting data</article-title>
          .
          <source>IEEE 21st Conference on Business Informatics (CBI)</source>
          . vol.
          <volume>01</volume>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <surname>Loan</surname>
            ,
            <given-names>N.T.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Hac</surname>
            ,
            <given-names>L.D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Anh</surname>
            ,
            <given-names>N.V.H.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Anh</surname>
            ,
            <given-names>L.H.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Dong</surname>
            ,
            <given-names>L.S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kreinovich</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Thach</surname>
            ,
            <given-names>N.N.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Application of Statistical Methods for Tax Inspection of Enterprises: A Case Study in Vietnam</article-title>
          . pp.
          <fpage>648</fpage>
          -
          <lpage>655</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <surname>Mathews</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mehta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Babu</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kasi</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>S.V.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>An algorithmic approach to handle circular trading in commercial taxation system</article-title>
          . pp.
          <fpage>67</fpage>
          -
          <lpage>75</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <surname>Mathews</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mehta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Babu</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kasi</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>S.V.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Clustering collusive dealers in commercial taxation system</article-title>
          .
          <source>Advances in Intelligent Systems and Computing</source>
          , vol.
          <volume>869</volume>
          , pp.
          <fpage>703</fpage>
          -
          <lpage>717</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <surname>Mathews</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mehta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kuchibhotla</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Bisht</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Chintapalli</surname>
            ,
            <given-names>S.B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>S.V.K.V.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Regression analysis towards estimating tax evasion in goods and services tax</article-title>
          . IEEE/WIC/ACM WI.
          <volume>758</volume>
          -
          <fpage>761</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <surname>Mathews</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mehta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Suryamukhi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Babu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2021</year>
          ).
          <article-title>Link prediction techniques to handle tax evasion</article-title>
          .
          <source>8th ACM IKDD CODS and 26th COMAD</source>
          . pp.
          <fpage>307</fpage>
          -
          <lpage>315</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <surname>Mehta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mathews</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Bisht</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Suryamukhi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Babu</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          , &amp; W., A.,
          <string-name>
            <surname>G.</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Detecting tax evaders using trustrank and spectral clustering</article-title>
          . vol.
          <volume>389</volume>
          LNBIP, pp.
          <fpage>169</fpage>
          -
          <lpage>183</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <string-name>
            <surname>Mehta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mathews</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kasi</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>S.V.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>K.S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Suryamukhi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Babu</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Identifying malicious dealers in goods and services tax</article-title>
          . pp.
          <fpage>312</fpage>
          -
          <lpage>316</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <string-name>
            <surname>Mehta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mathews</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Suryamukhi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Babu</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Curtailing the tax leakages by nabbing return defaulters in taxation system</article-title>
          . vol.
          <volume>1127</volume>
          CCIS, pp.
          <fpage>183</fpage>
          -
          <lpage>195</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <string-name>
            <surname>Mehta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mathews</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Suryamukhi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Babu</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>S.V.K.V.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Shivapujimath</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Bisht</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Big data analytics for tax administration</article-title>
          . vol.
          <volume>11709</volume>
          LNCS, pp.
          <fpage>47</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          <string-name>
            <surname>Mehta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mathews</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Suryamukhi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Sobhan Babu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kasi Visweswara Rao</surname>
            ,
            <given-names>S.V.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Big data analytics for nabbing fraudulent transactions in taxation system</article-title>
          . vol.
          <volume>11514</volume>
          LNCS, pp.
          <fpage>95</fpage>
          -
          <lpage>109</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          <string-name>
            <surname>Mehta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mathews</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Suryamukhi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>K.S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Babu</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Predictive modeling for identifying return defaulters in goods and services tax</article-title>
          . pp.
          <fpage>631</fpage>
          -
          <lpage>637</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          <string-name>
            <surname>Meline</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Selecting studies for systemic review: inclusion and exclusion criteria</article-title>
          .
          <volume>33</volume>
          ,
          <fpage>21</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          <string-name>
            <surname>Meservy</surname>
            ,
            <given-names>R.D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Denna</surname>
            ,
            <given-names>E.L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Hansen</surname>
            ,
            <given-names>J.V.</given-names>
          </string-name>
          (
          <year>1992</year>
          ).
          <article-title>Application of artificial intelligence to accounting, tax, and</article-title>
          audit services.
          <volume>4</volume>
          (
          <issue>2</issue>
          ),
          <fpage>213</fpage>
          -
          <lpage>218</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          <string-name>
            <surname>Mittal</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Reich</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mahajan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Who is bogus? using one-sided labels to identify fraudulent firms from tax returns</article-title>
          .
          <source>In: Proceedings of. COMPASS '18.</source>
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          <string-name>
            <surname>Petersen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Feldt</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mujtaba</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mattsson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>Systematic mapping studies in software engineering</article-title>
          .
          <source>In: 12th EASE</source>
          . pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          <string-name>
            <surname>Priya</surname>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mathews</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>K.S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Babu</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>S.V.K.V.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>A collusion set detection in value added tax using benford's analysis</article-title>
          . vol.
          <volume>858</volume>
          , pp.
          <fpage>909</fpage>
          -
          <lpage>921</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          <string-name>
            <surname>Rad</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Shahbahrami</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>High performance implementation of tax fraud detection algorithm</article-title>
          . pp.
          <fpage>6</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          <string-name>
            <surname>Rahimikia</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mohammadi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Rahmani</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Ghazanfari</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Detecting corporate tax evasion using a hybrid intelligent system: A case study of iran</article-title>
          . 25, pp.
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          <string-name>
            <surname>Vanhoeyveld</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Martens</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Peeters</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Value-added tax fraud detection with scalable anomaly detection techniques</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          <string-name>
            <surname>Vicente</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mateos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Jiménez-Martín</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Torra</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Narukawa</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Navarro-Arribas</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Yañez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Complicity Functions for Detecting Organized Crime Rings</article-title>
          . vol.
          <volume>9880</volume>
          , pp.
          <fpage>205</fpage>
          -
          <lpage>216</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          <string-name>
            <surname>Voorhees</surname>
            ,
            <given-names>W.R.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Neural networks and revenue forecasting: a smarter forecast? 1(4</article-title>
          ),
          <fpage>379</fpage>
          -
          <lpage>388</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Ou</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>H.y.</given-names>
          </string-name>
          ,
          <source>&amp;</source>
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>S.I.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Yen</surname>
            ,
            <given-names>D.C.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Using data mining technique to enhance tax evasion detection performance</article-title>
          .
          <volume>39</volume>
          (
          <issue>10</issue>
          ),
          <fpage>8769</fpage>
          -
          <lpage>8777</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Dong</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>A novel tax evasion detection framework via fused transaction network representation</article-title>
          . pp.
          <fpage>235</fpage>
          -
          <lpage>244</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Dong</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp; Z.,
          <string-name>
            <given-names>F.</given-names>
            , &amp;
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.</surname>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Tedm-pu: A tax evasion detection method based on positive and unlabeled learning</article-title>
          . pp.
          <fpage>1681</fpage>
          -
          <lpage>1686</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Qiao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Shu</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Neural network based transaction classification system for chinese transaction behavior analysis</article-title>
          .
          <source>2019 IEEE BigData Congress</source>
          . pp.
          <fpage>64</fpage>
          -
          <lpage>71</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          <string-name>
            <surname>Zha</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Taxaa: A reliable tax auditor assistant for exploring suspicious transactions</article-title>
          .
          <source>WWW '20</source>
          . p.
          <fpage>240</fpage>
          -
          <lpage>244</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Ruan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Dong</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2018</year>
          )
          <article-title>Irted-tl: An inter-region tax evasion detection method based on transfer learning</article-title>
          . pp.
          <fpage>1224</fpage>
          -
          <lpage>1235</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>