<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Pizzo Calabro (VV), Italy
" cauteruccio@mat.unical.it (F. Cauteruccio); e.corradini@pm.univpm.it (E. Corradini); terracina@mat.unical.it
(G. Terracina); d.ursino@univpm.it (D. Ursino); l.virgili@pm.univpm.it (L. Virgili)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>An investigation on Not Safe For Work adult content in Reddit</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>(Discussion Paper)</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Cauteruccio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Enrico Corradini</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giorgio Terracina</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Domenico Ursino</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luca Virgili</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DEMACS, University of Calabria</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>DII, Polytechnic University of Marche</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>Reddit is one of the few social platforms that handles NSFW (Not Safe For Work) content in an explicit and well-structured way. Despite this fact, such an issue has been very neglected in the past by researchers who have studied this social network. In this paper, we aim at providing a contribution in this setting by proposing an approach to extract and analyze text patterns from NSFW content in Reddit. An important peculiarity of our approach is that patterns are extracted not only based on their frequency (as it generally happens in the past literature), but also, and especially, on one or more utility measures.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Reddit</kwd>
        <kwd>NSFW posts and comments</kwd>
        <kwd>Text patterns</kwd>
        <kwd>Pattern utility measures</kwd>
        <kwd>Social Network Analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Data Cleaning
and Annotation</p>
      <p>Pattern
Extraction and</p>
      <p>Enrichment
Network-based
Pattern Analysis
 Removing bot's content, cleaning text
 Sentiment annotation
 Lexical annotation
 Frequent pattern mining
 Filter patterns by utility
 Enrich patterns with related data information
 Building of pattern-based networks and
 user-based networks
 Network analysis
and (iii) Network-based Pattern Analysis. Applying our approach on Reddit allowed us to make
several contributions to this research scenario. They involve: (i) discovering that traditional
approaches to sentiment computation are unreliable in the context of NSFW adult content; (ii)
defining and finding opinion leaders in real communities sharing NSFW adult content; (iii)
discovering text patterns representing the building blocks of NSFW posts and comments on
Reddit; (iv) determining new virtual communities of users sharing NSFW adult content; (v)
identifying opinion leaders who could influence such communities.</p>
      <p>The rest of this paper is structured as follows: Section 2 provides a general description of our
approach and the dataset used for our experiments. Section 3 illustrates the Pattern Extraction
and Enrichment step. Section 4 describes the Network-based Pattern Analysis step. Finally,
Section 5 presents our conclusions and possible future developments of our research eforts.</p>
    </sec>
    <sec id="sec-2">
      <title>2. General overview of our approach</title>
      <p>The general workflow of our approach is shown in Figure 1, which highlights the three steps
composing it (i.e., Data Cleaning and Annotation, Pattern Extraction and Enrichment, and
Network-based Pattern Analysis).</p>
      <p>
        The Data Cleaning and Annotation step removes irrelevant content and standardizes text
representations. It also performs lexical (e.g., part-of-speech and named entities) and sentiment
annotations. These latter highlight the polarity of sentiments expressed in the texts, represented
in terms of a compound score, computed by applying Vader [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Due to space limitations, we do
not illustrate this step in detail in this paper.
      </p>
      <p>The Pattern Extraction and Enrichment step extracts a set of text patterns from the posts
and comments identified in the previous step; they are the basis for the next Network-based
Pattern Analysis step. To this end, it first extracts frequent patterns. Then, it associates each
pattern with a rich set of features regarding the posts and comments it derives from, as well
as the users who published them. Afterwards, it defines some utility measures and associates
the corresponding values with each pattern. Finally, it selects only those patterns with high
frequency and high utility. Our approach allows the definition of diferent concepts and utility
measures and, consequently, the selection of diferent sets of useful patterns based on them.
This allows us to analyze the available NSFW content from very diferent perspectives, yet
adopting a uniform methodology.</p>
      <p>The Network-based Pattern Analysis step applies the concepts and approaches of Social
Network Analysis to the patterns obtained during the previous step with the goal of extracting
information and knowledge from them. Specifically, it constructs and uses three social networks,
namely: (i) User Interaction Network, in which a node  represents a user , who published at
least one post or comment. An arc (,  ,  ) denotes that  commented a post of  ; 
indicates how many times this happened. (ii) Pattern Network, in which a node  represents
a pattern  extracted in the previous step. An arc (,  ,  ) indicates that  and  were
adopted by at least one user in common;  indicates the number of users who adopted both 
and  . (iii) User Content Network, in which a node  represents a user , who published at least
one post or comment. An arc (,  ,  ) indicates that there is at least one comment posted by
 and at least one comment posted by  containing the same pattern;  denotes the number
of times this happened. Once these networks are built, this step proceeds by applying Social
Network Analysis concepts and approaches to them for extracting information and knowledge
on Reddit users publishing, commenting and reading NSFW adult posts and on the content these
users exchange. Due to space limitations, in this paper, we focus only on the User Interaction
Network.</p>
      <p>
        To perform our experiments for evaluating our approach, we downloaded a dataset from
the pushshift.io [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] website, which represents one of the main data repositories for Reddit.
Specifically, we considered 449 NSFW adult subreddits listed at https://www.reddit.com/r/
ListOfSubreddits/wiki/nsfw and extracted all posts and the corresponding comments published
on them from January 1, 2020 to March 31, 2020. The number of posts on the dataset is
3,064,758, while the number of comments is 11,627,372.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Pattern Extraction and Enrichment</title>
      <p>This step extracts text patterns from posts and comments in the dataset and, then, enriches
them with additional information concerning their frequency and utility. Pattern mining plays
a key role in this activity. It is a well known task in the literature, which extracts from a dataset
some (hopefully interesting and/or unexpected) information that can be understood by humans.</p>
      <p>
        Many pattern mining approaches are based on the concept of pattern frequency and aim at
identifying the most frequent patterns in the texts received as input. They are based on the
assumption that frequent patterns are interesting [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. This is true in many application contexts.
However, there are cases where it does not hold. To handle these cases, the notion of pattern
utility has been introduced. It shifts the emphasis from frequent pattern mining to High Utility
Pattern Mining (hereafter, HUPM) [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ]. In this case, a utility function is defined; the patterns
with a high value of this function are considered interesting. Recall that a utility function
denotes a user preference ordering over a set of choices [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. It is clearly a subjective measure
allowing us to state the usefulness of a text pattern from diferent perspectives, depending on
our preferences and/or needs.
      </p>
      <p>After having introduced the concepts of frequency and utility of a pattern, we can illustrate
our approach to pattern extraction and the model which it operates on. Let  = {1, 2, · · · , }
be a set of lemmatized comments, obtained at the end of the Data Cleaning and Annotation step.
Each comment  ∈  corresponds to a post and is written by a Reddit user. We can represent
 as a set of lemmas  = {1, 2, · · · , }. Therefore, if we denote by ℒ = {1, 2, · · · , }
the set of all possible lemmas, then  is a subset of ℒ. From the HUPM perspective, each
lemma is an item. A pattern  is a set of items and, therefore,  ⊆ ℒ .  can occur in zero,
one or more comments in our dataset. We denote by  ⊆  the subset of the comments of
 in which  is present, and by frequency of  the cardinality of  .  inherits the set of
features characterizing the comments of  , and the utility of  can be defined as an appropriate
function of all or some of these features. The choice of the features and the utility function to
adopt determine the perspective one whishes to consider in the analysis of patterns.</p>
      <p>
        For example, consider the features score_comm (denoting the score of a comment) and
compound (indicating the sentiment value extracted from the text of a comment). Suppose that
the utility function is the Pearson’s correlation [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] between them, which allows us to say
whether there is a form of correlation between the two features, such that a high (resp., low)
score of a comment arouses a positive (resp., negative) sentiment about it. This function allows
us to select those patterns whose presence in comments with high (resp., low) scores is flanked
by a positive (resp., negative) sentiment. We point out that this correlation between score and
sentiment is not obvious for comments because there could exist comments with high (resp.,
low) score and null or negative (resp., positive) sentiment. In the following, we call  the utility
function computing the Pearson correlation. It is worth investigating both patterns having
a positive value of  and those having a negative value of that function. Indeed, a positive
(resp., negative) value of  indicates that there is a direct (resp., inverse) correlation between
the sentiment aroused by a comment and the score it obtains. Consequently, we denote by +
(resp., − ) the function that selects those patterns having a value of  greater (resp., lesser)
than a threshold ℎ+ (resp., ℎ− ).
      </p>
      <p>
        Figure 2 shows the trend of the number of extracted patterns as ℎ− (resp., ℎ+) decreases
(resp., increases). Patterns are also grouped based on their length. This figure provides us with
non-obvious and extremely interesting knowledge. In fact, the number of patterns extracted by
− is much greater than the one extracted by +. This allows us to say that, given a pattern, a
positive (resp., negative) sentiment of it is not necessarily accompanied by a high (resp., low)
score of the comments where it is present. This phenomenon is very evident for moderately
positive or negative values of , while it reduces strongly for extreme values. It can be explained
by considering that, given the nature of the reported texts, NSFW posts and comments tend
to be associated with a negative sentiment by any sentiment analysis tool. This happens even
when such terms are used in goliardic comments, which are actually appreciated by this type
of audience. For instance, consider the text pattern {ℎ,  }, possibly accompanied by an
emoticon with two little hearts instead of eyes. Applying the sentiment analysis tools available
in the literature to our dataset [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], we obtained a sentiment value of -0.1280 for this pattern.
Instead, the corresponding comments have a very high score. As a consequence, the value of
 is negative. This allows us to obtain a first important outcome of our approach, namely
that traditional sentiment computation tools do not work well in presence of NSFW posts
and comments. As for the choice of the values of ℎ+ and ℎ− , all the reasoning above, and
the presence of a low number of extracted patterns, led us to choose low values for the two
thresholds. In particular, we set ℎ+ = 0.1 and ℎ− = − 0.1.
      </p>
      <p>We end this discussion by pointing out that many other utility functions could be defined on
many diferent features concerning posts and users. This important peculiarity of our approach
allows us to analyze the phenomenon of NSFW content from many diferent perspectives.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Analysis of User Interaction Networks</title>
      <p>In this section, we formally introduce the User Interaction Network and show how information
can be derived from it. Let  be the set of patterns obtained by applying the utility function  ,
and let  be the set of users who have written at least one comment or post containing at least
one of the patterns of  . A User Interaction Network   can be defined as:   = ⟨ , ⟩.
  is the set of nodes of  . There exists a node  ∈   for each user  ∈  .  is
the set of arcs of  . An arc (,  ,  ) ∈  denotes that  made comments on a post
published by  ;  measures how many times this task happened.</p>
      <p>allows us to characterize the behavior of users, who interact with each other by posting
and commenting NSFW adult content.</p>
      <p>Table 1 lists some parameters of the User Interaction Networks built exploiting the dataset
outlined in Section 2, the utility functions + and − , and the thresholds ℎ+ and ℎ− . This
 contains a larger number
table confirms the information derived in Section 3. For instance, −
of nodes and arcs than + . Note that the cardinalities of − and + follow the same trend as
the number of nodes (and users) in the User Interaction Networks, even though they represent
patterns instead of users. It is worth observing the high density coupled with a high clustering
coeficient characterizing + . This tells us that, in this network, users tend to form very tight
 has a high density but
communities, whose structure resemble that of a clique. Instead, −
a low clustering coeficient; this suggests the presence of very strong power users, i.e., users
receiving comments from many other ones, who do not actually interact with each other. In
both networks, the number of nodes composing the maximum connected component is very
high, i.e., about 60%.

parameters  and  of these power law distributions; they are:  = 1.371,  = 0.062 for −
and  = 1.507,  = 0.063 for + . The results of these analyses show that there is a low
number of pairs of users in which one of them comments a post of the other at least twice
(we call them interacting users in the following). This can be considered a minimum condition
to detect non-random relationships between pairs of users. We compared the values of the
average indegree, outdegree and clustering coeficient of the interacting users, on one hand,
and all the users, on the other hand. The bottom of Table 1 shows this comparison for the
two considered networks. It is easy to observe that, in both networks, interacting users have
indegrees and outdegrees much higher than the other users. Therefore, they can be considered
power users. Also, their clustering coeficient is very high, indicating that they are able to
promote the generation of communities. Therefore, it can be said that they are community
leaders in the distribution of NSFW adult content in Reddit.</p>
      <p>At this point, we found it interesting to investigate the possible existence of mutual
relationship between interacting users. For this purpose, we determined the fraction of interacting users
such that a user  comments the posts of a user  , and vice versa. We found that this fraction
 , whereas it is higher (i.e., 0.433) for + . Furthermore, although
is low (i.e., 0.141) for −
the number of nodes of + is considerably lower than that of −
 , the two networks have

similar average indegree and outdegree for both normal and interacting users. Moreover, +
has a much higher clustering coeficient and a much higher fraction of interacting users than
 . Based on all these outcomes, we can state that, although the number of users of − is
−
much greater than that of + , these last ones have a higher attitude to be opinion leaders.
Taking also the utility function underlying + into account, we can deduce that the users of
this network are particularly able to maintain a positive correlation between the sentiment of
their comments and the associated scores. Finally, the results obtained so far allow us to state
that the users of + are the most dynamic ones, as they publish posts attracting interest (since
they are commented by other users), and comment the posts of the other users. This feature
makes them particularly important, as they are not only content producers, but also dynamic
participants who contribute to maintain their communities active, and act as opinion leaders
for these communities. We call them proactive users.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This paper presented an approach to analyze NSFW users, comments and posts on Reddit,
taking into account and exploiting the knowledge extracted by investigating text patterns. The
methods and results considered in this paper can represent the basis for many new applications.
In fact, posts and subreddits of other target categories (e.g., vegetarian or vegan users) could
be examined by means of the same methodology. Moreover, the analysis of text patterns
could represent the engine of an automatic classifier aiming at tagging posts and comments
containing unsuitable content. Furthermore, our approach can be adapted to other social
networks managing NSFW content less explicitly than Reddit. Finally, we plan to design an
automatic tool that exploits a knowledge base built by integrating utility patterns and semantic
analysis tools to automatically classify new contents, and, then, suggest the most pertinent
communities which they should be directed to.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            <surname>Tiidenberg</surname>
          </string-name>
          ,
          <article-title>Boundaries and conflict in a NSFW community on tumblr: The meanings and uses of selfies</article-title>
          ,
          <source>New Media &amp; Society</source>
          <volume>18</volume>
          (
          <year>2016</year>
          )
          <fpage>1563</fpage>
          -
          <lpage>1578</lpage>
          . Sage Publications.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Matias</surname>
          </string-name>
          ,
          <article-title>Going dark: Social factors in collective action against platform operators in the Reddit blackout</article-title>
          ,
          <source>in: Proc. of the International Conference on Human Factors in Computing Systems (ACM CHI</source>
          <year>2016</year>
          ), San Jose, CA, USA,
          <year>2016</year>
          , pp.
          <fpage>1138</fpage>
          -
          <lpage>1151</lpage>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B. K.</given-names>
            <surname>Narayanan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nirmala</surname>
          </string-name>
          ,
          <article-title>Adult content filtering: Restricting minor audience from accessing inappropriate Internet content</article-title>
          ,
          <source>Education and Information Technologies</source>
          <volume>23</volume>
          (
          <year>2018</year>
          )
          <fpage>2719</fpage>
          -
          <lpage>2735</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E.</given-names>
            <surname>Corradini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nocera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ursino</surname>
          </string-name>
          , L. Virgili,
          <article-title>Investigating the phenomenon of NSFW posts in Reddit</article-title>
          ,
          <source>Information Sciences 566</source>
          (
          <year>2021</year>
          )
          <fpage>140</fpage>
          -
          <lpage>164</lpage>
          . Elsevier.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Hutto</surname>
          </string-name>
          , E. Gilbert,
          <article-title>Vader: A parsimonious rule-based model for sentiment analysis of social media text</article-title>
          ,
          <source>in: Proc. of the International AAAI Conference on Weblogs and Social Media (ICWSM'14)</source>
          , Ann Arbor, MI, USA,
          <year>2014</year>
          , pp.
          <fpage>216</fpage>
          -
          <lpage>225</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Baumgartner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zannettou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Keegan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Squire</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Blackburn,</surname>
          </string-name>
          <article-title>The pushshift Reddit dataset</article-title>
          ,
          <source>in: Proc. of the International AAAI Conference on Web and Social Media (ICWSM'20)</source>
          , volume
          <volume>14</volume>
          ,
          <string-name>
            <surname>Atlanta</surname>
            ,
            <given-names>GA</given-names>
          </string-name>
          , USA,
          <year>2020</year>
          , pp.
          <fpage>830</fpage>
          -
          <lpage>839</lpage>
          . AAAI Press.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Fournier-Viger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.-W.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Vo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Chi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <article-title>A survey of itemset mining</article-title>
          ,
          <source>WIREs Data Minining and Knowledge Discovery</source>
          <volume>7</volume>
          (
          <year>2017</year>
          )
          <article-title>e1207</article-title>
          . doi:https://doi.org/ 10.1002/widm.1207, Wiley.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhuiyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Hasan</surname>
          </string-name>
          ,
          <article-title>Frequent pattern mining algorithms: A survey</article-title>
          , in: J. H. C. Aggarwal (Ed.),
          <source>Frequent Pattern Mining</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>19</fpage>
          -
          <lpage>64</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>319</fpage>
          -07821-2\_2, springer, Cham.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Fournier-Viger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-W.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Nkambou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Vo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tseng</surname>
          </string-name>
          ,
          <string-name>
            <surname>High-Utility Pattern</surname>
            <given-names>Mining</given-names>
          </string-name>
          ,
          <year>2019</year>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>W.</given-names>
            <surname>Gan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fournier-Viger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tseng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <article-title>A survey of utility-oriented pattern mining</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>33</volume>
          (
          <year>2021</year>
          )
          <fpage>1306</fpage>
          -
          <lpage>1327</lpage>
          . doi:https://doi.org/10.1109/TKDE.
          <year>2019</year>
          .
          <volume>2942594</volume>
          , iEEE.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>K.</given-names>
            <surname>Pearson</surname>
          </string-name>
          ,
          <article-title>Note on regression and inheritance in the case of two parents</article-title>
          ,
          <source>Proceedings of the Royal Society of London</source>
          <volume>58</volume>
          (
          <year>1895</year>
          )
          <fpage>240</fpage>
          -
          <lpage>242</lpage>
          . The Royal Society.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>