<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>LSACoNet: A Combination of Lexical and Conceptual Features for Analysis of Fake News Spreaders on Twitter</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hamed Babaei Giglou</string-name>
          <email>h.babaei98@ms.tabrizu.ac.ir</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jafar Razmara</string-name>
          <email>razmara@tabrizu.ac.ir</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mostafa Rahgouy</string-name>
          <email>mostafa.rahgouy@partdp.ai</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mahsa Sanaei</string-name>
          <email>mahsasanaei97@ms.tabrizu.ac.ir</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Tabriz</institution>
          ,
          <addr-line>Tabriz</addr-line>
          ,
          <country country="IR">Iran</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Part AI Research Center</institution>
          ,
          <addr-line>Tehran</addr-line>
          ,
          <country country="IR">Iran</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <abstract>
        <p>Fake news detection on social medial has attracted a huge body of research as one of the most important tasks of social analysis in recent years. In this task, given a Twitter feed, the goal is to identify fake/real news authors or spreaders. We assume fake news authors mostly like to play with the semantic aspect of news rather than trying to add specific changes to their styles. However, making a change into the semantic aspect of news can cause unwanted changes in style. We hypothesize, by relying on news content, a combination of semantic and coarse-grained features may lead us to common information about the author's style while reviewing the conceptual aspect of author documents. In this paper, we propose the LSACoNet representation using a fully connected neural network (FCNN) classifier that combines different levels of document representation to investigate this hypothesis. Experimental results presented in this paper showed that a combination of representations plays an important role in identifying fake/real news spreaders. Finally, we achieved accuracies of 72.5% and 74.5% in the English and Spanish test datasets, respectively, using presented LSACoNet representation and FCNN classifier.</p>
      </abstract>
      <kwd-group>
        <kwd>Fake News</kwd>
        <kwd>False Information</kwd>
        <kwd>Feature Combination</kwd>
        <kwd>Suspicious Fake News Authors</kwd>
        <kwd>Fully Connected Neural Network</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>False information such as fake news is one of the main threats of our society. In the last
years, big social networks like Facebook or Twitter have admitted that their networks
had fake and duplicate accounts. Regarding this, fake news are not a new phenomenon,
and the exponential growth of social media has offered an easy way for fast propagation.
These fake news usually try to deceive users to express specific options. Users play a
critical role in the creation and spread of fake news by influencing people to make a
decision, support or attack an idea, or even election candidate.</p>
      <p>This year at author profiling tasks series, the new task got a place to convey our concern
to stop spreading fake news, a Profiling Fake News Spreaders on Twitter [16] task. In
this task, we aim to identify possible fake news spreaders on social media as a first step
towards preventing fake news from being propagated among online users.</p>
      <p>Task: Given a Twitter feed, determine whether its author is keen to be a spreader
of fake news.</p>
      <p>The main goal was aimed to investigating if it is possible to discriminate authors that
have shared some fake news in the past from those that, to the best of our knowledge,
have never done it. Also, this task runs based on a multilingual perspective for English
and Spanish languages.</p>
      <p>The rest of the paper is organized as follows. Section 2 presents related works. Section
3 describes the proposed method. Section 4 describes the performed baselines,
experiments, and discusses the obtained results. Finally, section 5 presents our conclusions.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>In the fake news challenge (FNC-1) [6] shared task, studies have been done with 50
participating teams. They performed a detailed feature analysis of participant and
concluded that identifying high-performing features for the task yields a new model which
mostly rely on the lexical overlap for classification. They believe that this task is
challenging since the best performing features are not yet able to resolve difficult cases.
Thus, more sophisticated machine learning techniques are needed, which have a deeper
semantic understanding. In [18], the authors made a study to understand user profiles
on social media for fake news detection and proposed a principled way to understand
which features of user-profiles are helpful for fake news detection. They concluded that
first, there are specific users who are more likely to trust fake news than real news,
second, these users reveal different features from those who are more likely to trust real
news. These observations showed the importance of feature construction for fake news
detection.</p>
      <p>According to a major study of [6] and the study of [18], we believe, this task is
sensitive to feature dimensionality. That is, low-quality features can reduce overall model
performance. Feature combination is one of the common actions used to enhance
features. In combination methods, different feature vectors are lumped into a single long
composite vector, or in addition to the combination of feature vectors, the dimension
of feature space is reduced. From an NLP attitude, many methods proposed to employ
feature combinations for different studies like fake news challenge.</p>
      <p>A work done in [21] has studied false information on Twitter. They found that real
tweets contain fewer bias markers, hedges, subjective terms, and less harmful words.
They build a model that combined features like graph-based, cues words, and syntax.
They concluded, incorporating linguistic features and social network interactions with
neural network models improves the classification of suspicious news. However, they
are expecting to utilize more sophisticated discourse and pragmatics features and
inferring degrees of credibility in their future works.</p>
      <p>In the work of [7], they have used a Long Short-Term Memory (LSTM) network
combined with other features such as bag-of-characters (BOC), BOW, and topic model
features based on non-negative matrix factorization, Latent Dirichlet Allocation, and
Latent Semantic Indexing. They achieved a state-of-the-art result of 60.9% (Macro F1) on
the Fake News Challenge (FNC-1) dataset. Similar to this work, at [4], an approach was
presented that combines lexical, word embeddings, and n-gram features to detect the
stance in fake news. Their approach has been tested on the FNC-1 dataset and achieved
an accuracy of 59.6% (Macro F1) close to state-of-the-art results using a simple feature
representation. Mainly approaches at Fake News Challenge (FNC-1) dataset
incorporated a different combination of features, such as word or character n-grams,
bag-ofwords, word embeddings, latent semantic analysis features [17] [8].</p>
      <p>At another work [13], they have used a set of linguistic features like n-grams,
punctuation, psycholinguistic, readability, and syntax features. The proposed linguistics-driven
approach suggests that to differentiate between fake and genuine content it is
worthwhile to look at the lexical, syntactic, and semantic level of a news item in question.
They have achieved an accuracy of up to 76% in their own collected dataset.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Proposed Approach</title>
      <p>We assume authors may convey different concepts when they are tweeting, so
differences in concepts can capture fake/real news. Since fake news spreaders can be very
smart or complicate their semantic of tweet concepts highly keen to be real but in a
different style than usual. According to [15] coarse-grained features are most likely to
find author’s styles. So, taking author fingerprinted features into account can be
useful in the case of finding author styles. To construct a hypothesis, lets (Xi; yi) be the
definition of each user tweets. Xi refers to user i tweets. yi describes fake/real news
spreader. Suppose i 2 [1; m]; j 2 [1; n] and m; n be the maximum numbers of users,
and each user tweets, respectively. We can define Xi = [j=1 j in which j refers to
n
array of words which belongs to j-th tweet’s for user i-th, and jk is the k-th word of
array j with length of j j j. In the following, we will use these notations to introduce
our proposed approach in more details.
3.1</p>
      <sec id="sec-3-1">
        <title>Data Preprocessing</title>
        <p>
          In the first stage of preprocessing, we used Preprocessor1 which is a preprocessing
library for tweet data written in Python. It used to remove URLs, Hashtags, Mentions,
Reserved words (RT, FAV), Emojis, Smileys, and Numbers from Xi even those that
already masked in the dataset. Next, punctuation removal, stopwords removal, and
stemming applied to 8 jk , k 2 j j j using NLTK 3.0 Toolkit [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
1 https://github.com/s/preprocessor
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Data Representation Methods</title>
        <p>I. ConceptNet Numberbatch Regarding word embeddings that represent only
distributional semantics like Word2Vec or GloVe and word embedding that represent only
relational knowledge like ConceptNet, ConceptNet Numberbatch is a hybrid word
embedding built using an ensemble approach. It combines data from ConceptNet, Word2Vec,
GloVe, and OpenSubtitles 2016 using a variation on retrofitting [19].</p>
        <p>– ConceptNet [19] is a knowledge graph that connects words and phrases of natural
language with labeled edges. ConceptNet sources include symmetric and
asymmetric relations. Its knowledge is collected from many sources that include
expertcreated resources, crowd-sourcing, and games with a purpose. It is designed to
allow the applications to better understand the meanings behind the words people
use [19].
– GloVe [12] is a vector space with meaningful substructure which pre-trained on
various datasets.
– Word2Vec [11] is a word vectors pre-trained on the Google News dataset.
– OpenSubtitles 2016 [20] is a collection of movie subtitles and used as a part of
meta data for training ConceptNet Numberbatch.</p>
        <p>ConceptNet Numberbatch is a multilingual word embedding and represents 78
different languages in 300 dimensions. Words in different languages share a common
semantic space, and that semantic space is informed by all of the languages. The f is a
representation of semantic space.</p>
        <p>f : W ord 7 ! V3100
In this work, we used ConceptNet Numberbatch version 19.08, and a vocabulary size
of 651859 for Spanish, and 516782 for English. uses f to represent word vectors for
both words in numberbatch vocabulary and OOV words.</p>
        <p>(word) =
( !f(word), word 2f
!0 ;
word 2=f
Finally, CoN et is a formulation of how we extract averaged feature vectors for Xi.</p>
        <p>CoN et : [jn=1 j
!</p>
        <p>Pn
j=1</p>
        <p>Pj jj !( jk)
k=1 q2!( jk) !( jk)
Pn
j=1 j j j
We skipped stemming in the preprocessing stage for given jk due to low accuracy
achieved in our experiments. Investigations showed stemming decreases word usage
frequency in the data and it leads to poor CoN et vectors.</p>
        <p>II. Latent Semantic Analysis (LSA) [9] is a statistical approach to extract relations
among words by meaning of their contexts of use in documents. LSA can be
accomplished by applying a low-rank Singular Value Decomposition (SVD) on the
N-grams/TF-IDF matrices to reduce the number of rows while preserving the similar
structure among columns.
LSA is dimension reduction which is able to capture and represent significant
components of the lexis and passage meanings. Also, this has the effect of reducing
noise in the data as well as reducing the sparseness of the matrix. From these
perspectives, we applied SVD to N-grams and TF-IDF matrices for dimensional
reduction with a component number of 200. SVD is a formulation of dimension
reduction for our case. SVD is a transformer of Mtfidf and Mngram to latent space.</p>
        <p>
          SV D : Vdi ! V2i00
We used scikit-learn [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] python library for our experiments and training N-grams
models for both languages. Experimental searches have been done for tuning N-grams
and TF-IDF parameters using a 5 and 10 fold cross-validation. Table 1 shows a
summary of the best achieved parameters for both languages.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3 Input Representation</title>
        <p>According to our experiments, single representations mainly are not able to perform
well after achieving specific accuracy due to their features overlaps and similarities.
We will discuss it in more detail. Regarding the hypothesis of combining, weak
learners can boost performance. We hypothesis that combining representations must
do the same in most of the cases. To overcome single representation issues and to keep
representation combination simple, LSACoNet has been introduced as a concatenation
of representations. The is a transformer which is able to represent a combination of
feature vectors for given user tweets in 700 dimensions.</p>
        <p>: Xi
! (CoN et(Xi); SV D(Mtfidf (Xi)); SV D(Mngram(Xi)))</p>
        <p>(Xi) 2 V7i00
3.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Model Architecture</title>
        <p>A fully connected feed-forward neural network [5] (namely FCNN) introduced to
tackle fake/real news spreader detection challenge. Proposed FCNN contains an input
layer with 1024 neurons, ReLU activation, dropout, and BatchNormalization. Next,
FCNN follows 3 hidden layers, each holding 256, 128, and 64 neurons respectively
with sigmoid activation and an output layer with 2 neurons, and BatchNormalization.
At the input layer, BatchNormalization set to normalize the combined features from
different representations. To reduce thinking of the network, dropout has been used
with a probability of 40% at the input layer. To compile network spars categorical
cross-entropy, loss function has been utilized. As an optimizer, Adam applied with a
learning rate of 0.002. The process of experimenting with Deep Neural Networks has
been done using Keras [3] a deep learning API written in Python.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiments and Results</title>
      <p>This year, task organizers have provided a training corpus2. The corpus is composed of
documents in English and Spanish, where each document contains 100 tweets for each
author. The statistics of this corpus are presented in Table 2.
4.1</p>
      <sec id="sec-4-1">
        <title>Baselines</title>
        <p>In order to compare the proposed methods, we implemented 3 baselines as described
in bellow, and Table 3 (in group 0 for detailed experimental result) shows detailed
evaluation results for them.</p>
        <p>– RANDOM: a random prediction model predicts 1 if random value 2 [0; 0:5] else
0.
– TFIDFLSVM: TF-IDF representation contains all words without applying
preprocessing and parameter tuning, and linear SVM as a classifier with C = 1.
– STATLSVM: includes statistical features like number of characters, URLs,</p>
        <p>Mentions, Hashtags, RTs, and Emojis with linear SVM as a classifier with C = 1.
We conducted a few experiments with different classifiers (Multi-layer Perceptron,
Linear/RBF SVM, Logistic Regression, K Nearest Neighbors, Naive Bayes, Ridge
classifier - a classifier using ridge regression, Stacking Ensemble), and different
representations(N-gram, TF-IDF, LSA, ConceptNet Numberbatch). The differences
between experiments are mainly focused on 5/10-fold cross-validation mean accuracy
and confidence interval(CI). Most of the models in experiments were suffering from a
hight confidence interval. We essentially concentrated on reducing the overfitting
impact by reviewing confidence intervals, while boosting model performance on
validations using 5/10-fold cross-validation scheme.</p>
        <p>Experiment 1: TF-IDF Modeling In Experiment 1 we used TF-IDF representation
using word usage factor while making a vocabulary for representation. With word
usage factor we were able to use the author’s fingerprinted words as a representation
with ignoring less and most used words with setting lower/upper bound threshold to
each term frequency. We used a lower/upper bound term frequency thresholds for both
languages. The lower/upper bound term frequency threshold includes 2/2000 for
English and 3/4000 for Spanish. In final, terms fall in the range of [Ltf ; Utf ]
considered in making TF-IDF vocabulary. Attained results for this experiment is
recorded in Table 3 (in the section for detailed experimental results using
cross-validation) group 1. We achieved CI close to 0.05 by applying a linear SVM
classifier. The ridge classifier also achieved average accuracy result close to linear
SVM, however, this model suffers from a high CI.</p>
        <p>Experiment 2: Character N-gram Modeling In Experiment 2 similar to the
previous analysis, we have run an investigation with character n-gram representation to
explore for better features by keeping only the author’s most valuable words. We used
a character 3-grams scheme using word usage factor while making vocabulary for
representation. Less valuable terms were ignored from the vocabulary by setting a
lower bound term frequency threshold of 5 for both languages. In final, terms fall in
the range of [Ltf ; 1) considered in making representation vocabulary. Accomplished
results for this experiment were recorded in Table 3 group 2. Presented results are not
very promising due to high CI and low accuracy regarding previous experiment
models. Most importantly averaged results and CIs are close to baseline models except
2 cases and they are mostly suffering from high CI. More investigations revealed that
for Spanish, logistic regression, and ridge classifiers are running well, however, for
English, they are performing very low regarding baseline and group-1 models.
According to the results, character n-gram representation fails in capturing fake/real
news spreaders.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Experiment 3: Punctuation/Character N-gram Modeling In Experiment 3 we</title>
        <p>considered another study with character 5-grams with considering only marks. We
replaced letters in tweets with *. Next, we used the experiment 2 details for modeling
logistic regression and linear SVM. Recorded results in Table 3 group 3 for both
classifiers confirms that extracting character n-gram features could be hard for models
to capture fake/real news spreaders due to poor features.</p>
        <p>Experiment 4: Ensemble Learning In Experiment 4, we investigated combining
weak learners by applying a stacking ensemble approach with a majority voting
scheme. TF-IDF representation using linear SVM, k nearest neighbors, and ridge
classifiers were considered for English, and at Spanish, only third learner changed to
character 3-gram representation with logistic regression classifier. We achieved
accuracies of 0.768/0.764 for averaged 5/10-fold cross-validation respectively. It
outperforms current models, however, it suffers from high CI for English.
Experiment 5: Concept Modeling In Experiment 5, we examined linear/rbf SVM
and logistic regression classifiers with ConceptNet Numberbatch word embedding.
Obtained results are reported in Table 3, group 5. Results showed Numberbatch is
mostly likely to perform similar to TF-IDF representation.</p>
        <p>Experiment 6: Concatenation of Features In Experiment 6, we analyze LSACoNet
representations with linear SVM. For analysis, we made a baseline without any
specific parameter setting and using maximum feature dimensions. Interestingly we
achieved a low CI for both languages with this baseline. It showed how combination of
features are capable. Next, LSACoNet representations were evaluated based on the
parameter setting mentioned in Table 1. Obtained results showed feature combination
is a very powerful technique for boosting performance. We reached accuracies of
0.785/0.765 for 5/10-fold cross-validation and lowest possible CI. Detailed results
have been recorded in Table 3 in group 6.</p>
        <p>Experiment 7: FCNN In Experiment 7, we made a different analysis using
LSACoNet representation and CoNet representation. To make conclusions about if
FCNN is able to perform better than the models described in previous experiments,
CoNet representation is considered as a baseline. We used 5 different test split sizes
for this experiment to evaluate LSACoNetFCNN and CoNetFCNN models. Obtained
result from this experiment were recorded in Table 1 (in detailed experimental results
with FCNN). We gained average accuracy of 0.79 for LSACoNetFCNN. Both results
in experiment 6, and 7 are very promising, and comparing accuracies and CIs of these
2 experiments are not an interesting job to do because of differences in evaluations.
Both LSACoNetLSVM and LSACoNetFCNN models are very promising and since
for final evaluation, we didn’t have any test set to compare these 2 models we simply
relied on LSACoNetFCNN as a final model.
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Final Evaluation</title>
        <p>Following the previous results, for the final evaluation at TIRA platform [14], we
applied LSACoNet method with FCNN for the classification of real/fake news
spreaders. The obtained accuracy results for the final evaluation were as follows: in
Spanish, 0.745; in English, 0.725; and 0.735 for both tasks. The official results are
shown in Table 3 (in detailed results of submissions) for early birds and final
evaluation. We gained a better result with LSACoNet and FCNN for English at the
final evaluation. However, for Spanish TF-IDF representation with linear SVM
performed well with an accuracy of 0.765 at early birds evaluation. In the final
evaluation metrics, the best scores of the submissions between the early birds and final
submissions of each participant and each language have been considered. This means
that in our case we achieved the best score for Spanish in early bird and the best score
for English in the final submission so, overall achieved accuracy is 0.745.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>In this paper, we proposed a model for Profiling Fake News Spreader on the Twitter
task in PAN 2020. We presented a feature combination model namely LSACoNet to
use a different representation of the documents to incorporate with FCNN on detecting
fake/real news spreaders on Twitter. In the final, we achieved average accuracy of
0.745. Regarding our manual evaluation, our approach is very capable of
distinguishing fake/real news spreaders. In future works, we most likely to try to add
feature weighting for representations and use different deep neural network models
like RNN and cleverly emotionalized word or character n-gram features to enrich
current features to boost the performance of currently existed representation.
project. In: ECML PKDD Workshop: Languages for Data Mining and Machine Learning.
pp. 108–122 (2013)
3. Chollet, F., et al.: Keras. https://keras.io (2015)
4. Ghanem, B., Rosso, P., Rangel, F.: Stance detection in fake news a combined feature
representation. In: Proceedings of the First Workshop on Fact Extraction and VERification
(FEVER). pp. 66–71 (2018)
5. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016),
http://www.deeplearningbook.org
6. Hanselowski, A., PVS, A., Schiller, B., Caspelherr, F., Chaudhuri, D., Meyer, C.M.,
Gurevych, I.: A retrospective analysis of the fake news challenge stance-detection task. In:
Proceedings of the 27th International Conference on Computational Linguistics. pp.
1859–1874. Association for Computational Linguistics, Santa Fe, New Mexico, USA (Aug
2018), https://www.aclweb.org/anthology/C18-1158
7. Hanselowski, A., S., A.P.V., Schiller, B., Caspelherr, F., Chaudhuri, D., Meyer, C.M.,
Gurevych, I.: A retrospective analysis of the fake news challenge stance detection task.</p>
      <p>CoRR abs/1806.05180 (2018), http://arxiv.org/abs/1806.05180
8. Karadzhov, G., Gencheva, P., Nakov, P., Koychev, I.: We built a fake news &amp; click-bait
filter: What happened next will blow your mind! CoRR abs/1803.03786 (2018),
http://arxiv.org/abs/1803.03786
9. Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis.</p>
      <p>Discourse processes 25(2-3), 259–284 (1998)
10. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval.</p>
      <p>Cambridge University Press, USA (2008)
11. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of
words and phrases and their compositionality. In: Burges, C.J.C., Bottou, L., Welling, M.,
Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing
Systems 26, pp. 3111–3119. Curran Associates, Inc. (2013),
http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-andtheir-compositionality.pdf
12. Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation.</p>
      <p>In: Proceedings of the 2014 conference on empirical methods in natural language
processing (EMNLP). pp. 1532–1543 (2014)
13. Pérez-Rosas, V., Kleinberg, B., Lefevre, A., Mihalcea, R.: Automatic detection of fake
news. arXiv preprint arXiv:1708.07104 (2017)
14. Potthast, M., Gollub, T., Wiegmann, M., Stein, B.: TIRA Integrated Research Architecture.</p>
      <p>In: Ferro, N., Peters, C. (eds.) Information Retrieval Evaluation in a Changing World. The
Information Retrieval Series, Springer (Sep 2019)
15. Rahgouy, M., Giglou, H., Rahgooy, T., Sheykhlan, M., Mohammadzadeh, E.: Cross-domain
Authorship Attribution: Author Identification using a Multi-Aspect Ensemble Approach. In:
Cappellato, L., Ferro, N., Losada, D., Müller, H. (eds.) CLEF 2019 Labs and Workshops,
Notebook Papers. CEUR-WS.org (Sep 2019), http://ceur-ws.org/Vol-2380/
16. Rangel, F., Giachanou, A., Ghanem, B., Rosso, P.: Overview of the 8th Author Profiling
Task at PAN 2020: Profiling Fake News Spreaders on Twitter. In: Cappellato, L., Eickhoff,
C., Ferro, N., Névéol, A. (eds.) CLEF 2020 Labs and Workshops, Notebook Papers. CEUR
Workshop Proceedings (Sep 2020), CEUR-WS.org
17. Riedel, B., Augenstein, I., Spithourakis, G.P., Riedel, S.: A simple but tough-to-beat
baseline for the fake news challenge stance detection task. CoRR abs/1707.03264 (2017),
http://arxiv.org/abs/1707.03264
18. Shu, K., Wang, S., Liu, H.: Understanding user profiles on social media for fake news
detection. In: 2018 IEEE Conference on Multimedia Information Processing and Retrieval
(MIPR). pp. 430–435 (2018)
19. Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: An open multilingual graph of general
knowledge. pp. 4444–4451 (2017),
http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14972
20. Tiedemann, J.: Parallel data, tools and interfaces in opus. In: Chair), N.C.C., Choukri, K.,
Declerck, T., Dogan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.)
Proceedings of the Eight International Conference on Language Resources and Evaluation
(LREC’12). European Language Resources Association (ELRA), Istanbul, Turkey (may
2012)
21. Volkova, S., Shaffer, K., Jang, J.Y., Hodas, N.: Separating facts from fiction: Linguistic
models to classify suspicious and trusted news posts on twitter. In: Proceedings of the 55th
Annual Meeting of the Association for Computational Linguistics (Volume 2: Short
Papers). pp. 647–653 (2017)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bird</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Loper</surname>
          </string-name>
          , E.:
          <article-title>Natural language processing with Python: analyzing text with the natural language toolkit. "</article-title>
          <string-name>
            <surname>O'Reilly Media</surname>
          </string-name>
          ,
          <source>Inc."</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Buitinck</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Louppe</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blondel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pedregosa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mueller</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grisel</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Niculae</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prettenhofer</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gramfort</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grobler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Layton</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , VanderPlas, J.,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holt</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varoquaux</surname>
          </string-name>
          , G.:
          <article-title>API design for machine learning software: experiences from the scikit-learn</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>