<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Introducing a Framework for Automatically Differentiating Witness Accounts of Events from Social Media</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marie Truelove</string-name>
          <email>truelove@student.unimelb.edu.au</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria Vasardani</string-name>
          <email>maria.vasardani@unimelb.edu.au</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stephan Winter</string-name>
          <email>winter@unimelb.edu.au</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Proc. of the 3</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Identifying Witnesses of events from social media is an opportunity to crowdsource real-time information to enhance numerous applications including emergency response in a crisis, filtering sources for journalism, and enhancing marketing services. Using a sporting event broadcast live to a proportionally much larger audience, this research demonstrates a significant increase in the number of Witnesses identified posting from the event venue, in comparison to the number identified from geotags alone. This is achieved by considering the text and image content of micro-blogs as additional evidence. This paper also reports progress towards the automatic categorisation of the additional text and image evidence, and modelling and testing this evidence for corroboration or conflict, using Dempster-Shafter Theory of Evidence.</p>
      </abstract>
      <kwd-group>
        <kwd>Crowdsourcing</kwd>
        <kwd>Social Media</kwd>
        <kwd>Witness Accounts</kwd>
        <kwd>Supervised Machine Learning</kwd>
        <kwd>Dempster-Shafer Theory of Evidence</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        INTRODUCTION
Crowdsourcing information about events from social networks such as Twitter is recognised as an
opportunity to harvest detailed real-time information, for example enhancing situational awareness for
emergency response and management [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and creating news summaries of large sporting spectacles
[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. However, these opportunities come with many problems to solve, including detecting the fraction
of relevant micro-blogs, and assessing the credibility and location of the micro-bloggers who posted
them. This research makes unique contributions by proposing a framework towards distinguishing those
micro-blogs which are Witness Accounts (WA) of events. WA are defined as those micro-blogs which
contain an observation of the event or its effects [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], for example a statement I see the bushfire smoke!
or an image conveying the same information. The micro-blogger who posted the WA is considered a
potential Witness to the event, and it can be inferred they are on-the-ground (OTG) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], that is they in
close proximity to the event [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Impact Accounts (IA) are defined for those micro-blogs which do
no contain an observation of the event, but from which it can also be inferred that the micro-blogger
who posted it is OTG. IA statements may be as explicit as I’m being evacuated from my home due to
the bushfire. Formally modelling the witnessing fundamentals of observation and spatial relationship
separately enables a generic model for a range of event types including unpredicted natural disasters to
scheduled events broadcast live from dedicated venues, such as the case study presented in this paper.
All micro-bloggers who post observations of the event whether viewed direct from the grandstands or
via television are by definition Witnesses. The research in this paper questions whether it is possible to
differentiate those Witnesses which are physically at the event from those watching a broadcast. Such
differentiation is supported by micro-blogs with geotags, but typically they are present in only a fraction
of micro-blogs, for example 1% [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This research demonstrates that including the text content and
linked images as evidence, the sample of micro-blogs posted from the event location can be increased
significantly from those identified by geotags alone. Additionally, this research questions whether text
content and linked images can be automatically categorised, and used to test whether they corroborate
the inference they were posted from the event.
      </p>
      <p>In order to automatically differentiate those micro-blogs which are WA or IA, and test the Witness
categorisation of the micro-bloggers who posted them, a framework is proposed with the following parts:
2. Combine the evidence extracted for each individual micro-blog to determine those which can be
ranked as containing corroborating or conflicting evidence;
3. For each micro-blogger found to have posted micro-blogs containing evidence, combine these to
rank their likely status as a Witness OTG; and
4. For likely Witnesses, seek further evidence, for example from micro-blogging history posted during
the event.</p>
      <p>
        This paper presents progress to date on parts 1) and 2). To demonstrate part 1) supervised machine
learning approaches are used to categorise the text and image content. A model of the micro-blog text,
linked images and geotags using Dempster-Shafer Theory of Evidence [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is developed to demonstrate
part 2). The results indicate a significant improvement on the recognition rate of micro-blogs posted from
an event from geotags alone. And where multiple evidence is present for an individual micro-blog their
combination does produce intuitive results, including identifying conflict due to GPS error. Enhancements
and alternative approaches to those presented in this paper, as with parts 3) and 4) of the framework is
the subject of future work.
      </p>
      <p>
        BACKGROUND
Communication technologies have been described as space-adjusting techniques [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], as they enable
events to be witnessed by proportionally much larger audiences than the capacity of the venues in which
they are held. In these scenarios, unlike previous case studies such as those in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], it is not possible to
infer a Witness is OTG for the dominating category of observations, that is of the play on the field [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
It has also been determined that the live broadcast delay of approximately 12 seconds cannot be detected
in micro-blogs, ruling this feature out as a method to distinguish those witnessing via a broadcast [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] .
In addition to sport, differentiating Witnesses of crisis events has gained much interest from researchers.
A journalistic approach describes extracting observation features from text to identify Witnesses [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
whereas spatial presence in the city of the event is the criterion in other work [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>Supervised Machine Learning for Categorisation</title>
      <p>
        Natural language processing (NLP) using bag-of-words approaches from unigram, bigram and
parts-ofspeech (POS) models, can be utilised as baseline text categorisation features [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. These research
report success, comparable in many scenarios to more sophisticated features [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. A visual
bag-ofwords approach to categorise images linked to micro-blogs has also been tested [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The disadvantage
of bag-of-words approaches is that although the methodology can be applied generically, the resulting
model is not generic, for example, a model developed from training data for a football game cannot be
used for a bushfire. Approaches which extract semantic meaning, for example locative expressions from
text [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] would enable a generic model, but their success to-date is limited in domains such as social
media [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Detecting micro-blogs posted from OTG is also recognised as a unbalanced class problem
[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Approaches taken to mitigate class imbalance typically involve balancing the data via sampling
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], or algorithmically introducing a miss-classification cost to the under-represented class [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>Dempster-Shafer Theory of Evidence</title>
      <p>
        Dempster-Shafters Theory of Evidence is one method that has found application in classifier fusion, and
managing uncertainty and incomplete reasoning [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The theory models the power set for the frame of
discernment of the hypothesis [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. A mass function is assigned for each subset in the power set from
which the belief interval can be derived [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The mass function can be assigned from various classifier
results, including the overall accuracy, class statistics or individual instances [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The mass functions
for independent evidence can then be combined [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Dempster’s Rule of Combination has been shown
to produce uninituitve results in scenarios with conflict [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], resulting in many enhancements being
proposed including PCR6 [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] based on proportional conflict resolution.
      </p>
      <sec id="sec-3-1">
        <title>Data Collection and Training Set Creation</title>
        <p>
          The case study event is an Australian Football League (AFL) match played at the Melbourne Cricket
Ground (MCG) on the annual ANZAC Day public holiday. In 2015, this match attracted a near
capacity crowd of 88,3981 and television ratings of 1.298 million2. The corpus was collected using
the AFL’s promoted hashtag #afldonspies, utilising the Twitter Data Analystics software packages [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
Pre-processing samples the micro-blogs to those which can be identified as individual and original, that is
not a retweet or posted by a non-individual such as the media [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. To collect a sample of linked images,
all micro-blogs in the corpus with a URL to Twitter or Instagram were inspected as these are more likely
to contain WA [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. To create the training set, two expert annotators coded the tweet text and linked
images with one of three categories, examples for which are presented in Table 1. The three categories
are:
1. No Evidence (NE) when no evidence of being posted from OTG or another place could be detected.
2. When evidence is detected, it is categorised as either evidence posted from OTG (E-OTG);
3. Or counter-evidence indicating that it is not posted from OTG (E-NOTG).
        </p>
        <sec id="sec-3-1-1">
          <title>No Evidence (NE) Evidence OTG (E-OTG)</title>
          <p>Fletcher goes bang with a 60 Not the best seats in the
metre monster! #AFLDon- house but just glad to be here
sPies at @MCG #AFLDonsPies</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Evidence not OTG (E-NOTG)</title>
          <p>In front of TV with chips for next
3 hours! #AFLDonsPies</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Supervised Machine Learning for Categorisation of Text and Images</title>
        <p>
          Pre-processing of the text included word tokenisation and parts-of-speech tagging using Ark NLP [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
WEKA’s [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] default pre-processing filters were used to experiment with unigram and bigram models.
WEKA default feature selection filters are utilised to reduce the number of redundant dimensions, and
experiment with a range of classifiers indicated by previous research including Naive Bayes, Random
Forest and Support Vector Machines (SVM). All experiments were completed with 10-fold cross
validation. As expected, class imbalance was an issue in particular for the text corpus. Sampling to micro-blogs
posted by micro-bloggers with at least one piece of evidence detected in the the training set was used
to mitigate the imbalance. The classifier selected was that which maximises precision of E-OTG and
E-NOTG classes, at the expense of recall if necessary, to minimise conflict due to miss-classification in
the Dempster-Shafer modeling.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Categorisation of Geotags</title>
        <p>Geotag evidence was cateogorised as E-OTG or E-NOTG based on whether it was contained within or in
the immediate vicinity of the MCG, the place of the event. It was necessary to create a decision boundary
for this categorisation, which was informed primarily by the boundaries of places bordering the MCG,
for example train lines, roads and other venues.</p>
        <p>
          1twitter.com/MCG/status/591859347891748865
2http://footyindustry.com/files/afl/media/tvratings/2015/2015AFLRatings.png
Dempster-Shafer Modelling of Evidence Extracted from Micro-blogs
The frame of discernment is modelled as {E-OTG, E-NOTG} with power set (null, E-OTG, E-NOTG,
{E-OTG, E-NOTG}). The categorisation of NE is not modelled in the frame of discernment. For example,
if a micro-blog has a geotag categorised as E-OTG, and text and image categorised as NE, the text and
image do not corroborate or produce conflict with the E-OTG categorisation provided by the geotag. For
demonstraton mass functions are set manually, with derivation from classifier results left to future work.
The mass functions assigned to geotags represent greater certainty than that assigned for images, which
are greater than that assigned for text. The combination rule PCR6 implemented in Matlab [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] is then
used to compute the combinations for analysis, and again, decision algorithm testing left for future work.
2 RESULTS AND DISCUSSION
The corpus contained 3260 micro-blogs, 265 with linked images and 133 with geotags. Table 2 presents
the categorisation results, both training and predicted by classifiers. The annotator agreement for the text
and image content was high with Cohen’s Kappa of 0.895 and 0.929 respectively. Combining the three
content sources, or evidence, from the training data indicates the number of micro-blogs categorised as
E-OTG and E-NOTG can be increased significantly from those with geotags alone. The increase for
E-OTG is from 21 to 176 micro-blogs, and the increase for E-NOTG is from 112 to 241 micro-blogs.
This corresponds to an additional 125 potential Witnesses OTG from 16. 54 tweets had more than one
piece of evidence which could be checked for conflict. Conflict did exist for a fraction of tweets, found to
be due to GPS error. The geotag indicated the micro-blog was posted from a nearby venue, when the
image and text indicated it was posted from the MCG.
        </p>
        <p>The combined results correctly predicted are fewer than the training data, but still a significant
increase from those with geotags alone. The number of micro-blogs categorised as E-OTG increased
from 21 to 125, and the number of micro-blogs categorised as E-NOTG increased from 112 to 182. This
corresponded to an additional 77 potential Witnesses posting from OTG and an additional 50 potential
Witnesses via the broadcast. From the predicted results of the classifiers, 26 micro-blogs had more
than one piece of evidence, with five in conflict. In addition to GPS error, these conflicts are now also
attributed to miss-classification.</p>
        <p>
          From the classifier experimentation it was found the WEKA default SVM, feature selection filter,
and a unigram model maximised precision of the E-OTG and E-NOTG classes for text content. Using
the methodology described by [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] the SVM classifier was additionally selected for the image content.
The precision and recall results for each class are presented in Table 3. These results indicate the image
categorisation failed for the E-NOTG class, which is attributed to the insufficient number of samples in
the training data. For future experiments this category could be removed, or the sample increased from
other events, both options are to be tested in future work. In comparison, the E-OTG category proved
acceptable for both precision and recall. The better precision for text E-NOTG compared to E-OTG
could in part be explained by the topics contained in these micro-blogs were dominated by explicit
statements critiquing the television coverage or the medium via which the broadcast was being viewed,
enabling a more representative unigram model. In comparison, there was not a dominate topic for E-OTG.
More robust feature development based on previously identified witnessing characteristics [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] is being
developed in future work. Additionally, the results indicate improvements could be made if the class
imbalance were further addressed.
3 CONCLUSION
        </p>
        <p>E-OTG
This paper presented progress on a framework to automatically extract WA and IA of events from
social media. Baseline supervised machine learning techniques to categorise text and images were
demonstrated, enabling micro-blogs posted from OTG or via the broadcast to be identified in signficantly
greater numbers than with geotags alone. Additionally, a method based on Dempster-Shafer Theory
of Evidence was demonstrated to combine the extracted evidence to test corroboration or conflict in
the categorisation of the micro-blogs. Many areas for enhancements are identified, including machine
learning approaches that further mitigate class imbalance and enable generic model development. In
addition to seeking these enhancements, future work will include modeling the combination of evidence
from multiple micro-blogs to identify the status of potential Witnesses.</p>
        <p>Proc. of the 3rd Annual Conference of Research@Locate 17</p>
        <p>Proc. of the 3rd Annual Conference of Research@Locate 18</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1] CHENG,
          <string-name>
            <given-names>Z.</given-names>
            ,
            <surname>CAVERLEE</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , AND LEE,
          <string-name>
            <surname>K.</surname>
          </string-name>
          <article-title>You are where you tweet: A content-based approach to geo-locating Twitter users</article-title>
          .
          <source>In Proceedings of the 19th ACM International Conference on Information and Knowledge Management</source>
          (
          <year>2010</year>
          ), ACM, pp.
          <fpage>759</fpage>
          -
          <lpage>768</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>DIAKOPOULOS</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>DE</surname>
            <given-names>CHOUDHURY</given-names>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ,
          <string-name>
            <surname>AND NAAMAN</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Finding and assessing social media information sources in the context of journalism</article-title>
          .
          <source>In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems</source>
          (
          <year>2012</year>
          ), pp.
          <fpage>2451</fpage>
          -
          <lpage>2460</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>GARGIULO</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>MAZZARIELLO</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , AND SANSONE,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Multiple Classifier Systems: Theory, Application and Tools</article-title>
          . Spinger-Verlag,
          <year>2013</year>
          , ch. 10, pp.
          <fpage>335</fpage>
          -
          <lpage>378</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4] HALL,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>FRANK</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          , HOLMES,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>PFAHRINGER</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>REUTEMANN</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          , AND WITTEN,
          <string-name>
            <surname>I. H.</surname>
          </string-name>
          <article-title>The WEKA data mining software: An update</article-title>
          .
          <source>SIGKDD Explorations 11</source>
          ,
          <issue>1</issue>
          (
          <year>2009</year>
          ),
          <fpage>10</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>KUMAR</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>HU</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          , AND LIU,
          <string-name>
            <surname>H.</surname>
          </string-name>
          <article-title>A behaviour analytics approach to identifying tweets from crisis regions</article-title>
          .
          <source>In Proceedings of the 25th ACM Conference on Hypertext and Social Media</source>
          (
          <year>2014</year>
          ), pp.
          <fpage>255</fpage>
          -
          <lpage>260</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>KUMAR</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>MORSTATTER</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , AND LIU, H. Twitter Data Analytics. Springer,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>LIU</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>VASARDANI</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>AND</surname>
          </string-name>
          BALDWIN,
          <string-name>
            <surname>T.</surname>
          </string-name>
          <article-title>Automatic identification of locative expressions from social media text: A comparative analysis</article-title>
          .
          <source>In Proceedings of the 4th International Workshop on Location and the Web (LocWeb)</source>
          (
          <year>2014</year>
          ), pp.
          <fpage>9</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>MARTIN</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <article-title>Implementing general belief function framework with a practical codification for low complexity</article-title>
          .
          <source>In Advances and Application os DSmT for Information Fusion</source>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Smarandache</surname>
          </string-name>
          and J. Dezert, Eds., vol.
          <volume>3</volume>
          . American Press Rehoboth,
          <year>2009</year>
          , pp.
          <fpage>217</fpage>
          -
          <lpage>273</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>MCLEAN</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <article-title>Identifying witness account in social media using imagery</article-title>
          .
          <source>Master's thesis</source>
          ,
          <year>2015</year>
          . The University of Melbourne.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>MORSTATTER</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>LUBOLD</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>PON-BARRY</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>PFEFFER</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , AND LIU,
          <string-name>
            <surname>H.</surname>
          </string-name>
          <article-title>Finding eyewitness tweets during crises</article-title>
          .
          <source>In Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>OWOPUTI</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>OCONNOR</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>DYER</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>GIMPEL</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>SCHNEIDER</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>AND SMITH</surname>
          </string-name>
          ,
          <string-name>
            <surname>N. A.</surname>
          </string-name>
          <article-title>Improved part-of-speech tagging for online conversational text with word clusters</article-title>
          .
          <source>In Proceedings of NAACL</source>
          <year>2013</year>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>PARIKH</surname>
            ,
            <given-names>C. R.</given-names>
          </string-name>
          , PONT,
          <string-name>
            <surname>M. J.</surname>
          </string-name>
          , AND JONES, N. B.
          <article-title>Application of Dempster-Shafer theory in condition monitoring applications: a case study</article-title>
          .
          <source>Pattern Recognition Letters</source>
          <volume>22</volume>
          (
          <year>2001</year>
          ),
          <fpage>777</fpage>
          -
          <lpage>785</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>SMARANDACHE</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>AND DEZERT</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          <article-title>On the consistency of PCR6 with the averaging rule and its application to probability estimation</article-title>
          .
          <source>In Proceedings of the 16th International Conference on Information Fusion</source>
          (
          <year>2013</year>
          ), pp.
          <fpage>1119</fpage>
          -
          <lpage>1126</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>SPENCER</surname>
            ,
            <given-names>J. E.</given-names>
          </string-name>
          , AND THOMAS,
          <string-name>
            <given-names>W. L. J. Cultural</given-names>
            <surname>Geography</surname>
          </string-name>
          . John Wiley &amp; Sons, Inc.,
          <year>1969</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>STARBIRD</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>GRACE</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>AND LEYSIA</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          <article-title>Learning from the crowd: Collaborative filtering techniques for identifying on-the-ground Twitterers during mass disruptions</article-title>
          .
          <source>In Proceedings of the 9th International ISCRAM Conference</source>
          (
          <year>2012</year>
          ), pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>TRUELOVE</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>VASARDANI</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ,
          <string-name>
            <surname>AND</surname>
          </string-name>
          WINTER,
          <string-name>
            <given-names>S.</given-names>
            <surname>Testing</surname>
          </string-name>
          <article-title>a model of witness accounts in social media</article-title>
          .
          <source>In Proceedings of the 8th Workshop on Geographic Information Retrieval</source>
          (
          <year>2014</year>
          ), no.
          <volume>10</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>TRUELOVE</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>VASARDANI</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ,
          <string-name>
            <surname>AND</surname>
          </string-name>
          WINTER,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>Towards credibility of micro-blogs: characterising witness accounts</article-title>
          .
          <source>GeoJournal 80</source>
          ,
          <issue>3</issue>
          (
          <year>2015</year>
          ),
          <fpage>339</fpage>
          -
          <lpage>359</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>VERMA</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , VIEWEG,
          <string-name>
            <surname>S.</surname>
          </string-name>
          , CORVEY,
          <string-name>
            <surname>W. J.</surname>
          </string-name>
          ,
          <string-name>
            <surname>PALEN</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>MARTIN</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. H.</surname>
          </string-name>
          , PALMER,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ,
          <string-name>
            <surname>SCHRAM</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            , AND ANDERSON,
            <surname>K. M.</surname>
          </string-name>
          <article-title>Natural language processing to the rescue? Extracting ”situational awareness” tweets during mass emergency</article-title>
          .
          <source>In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media</source>
          (
          <year>2011</year>
          ), pp.
          <fpage>385</fpage>
          -
          <lpage>392</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>ZHAO</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>ZHONG</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>WICKRAMASURIYA</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , AND VASUDEVAN, V.
          <article-title>Human as real-time sensors of social and physical events: A case study of Twitter and sports games</article-title>
          .
          <source>Tech. rep.</source>
          , Rice University and Motorola Labs,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>