=Paper= {{Paper |id=Vol-1688/paper-21 |storemode=property |title=Is Readability a Valuable Signal for Hashtag Recommendations? |pdfUrl=https://ceur-ws.org/Vol-1688/paper-21.pdf |volume=Vol-1688 |authors=Ion Madrazo,Maria Soledad Pera |dblpUrl=https://dblp.org/rec/conf/recsys/AzpiazuP16 }} ==Is Readability a Valuable Signal for Hashtag Recommendations?== https://ceur-ws.org/Vol-1688/paper-21.pdf
                       Is Readability a Valuable Signal for Hashtag
                                   Recommendations?

                            Ion Madrazo Azpiazu                               Maria Soledad Pera
                            Computer Science Dept.                           Computer Science Dept.
                             Boise State University                            Boise State University
                              Boise, Idaho, USA                                  Boise, Idaho, USA
                          ionmadrazo@boisestate.edu                          solepera@boisestate.edu


ABSTRACT                                                          language analysis techniques inefficient. With that in mind,
We present an initial study examining the benefits of incorpo-    we developed TweetRead, a novel readability assessment tool
rating readability indicators in social network-related tasks.    specifically designed for tweets. TweetRead takes advantage
In order to do so, we introduce TweetRead, a readability          of social information, such as hashtags or mentions, for pre-
assessment tool specifically designed for Twitter and use it      dicting the text complexity levels of tweets. Furthermore, in
to inform the hashtag prediction process, highlighting the        order to highlight the usefulness of such a tool in social net-
importance of a readability signal in recommendation tasks.       working environments, we developed a simple, yet effective,
                                                                  hashtag recommendation strategy that takes advantage of
                                                                  TweetRead-generated complexity levels of tweets to inform
CCS Concepts                                                      the hashtag recommendation process.
•Human-centered computing → Social recommenda-
tion; Social networks;                                            2.   TWEETREAD
                                                                     TweetRead’s goal is to estimate readability of any given
Keywords                                                          tweet T . TweetRead is based on a logistic regression tech-
                                                                  nique1 that fuses simple indicators describing T from different
Hashtag Recommendation; Readability
                                                                  perspectives and determines its text complexity. The indi-
                                                                  cators considered by TweetRead include: (i) T ’s readability
1.     INTRODUCTION                                               level, estimated using F lesch2 [1], (ii) T ’s similarity with
   Readability is a measure of the ease with which a text can     respect to word distributions generated from a large Twitter
be read. Usually represented by a number, it is an indicator      corpora C labeled by age groups, (iii) average readability of
used by teachers to classify and find appropriate resources       each hashtag h in T , computed based on the average readabil-
for students. Several studies have demonstrated the benefits      ity levels estimated using Flesch of tweets in C that include
of using readability indicators in educational-related applica-   h, (iv) average readability level of users mentioned on T ,
tions, such as book recommendation, text simplification, or       estimated using Flesch on tweets written by mentioned users,
automatic translation. However, applying readability indica-      and (v) frequency of mentions, emoticons, and hashtags in T .
tors outside this environment remains relatively unexplored.         Unlike traditional readability formulas that tend to map
Social networks could benefit from readability assessment.        readability levels with school grades, to tailor TweetRead to
Twitter is a social network where users and texts are the         the Twittersphere, we consider six levels of text complexity
main focus. For this reason, it is natural to think that for      following Levinston’s [3] adult development stages.
Twitter the ease with which a tweet can be understood by
a user may affect his interest in it, and therefore influence
actions taken, such as re-tweeting, giving a like or replying
                                                                  3.   HASHTAG RECOMMENDATION
to the tweet.                                                        Hashtags are character strings used to represent concepts
   The authors of [6] examined the degree to which the age        on Twitter, starting with a # symbol. They are a core
of a user, a feature strongly correlated with readability, in-    Twitter feature and serve classification and search purposes.
fluences who people follow on Twitter, and demonstrated           Their unrestricted nature, however, creates difficulties, in-
that Twitter users have a higher chance to follow people          cluding the fact that the same concept can be represented by
of similar age. Using standard readability measures in text       different hashtags, hindering the search process of a concept
from Twitter, which constrains tweets to be of at most 140        [5]. For example, tweets related to the Monaco Formula
characters in length, is not a trivial task. The lack of struc-   1 Grand Prix can be searched using #monacoGP, #mona-
ture and shortness of those texts make standard natural           coF1GP or #monacoF1 retrieving different results. Hashtag
                                                                  recommendation aims at identifying suitable hashtags a user
                                                                  can include in his tweet to reduce the space of tags generated
                                                                  [5] and facilitate the ease with which he and other users can
                                                                  locate the corresponding tweet.
                                                                     Given that (i) the scope of this paper is to validate the im-
                                                                  portance of considering a text complexity signal to enhance
                                                                  1
                                                                    We empirically verified that among numerous supervised
                                                                  techniques, logistic regression was the most promising one.
                                                                  2
                                                                    Flesch estimates the readability of a text/tweet t, by exam-
Copyright held by the authors.                                    ining its length and the average length of terms in t.
a recommendation task and (ii) multiple and increasingly                           Flesch    Spache    TweetRead
complex systems have been developed for hashtag recommen-                           27%       31%        81%
dation [2], we base our study on an existing framework for
hashtag recommendation presented in [5]. Given a tweet T ,         Table 1: Performance evaluation of TweetRead vs. baselines.
the proposed framework identifies existing hashtags to recom-
mend by following two major steps: (1) generate candidate          used the aforementioned dataset. We treated the hashtag
hashtags by recommending hashtags present in similar tweets,       of each corresponding tweet as the ground truth. In other
using tf-idf based cosine similarity and (2) rank hashtags         words, for each tweet T , we generated the corresponding
from retrieved candidate tweets using different strategies.        top-N hashtag recommendations and considered relevant the
The strategies presented in [5] include:                           ones matching the hashtags in T . As in [5], we used the recall
                                                                   measure to evaluate performance and determine to which
     • Similarity. Prioritizes hashtags included on tweets         extend the correct hashtags were recommended within the
       that have the closes similarity to T , as estimated using   top N generated suggestions. As shown in Figure 1, even if
       the well-known tf-idf and cosine similarity measure.        readability on its own is not a sufficient factor to suggest hash-
                                                                   tags, when combined in-tandem with other content-based
     • Global popularity. Prioritizes hashtags based on            and/or popularity strategies, it leads to the improvement of
       their respective frequency of occurrence on Twitter.        the overall hashtag recommendation process.
     • Local popularity. Prioritizes hashtags based on their
       frequencies of occurrence among the tweets retrieved
       in response to T .
   We enhance the proposed strategies by taking advantage
of TweetRead, as follows:
     • TweetRead. Prioritizes candidate hashtags that have
       the same or similar text complexity (estimated using
       TweetRead) with respect to T .
     • PopularityTweetRead. Prioritizes hashtags based
       on their frequencies of occurrence among tweets whose
       readability level is estimated to match T ’s.
     • SimilarityTweetRead. Prioritizes candidate hash-
       tags based on their respective ranking scores computed
       using Similarity only on tweets whose readability level
       is estimated to match T ’s .
                                                                         Figure 1: Hashtag recommendation assessment.
4.    INITIAL ASSESSMENT                                           5.   CONCLUSION AND FUTURE WORK
   In this section, we discuss an initial evaluation on Tweet-        In this paper, we presented TweetRead, a novel readability
Read, as well as its applicability for suggesting hashtags.        assessment tool specifically designed to predict the readability
   TweetRead. Given that readability of social content is          of tweets. We also discussed the initial study conducted
an unexplored area, benchmark datasets that can be used            to demonstrate the benefit of using a readability signal in
for evaluation purposes are unavailable. For this reason, we       the hashtag recommendation task, which yielded promising
built our own dataset. We initially gathered 172M tweets           results. In the future, we plan to explore other applications of
over an 8-month period using Twitter streaming API. For            readability in social networks, such as user recommendation,
the purpose of this experiment we assume that the age of           advertisement targeting or re-tweet prediction. We will also
people exactly corresponds to their readability level, and that    explore techniques to further enhance TweetRead and adapt
each tweet written by a user will have the same readability        it to other social networks beyond Twitter.
level as its author. With that in mind, we followed the
framework presented in [6], which examines patterns such
as “happy xth birthday”, for determining the age of Twitter        6.   REFERENCES
users. In doing so, we eliminated from our dataset, users          [1] R. Flesch. A new readability yardstick. Journal of
(and their corresponding tweets) from whom age could not               Applied Psychology, 32(3):221, 1948.
be determined. Thereafter, we grouped labeled tweets into          [2] F. Godin, V. Slavkovikj, W. De Neve, B. Schrauwen, and
6 age groups, which translates into a uniformly distributed            R. Van de Walle. Using topic models for twitter hashtag
dataset of 22k tweets with their corresponding readability             recommendation. In WWW, pages 593–596. ACM, 2013.
levels. We followed a 10-cross-fold validation strategy and        [3] D. J. Levinson. A conception of adult development.
measured the accuracy of the predicted readability levels              American psychologist, 41(1):3, 1986.
with respect to the ground truth. As shown in Table 1,             [4] G. Spache. A new readability formula for primary-grade
TweetRead significantly outperforms the baselines considered           reading materials. The Elementary School Journal,
for this assessment: Flesch [1] and Spache [4], which are two          53(7):410–413, 1953.
well-known, traditional readability measures. The reported
                                                                   [5] E. Zangerle, W. Gassler, and G. Specht. Recommending
results demonstrate the need for readability strategies that
                                                                       #-tags in twitter. In SASWeb 2011, volume 730, pages
examine information beyond standard text analysis, if they
                                                                       67–78, 2011.
are meant to be successfully used in the social networking
context.                                                           [6] J. Zhang, X. Hu, Y. Zhang, and H. Liu. Your age is no
   Hashtag recommendation. For evaluating the strate-                  secret: Inferring microbloggers’ ages via content and
gies for hashtag recommendation presented in Section 3, we             interaction analysis. In AAAI ICWSM, 2016.