=Paper= {{Paper |id=Vol-2068/humanize2 |storemode=property |title=You Are What You Post: What the Content of Instagram Pictures Tells About Users' Personality |pdfUrl=https://ceur-ws.org/Vol-2068/humanize2.pdf |volume=Vol-2068 |authors=Bruce Ferwerda,Marko Tkalcic |dblpUrl=https://dblp.org/rec/conf/iui/FerwerdaT18 }} ==You Are What You Post: What the Content of Instagram Pictures Tells About Users' Personality== https://ceur-ws.org/Vol-2068/humanize2.pdf
     You Are What You Post: What the Content of Instagram
            Pictures Tells About Users’ Personality
                  Bruce Ferwerda∗                                                        Marko Tkalcic
   Department of Computer Science and Informatics                                  Faculty of Computer Science
               School of Engineering                                            Free University of Bozen-Bolzano
                Jönköping University                                                  Piazza Domenicani 3
                   P.O. Box 1026                                                  I-39100, Bozen-Bolzano, Italy
          SE-551 11, Jönköping, Sweden                                               marko.tkalcic@unibz.it
               bruce.ferwerda@ju.se


ABSTRACT                                                               to have an advantage over systems not using personality in-
Instagram is a popular social networking application that al-          formation [15]; an advantage is created in terms of increased
lows users to express themselves through the uploaded content          users’ loyalty towards the system and decreased cognitive
and the different filters they can apply. In this study we look at     effort.
the relationship between the content of the uploaded Instagram
                                                                       The usefulness of personality for personalization is shown
pictures and the personality traits of users. To collect data, we
                                                                       in its domain independency: once the personality of users is
conducted an online survey where we asked participants to
                                                                       known, it can be used across domains for personalization [1].
fill in a personality questionnaire, and grant us access to their
                                                                       This allows for personality extraction in one domain and im-
Instagram account through the Instagram API. We gathered
                                                                       plementation in another. Hence, the relationships between
54,962 pictures of 193 Instagram users. Through the Google
                                                                       personality traits and users’ behavior preferences and needs
Vision API, we analyzed the pictures on their content and clus-
                                                                       are increasingly being investigated (e.g., health [14, 25], ed-
tered the returned labels with the k-means clustering approach.
                                                                       ucation [3, 19], movies [4], music [6, 8, 5, 7, 11, 26], mar-
With a total of 17 clusters, we analyzed the relationship with
                                                                       keting [20]) in order to learn about the connection between
users’ personality traits. Our findings suggest a relationship
                                                                       personality traits and specific behaviors.
between personality traits and picture content. This allow for
new ways to extract personality traits from social media trails,       Since personality traits of users are increasingly being used
and new ways to facilitate personalized systems.                       to provide a personalized experience to users, there is an in-
                                                                       creased interest in how to implicitly acquire these traits for
Author Keywords                                                        implementation. A useful source of information are social
Personality, Instagram, picture content, social media                  networking services (SNSs). SNSs are increasingly intercon-
                                                                       nected with applications through so called "single sign-on
INTRODUCTION
                                                                       buttons" (SSO buttons). 1 The abundance of information that
                                                                       becomes available from the connected SNSs can be used to
Personality traits have shown to be a useful concept to rely
                                                                       infer users’ personality traits from (e.g., Facebook [7], Twit-
on when considering personalizations of user experiences in
                                                                       ter [21, 24], and Instagram [9, 10]).
a system. This because personality has shown to be a stable
construct over time, and reflects the coherent patterning of           In this work we join the personality extraction research. We
one’s affect, cognition, and desires (goals) as it leads to behav-     specifically focus on Instagram, a popular mobile photo-
ior [22]. The stability and coherency that personality bring, has      sharing, and SNS, with currently over 800 million users. 2
shown to be useful for systems to infer users’ preferences and         With the content as well as with the filters that Instagram al-
to provide personalized experiences to users (e.g., [6]). Sys-         lows users to apply to their pictures, users are able to express
tems that use personality-based personalizations have shown            a personal style and create a seeming distinctiveness. Hence,
                                                                       personality information about users may be hidden in the pic-
* Also affiliated with the Department of Computational Perception,
                                                                       tures that users upload to Instagram. Whereas prior work on
Johannes Kepler University, Altenberger Strasse 69, 4040, Linz (Aus-
tria), bruce.ferwerda@jku.at                                           Instagram focused on the picture properties (i.e., hue, satura-
                                                                       tion, valence relationship) [9, 10], we focus on the content of
                                                                       the posted pictures on Instagram and explore the relationship
                                                                       with the personality traits of Instagram users. By analyzing the
                                                                       Instagram pictures on their content using the Google Vision


                                                                       1 Buttons that allow users to easily register and log in to a system

©2018. Copyright for the individual papers remains with the authors.   with their social media account.
                                                                       2 https://instagram.com/press/ (accessed: 08/12/2017)
Copying permitted for private and academic purposes.
HUMANIZE ’18, March 11, 2018, Tokyo, Japan
API 3 , we were able to find distinct correlations between users’         contributions. Several control questions were used to filter
personality traits and the content of the pictures they post on           out fake and careless entries. This left us with 193 completed
Instagram.                                                                and valid responses. Age (18-64, median 30) and gender (104
                                                                          male, 89 female) information indicated an adequate distribu-
RELATED WORK                                                              tion. Pictures of each participant were crawled after the study.
There is an increasing body of work that looks at how to                  This resulted in a total of 54,962 pictures.
implicitly acquire personality traits of users. Since all kind of
                                                                          To analyze the content of the pictures, we used the Google
information can relate to personality traits, even information
                                                                          Vision API. The Google Vision API uses a deep neural
that is not directly relevant for a specific purpose may contain
                                                                          network to analyze the pictures and assign tags ("description")
information that is useful for the extraction of personality
                                                                          with a confidence level ("score": rε[0,1]) to classify the
(e.g., Facebook [7], Twitter [21, 24], and Instagram [9, 10]).
                                                                          content (example given in Listing 1).
The increased connectedness between SNSs and applications
through SSO buttons provide an abundance of information
that can be exploited to implicitly acquire personality traits of     1   [{
users.                                                                2                " score ": 0.8734813,
Quercia et al. [21] looked at Twitter profiles and were able          3                "mid": "/m/06__v",
to predict users’ personality traits by using their number of         4                " description ": " snowboard "
followers, following, and listed counts. With these three char-       5   }, {
acteristics they were able to predict personality scores with         6                " score ": 0.8640924,
a root-mean-square error 0.88 on a [1,5] scale. Similar work          7                "mid": "/m/01fklc",
has been done by Golbeck, Robles, and Turner [13] on Face-            8                " description ": "pink"
book profiles. They mainly looked at the sentiment of posted          9   }, {
content and were able to create a reliable personality predictor     10                " score ": 0.81754106,
with that information. A more comprehensive work on the              11                "mid": "/m/0bpn3c2",
prediction of personality and other user characteristics using       12                " description ": " skateboarding
Facebook likes has been proposed by Kosinski, Stillwell and                                equipment and supplies "
Graepel [18].                                                        13   }, {
                                                                     14                " score ": 0.8131781,
Besides posted content on SNSs, the characteristics of pictures      15                "mid": "/m/06_fw",
has shown to consist of personality information as well. Celli,      16                " description ": " skateboard "
Bruni, and Lepri [2] showed that Facebook profile pictures           17   }, {
consist of indicators of users’ personality. An extension of         18                " score ": 0.7329241,
this work has been recently published [23]. Work of Ferwerda,        19                "mid": "/m/05y5lj",
Schedl, and Tkalcic [12, 10] on Instagram pictures, showed           20                " description ": " sports equipment
that the way filters are applied to create a certain distinctive-                          "
ness that can be used to predict personality traits of the poster.   21   }, {
                                                                     22                " score ": 0.64866644,
In this work we expand the work of Ferwerda et al. [12, 10]
                                                                     23                "mid": "/m/02nnq5",
on Instagram pictures. Instead of looking at the picture char-
                                                                     24                " description ": " longboard "
acteristics (i.e., how filters are applied), we look at the posted
                                                                     25   }]
content itself.
                                                                          Listing 1. Example JSON file returned by the Google Vision API for one
                                                                          picture
METHOD
To investigate the relationship between personality traits and
picture features, we asked participants to fill in the 44-item            Using the Google Vision API, we were able to retrieve 4090
BFI personality questionnaire (5-point Likert scale; Disagree             unique labels from the Instagram pictures. In order to create
strongly - Agree strongly [16]). The questionnaire includes               an initial clustering of the labels, we used a k-means cluster-
questions that aggregate into the five basic personality traits           ing method that is applied to the vectors that represent the
of the FFM. Additionally, we asked participants to grant us               terms in the joint vector space. The vectors were generated
access to their Instagram account through the Instagram API,              with the doc2vec approach using a set of embeddings that are
in order to crawl their pictures.                                         pre-trained on the English Wikipedia 5 . Using this method we
                                                                          collated the labels into 400 clusters. 6 After that, the output of
We recruited 233 participants through Amazon Mechanical                   the k-means was manually checked and the clusters were fur-
Turk, a popular recruitment tool for user-experiments [17]. Par-          ther (manually) collated into similar categories. This resulted
ticipation was restricted to those located in the United States,          into 17 categories representing:
and also to those with a very good reputation (≥95% HIT
                                                                          5 https://github.com/jhlau/doc2vec
approval rate and ≥1000 HITs approved) 4 to avoid careless
                                                                          6 The k-means clustering method allows for setting a parameter for
3 https://cloud.google.com/vision/
                                                                          the number of clusters to be forced. Different number of clusters were
4 HITs (Human Intelligence Tasks) represent the assignments a user        tried out. Setting the k-means to automatically define 400 clusters
has participated in on Amazon Mechanical Turk prior to this study.        resulted in clusters with least errors in clustering the labels.
 1. Architecture                                                                      O           C          E          A           N
                                                                           1      -0.009     -0.009      0.044     -0.002      -0.043
 2. Body parts                                                             2      -0.039     -0.075      0.023      0.115       0.108
                                                                           3       0.040      0.148      0.110      0.234      -0.184
 3. Clothing
                                                                           4       0.156      0.133      0.034      0.049      -0.081
 4. Music instruments                                                      5       0.048     -0.003      0.122      0.111      -0.065
                                                                           6       0.105      0.113      0.088      0.051      -0.027
 5. Art                                                                    7       0.002     -0.034     -0.074      0.099       0.057
                                                                           8       0.027     -0.040      0.053      0.050      -0.076
 6. Performances
                                                                           9       0.008     -0.003     -0.008     -0.015       0.112
 7. Botanical                                                              10     -0.069      0.027     -0.012     -0.029      -0.016
                                                                           11     -0.087      0.156      0.023     -0.003      -0.135
 8. Cartoons                                                               12     -0.067      0.054      0.024      0.054      -0.028
 9. Animals                                                                13     -0.057      0.097      0.167      0.062      -0.132
                                                                           14     -0.009      0.024     -0.026      0.010       0.058
10. Foods                                                                  15     -0.042      0.112      0.085      0.180      -0.124
                                                                           16     -0.055     -0.070     -0.052     -0.017       0.188
11. Sports                                                                 17      0.009      0.096     -0.019      0.041       0.032
12. Vehicles                                                        Table 1. Spearman’s correlation between picture content categories and
                                                                    personality traits. Significant correlations after Bonferroni correction
13. Electronics                                                     are shown in boldface (p <.001).

14. Babies
                                                                    Extraversion: We found a correlation between elec-
15. Leisure                                                         tronics (category #13) and extraversion. Extraverts tend
                                                                    to post pictures on their Instagram account consisting of
16. Jewelry                                                         electronics.
17. Weapons
                                                                    Agreeableness:        Positive correlations were found be-
                                                                    tween agreeableness and the the categories #3 (clothing)
 For each participant, we accumulated the number of category        and #15 (leisure). This means that the Instagram picture-
 occurrences in their Instagram picture-collection. Since the       collections of agreeable participants consist of pictures with
 number of Instagram pictures in each picture-collection is         clothing or leisure content.
 different, we normalized the number of category occurrences
 to represent a range of rε[0,1]. This in order to be able to       Neuroticism: A negative correlation was found with
 compare users with differences in the total amount of pictures.    category #3 (clothing) and a positive correlation was found
                                                                    with category #16 (jewelry) and those scoring high on
 RESULTS                                                            neuroticism. The results show that people who score high on
 We used the Spearman’s correlation analysis to analyze the         neuroticism tend to have less pictures with clothing content,
 correlations between the picture content categories and person-    but in general have more content with jewelry.
 ality traits. Alpha levels were adjusted using the Bonferroni
 correction to limit the chances of a Type I error. The reported    CONCLUSION AND OUTLOOK
 significant results adhere to alpha levels of p <.001 (see Ta-     We found the content of Instagram picture features to be corre-
 ble 1). Several correlations were found that indicate a higher     lated with personality. A summary of the correlations between
 usage of posting pictures with a certain content depending on      the picture content and personality traits can be found in Ta-
 personality traits. The correlations between the picture content   ble 2.
 categories and personality traits are discussed below.
                                                                            Personality                      Picture content
                                                                            Openness to experience           Music instruments
 Openness to experience: Openness to experience was found                   Conscientiousness                Clothing, sports
 to correlate with the music instruments category (category                 Extraversion                     Electronics
 #4). This shows that those scoring high in the openness to                 Agreeableness                    Clothing, leisure
 experience trait in general post more pictures consisting of               Neuroticism                      Clothing (-), jewelry
 music instruments.
                                                                    Table 2. Interpretation and summary of the correlations found between
                                                                    personality traits and picture properties. Unless indicated with "(-)," the
 Conscientiousness: A positive correlation was found                results indicate positive correlations. The content correlations apply for
 between conscientiousness and the categories #3 (clothing)         the pictures of participants who score high in the respective personality
 and #11 (sports). This indicates that conscientious participants   trait.
 more frequently shared pictures consisting of content with
 clothing or sports.
The identification of the correlations between image categories         pictures. In Proceedings of the 3rd Workshop on
and user personality is the first step towards unobtrusive per-         Emotions and Personality in Personalized Systems 2015.
sonality detection and personalization. In future work we plan          ACM, 7–10.
to use the automatically detected categories as features for the
                                                                    10. Bruce Ferwerda, Markus Schedl, and Marko Tkalcic.
unobtrusive prediction of personality using machine learning
                                                                        2016. Using instagram picture features to predict users’
techniques. With this work we are complementing prior work
                                                                        personality. In International Conference on Multimedia
of Ferwerda et al. [12, 10] in which they used the picture prop-
                                                                        Modeling. Springer, 850–861.
erties of Instagram pictures to find relations with personality
traits as well creating a predictive model of personality traits.   11. Bruce Ferwerda, Marko Tkalcic, and Markus Schedl.
Future work will focus on combining the relevant picture fea-           2017. Personality Traits and Music Genres: What Do
tures of prior work with the categories that we laid out in this        People Prefer to Listen To?. In Proceedings of the 25th
work to improve the predictive models that can be created for           Conference on User Modeling, Adaptation and
personality prediction.                                                 Personalization. ACM, 285–288.

ACKNOWLEDGEMENTS                                                    12. Bruce Ferwerda, Emily Yang, Markus Schedl, and Marko
We would like to thank Marcin Skowron for his help and                  Tkalcic. 2015. Personality traits predict music taxonomy
expertise on processing the data into clusters.                         preferences. In Proceedings of the 33rd Annual ACM
                                                                        Conference Extended Abstracts on Human Factors in
REFERENCES                                                              Computing Systems. ACM, 2241–2246.
 1. Iván Cantador, Ignacio Fernández-Tobías, and Alejandro          13. Jennifer Golbeck, Cristina Robles, and Karen Turner.
    Bellogín. 2013. Relating personality types with user                2011. Predicting personality with social media. In CHI’11
    preferences in multiple entertainment domains. In CEUR              extended abstracts on human factors in computing
    Workshop Proceedings. Shlomo Berkovsky.                             systems. ACM, 253–262.
 2. Fabio Celli, Elia Bruni, and Bruno Lepri. 2014.                 14. Sajanee Halko and Julie A Kientz. 2010. Personality and
    Automatic personality and interaction style recognition             persuasive technology: an exploratory study on
    from facebook profile pictures. In Proceedings of the               health-promoting mobile applications. In International
    22nd ACM international conference on Multimedia.                    Conference on Persuasive Technology. Springer,
    ACM, 1101–1104.                                                     150–161.
 3. Guanliang Chen, Dan Davis, Claudia Hauff, and                   15. Rong Hu and Pearl Pu. 2009. Acceptance issues of
    Geert-Jan Houben. 2016. On the impact of personality in             personality-based recommender systems. In Proceedings
    massive open online learning. In Proceedings of the 2016            of the third ACM conference on Recommender systems.
    conference on user modeling adaptation and                          ACM, 221–224.
    personalization. ACM, 121–130.
                                                                    16. Oliver P John and Sanjay Srivastava. 1999. The Big Five
 4. Li Chen, Wen Wu, and Liang He. 2013. How personality                trait taxonomy: History, measurement, and theoretical
    influences users’ needs for recommendation diversity?. In           perspectives. Handbook of personality: Theory and
    CHI’13 Extended Abstracts on Human Factors in                       research 2, 1999 (1999), 102–138.
    Computing Systems. ACM, 829–834.
                                                                    17. Aniket Kittur, Ed H Chi, and Bongwon Suh. 2008.
 5. Bruce Ferwerda, Mark Graus, Andreu Vall, Marko                      Crowdsourcing user studies with Mechanical Turk. In
    Tkalcic, and Markus Schedl. 2016. The influence of                  Proceedings of the SIGCHI conference on human factors
    users’ personality traits on satisfaction and attractiveness        in computing systems. ACM, 453–456.
    of diversified recommendation lists. In 4 th Workshop on
    Emotions and Personality in Personalized Systems                18. Michal Kosinski, David Stillwell, and Thore Graepel.
    (EMPIRE) 2016. 43.                                                  2013. Private traits and attributes are predictable from
                                                                        digital records of human behavior. Proceedings of the
 6. Bruce Ferwerda and Markus Schedl. 2014. Enhancing                   National Academy of Sciences of the United States of
    Music Recommender Systems with Personality                          America 110, 15 (mar 2013), 5802–5. DOI:
    Information and Emotional States: A Proposal.. In                   http://dx.doi.org/10.1073/pnas.1218772110
    UMAP Workshops.
                                                                    19. Michael J Lee and Bruce Ferwerda. 2017. Personalizing
 7. Bruce Ferwerda and Markus Schedl. 2016.                             online educational tools. In Proceedings of the 2017 ACM
    Personality-Based User Modeling for Music                           Workshop on Theory-Informed User Modeling for
    Recommender Systems. In Joint European Conference                   Tailoring and Personalizing Interfaces. ACM, 27–30.
    on Machine Learning and Knowledge Discovery in
    Databases. Springer, 254–257.                                   20. S C Matz, M Kosinski, G Nave, and D J Stillwell. 2017.
                                                                        Psychological targeting as an effective approach to digital
 8. Bruce Ferwerda, Markus Schedl, and Marko Tkalcic.
                                                                        mass persuasion. Proceedings of the National Academy of
    2015a. Personality & Emotional States: Understanding
                                                                        Sciences 114, 48 (nov 2017), 12714–12719. DOI:
    Users’ Music Listening Needs.. In UMAP Workshops.
                                                                        http://dx.doi.org/10.1073/pnas.1710966114
 9. Bruce Ferwerda, Markus Schedl, and Marko Tkalcic.
    2015b. Predicting personality traits with instagram
21. Daniele Quercia, Michal Kosinski, David Stillwell, and       24. Marcin Skowron, Marko Tkalčič, Bruce Ferwerda, and
    Jon Crowcroft. 2011. Our Twitter profiles, our selves:           Markus Schedl. 2016. Fusing social media cues:
    Predicting personality with Twitter. In Proceedings of the       personality prediction from twitter and instagram. In
    International Conference on Social Computing                     Proceedings of the 25th international conference
    (SocialCom). IEEE, 180–185.                                      companion on world wide web. International World Wide
                                                                     Web Conferences Steering Committee, 107–108.
22. William Revelle. 2009. Personality structure and
    measurement: The contributions of Raymond Cattell.           25. Kirsten A Smith, Matt Dennis, and Judith Masthoff. 2016.
    British Journal of Psychology 100, S1 (2009), 253–257.           Personalizing reminders to personality for melanoma
                                                                     self-checking. In Proceedings of the 2016 Conference on
23. Cristina Segalin, Fabio Celli, Luca Polonio, Michal
                                                                     User Modeling Adaptation and Personalization. ACM,
    Kosinski, David Stillwell, Nicu Sebe, Marco Cristani, and
                                                                     85–93.
    Bruno Lepri. 2017. What your Facebook Profile Picture
    Reveals about your Personality. In Proceedings of the        26. Marko Tkalčič, Bruce Ferwerda, David Hauger, and
    2017 ACM on Multimedia Conference, MM 2017,                      Markus Schedl. 2015. Personality correlates for digital
    Mountain View, CA, USA, October 23-27, 2017. 460–468.            concert program notes. In International Conference on
    DOI:http://dx.doi.org/10.1145/3123266.3123331                    User Modeling, Adaptation, and Personalization.
                                                                     Springer, 364–369.