You Are What You Post: What the Content of Instagram Pictures Tells About Users’ Personality Bruce Ferwerda∗ Marko Tkalcic Department of Computer Science and Informatics Faculty of Computer Science School of Engineering Free University of Bozen-Bolzano Jönköping University Piazza Domenicani 3 P.O. Box 1026 I-39100, Bozen-Bolzano, Italy SE-551 11, Jönköping, Sweden marko.tkalcic@unibz.it bruce.ferwerda@ju.se ABSTRACT to have an advantage over systems not using personality in- Instagram is a popular social networking application that al- formation [15]; an advantage is created in terms of increased lows users to express themselves through the uploaded content users’ loyalty towards the system and decreased cognitive and the different filters they can apply. In this study we look at effort. the relationship between the content of the uploaded Instagram The usefulness of personality for personalization is shown pictures and the personality traits of users. To collect data, we in its domain independency: once the personality of users is conducted an online survey where we asked participants to known, it can be used across domains for personalization [1]. fill in a personality questionnaire, and grant us access to their This allows for personality extraction in one domain and im- Instagram account through the Instagram API. We gathered plementation in another. Hence, the relationships between 54,962 pictures of 193 Instagram users. Through the Google personality traits and users’ behavior preferences and needs Vision API, we analyzed the pictures on their content and clus- are increasingly being investigated (e.g., health [14, 25], ed- tered the returned labels with the k-means clustering approach. ucation [3, 19], movies [4], music [6, 8, 5, 7, 11, 26], mar- With a total of 17 clusters, we analyzed the relationship with keting [20]) in order to learn about the connection between users’ personality traits. Our findings suggest a relationship personality traits and specific behaviors. between personality traits and picture content. This allow for new ways to extract personality traits from social media trails, Since personality traits of users are increasingly being used and new ways to facilitate personalized systems. to provide a personalized experience to users, there is an in- creased interest in how to implicitly acquire these traits for Author Keywords implementation. A useful source of information are social Personality, Instagram, picture content, social media networking services (SNSs). SNSs are increasingly intercon- nected with applications through so called "single sign-on INTRODUCTION buttons" (SSO buttons). 1 The abundance of information that becomes available from the connected SNSs can be used to Personality traits have shown to be a useful concept to rely infer users’ personality traits from (e.g., Facebook [7], Twit- on when considering personalizations of user experiences in ter [21, 24], and Instagram [9, 10]). a system. This because personality has shown to be a stable construct over time, and reflects the coherent patterning of In this work we join the personality extraction research. We one’s affect, cognition, and desires (goals) as it leads to behav- specifically focus on Instagram, a popular mobile photo- ior [22]. The stability and coherency that personality bring, has sharing, and SNS, with currently over 800 million users. 2 shown to be useful for systems to infer users’ preferences and With the content as well as with the filters that Instagram al- to provide personalized experiences to users (e.g., [6]). Sys- lows users to apply to their pictures, users are able to express tems that use personality-based personalizations have shown a personal style and create a seeming distinctiveness. Hence, personality information about users may be hidden in the pic- * Also affiliated with the Department of Computational Perception, tures that users upload to Instagram. Whereas prior work on Johannes Kepler University, Altenberger Strasse 69, 4040, Linz (Aus- tria), bruce.ferwerda@jku.at Instagram focused on the picture properties (i.e., hue, satura- tion, valence relationship) [9, 10], we focus on the content of the posted pictures on Instagram and explore the relationship with the personality traits of Instagram users. By analyzing the Instagram pictures on their content using the Google Vision 1 Buttons that allow users to easily register and log in to a system ©2018. Copyright for the individual papers remains with the authors. with their social media account. 2 https://instagram.com/press/ (accessed: 08/12/2017) Copying permitted for private and academic purposes. HUMANIZE ’18, March 11, 2018, Tokyo, Japan API 3 , we were able to find distinct correlations between users’ contributions. Several control questions were used to filter personality traits and the content of the pictures they post on out fake and careless entries. This left us with 193 completed Instagram. and valid responses. Age (18-64, median 30) and gender (104 male, 89 female) information indicated an adequate distribu- RELATED WORK tion. Pictures of each participant were crawled after the study. There is an increasing body of work that looks at how to This resulted in a total of 54,962 pictures. implicitly acquire personality traits of users. Since all kind of To analyze the content of the pictures, we used the Google information can relate to personality traits, even information Vision API. The Google Vision API uses a deep neural that is not directly relevant for a specific purpose may contain network to analyze the pictures and assign tags ("description") information that is useful for the extraction of personality with a confidence level ("score": rε[0,1]) to classify the (e.g., Facebook [7], Twitter [21, 24], and Instagram [9, 10]). content (example given in Listing 1). The increased connectedness between SNSs and applications through SSO buttons provide an abundance of information that can be exploited to implicitly acquire personality traits of 1 [{ users. 2 " score ": 0.8734813, Quercia et al. [21] looked at Twitter profiles and were able 3 "mid": "/m/06__v", to predict users’ personality traits by using their number of 4 " description ": " snowboard " followers, following, and listed counts. With these three char- 5 }, { acteristics they were able to predict personality scores with 6 " score ": 0.8640924, a root-mean-square error 0.88 on a [1,5] scale. Similar work 7 "mid": "/m/01fklc", has been done by Golbeck, Robles, and Turner [13] on Face- 8 " description ": "pink" book profiles. They mainly looked at the sentiment of posted 9 }, { content and were able to create a reliable personality predictor 10 " score ": 0.81754106, with that information. A more comprehensive work on the 11 "mid": "/m/0bpn3c2", prediction of personality and other user characteristics using 12 " description ": " skateboarding Facebook likes has been proposed by Kosinski, Stillwell and equipment and supplies " Graepel [18]. 13 }, { 14 " score ": 0.8131781, Besides posted content on SNSs, the characteristics of pictures 15 "mid": "/m/06_fw", has shown to consist of personality information as well. Celli, 16 " description ": " skateboard " Bruni, and Lepri [2] showed that Facebook profile pictures 17 }, { consist of indicators of users’ personality. An extension of 18 " score ": 0.7329241, this work has been recently published [23]. Work of Ferwerda, 19 "mid": "/m/05y5lj", Schedl, and Tkalcic [12, 10] on Instagram pictures, showed 20 " description ": " sports equipment that the way filters are applied to create a certain distinctive- " ness that can be used to predict personality traits of the poster. 21 }, { 22 " score ": 0.64866644, In this work we expand the work of Ferwerda et al. [12, 10] 23 "mid": "/m/02nnq5", on Instagram pictures. Instead of looking at the picture char- 24 " description ": " longboard " acteristics (i.e., how filters are applied), we look at the posted 25 }] content itself. Listing 1. Example JSON file returned by the Google Vision API for one picture METHOD To investigate the relationship between personality traits and picture features, we asked participants to fill in the 44-item Using the Google Vision API, we were able to retrieve 4090 BFI personality questionnaire (5-point Likert scale; Disagree unique labels from the Instagram pictures. In order to create strongly - Agree strongly [16]). The questionnaire includes an initial clustering of the labels, we used a k-means cluster- questions that aggregate into the five basic personality traits ing method that is applied to the vectors that represent the of the FFM. Additionally, we asked participants to grant us terms in the joint vector space. The vectors were generated access to their Instagram account through the Instagram API, with the doc2vec approach using a set of embeddings that are in order to crawl their pictures. pre-trained on the English Wikipedia 5 . Using this method we collated the labels into 400 clusters. 6 After that, the output of We recruited 233 participants through Amazon Mechanical the k-means was manually checked and the clusters were fur- Turk, a popular recruitment tool for user-experiments [17]. Par- ther (manually) collated into similar categories. This resulted ticipation was restricted to those located in the United States, into 17 categories representing: and also to those with a very good reputation (≥95% HIT 5 https://github.com/jhlau/doc2vec approval rate and ≥1000 HITs approved) 4 to avoid careless 6 The k-means clustering method allows for setting a parameter for 3 https://cloud.google.com/vision/ the number of clusters to be forced. Different number of clusters were 4 HITs (Human Intelligence Tasks) represent the assignments a user tried out. Setting the k-means to automatically define 400 clusters has participated in on Amazon Mechanical Turk prior to this study. resulted in clusters with least errors in clustering the labels. 1. Architecture O C E A N 1 -0.009 -0.009 0.044 -0.002 -0.043 2. Body parts 2 -0.039 -0.075 0.023 0.115 0.108 3 0.040 0.148 0.110 0.234 -0.184 3. Clothing 4 0.156 0.133 0.034 0.049 -0.081 4. Music instruments 5 0.048 -0.003 0.122 0.111 -0.065 6 0.105 0.113 0.088 0.051 -0.027 5. Art 7 0.002 -0.034 -0.074 0.099 0.057 8 0.027 -0.040 0.053 0.050 -0.076 6. Performances 9 0.008 -0.003 -0.008 -0.015 0.112 7. Botanical 10 -0.069 0.027 -0.012 -0.029 -0.016 11 -0.087 0.156 0.023 -0.003 -0.135 8. Cartoons 12 -0.067 0.054 0.024 0.054 -0.028 9. Animals 13 -0.057 0.097 0.167 0.062 -0.132 14 -0.009 0.024 -0.026 0.010 0.058 10. Foods 15 -0.042 0.112 0.085 0.180 -0.124 16 -0.055 -0.070 -0.052 -0.017 0.188 11. Sports 17 0.009 0.096 -0.019 0.041 0.032 12. Vehicles Table 1. Spearman’s correlation between picture content categories and personality traits. Significant correlations after Bonferroni correction 13. Electronics are shown in boldface (p <.001). 14. Babies Extraversion: We found a correlation between elec- 15. Leisure tronics (category #13) and extraversion. Extraverts tend to post pictures on their Instagram account consisting of 16. Jewelry electronics. 17. Weapons Agreeableness: Positive correlations were found be- tween agreeableness and the the categories #3 (clothing) For each participant, we accumulated the number of category and #15 (leisure). This means that the Instagram picture- occurrences in their Instagram picture-collection. Since the collections of agreeable participants consist of pictures with number of Instagram pictures in each picture-collection is clothing or leisure content. different, we normalized the number of category occurrences to represent a range of rε[0,1]. This in order to be able to Neuroticism: A negative correlation was found with compare users with differences in the total amount of pictures. category #3 (clothing) and a positive correlation was found with category #16 (jewelry) and those scoring high on RESULTS neuroticism. The results show that people who score high on We used the Spearman’s correlation analysis to analyze the neuroticism tend to have less pictures with clothing content, correlations between the picture content categories and person- but in general have more content with jewelry. ality traits. Alpha levels were adjusted using the Bonferroni correction to limit the chances of a Type I error. The reported CONCLUSION AND OUTLOOK significant results adhere to alpha levels of p <.001 (see Ta- We found the content of Instagram picture features to be corre- ble 1). Several correlations were found that indicate a higher lated with personality. A summary of the correlations between usage of posting pictures with a certain content depending on the picture content and personality traits can be found in Ta- personality traits. The correlations between the picture content ble 2. categories and personality traits are discussed below. Personality Picture content Openness to experience Music instruments Openness to experience: Openness to experience was found Conscientiousness Clothing, sports to correlate with the music instruments category (category Extraversion Electronics #4). This shows that those scoring high in the openness to Agreeableness Clothing, leisure experience trait in general post more pictures consisting of Neuroticism Clothing (-), jewelry music instruments. Table 2. Interpretation and summary of the correlations found between personality traits and picture properties. Unless indicated with "(-)," the Conscientiousness: A positive correlation was found results indicate positive correlations. The content correlations apply for between conscientiousness and the categories #3 (clothing) the pictures of participants who score high in the respective personality and #11 (sports). This indicates that conscientious participants trait. more frequently shared pictures consisting of content with clothing or sports. The identification of the correlations between image categories pictures. In Proceedings of the 3rd Workshop on and user personality is the first step towards unobtrusive per- Emotions and Personality in Personalized Systems 2015. sonality detection and personalization. In future work we plan ACM, 7–10. to use the automatically detected categories as features for the 10. Bruce Ferwerda, Markus Schedl, and Marko Tkalcic. unobtrusive prediction of personality using machine learning 2016. Using instagram picture features to predict users’ techniques. With this work we are complementing prior work personality. In International Conference on Multimedia of Ferwerda et al. [12, 10] in which they used the picture prop- Modeling. Springer, 850–861. erties of Instagram pictures to find relations with personality traits as well creating a predictive model of personality traits. 11. Bruce Ferwerda, Marko Tkalcic, and Markus Schedl. Future work will focus on combining the relevant picture fea- 2017. Personality Traits and Music Genres: What Do tures of prior work with the categories that we laid out in this People Prefer to Listen To?. In Proceedings of the 25th work to improve the predictive models that can be created for Conference on User Modeling, Adaptation and personality prediction. Personalization. ACM, 285–288. ACKNOWLEDGEMENTS 12. Bruce Ferwerda, Emily Yang, Markus Schedl, and Marko We would like to thank Marcin Skowron for his help and Tkalcic. 2015. Personality traits predict music taxonomy expertise on processing the data into clusters. preferences. In Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in REFERENCES Computing Systems. ACM, 2241–2246. 1. Iván Cantador, Ignacio Fernández-Tobías, and Alejandro 13. Jennifer Golbeck, Cristina Robles, and Karen Turner. Bellogín. 2013. Relating personality types with user 2011. Predicting personality with social media. In CHI’11 preferences in multiple entertainment domains. In CEUR extended abstracts on human factors in computing Workshop Proceedings. Shlomo Berkovsky. systems. ACM, 253–262. 2. Fabio Celli, Elia Bruni, and Bruno Lepri. 2014. 14. Sajanee Halko and Julie A Kientz. 2010. Personality and Automatic personality and interaction style recognition persuasive technology: an exploratory study on from facebook profile pictures. In Proceedings of the health-promoting mobile applications. In International 22nd ACM international conference on Multimedia. Conference on Persuasive Technology. Springer, ACM, 1101–1104. 150–161. 3. Guanliang Chen, Dan Davis, Claudia Hauff, and 15. Rong Hu and Pearl Pu. 2009. Acceptance issues of Geert-Jan Houben. 2016. On the impact of personality in personality-based recommender systems. In Proceedings massive open online learning. In Proceedings of the 2016 of the third ACM conference on Recommender systems. conference on user modeling adaptation and ACM, 221–224. personalization. ACM, 121–130. 16. Oliver P John and Sanjay Srivastava. 1999. The Big Five 4. Li Chen, Wen Wu, and Liang He. 2013. How personality trait taxonomy: History, measurement, and theoretical influences users’ needs for recommendation diversity?. In perspectives. Handbook of personality: Theory and CHI’13 Extended Abstracts on Human Factors in research 2, 1999 (1999), 102–138. Computing Systems. ACM, 829–834. 17. Aniket Kittur, Ed H Chi, and Bongwon Suh. 2008. 5. Bruce Ferwerda, Mark Graus, Andreu Vall, Marko Crowdsourcing user studies with Mechanical Turk. In Tkalcic, and Markus Schedl. 2016. The influence of Proceedings of the SIGCHI conference on human factors users’ personality traits on satisfaction and attractiveness in computing systems. ACM, 453–456. of diversified recommendation lists. In 4 th Workshop on Emotions and Personality in Personalized Systems 18. Michal Kosinski, David Stillwell, and Thore Graepel. (EMPIRE) 2016. 43. 2013. Private traits and attributes are predictable from digital records of human behavior. Proceedings of the 6. Bruce Ferwerda and Markus Schedl. 2014. Enhancing National Academy of Sciences of the United States of Music Recommender Systems with Personality America 110, 15 (mar 2013), 5802–5. DOI: Information and Emotional States: A Proposal.. In http://dx.doi.org/10.1073/pnas.1218772110 UMAP Workshops. 19. Michael J Lee and Bruce Ferwerda. 2017. Personalizing 7. Bruce Ferwerda and Markus Schedl. 2016. online educational tools. In Proceedings of the 2017 ACM Personality-Based User Modeling for Music Workshop on Theory-Informed User Modeling for Recommender Systems. In Joint European Conference Tailoring and Personalizing Interfaces. ACM, 27–30. on Machine Learning and Knowledge Discovery in Databases. Springer, 254–257. 20. S C Matz, M Kosinski, G Nave, and D J Stillwell. 2017. Psychological targeting as an effective approach to digital 8. Bruce Ferwerda, Markus Schedl, and Marko Tkalcic. mass persuasion. Proceedings of the National Academy of 2015a. Personality & Emotional States: Understanding Sciences 114, 48 (nov 2017), 12714–12719. DOI: Users’ Music Listening Needs.. In UMAP Workshops. http://dx.doi.org/10.1073/pnas.1710966114 9. Bruce Ferwerda, Markus Schedl, and Marko Tkalcic. 2015b. Predicting personality traits with instagram 21. Daniele Quercia, Michal Kosinski, David Stillwell, and 24. Marcin Skowron, Marko Tkalčič, Bruce Ferwerda, and Jon Crowcroft. 2011. Our Twitter profiles, our selves: Markus Schedl. 2016. Fusing social media cues: Predicting personality with Twitter. In Proceedings of the personality prediction from twitter and instagram. In International Conference on Social Computing Proceedings of the 25th international conference (SocialCom). IEEE, 180–185. companion on world wide web. International World Wide Web Conferences Steering Committee, 107–108. 22. William Revelle. 2009. Personality structure and measurement: The contributions of Raymond Cattell. 25. Kirsten A Smith, Matt Dennis, and Judith Masthoff. 2016. British Journal of Psychology 100, S1 (2009), 253–257. Personalizing reminders to personality for melanoma self-checking. In Proceedings of the 2016 Conference on 23. Cristina Segalin, Fabio Celli, Luca Polonio, Michal User Modeling Adaptation and Personalization. ACM, Kosinski, David Stillwell, Nicu Sebe, Marco Cristani, and 85–93. Bruno Lepri. 2017. What your Facebook Profile Picture Reveals about your Personality. In Proceedings of the 26. Marko Tkalčič, Bruce Ferwerda, David Hauger, and 2017 ACM on Multimedia Conference, MM 2017, Markus Schedl. 2015. Personality correlates for digital Mountain View, CA, USA, October 23-27, 2017. 460–468. concert program notes. In International Conference on DOI:http://dx.doi.org/10.1145/3123266.3123331 User Modeling, Adaptation, and Personalization. Springer, 364–369.