=Paper= {{Paper |id=None |storemode=property |title=Colors of the street: color as an image visualization parameter of Twitter pictures from Brazil's 2013 protests |pdfUrl=https://ceur-ws.org/Vol-1210/datawiz2014_06.pdf |volume=Vol-1210 |dblpUrl=https://dblp.org/rec/conf/ht/HonoratoCGC14 }} ==Colors of the street: color as an image visualization parameter of Twitter pictures from Brazil's 2013 protests== https://ceur-ws.org/Vol-1210/datawiz2014_06.pdf
       Colors of the street: color as an image visualization
     parameter of Twitter pictures from Brazil's 2013 protests
     Johanna I. Honorato                  Lucas O. Cypriano                       Fábio Goveia                          Lia Carreira
       johonorato@labic.net               lucascypriano@gmail.com               fabiogoveia@labic.net                liacarreira@labic.net



                                              Labic, Universidade Federal do Espírito Santo
                                  514, Fernando Ferrari Ave. - Vitória, ES – Brazil – CEP: 29.075-910
                                                              +55 27 4009-2752
ABSTRACT                                                                 (with its vast developments and lower prices), extraction and
This paper aims to discuss color as a methodological tool in the         analysis of large amounts of images remains a challenge due to its
analysis of large quantities of images. For this purpose, this paper     peculiarities.
presents a series of researches done by two data analysis labs,          In this research scenario, images are analyzed though different
Software Studies Initiative (EUA) and Labic, the Laboratory of           data parameters, such as its sharing frequency, time and/or size, in
Image and Cyberculture Studies (Brazil), in order to illustrate its      order to create all sorts of visualizations. However, this paper
different uses. Moreover, this paper shows Labic's recent research       focuses on researches that use different types of color information
on color as a parameter for the analysis of 85.585 images linked to      (such as hue, brightness and saturation) as a parameter for
twitter hashtag #vemprarua, an important hashtag related to              analysis and visualization of image data 1. Our goal here is to study
Brazil's 2013 protests. Thus, this paper highlights the importance       its importance in revealing a variety of patterns and dissonances
of colors as parameters, while identifying issues and contributions      that can help us better understand the context and modes of image
to contemporary data science.                                            production today.
                                                                            Thus, our paper focuses on data collected from the 2013 Brazilian
Categories and Subject Descriptors                                          protests #vemprarua hashtag on Twitter, retrieved from the 15 th of
I.4.8 [Image Processing And Computer Vision]: Scene Analysis                june to the 15th of July of the same year. The 2013 protests
– color.                                                                    became a large movement that gained the support and
                                                                            participation of millions of people in the whole world. With this
General Terms                                                               large engagement, social media websites gained great relevance,
Measurement, Documentation, Design, Standardization.                        enabling protesters to rapidly share pictures and ideas, and
                                                                            promote a variety of debates and events. It also enabled people at
                                                                            home to become part of this social movement, sharing information
Keywords                                                                    and, thus, helping to promote the event and spread the news. Due
Big Data, Colors, Data visualization, Image, #Vemprarua, Image              to its importance to Brazil's social and political context, this paper
analysis.                                                                   also aims to better understand its contexts and repercussions
                                                                            though image analysis using color parameters.
1.        INTRODUCTION
The production, dissemination and storage of digital images have            2.  INTERNACIONAL STUDIES USING
achieved large scales with rapid technological advances and
accessibility in contemporary society. Image production, with its
                                                                            COLOR AS A METHOD FOR IMAGE
multiplying variety of tools and available apps for online sharing,         ANALYSIS IN BIG DATA RESEARCH
has boosted this ever changing scenario, being, therefore, an               2.1 Color Analysis in Visual Arts
important and complex contemporary context to be studied and                Visual art, such as paintings, can be one of many spheres in which
better comprehended.                                                        patterns can be revealed though color analysis. For example,
Differently from contemporary semantic studies (that already is a           painters make use of a variety of colors to produce their works of
well developed research field, with its well established tools and          art and establish themselves within a specific artistic style. Using
softwares), the analysis of large amounts of images is still                color as parameters when creating visualizations of these art
underexplored, considering that there are fewer tools and                   works, we can perceive and analyze certain differences between
researches presently available regarding image datamining,                  different artists and their works, enabling comparative analysis or
visualization and analysis. Image processing and storing requires           even analyze what can be called “stylistic development” of a
great memory capacity and powerful devices, as well as                      particular painter.
specialized professionals. Although in recent years these                   On this matter, Software Studies Initiative published in June 2011
processes have become more accessible to all sorts of researchers           a research analyzing two visual art collections, one by Piet
                                                                            Mondrian and the other by Mark Rothko. The research was based
                                                                            on their images' visual elements (such as hue, brightness and
Permission to make digital or hard copies of all or part of this work for   saturation), thus revealing patterns not only between the works
personal or classroom use is granted without fee provided that copies are   themselves, but also between the artists. The purpose of that
not made or distributed for profit or commercial advantage and that         particular study was to compare a certain number of Mondrian’s
copies bear this notice and the full citation on the first page. To copy    paintings to Rothko’s produced in similar periods of time in their
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
                                                                            1
Conference’10, Month 1–2, 2010, City, State, Country.                           This paper, due to its size limits and piratical purposes, does not
Copyright 2010 ACM 1-58113-000-0/00/0010 …$15.00.                               aim to present an analysis of the context of visual studies and
                                                                                visual perception, although Labic recognizes its importance to
                                                                                the field.
careers. Through this comparison, the research identified their           3.   #VEMPRARUA's COLOR ANALYSIS
initial predominant artistic style as being related to the styles of
their predecessors. But it also found that, as the years went by,         3.1  2013 Brazilian Protests and its hashtag
Mondrian and Rothko began to differ their color pallet, which was         #vemprarua
interpreted by the research as the artists' concerns in developing        The previous studies conducted by Software Studies Initiatives
their own style, therefore, diverging from figurativism2.                 has shown the vast possibilities regarding image analysis using
Another image analysis was done by Software Studies Initiative            colors as parameters, bringing great contributions to data science.
using color in relation to time parameters. That particular study         Taking in consideration the contribution that they have also made
was based on Van Gogh's experience in Paris compared to his               to the analysis of social and cultural behavior and patterns through
time in Arles, and it showed that the set of images from the later        image visualization, Labic has been developing, using other tools
contrasted with the set from the former, due to its higher                and visualizations, a research in which we can better understand
saturation and brightness - a result of the painter's new color           the complexity and variety of political and social issues
experimentations. Thus, the visualization proposed by Software            implicated in the emergence of June's 2013 protests.
Studies suggests that Van Gogh's paintings were influenced by the         The objective of this study, named "Visagem", is then to analyze
spatial changes in his life, as he moved from one city to the other.      Twitter hashtag #vemprarua (which can be translated as “come to
                                                                          the streets”), an iconic expression of the Brazilian protests and,
2.2        Phototrails' Color Analysis                                    thus, the most used hashtag to refer to this particular social
In July 2013, Nadav Hochman, Lev Manovich and Jay Chow -                  movements within social media websites. The 2013 protests in
researchers from Software Studies Initiative -, developed a project       general were against government corruption, poor government
called Phototrails. Their goal was to explore, in a planetary scale,      financial administration associated with 2014 World Cup, and
visual and dynamic patterns and structures of user-generated              also for better quality of transport, security, education throughout
contents of Instagram. The study showed, thorough visualizations          the country. It is then an important social and political movement
created using images from that photo sharing online network, how          that deserves especial attention and research.
temporal changes and visual features of different locations can
reveal their social, cultural and political characteristics, as well as   3.2       #Vemprarua's Datamining Process
people's habits around the world. In one of their analysis, the           Our datamining method was based on the retrieval of data from
researchers chose, among millions of images captured from                 the popular social networking service Twitter through a software
Instagram, several random samples of various cities, each                 called yourTwapperKeeper (a.k.a YTK), which uses Twitter API
containing 50,000 images. From that chosen dataset, it was                to gain access and extract the necessary data. With this method,
extracted basic visual information (such as average color,                all tweets that had the matching hashtag #vemprarua were
brightness, saturation, number of edges, contrast, etc.) to create        collected, creating a csv file with all the available information
different visualizations and, thus, highlight each city visual            (such as who tweeted, date of publication, number of retweets,
identity in a specific period of time 3.                                  etc.).
                                                                          With this csv file, Labic used a Java based script called Crawler,
2.3        Flickr Flow's Color Analysis                                   developed by our lab and whose function is to separate tweets that
Flickr Flow is a project from 2009, developed by data                     contain links from those that don’t. After this process, the script
visualization researchers Fernanda Viégas and Martin                      access each tweeted link and captures the images that obeys the
Watternberg, which serves as an example of image visualization            parameters set previously by our researchers, such as a minimum
by color using contemporary photographs to retrieve its data. The         size of 15 kB or 200 x 200 px, and the extension files PNG, JPG,
study started using collections of photographs of Boston Common           JPEG, TIF or TIFF.
found and extracted from the photo social network Flickr. With            Between June 15 and July 15 of 2013 (a critical period in that
their available data, the researchers divided all photos by month         year's political and social protests), we extracted 85,595 images,
and calculated their colors' relative proportions. The projects           originated from a total of 404,006 tweets. These images, despite
following step was to, then, plot a “wheel” shaped dataviz using          only being retrieved from Twitter, came originally from a variety
both color and time as its parameters 4.                                  of websites and apps, such as online news websites, blogs, and
Thus, as a result of the visualization created, differences between       other social networking websites, that was then shared by several
the seasons of that particular year can be identified through its         social media profiles.
color variation pattern. At the bottom of this visualization, it's
possible to identify a great amount of grays, whites and lighter          3.3        #Vemprarua's Color Parameters
colors, which represent winter. One can then observe, clockwise,          In order to analyze this large amount of images, HSB color scale
the increase of more vivid colors (variations of pink, purple, green      was used in this research, in which the color of each pixel in a
and yellow), thus representing spring. Following this pattern, one        image is composed by three numeric data: hue, brightness and
can also observe the other seasons, with fall being indicated by the      saturation. Basically, hue values goes from 0 to 255 (equivalent to
yellows and oranges and summer by large amounts of bright                 0º to 360º degrees), thus forming a color circle. Brightness is then
colors and a very few of white tones.                                     determined by values ranging from 0 to 100, in which zero means
                                                                          no light (black) and 100 means maximum presence of light
                                                                          (white). Finally, saturation follows the numerical variation of
                                                                          brightness, also ranging from 0 to 100, however, being 0 an
2
    Images and more infos on the study are available at                   absence of tone (presence of grays) and 100 being fully saturated
    http://lab.softwarestudies.com/2011/06/mondrian-vs-rothko-            colors (no grays). This chosen color scale basically helps identify
    footprints-and.html                                                   groups of images with close measurements, and also allows image
3
    The research results and images are available at                      organization using these same parameters.
    http://firstmonday.org/ojs/index.php/fm/article/view/4711/3698        With this issue settled, the plug-in “Measure” (a plug-in of the
4
   The research results and images are available at                       software ImageJ) was used in order to read the values of each
http://hint.fm/projects/flickr/                                           pixel and then calculate its hue, brightness and saturation. Thus, it
was through these three color parameters that this study was able         white pattern, towards more vivid colors: mostly blue, green and
to develop different visualizations and analysis of large amounts         yellow, thus largely associated to Brazil's national flag.
of images from the #vemprarua movement, enabling the
researchers to identify certain patterns and characteristics.             3.4.2       Hue x Brigtness and Hue x Saturation
                                                                          The next visualizations (Figure 2 and 3) arranged the images in
3.4    Analysis of #vemprarua ’s Image                                    #vemprarua's dataset according to its color bands when the
Visualizations                                                            parameters were modified to "Hue" (X axis) and "saturation" (Y
With the images captured through YTK and with the visual                  axis). Thus, groups of similar images are clearly marked and the
information gathered by the Measure plugin, it was then possible          appearance frequency of certain types of images throughout the
to plot different visualizations in which these large volumes of          collection are better understood. The tracks that stands out are red,
data can be compared. In order to make these plots, we used a             orange, yellow, green, blue, and the combination of purple and
software called ImagerPlot, which was developed by Software               pink.
Studies Initiatives, housed within the UCSD Division of the
California Institute for Telecommunication and Information
Technology.
3.4.1       Brightness x Saturation
With these plots, which visually highlights sets of images
separated by its color parameters, an analysis was made possible.
In the visualization below (Figure 1), #vemprarua's images are
distributed throughout three major groups: a whiter set, mostly
found in the upper left quadrant; a darker set, found at the base of
the this dataviz; and a more colorful set, on the right upper
quadrant.



                                                                          Figure 2 - 85.595 images sorted by X-axis (hue median) and Y-
                                                                                             axis (brightness median)6




Figure 1 - 85.595 images sorted by X-axis (saturation median)
                and Y-axis (brightness median)5
In the first group with predominantly white images, a greater
presence of posters, prints of documents and newspapers covers
that have been shared throughout the months in which the protests         Figure 3 - 85.595 images sorted by X-axis (hue median) and Y-
occurred can be noticed. The distribution and circulation of                                 axis (saturation median)7
information of this type was predominant through June 15 to July
15, where, among other contents, it is possible to find: posters          In these visualizations, the first color range has a larger number of
aiming to motivate people's participation in the protests, as well as     images if compared with the rest of the dataset. The predominant
to sharing of more information on the objectives and schedules of         images that appears in this group are photos taken at the time of
the events; documents and newspaper covers that contained                 the protests, even if they were shared later on by the users. With a
information from mass media; and others.                                  closer look to the predominant orange tone area, images that are
                                                                          characterized by the street's yellow-orange lighting can be
The second group consists in images taken during the protests.            observed, as well as photos of confrontation between police force
They are images of a grayish tone, due to the predominance of the         and protesters, which often involved fires being set, explosions
streets' asphalt color, visible during the daytime as well as at night    and rubber bullets fired by police and captured by the lenses of the
(however, with a darker tone), thus beeing one of the most                vigilant photographers and protesters.
striking features of these events.
                                                                          The green color range is basically formed by the reproduction of
The third group is what aggregates posters and advertisements             the national flag of Brazil and also compose by its re-
attached to the contents shared with the #vemprarua hashtag. The          appropriations: these images varies in size, color and type, and
posters in this group moves away from the previous black and              occasionally inserted into green colored posters. On some of these
                                                                          posters, the white band that bears the inscription "Order and
                                                                          Progress" (Ordem e Progresso) in Brazil's flag was replaced by,
5
    It's important to notice that the visualizations here presented are
    created to be visualized in a larger digital visual devices that
    enables zooming features and user interaction. For a better           6
    visualization experience, this image is available in high                 Available in high definition at http://zoom.it/QCYi
                                                                          7
    definition at http://zoom.it/G8jg                                         Available in high definition at http://zoom.it/tAaW
"In Progress" (Em Progresso), meaning that the country was in a         hue median, saturation median and brightness median, as to
state of change led by the people.                                      compose the color of the squares representing each image. For the
The blue color range is also mostly composed by images of flags         second dataviz, we used just the hue median of each image; the
of Brazil, focusing on its inner circle. This group also has a lot of   saturation and brightness were established through a standard
photos from Instagram, due to one of its available filters. The last    value. So in Figure 4a we have a visualization of all color
color range covering the pink and purple tones are pictures of the      parameters of each images, and in Figure 4b it’s possible to see
protests that were intentionally faded (with the use of filters, for    more clearly just the hue value. These visualization were created
example) and posters intending to represent a more feminine             using a script developed in our lab through Processing, in order to
approach.                                                               visualize the colors medians previously calculated by ImageJ. In
                                                                        these dataviz, each image is thus represented by a square of 2 by 2
3.4.3 Color Visualization by Hue with Static                            pixels, in which the top represents the images with hue median 0
Brightness and Saturation                                               and the bottom, images with hue median 255. Thus, as a result,
                                                                        these recent visualization developed at Labic highlights the large
                                                                        color variations of the Brazilians 2013 social political protests,
                                                                        showing its characteristic visual aspect.

                                                                        4.        CONCLUSION
                                                                        Throughout this paper, we aimed to understand the importance of
                                                                        the usage of color as parameters for visualizing large amounts of
                                                                        images. Both in the artistic field and in the studies of social
                                                                        movements, color value can reveal more than numeric
                                                                        information: it can also highlight their characteristic visual aspects
                                                                        or styles, as well as point out patterns and singularities of their
                                                                        datasets. As shown in this paper, they can be viewed singularly or
                                                                        in comparison to other images sets. In both cases, color
                                                                        parameters presents themselves as a relevant and simple method
                                                                        for big data analysis.
                                                                        However, visualizations made with softwares such as ImageJ have
                                                                        certain limitations that can hinder a deeper analysis of the
                                                                        datasets. Using this kind of visualization tools, image plots can
                                                                        only be made using only two coordinates (X and Y). Thus, when a
                                                                        image has the same coordinates as another, an overlapping occurs
                                                                        and you lose, therefore, visual information. Considering this
                                                                        problem, we perceive a need to create a tool capable of adding a Z
                                                                        axis allowing a 3D environment, where image information would
                                                                        not be lost and user interaction is enhanced.
                                                                        On the other hand, when using Processing, each image is
                                                                        represented by its corresponding pixel, and thus no overlap
                                                                        occurs. In the dataviz presented in 3.4.3 (Figure 4a), each color
                                                                        range in the dataset is clearly represented, confirming the theory
                                                                        that previously observed through the analysis of the first
                                                                        visualizations created with ImageJ (Figure1, 2 and 3), that mostly
                                                                        orange toned pictures were shared during the protests. This fact
                                                                        emphasizes the frequent need of different visualizations of the
                                                                        same dataset for comparison in order to identify or confirm certain
                                                                        characteristics.
                                                                        The second dataviz in topic 3.4.3 (Figure 4b) follows the same
                                                                        principles of the first, in which the image is represented by its
                                                                        pixels' color, brightness and saturation values(HSB), showing that
                                                                        despite the predominant color being orange, the protests' general
                                                                        tone is dark, which again refers to specific characteristics of the
                                                                        June protests: being an predominantly evening event.
                                                                        Thus, this paper acknowledges that visual characteristics (such as
                                                                        hue, brightness and saturation), when used as a parameter to
                                                                        organize large amounts of images, can reveal artistic patterns and
                                                                        also social, cultural and behavioral patterns. Therefore, image
                                                                        visualizations using color parameters can present more then
                                                                        numerical values, also pointing to various perspectives of an
 Figure 4a and 4b - Visualization using the median values of
                                                                        determined event of practice.
               hue, saturation and brightness
Instead of placing an image in a specific position determined by
its color parameters, we proposed in Figure 4 two different kinds       5.        ACKNOWLEDGMENTS
of visualizations. Therefore, for these visualizations we used two
                                                                        Our thanks to the Federal University of Espírito Santo (UFES),
types of sets of color parameters. For the first dataviz, we used the
                                                                        the National Council for Scientific and Technological
Development (CNPq) and the Espírito Santo Research Support        [3] MANOVICH, Lev. 2011. Style Space: How to compare
Fundation (FAPES) for the this research financial support. This       image sets and follow their evolution. Available at:
research is part of the "Visagem" project of the Laboratory of        
                                                                  [4] Manovich, L., Hochman, N. 2013. Zooming into an
6.       REFERENCES                                                   Instagram City: Reading the local through social media.
[1] Hochman, N., Manovich, L., Chow, J. 2014. Phototrails.            First Monday, v.18, n7, 2013. Chicago. Available at
    Available at: < http://phototrails.net/>.                         
[2] Manovich, L. 2012. Data Visualization and Computational
    Art History. Available at: