IMAGE DIVERSITY ANALYSIS:
                   CONTEXT, OPINION AND BIAS

                          P. Zontone, G. Boato, F. G. B. De Natale1
                                A. De Rosa, M. Barni, A. Piva2
                            J. S. Hare, D. Dupplaw, P. H. Lewis3,
1
  Dep. of Information Engineering and Computer Science, University of Trento, Trento ITALY
2
  CNIT - National Inter-University Consortium for Telecommunications, Firenze/Siena ITALY
3
  School of Electronics and Computer Science, University of Southampton, Southampton, UK

       Abstract. The diffusion of new Internet and web technologies has increased the
       distribution of different digital content, such as text, sounds, images and videos.
       In this paper we focus on images and their role in the analysis of diversity. We
       consider diversity as a concept that takes into account the wide variety of
       information sources, and their differences in perspective and viewpoint. We
       describe a number of different dimensions of diversity; in particular, we analyze
       the dimensions related to image searches and context analysis, emotions
       conveyed by images and opinion mining, and bias analysis.


1 Introduction

With the advent of digital media, the number of images in the Web has rapidly
increased and consequently the role of images in the communication process has
gained importance. It is well known that an image can capture the attention of a
viewer more than a long sentence. Perhaps the most powerful and meaningful way to
inform, educate and persuade an individual is through the combination of memorable
visual messages with text [1]. A single image may not always convey precise
information or detailed data on a given subject, but an image can often transmit, in a
more effective and immediate way, a message or an emotion. The use of visual data
joined to textual data powerfully enriches the communication process that the writer
is performing. In other instances, photography is a means for the faithful and true
reproduction of real events and photographic images are used for documenting facts.
In this scenario, it is worth mentioning two aspects: on one hand, photographers by
taking pictures choose their own way of reporting an event (as the writers do). On the
other hand, pictures may be manipulated before their use, thus conveying different
information with respect to their original intent; therefore, the value of photography as
a record of events must be established carefully.
To summarize, images can have three main roles within a communication process.
Pictures can be used to: i) attract the attention of observers: a picture can be included
in a document for attracting attention and making the document more appealing; ii)
convey opinions and emotional messages: an image can be used for conveying an
emotional message with a positive or negative implication; iii) convey information for
documenting a given claim: images can be used to reproduce and document a claim.
Diversity plays an important role on the Internet and in all scenarios characterized by
a large amount of information input from different sources. The information derived
from multimedia content is the result of a clear diversity in cultural backgrounds,
religious beliefs, political beliefs, ideologies and temporal contexts, and has an
evident effect on opinions and bias of every person using such content. We will
consider different dimensions of diversity, and the effect of image diversity on
opinions and bias.
From the previous considerations it is clear that one of the dimensions of diversity on
images is related to the intent or role of images within a communication process.
Considering the impact of this type of diversity on bias, the use of images for
conveying a message or for illustrating a textual message has a strong potential for
bias due to the subtle message that can be conveyed.
In the following, we introduce the concept of diversity and present its image-related
dimensions. We show how the results of image searches and context analysis might
be analyzed for diversity (Section 2.1). Then, we provide an overview of the research
activity in the area of opinion mining and sentiment analysis (Section 2.2). Finally,
we describe bias analysis methods that allow the detection of bias in images (Section
2.3).


2 Diversity analysis

Let us define diversity as the co-existence of contradictory opinions and/or statements
(typically with some being non-factual or referring to opposing beliefs/opinions) [2].
There are several forms and aspects of diversity to be considered:
  − the existence of opinions with different polarity1 about the same entity, e.g., at
     different times;
  − diversity of themes, speakers, arguments, opinions, claims and ideas /frames;
  − diversity of norms, values, behavior patterns, and mentalities;
  − diversity in terms of geographical (local, regional, national, international, global
     focus of information), social (between individuals, between and within groups),
     and systemic (organizational and societal) aspects in media content;
  − static (at one point in time) and dynamic (long-term) diversity;
  − internal diversity (within one source) and external diversity (between sources).
Regarding dimensions of diversity that can be distinguished in images, we can list a
set of dimensions, which are also applicable to text, that are: diversity of sources (e.g.,
suppliers in commercial search); diversity of resources (e.g., images, text); diversity
of topic; diversity of viewpoint; diversity of genre (e.g., blogs, news, comments);
diversity of language; geographical diversity; and temporal diversity. In addition,
other dimensions specifically for images include:
  − author/holder (person or professional agency who took the picture);
  − time (date and time the picture was taken);
  − location (where it was taken);
  − source (where it was published, e.g., web site, blog, forum, PDF document);

1 The polarity of an opinion is the degree to which a statement is positive, negative or neutral.


                                                                                                    2
  − source producing the picture (if the picture is computer generated or natural, if it
     comes from a digital camera or a scanner);
  − intent (it is used to attract the attention of observers, to convey emotional
     messages, to give information for documenting a given claim);
  − sentiment/opinion (positive, negative or neutral);
  − context (characteristics of the text surrounding the picture, e.g., background of
     author, considered aspect, theme of the text);
  − subject (words that describe what the picture shows and that can be linked to the
     same-similar terms contained in the surrounding text);
  − time (date and time the picture was taken, night/day, summer/winter);
  − style (words describing the style of the photos, e.g., photorealistic, pictorial);
  − pure visual diversity (how visually similar or dissimilar images are).
Some values for these dimensions can be directly extracted from the EXIF
information in the picture, which may have been inserted automatically (by the digital
camera) or manually (by the photographer). If EXIF tags are unavailable, some
features can be derived using image retrieval techniques [3], forensic techniques [4],
and algorithms for automatically annotating images with high-level semantic concepts
[5]. An example for extracted values along these dimensions can be seen in Fig. 2.
This is a clear example of temporal diversity. These pictures have been extracted from
a PDF document entitled ‘Global Warming's Increasingly Visible Impacts’2. For these
pictures no EXIF information is available and so the features reported below are the
ones that could be derived using the algorithms introduced above. Another example is
shown in Fig. 3 where the subject of global warming is used in a very different way
from the Italian design company DIESEL during its advertising campaign of 2007:
the picture is a composition of computer generated data and photography, with the
intent of attracting attention (a common intent in advertisement images) and
incorporating a glamourous style.


    Fig. 2: Diversity dimensions: source: PDF document, context: climate change, topic/theme:
                                    mountain, lake, rocks, etc.


2   Environmental Defense Fund:
    http://www.edf.org/documents/4891_GlobalWarmingImpacts.pdf


                                                                                                3
Fig. 3: Diversity dimensions: author: Terry Richardson, source: web site, source producing the
picture: digital camera + computer generated data, sentiment: positive, content: buildings,
water, woman, man, etc.

In the following subsections we will focus on some diversity dimensions, analyzing
the state-of-the-art and possible research directions.

2.1 Diversity in image search and context analysis
Images can play several roles in the analysis of diversity. They might be used along
with text to try to distribute documents along a diversity axis where the documents are
primarily text based and the images play secondary roles both in the context of the
document and in the analysis of their diversity. However, in some searches, image
retrieval may be the main goal and in this section we show how the results of image
searches might be analyzed for diversity.
Diversity in image search is usually considered as a problem of result diversification.
Image search engines on the web, since they are based on exploiting textual
information associated with an image, often do not care about the diversification of
final visual results. Instead, a user's information need is often better satisfied when
the result set for a particular query shows many different aspects of that query; this is
especially important when the query is poorly specified or ambiguous [6, 7].
The diversification of search in image search engines is a relatively new area of
research. In terms of image search, one particular way of increasing diversity is to
ensure duplicate, or near-duplicate images in the retrieved set are hidden from the
user [8].
We have been considering how to make use of semantic web technologies to help
increase diversity of search results. Using the Yahoo BOSS (Build your Own Search
Service) API, we have developed a tool that is capable of providing image search with
different axes of diversity. The tool requires a user to input a query in the form of a
subject (i.e., “David Beckham”), context ([optional] i.e. “football”), and axis of
diversity (i.e., “football clubs”). Currently the axis must be specified as a DBpedia
resource URI (i.e., “http://dbpedia.org/ontology/clubs”), however that constraint will
be relaxed in future versions. The search engine works by using DBpedia to infer a
list of topics along the diversity axis that are related to the subject. These topics (both
the English name of the topic, and synonyms are considered) are then combined with
the subject and context to generate a (potentially large) number of queries that can be
fed to BOSS and structured into results. The results are presented as columns
corresponding to the particular topics discovered during the semantic inference.


                                                                                            4
Context analysis is also considered for the study of diversity [9, 10]. It involves
investigating the relevant information behind the content in order to better understand
the context in which it was created. In fact, from a diversity point of view, we may
wish, for example, to identify the location of events referred to in documents and if it
is not explicit, related documents or contextual information may give the information
necessary to find the location. Considering another example, we may wish to classify
the documents in terms of the location of the writer as views may vary
geographically, and although the writer's geographical location may not be explicit in
the article, secondary searches or contextual information analysis (such as a semantic
web search) may provide this information. The same is true for many other
dimensions of diversity, such as time and general political affiliation. This could be a
way not only of deriving opinions and sorting by diversity, but also a way of
determining possible bias in documents.

2.2 Diversity on opinions and emotions conveyed by images
As described previously, an important role of images in the communication process is
to convey opinions and emotional messages. With the growing availability of images
and opinion-rich resources, such as online review sites and web blogs, the area of
opinion mining and sentiment analysis has recently enjoyed a huge burst of research
activity [11]. The activity in this area deals with the computational treatment of
opinion, sentiment, and subjectivity in text and images. In particular, research on
opinion mining that refers to opinions and sentiments expressed in images is still at
the primary stage. The key problem is to select meaningful features that have a close
relationship with human emotions and to convert them into numerical features. Some
features (e.g., color, hue, luminance, saturation etc.) have been proposed but their
effectiveness has not yet been evaluated. Emotional semantic image retrieval is a new
and promising research direction in this field. Emotional semantics refer to the
highest level of abstract semantics, i.e., the semantics that describe intensity and type
of feelings, moods, emotions evoked in humans when they are viewing images [12].
One of the first emotional image retrieval systems was designed by Colombo et al.
[13]. They proposed an innovative method to obtain a high-level representation of art
images, which allowed the derivation of emotional semantics such as action,
relaxation, joy and uneasiness. Since then, other research approaches and emotion-
based retrieval systems have been proposed. In [14] a novel scheme to automatically
annotate the image emotional semantics and realize emotional image retrieval using
semantic words is described. In [15] the authors present an emotion categorization
system, trained by ground truth from psychological studies and applied to a collection
of masterpieces. In [16] only one of the aspects of aesthetic appeal is instead
analyzed. The authors consider harmony, i.e., the pleasing or congruent arrangement
of parts producing internal calm or tranquility. They conducted a series of
experiments to identify what low level features could predict harmony in an image.
However, emotional semantic image retrieval research is still at its primary stage
because emotion is a subjective characteristic, i.e., it is strongly linked to the concept
of human personality, and because it is difficult to find relations between the features
and emotions.
From our point of view, it is crucial to develop a system that allows the categorization
of pictures in terms of distinct emotions, taking into account the subjective


                                                                                        5
characteristics of emotions (the same image can lead to different emotions based on
the cultural background of the viewer) and the differences in the levels of intensities
of emotions (e.g., happiness, joyfulness). It is also important to find out what are the
features to be extracted from an emotional perspective that best represent emotional
semantics.

2.3 Bias on images
Let us define bias as a correlation between the polarity of an opinion and the context
of the opinion holder. Focusing on images, we believe that in order to understand how
the use of a given image within a given context can have influence on bias, it is
crucial to know the history of the image itself. In particular, important historical
aspects include the type of device used for producing the digital content, and, whether
and what kind of tampering the image or its sub-parts suffered. For instance,
discovering the semantic information within an image derived from a photomontage
may highlight how the exploitation of a particular image in a communication process
aims to polarize opinions, and may provide evidence that a biased view is being
projected.
Recently, image forensics has been largely proposed as a valid technological means
for ensuring the credibility of digital images, by both extracting knowledge about the
origin of the content and detecting the application of a wide variety of manipulations
[4]. Image forensics is based on the idea that inherent traces (like digital fingerprints)
are left behind in a digital media during both the creation phase and any other
successive processes [17]. By resorting only to analyzed data, digital forensic
techniques can be seen as ‘forensic blocks’ taking as input an image and providing as
output intrinsic information carried out by the digital asset, which permits better
evaluation, understanding and validation of pictures used in the communication
process.
Regarding the information on the content origin, the aim of a forensic block is to
identify the source that produced the picture, e.g. the forensic block can determine
whether the picture is computer generated or natural [18, 19], or whether the picture
comes from a digital camera or a scanner [20]. The exploitation of such information
can be used to validate a picture as an accurate and trustworthy representation of
reality. An example of this is the case of computer-generated images.
Regarding the detection of a wide variety of manipulations, different forensic blocks
are able to distinguish different processing operations, for example:
  − re-sampling operation: when geometric transformations are applied (e.g. rotation,
     scaling) a re-sampling of the origin image to a new sampling grid is comprised
     [21, 22];
  − double JPEG compression: when creating a digital forgery, it is often necessary
     to resave the modified image, so often the tampered image suffers a double JPEG
     compression [23];
  − copy-move forgery: a part of the image is copied and pasted on another part of
     the same image [24].
Such forensic blocks can also be applied block-wise in order to spatially localize
specific characteristics that could be different from one block to another. If we are
studying some features that should be coherent in the overall image (e.g. source,
JPEG compression, etc.), inconsistency of such features infers that some processing


                                                                                        6
has been locally applied to the content. The most common example of this is the
creation of photomontages that are usually considered as a cut-and-paste composite of
fragments coming from different images. This functionality could be very useful for
understanding what semantic information has been altered.
A new way of exploiting image forensic technologies would be through their
application to groups of images instead of single images, with the aim to discover
dependencies between different images, used in different places, representing similar
or equal contents, thus constructing a graph that describes picture relationships. By
focusing on two images, the idea is to understand if one image comes from the other
and the processing which possibly produced such a transformation. Knowing how a
set of images are related to each other could allow the clustering of images sharing the
same root image. In this way, we could discover that several images regarding an
event have been actually produced from a limited set of source images, thus
permitting isolation of the original information. In other situations, knowing how a
few source images have evolved into a large set of derived pictures could allow us to
reconstruct how the usage of the information contained in the original images has
evolved in time and space. For instance, this could permit us to identify how these
images have been used by groups of people with different opinions about the original
event.


3 Conclusions

This paper gives an overview of the role of images in the analysis of diversity. We
have considered how the results of image searches might be analyzed for diversity,
and how context analysis can be used to better understand the context in which some
information is created. Opinion mining that has recently attracted interest from
different research communities has also been introduced. Finally, some methods to
investigate the impact of diversity on bias have been presented.


Acknowledgements

The authors wish to thank the European Union, which supported this work under the
Seventh Framework project LivingKnowledge (IST-FP7-231126).


References

1. Lester, P.M., Visual Communication Images with Messages, Fourth Edition, 2006,
    http://commfaculty.fullerton.edu/lester/.
2. Living Knowledge, http://livingknowledge-project.eu/.
3. Datta, R., Joshi, D., Li, J., Wang, J.Z., Image Retrieval: Ideas, influences, and trends of the
    new age. ACM Comput. Surv., 40(2):1-60, 2008.
4. Digital Forensics. IEEE Signal Processing Magazine, 26(2), 2009.


                                                                                                7
5. Tsai, C.-F., Hung, C., Automatically annotating images with keywords: A review of image
   annotation systems. Recent Patents on Computer Science, 1:55–68, 2008.
6. Tian, S. K., Gao, Y., Huang, T., Diversifying the image retrieval results. In proc. ACM
   Multimedia ’06. Santa Barbara, CA, USA. Oct 23-27 2006. pp 707-710, 2006.
7. Chen, H., and Karger, D. R., Less is more: probabilistic models for retrieving fewer
    relevant documents. In proc. ACM SIGIR ’06. Seattle, Washington, USA. Aug 06-11 2006.
    pp 429-436, 2006.
8. Arni, T., Clough, P., Sanderson, M., and Grubinger, M. (2008) Overview of the
    ImageCLEFphoto 2008 Photographic Retrieval Task. Retrieved 18-06-2009.
    http://www.clef-campaign.org/2008/working_notes/ImageCLEFphoto2008-final.pdf
9. Smith, A., Carr, L., Hall, W. (2005) An Opportunistic Approach to Adding Value to a
    Photograph Collection. Proceedings of the 4th International Semantic Web Conference.
    November 2005, Galway Ireland.
10. Tuffield, M., Harris, S., Dupplaw, D P., Chakravarthy, A., Brewster, C., Gibbins, N.,
    O'Hara, K., Ciravegna, F., Sleeman, D., Wilks, Y., Shadbolt, N R. (2006) Image
    Annotation with Photocopain. Proceedings of the First International Workshop on Semantic
    Web Annotations for Multimedia. WWW2006, May 2006, Edinburgh.
11. Pang. B., Lee, L., Opinion Mining and Sentiment Analysis. Foundations and Trends in
    Information Retrieval, Vol. 2, 2008.
12. Wang, W., He, Q., A Survey on Emotional Semantic Image Retrieval. IEEE International
    Conference on Image Processing, San Diego, California, U.S.A., October 12–15, 2008.
13. Colombo, C., Bimbo, A. D., Pala, P., Semantics in Visual Information Retrieval. IEEE
    Multimedia, Vol.6, No.3, pp.38-53, 1999.
14. Wei-ning, W., Ying-lin, Y., Sheng-ming, J., Image Retrieval by Emotional Semantics: A
    Study of Emotional Space and Feature Extraction. IEEE International Conference on
    Systems, Man, and Cybernetics, Taipei, Taiwan, October 8-11, 2006.
15. Yanulevskaya, V., van Gemert, J.C., Roth, K., Herbold, A.K., Sebe, N., Geusebroek, J.M.,
    Emotional Valence Categorization Using Holistics Image Features. IEEE International
    Conference on Image Processing, San Diego, California, U.S.A., October 12–15, 2008.
16. Fedorovskaya, E., Neustaedter, C., Hao, W., Image Harmony For Consumer Images. IEEE
    International Conference on Image Processing, San Diego, California, U.S.A., October 12–
    15, 2008.
17. Swaminathan, A., Wu, M., Liu, K.J.R., Digital Image Forensics via Intrinsic Fingerprints.
    IEEE Transactions on Information Forensics and Security, 3(1), pp. 101-117, 2008.
18. Lyu, S., Natural Image Statistics for Digital Image Forensics. Ph.D. Dissertation,
    Department of Computer Science, Dartmouth College, 2005.
19. Ng, T.T., Statistical and Geometric Methods for Passive-blind Image Forensics. PhD
    Dissertation, Graduate School of Arts and Sciences, Columbia University, 2007.
20. McKay, C., Swaminathan, A., Gou, H., Wu, M., Image acquisition forensics: forensic
    analysis to identify imaging source. IEEE International Conference on Acoustic, Speech,
    and Signal Processing, Las Vegas, NV, March 2008.
21. Mahdian, B., Saic, S., Blind authentication using periodic properties of interpolation. IEEE
    Transactions on Information Forensics and Security, 3(3), pp. 529-538, 2008.
22. Kirchner, M., Fast and reliable resampling detection by spectral analysis of fixed linear
    predictor residue. In Proc. 10th ACM Workshop on Multimedia and Security, Oxford, UK,
    pp. 11-20, September 2008.
23. Farid, H., Exposing digital forgeries from JPEG ghosts. IEEE Transactions on Information
    Forensics and Security, 4(1), pp. 154-160. 2009.
24. Bayram, S., Sencar, H.T., Memon, N., An Efficient and Robust Method For Detecting
    Copy-Move Forgery. In Proc. of IEEE International Conference on Acoustics, Speech and
    Signal Processing, ICASSP 2009, 2009.


                                                                                              8