<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Assessing Content Diversity in Medical Weblogs?</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>L3S Research Center</institution>
          ,
          <addr-line>Appelstr. 9a, 30167 Hannover</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we are considering weblogs focusing on medicine and health. Given this general topic, posts can deal with diseases, medical treatments, medications and the like to which in turn di®erent aspects can be considered. Within a diversity-aware medical search engine knowledge about this kind of diversity can support grouping of search results into diversity dimensions covered by a text. Therefore, a method is introduced for studying topic and content diversity. The approach bases on information extraction technologies and domain knowledge and is applied to a set of medical weblog posts. The diversity of topics described in this dataset will be studied in more detail.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Weblogs, or blogs, have become a popular way to share experiences and
information, to engage in discussions and to form communities. In this paper, we
are considering medical weblogs, i.e., blogs whose topical focus is health and
medicine. The medical blogging community consists of healthcare professionals
writing about their daily practice and current issues related to medicine on the
one hand, and of patients providing information about health related issues and
experiences on living with medical conditions on the other hand. Therefore, the
diversity of content varies a lot. It is of particular interest to ¯nd possibilities to
automatically analyse the content diversity in these blog posts to enable better
search and retrieval facilities. Consider the following scenario:</p>
      <p>A person searching with the query breast cancer might be interested in the
disease itself, in possible treatments, in medications and so on. Other dimensions
are for example the content type (e.g., experience vs. information), the author
(e.g., physician, patient) or the polarity. A user might be interested in experiences
of persons living with breast cancer, or in experiences of physicians who treat
patients su®ering from breast cancer etc.</p>
      <p>Current medical weblog search engines such as Medlogs or Medworm1 only
list search results matching query keywords in a °at list. Sometimes, results can
be restricted to posts of speci¯c author groups (e.g., physician, patients). But,
additional content dimensions such as aspects considered or expressed sentiment
? This work is funded by the EU under 231126
1 http://www.medlogs.com, http://www.medworm.com
remain hidden in the posts. Having methods in hand that allow to analyse and
detect the di®erent dimensions would help to present search results according to
these dimensions. The work presented in this paper targets towards analysis of
diversity in medical weblog posts. The focus is on analysing the diversity of
content, in particular the topic diversity and the diversity of the aspect considered.</p>
      <p>The remainder of the paper is structured as follows. Section 2 presents related
work. Then, we give an overview on relevant diversity dimensions in medical texts
(section 3). Then, methods and measures to study topic diversity in medical texts
are introduced (section 4). This is applied to a real world data set in section 5.
The paper ¯nishes with conclusions and remarks on future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Diversity of search results in text retrieval has been considered as problem of
result diversi¯cation, i.e., ¯nding the right balance between having more relevant
results of the 'correct' intent and having more diverse results in the top positions.
Existing approaches to this problem combine measures of diversity and similarity
to improve the recommendation diversity. In order to improve user satisfaction,
the top N search results are either ranked by diversity [
        <xref ref-type="bibr" rid="ref1 ref4 ref7 ref9">1, 4, 9, 7</xref>
        ] or diversi¯ed by
clustering them according to the di®erent diversity dimensions covered [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>Clustering of search results is be done within the search engines Newssift and
Fairspin2. Within Newssift, content from major news and business sources are
grouped into high-level categories such as Business Topic, Organizations, Place,
Person and Theme. It leverages semantic technology, but relies also on manual
work. FairSpin collects all the latest news and opinion from across the Web and
organizes them by political bias based on community votes.</p>
      <p>
        In faceted classi¯cation, a set of category hierarchies is built [
        <xref ref-type="bibr" rid="ref11 ref8">8, 11</xref>
        ]. These
capture the di®erent facets, i.e., dimensions or features, relevant to the collection.
Facets can for example be derived based on WordNet or Wikipedia as it is shown
be [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The work presented in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] describes several faceted search algorithms and
employs collaborative ¯ltering and personalization techniques to customize the
search interface to each user.
      </p>
      <p>In contrast to existing work, we intend to determine and analyse diversity in
more detail and automatically. Instead of providing hierarchical browsing
facilities, a potential application would be the grouping of search results according to
diversity dimensions. Presenting di®erent topical aspects to a user would enable
him to see di®erent aspects of his query at the same instant and to get deeper
insights into all the facets of the topic under consideration. Furthermore, we are
focusing on medical weblog posts since this is a very interesting domain from
which many people can bene¯t. For the medical domain sophisticated ontologies
exist. We will show how these can support analysis of topic diversity.</p>
      <p>
        In this paper, we study the topic diversity of texts, considering a document
as a mixture of topics. Topic models as introduced by Blei et al. also consider
2 http://www.newssift.com, http://fairspin.org/
documents as mixture of topics [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Each topic is represented by a set of keywords
together with a probability indicating the word's contribution to the topic. In our
approach, topics are considered medical concepts that are provided by a medical
ontology. While within clustering approaches such as Latent Dirichlet Allocation
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] automatic labeling of clusters is di±cult, the use of domain knowledge helps
to label document clusters and describe topics with concrete concept names.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Diversity Dimensions in Medical Weblogs</title>
      <p>
        In this paper, we focus on processing medical weblogs written in English. A
medical weblog deals with diseases, medical treatments, medications or health care
politics, i.e., its main topic is medicine or health care. They can be di®erentiated
with regard to their author into blogs written by health care professionals and
written by non-healthcare professionals [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>Diversity in medical weblogs can be considered along several dimensions or
facets, including Time, Author, Location, Resource, Topic, Aspect considered,
Information Content, or Information Type. An example for values for these
dimensions identi¯ed for a medical post is given by Figure 1. In this paper, we
examine diversity of topic and the diversity of the aspect considered in more
detail.</p>
      <p>Given a topic T , topic diversity concerns the correlation between T and
other topics that are frequently used together with T . In this paper, a topic is
considered to be a medical concept, in particular a UMLS concept (see section
4.1) dealing with diseases, medical treatments or medications. A topic is highly
diverse, if it cooccurs with a large number of other concepts. Even if posts
have the same topic, di®erent medical aspects can be considered. While one
post rather talks about the treatment of asthma, others may rather focus on its
symptoms. We consider this diversity dimension as diversity of the aspect
considered. A post is highly diverse, when its content covers di®erent semantic
groups (e.g., symptoms, drugs, procedures).
[...] Most parents are understandably reluctant to start their kids on psychiatric
medications. While caution is indicated, I feel such a rigid rejection of their use in children
and teenagers is a mistake and may deprive your child of a therapy that could help
a lot. Used appropriately and wisely, with ongoing follow-up and psychotherapy, I
have seen such medications make a huge positive di®erence lifting a child's mood and
improving the quality of his/her life. [...]
Time: January 2005 Author: Dr. J
Resource: Blog Topic: Depression
Aspect considered: Medication Information type: experience
Information content: 50% informative, 50% a®ective</p>
      <p>Fig. 1. Example for diversity in medical texts</p>
    </sec>
    <sec id="sec-4">
      <title>Method</title>
      <p>We are now presenting a method and measures to study diversity in topic and
aspect considered. First, medical content is extracted (see section 4.1) which is
then exploited for analysing the diversity (see sections 4.2, 4.3 ). Since even in a
medical weblog, posts can be completely unrelated to health (e.g., dealing with
holiday plans or the weather), we exclude posts that are unrelated to medicine
and health in a preprocessing step using a classi¯er based on language models
(LingPipe classi¯er3).
4.1</p>
      <sec id="sec-4-1">
        <title>Extracting Medical Content</title>
        <p>
          The medical content of a post is determined by extracting medical concepts,
i.e., concepts describing diagnoses, treatments or medications. For this purpose,
an existing mapping algorithm (MMTx) is exploited. MMTx, or its java
implementation MetaMap [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] is based on natural language processing techniques and
maps natural language to concepts of the UMLS Metathesaurus.
        </p>
        <p>The Uni¯ed Medical Language System (UMLS4) is a biomedical terminology
that consists of around 1.7 Million biomedical concepts and integrates several
biomedical vocabularies such as SNOMED CT or MeSH. Each concept de¯ned in
the UMLS is assigned to at least one of the 135 speci¯ed semantic types. The
semantic types are grouped in turn into 15 main groups. The concept atrial
¯brillation belongs for example to the semantic types Finding and Pathologic
Function that in turn belong to the main group Disorders.</p>
        <p>In order to determine UMLS concepts for a text, MMTx works in several
steps. First, it parses a text into paragraphs, sentences, phrases, lexical elements
and tokens. From the resulting phrases a set of (lexical) variants is generated.
For the phrases, candidate concepts from the UMLS Metathesaurus are retrieved
and evaluated. The best candidates are organized into a ¯nal mapping in such
a way as to best cover the text. Out of the possible candidates provided by
MetaMap, the ¯rst proposal out of the highest ranked candidate set is selected
as list of extracted concepts which is exploited in our diversity study.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Determining the Diversity of the Aspect Considered</title>
        <p>The diversity of the aspect considered is studied by means of the semantic types
and main groups of the concepts extracted from a post and by applying formulas
(2) and (3). The formulas determine the proportion of di®erent semantic types
(main groups) contained in a text on the overall number of possible types (or
groups). A value close to 1 indicates a high diversity while a value close to 0
corresponds to a small diversity.</p>
        <p>In addition, we consider the concept diversity divconcept, i.e. how many
different concepts cod a post contains related to the number of extracted concepts
3 http://alias-i.com/lingpipe/
4 http://www.nlm.nih.gov/research/umls
co (Formula (1)). A value close to zero indicates that the same concepts are
used several times, i.e. the diversity of concepts is small. For example, a post
can contain only a few di®erent concepts, but these concepts belong to
di®erent semantic types and main groups, i.e., it deals with several aspects. In this
case, the concept diversity is small, but a high diversity in semantic types and
main groups would be detected. Furthermore, we calculate the diversity for
single main groups Disorders, Procedures and Chemicals and Drugs by considering
in formula (1) only the frequency of concepts of one of these three main groups.
(1) divconcept = ccood
(2) divtype = ty1p35es
(3) divgroup = groups
15
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Determining the Topic Diversity</title>
        <p>For studying diversity of topics, we have to identify relevant topics in the data
collection under consideration. We assume that a post deals with a main topic to
which in turn a set of subtopics is related. Therefore, the concept representation
of texts determined by the method described in section 4.1 is used to (1)
determine topics of posts and (2) to identify co-occurring concepts to study topical
diversity. For this purpose, we follow a three-step approach:
1. Calculation of concept frequencies per post,
2. Selection of the most frequent concept as "topic concept",
3. Identi¯cation of relevant concepts related to the topic concept.</p>
        <p>Concept frequencies are determined by calculating the number of mentions
of a concept in a text. Then, the most frequent concept within the post under
consideration is selected. We consider this concept as topic-describing concept
(also referred to as "topic") for a post. Assuming that the topic of a medical
weblog post deals with a disease, a clinical procedure or a medication, a topic
concept needs belong to one of the UMLS main groups Disorder, Procedure or
Chemicals and Drugs.</p>
        <p>Given a document collection, we receive a list of topics together with the
documents for which this topic has been determined. Documents with a joint
topic concept are considered in the next step when concept cooccurrence pairs
are determined for each topic concept. A pair is considered relevant when it
occurs at least twice in one document and in at least ten documents of this
topic. This results in a set of concept pairs for each topic concept.</p>
        <p>The concept pairs provide information on how diverse a topic is: If for one
topic a large amount of pairs can be identi¯ed within one document collection,
this topic is highly diverse. In case a topic-describing concept cooccurs only
with a few other concepts frequently, its diversity is rather low, i.e. only a few
additional aspects are of interest to this topic.
4.4</p>
      </sec>
      <sec id="sec-4-4">
        <title>Application Example</title>
        <p>Having information about di®erent diversity dimensions on hand allows for
creating more sophisticated search engines and presentation of results, including
grouping search results according to the aspect considered (diagnosis,
treatment, medication...). Consider the following scenario: A woman just diagnosed
with breast cancer enters breast cancer into her favourite search engine and
receives results grouped into the clusters disease, medications, treatment and the
like. This result structure o®ers her the opportunity to get a general
impression on the di®erent facets of the topic. She can now decide in what kind of
information she is interested most. The introduced measures of diversity can be
exploited in ranking, e.g. ranking more diverse texts higher.
5
5.1</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experiments</title>
      <sec id="sec-5-1">
        <title>Material</title>
        <p>For the analysis in this paper, a set of di®erent medical weblogs written in
English and all their posts have been crawled. The resulting data set consists of
5480 posts (patient-written (4343), physician-written (1137)). The weblogs have
been selected randomly by collecting addresses of weblogs from the two
(medical) weblog search engines Medworm and Medlogs. For comparison reasons we
decided to use as an additional data set articles from Yahoo! Encyclopedia5. 2777
articles on di®erent topics related to illnesses, treatments and drugs have been
collected from this resource. In the following sections, the results are reported
when the introduced methods are applied to these data sets.
5.2</p>
      </sec>
      <sec id="sec-5-2">
        <title>Diversity of Aspect Considered</title>
        <p>When comparing the concept diversity of the three data sets (see Table 1) it
can be seen that the diversity value of the encyclopedia data set is signi¯cantly
smaller than the one for the weblog datasets. In contrast, the diversity of
semantic types and main groups is for the encyclopedia data set much higher. We can
conclude that the concepts extracted from the encyclopedia data set belong to
more di®erent semantic types, i.e. a larger variety of thematic aspects is covered.
In the weblog datasets the considered aspects are more restricted. Nevertheless,
the values for divtype are quite small for all three data sets. This shows that from
the 135 possible UMLS semantic types only one fourth or one third is covered.</p>
        <p>The concept diversity for the categories Disorders, Procedures and Chemicals
&amp; Drugs are similar for all three data sets. In particular for the concept diversity
in Procedure concepts, only a small diversity could be ascertained. For the
Encyclopedia articles the diversity in disorder-related concepts is higher than the
one for blogs. This shows that the spectrum of covered diseases in these articles
is higher than in blog posts.
5 http://health.yahoo.com/ency/
For studying the diversity of topics in our data collections, we determine the
number of di®erent topics that is identi¯ed for each collection as well as the
average number of co-occurring concepts per topic and collection. Taking into
account the di®erent data set sizes, the largest number of di®erent topics is given
by the collection of physician-written posts (385 topics) and the encyclopedia
articles (787 topics). Nevertheless, the topics in the encyclopedia articles are
more diverse: In average 60 co-occurring concepts are determined for each topic.
For topics in patient-written posts 43 co-occurring concepts were identi¯ed and
for topics in physician-written posts only 24 concepts were related. This might
be due to the di®erent text length (encyclopedia articles are longer than weblog
posts), but is also an indicator to the larger topical diversity in encyclopedia
articles.</p>
        <p>To study the topical diversity in more detail, we selected the topic
Diabetes Mellitus, Non-Insulin-Dependent. From the patient-written blogs, 788
cooccurring concepts could be detected that belong to 97 di®erent semantic types.
In this dataset, the topic Diabetes is highly diverse - a lot of di®erent aspects are
considered. In contrast, from the Yahoo! Encyclopedia dataset 259 related topics
of 58 di®erent semantic types were extracted and only 60 related concepts of 25
di®erent types were identi¯ed in the physician-written posts. We can conclude,
that with respect to this topic, the patient-written dataset contains additional
information, i.e. covers additional aspects. Figure 2 shows some of the aspects
and concepts extracted from the physician-written posts and related to Diabetes.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>In this paper, topic and thematic diversity in medical weblogs has been
considered. We described how results of entity extraction together with a domain
ontology can be exploited for studying these aspects and applied the methods
to a data collection of medical texts. In order to apply the introduced method
to documents of other domains, the underlying domain knowledge has to be
replaced or the topics need to be discovered by alternative technologies. In future
work, we will work towards this direction to come up with a more general
approach. Nevertheless, the approach presented here can be considered as baseline
when testing other topic detection algorithms in the medical domain.</p>
      <p>Having information about topic diversity o®ers the opportunity to get
insights into relevant aspects related to a topic that can for example be presented
to a user. Furthermore, it can help to improve the retrieval of documents:
Documents that consider the same topic but di®erent aspects can be recommend to
a user.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>R.</given-names>
            <surname>Agrawal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gollapudi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Halverson</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Ieong</surname>
          </string-name>
          .
          <article-title>Diversifying search results</article-title>
          .
          <source>In WSDM '09</source>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>A.</given-names>
            <surname>Aronson</surname>
          </string-name>
          . E®
          <article-title>ective mapping of biomedical text to the umls metathesaurus: The metamap program</article-title>
          .
          <source>Proc AMIA Symp</source>
          , pages
          <volume>17</volume>
          {
          <fpage>21</fpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>D. M. Blei</surname>
            ,
            <given-names>A. Y.</given-names>
          </string-name>
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>M. I.</given-names>
          </string-name>
          <string-name>
            <surname>Jordan</surname>
            ,
            <given-names>and J.</given-names>
          </string-name>
          <string-name>
            <surname>La</surname>
          </string-name>
          <article-title>®erty. Latent dirichlet allocation</article-title>
          .
          <source>JMLR</source>
          ,
          <volume>3</volume>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>C. L.</given-names>
            <surname>Clarke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kolla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. V.</given-names>
            <surname>Cormack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Vechtomova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ashkan</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>BuÄttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation</article-title>
          .
          <source>In SIGIR '08</source>
          , pages
          <fpage>659</fpage>
          {
          <fpage>666</fpage>
          , New York, NY, USA,
          <year>2008</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>M.</given-names>
            <surname>Cohen</surname>
          </string-name>
          .
          <article-title>Family medicine meets the blogosphere. American Academy of Family Physicians. Family Practice Management Web site</article-title>
          , http://www.aafp.org/fpm,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>W.</given-names>
            <surname>Dakka</surname>
          </string-name>
          and
          <string-name>
            <given-names>P. G.</given-names>
            <surname>Ipeirotis</surname>
          </string-name>
          .
          <article-title>Automatic extraction of useful facet hierarchies from text databases</article-title>
          .
          <source>In ICDE '08</source>
          , pages
          <fpage>466</fpage>
          {
          <fpage>475</fpage>
          , Washington, DC, USA,
          <year>2008</year>
          . IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>S.</given-names>
            <surname>Gollapudi</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Sharma</surname>
          </string-name>
          .
          <article-title>An axiomatic approach for result diversi¯cation</article-title>
          .
          <source>In WWW '09</source>
          , pages
          <fpage>381</fpage>
          {
          <fpage>390</fpage>
          , New York, NY, USA,
          <year>2009</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Hearst</surname>
          </string-name>
          .
          <article-title>Clustering versus faceted categories for information exploration</article-title>
          .
          <source>Commun. ACM</source>
          ,
          <volume>49</volume>
          (
          <issue>4</issue>
          ):
          <volume>59</volume>
          {
          <fpage>61</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>C.-S. Hwang</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Kuo</surname>
            , and
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
          </string-name>
          .
          <article-title>Representative-based diversity retrieval</article-title>
          .
          <source>In ICICIC '08, page 155</source>
          , Washington, DC, USA,
          <year>2008</year>
          . IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>J. Koren</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            , and
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Personalized interactive faceted search</article-title>
          .
          <source>In WWW '08</source>
          , pages
          <fpage>477</fpage>
          {
          <fpage>486</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. E. Oren,
          <string-name>
            <given-names>R.</given-names>
            <surname>Delbru</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Decker</surname>
          </string-name>
          .
          <article-title>Extending faceted navigation for rdf data</article-title>
          .
          <source>In ISWC 2006</source>
          , pages
          <fpage>559</fpage>
          {
          <fpage>572</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>C.-N. Ziegler</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <string-name>
            <surname>McNee</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          <string-name>
            <surname>Konstan</surname>
            , and
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Lausen</surname>
          </string-name>
          .
          <article-title>Improving recommendation lists through topic diversi¯cation</article-title>
          .
          <source>In WWW '05</source>
          , New York, NY, USA,
          <year>2005</year>
          . ACM Press.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>