<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Accessed</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Identifying Consumers' Arguments in Text</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jodi Schneider</string-name>
          <email>jodi.schneider@deri.org</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adam Wyner</string-name>
          <email>adam@wyner.info</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Liverpool</institution>
          ,
          <addr-line>Liverpool</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Digital Enterprise Research Institute, National University of</institution>
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2012</year>
      </pub-date>
      <volume>201</volume>
      <fpage>2</fpage>
      <lpage>07</lpage>
      <abstract>
        <p>Product reviews are a corpus of textual data on consumer opinions. While reviews can be sorted by rating, there is limited support to search in the corpus for statements about particular topics, e.g. properties of a product. Moreover, where opinions are justified or criticised, statements in the corpus indicate arguments and counterarguments. Explicitly structuring these statements into arguments could help better understand customers' disposition towards a product. We present a semi-automated, rule-based information extraction tool to support the identification of statements and arguments in a corpus, using: argumentation schemes; user, domain, and sentiment terminology; and discourse indicators.</p>
      </abstract>
      <kwd-group>
        <kwd>argumentation schemes</kwd>
        <kwd>information extraction</kwd>
        <kwd>product reviews</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Product reviews such as found on Amazon or eBay represent a source of data on
consumer opinions about products. Current online tools allow reviews to be sorted by star
rating and comment threads. Yet, there is no support to search through the data for
statements about particular topics, e.g. properties of the product. Such statements are
distributed throughout the corpus, making it difficult to gain a coherent view. Moreover,
reviewers justify their opinions as well as support or criticise the opinions of others; that
is, reviewers provide arguments and counterarguments. Extracting data about particular
topics and structuring it into arguments would be informative: it could help producers
better understand consumers’ disposition to their products; and, it could help consumers
make sense of the product options, as reported in the reviews, and so then decide what
to buy. We present an information extraction tool to support the extraction of arguments
from reviews, using: user, domain, and sentiment terminology; discourse indicators;
and argumentation schemes.</p>
      <p>To set the context, consider the information in the reviews from the point of view
of a consumer or manufacturer. We use product reviews from the Amazon consumer
web site about buying a camera as a use case. From the consumer side, suppose a photo
enthusiast wants to buy a new camera that gives quality indoor pictures. The enthusiast
consults a shopping website and reads the reviews of camera models. The information
in product reviews about this topic is dispersed through a number of reviews, using
different terminology, and expressing opinions on different sides of the situation. So,
currently, the enthusiast must read through the reviews, keeping in mind the relevant
statements, organising and relating them; this is a difficult task. Instead, the enthusiast
would like all statements bearing on the camera’s indoor picture quality to be reported
and sorted according to whether the statement supports the claim that the camera gives
quality indoor pictures or supports the claim that it does not. Moreover, it is not
sufficient for the enthusiast to be provided with one ‘layer’ of the argument, since those
statements which support or criticise the claim may themselves be subject to support or
criticism. From the manufacturer’s side, there is a related problem since she wishes to
sell a product to a consumer. Looking at the reviews, the manufacturer must also extract
information about specific topics from the corpus and structure the information into a
web of claims and counterclaims. With this information, the manufacturer could have
feedback about the features that the consumer does or doesn’t like, the problems that
the consumer experiences, as well as the proposed solutions.</p>
      <p>There are a variety of complex issues to address. For instance, to overcome the
linearity of the corpus and terminological variation, we want a tool that searches and
extracts information from across the corpus using semantic annotations, allowing us to
find statements about the same semantic topic; searches for strings do not suffice since
the same semantic notion might be expressed with different strings. Sentiment
identifiers, which signal approval or disapproval, are relevant. Discourse markers indicate
relationships between statements, e.g. premise or claim. In addition, users argue from a
point of view: different user classes, e.g. amateurs and professionals, argue differently
about the same object.</p>
      <p>While a fully automated system to reliably extract and structure all such information
is yet in the future, we propose a semi-automated, rule-based text analytic support tool.
We first manually analyse the corpus, identifying the sorts of semantic information
to be annotated. We develop reasoning patterns, argumentation schemes, and identify
slots in these schemes to be filled. The schemes represent different aspects of how users
reason about a decision to buy a product. We structure the schemes into a decision tree,
hypothesising a main scheme which is used to argue for buying the product. This main
scheme is supported by subsidiary schemes that argue for premises of the main scheme.
In turn, the subsidiary schemes are grounded in textual information related to the user
and the representation of the product. In effect, we reverse engineer an argumentative
expert system which takes as input material from the corpus. Thus, the schemes give us
targets for information extraction in the corpus, namely, those components that can be
used to instantiate the argumentation schemes. The information extraction tool supports
the identification of relevant information to instantiate the argumentation schemes. As
a result of the analysis and instantiation, we gain a rich view on the arguments for
or against a particular decision. The novelty is that the tool systematically draws the
analyst’s attention to relevant terminological elements in the text that can be used to
ground defeasible argumentation schemes.</p>
      <p>The outline of the paper is as follows. In Section 2, we discuss our use case and
materials. Several components of the analysis are presented in Section 3: user, domain,
and sentiment terminology; and discourse indicators. The argumentation schemes that
we propose to use are given in Section 4. The tool is outlined in Section 5, followed by
sample results in Section 6. Related work is discussed in Section 7, and we conclude in
Section 8 with some general observations and future work.</p>
    </sec>
    <sec id="sec-2">
      <title>Use Case and Materials</title>
      <p>As a use case, we take reviews about buying the (arbitrarily chosen) Canon PowerShot
SX220 HS Digital Camera from the Amazon UK e-commerce website3, where a very
typical question is: Which camera should I buy? There are 99 reviews in our corpus,
distributed as shown in Table 1.</p>
      <p>In these product reviews, many topics are discussed. By careful reading and
analysis, we find comments about cameras such as their features and functions. Further,
accessories, such as memory cards, batteries, and cases are also discussed – both with
regard to their necessity or utility, and their suitability. The brand reputation and
warranty are discussed. Users also give conditions of use – recommendations for who the
camera would or would not suit, and warnings and advice about how to get the best
results. These incorporate the purpose or context in which the camera is or could be used
(e.g. “traveling”) or values that the camera fulfils (e.g.“portable”). Users also give clues
to their own experience and values, by talking about how they evaluated the camera,
their experience with photography, or personal characteristics (e.g. “ditzy blonde”).</p>
      <p>Point of view is key to making sense of the overall discussion. For subjective
aspects, the impact of a statement may depend on the extent to which consumers share
values and viewpoints. Such qualitative aspects of the reviews are not captured by
quantitative measures of the discussion since the most popular comment may not advance the
analysis with respect to that user or may only sway individuals who are susceptible to
popular opinion. Given this, we focus on representing justifications and disagreements
with respect to classes of users.</p>
      <p>In the course of our manual examination of the corpus, we identified five
“components” of an analysis: several consumer argumentation schemes; a set of discourse
indicators, and user, domain, and sentiment terminology. The user and domain terminology
are used to instantiate the schemes, while the discourse indicators and sentiment
terminology structure the interrelationships between the statements within a scheme (e.g.
premises, claim, and exception) and between schemes (e.g. disagreement). We begin by
discussing the last four components, then turn to argumentation schemes in Section 4.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Components of Analysis</title>
      <p>The objective of information extraction in our context is to extract statements about a
topic (e.g. a camera takes good pictures indoors) and structure them into arguments for
(e.g. justifications for this claim) or against it (counterclaims and their justifications).</p>
      <p>https://www.amazon.co.uk/product-reviews/</p>
      <p>In the following, we briefly outline the components of our analysis, which are
implemented in the tool discussed in Section 5. In our approach, we identify a
terminological pool that helps us investigate the source text material for relevant passages; thus,
we presume that we can search throughout the corpus to instantiate an argumentation
scheme using the designated terminology.</p>
      <p>In our approach to analysis of the source material, we have presumed that in the
context of product reviews, contributors are trying to be as helpful, informative, and
straightforward as possible, so the interpretation of language is at face value. In other
contexts, problematic, interpretive aspects of subjectivity may arise, e.g. irony or
sarcasm, which require significant auxiliary, extra-textual knowledge to accurately
understand. For our purposes, we do not see irony or sarcasm as a significant problem as we
can rely on the normative reading of the text that is shared amongst all readers.
Camera Domain We have terminology from the camera domain that specifies the
objects and properties that are relevant to the users. Analysing the corpus, consumer report
magazines (e.g. Which?), and a camera ontology4, we identified some of the prominent
terminology. These refer both to parts of the camera (e.g. lens, li-ion battery) as well
as its properties (e.g. shutter speed). While users may dispute particular factual matters
about a camera, these remain objective aspects about the camera under discussion.
User Domain Users discuss topics relative to their point of view, knowledge, and
experience. This introduces a subjective aspect to the comments. For instance, whether
an amateur finds that that a particular model of camera takes very poor pictures
indoors may not agree with an expert who finds that the same model takes good pictures
indoors; each is evaluating the quality of the resulting pictures relative to their own
parameter of quality and experience with camera settings. To allow such user-relative
judgements, we introduce user terminology bearing on a user’s attributes (e.g. age),
context of use (e.g. travel), desired camera features (e.g. weight), quality expectations
(e.g. information density), and social values (e.g. prestige).</p>
      <p>
        Discourse Indicators Discourse indicators express discourse relations within or
between statements [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and help to organise statements into larger scale textual units such
as an argument. The analysis of discourse indicators and relations is complex: there
many classes of indicators, multiple senses for instances of indicators depending on
context, and implicit discourse relations. However, in this study, we keep to a closed
class of explicit indicators that signal potentially relevant passages; it remains for the
analyst to resolve ambiguities in context.
      </p>
      <p>
        Sentiment Terminology We use sentiment terminology that signals lexical semantic
contrast: The flash worked poorly is the semantic negation of The flash worked flawlessly,
where poorly is a negative sentiment and flawlessly is a positive sentiment. An extensive
list of terms is classified according to a sentiment scale from highly negative to highly
positive [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Text analytic approaches to sentiment analysis are well-developed, but for
our purposes we take this relatively simple model to integrate with other components.
      </p>
      <p>In the following, we provide argumentation schemes that use the camera and user
terminology. The discourse indicators and sentiment terminology are only used in the
tool to identify relevant passages to instantiate the schemes.
4 http://www.co-ode.org/ontologies/photography/</p>
    </sec>
    <sec id="sec-4">
      <title>Argumentation schemes</title>
      <p>
        Argumentation schemes represent prototypical patterns of defeasible reasoning [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
They are like logical syllogisms in that they have premises, an implicational rule (e.g.
If....Then....), and a conclusion that follows from the premises and rule. Moreover, they
can be linked as in proof trees. Yet, unlike classical syllogisms, the conclusion only
defeasibly follows since the rule or the conclusion may not hold. Argumentation schemes
have been formalised [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and can be used for abstract argumentation [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Example
schemes include practical reasoning, expert opinion, and analogy. However, schemes
are not widely used to support text analysis, are not tied to user terminology, and not
usually tied to some particular domain. This paper makes progress in addressing these
issues. In this section we develop a number of argument schemes found in customer
reviews, based on manual review of the corpus. Our approach is to remain grounded in
the source, and to choose example schemes based on their relevance to arguing for or
against purchase of the product. In this way, the schemes give us targets for information
extraction in the corpus: in particular, the targets are those textual passages that can be
used to instantiate the argumentation schemes.
4.1
      </p>
      <sec id="sec-4-1">
        <title>Argumentation Schemes - Abstract</title>
        <p>We present the schemes propositions with variables such as aP1; the list of premises is
understood to hold conjunctively and the conclusion follows; the rule is left implicit.
User Classification With this scheme, we reason from various attributions to a user to
the class of the user. This scheme is tied to the particular data under consideration, but
could be generalised. We have a variety of users such as amateur or professional.</p>
        <p>User Classification Argumentation Scheme (AS1)
1. Premise: Agent x has user’s attributes aP1; aP2; :::.
2. Premise: Agent x has user’s context of use aU1; aU2; :::.
3. Premise: Agent x has user’s desirable camera features aF1; aF2; :::.
4. Premise: Agent x has user’s quality expectations aQ1; aQ2; :::.
5. Premise: Agent x has user’s values aV1; aV2; :::.
6. Premise: User’s desirable camera features aF1; aF2; ::: promote/demote user’s
values aV1; aV2; :::.</p>
        <p>Conclusion: Agent x is in class X.</p>
        <p>Camera Classification We have a scheme for classifying the camera. Note that we have
distinguished a user’s context of use from a camera’s context of use (and similarly for
other aspects); in a subsequent scheme (AS3), these are correlated.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Camera Classification Argumentation Scheme (AS2)</title>
        <p>1. Premise: Camera y has camera’s context of use cU1; cU2; :::.
2. Premise: Camera y has camera’s available features cF1; cF2; :::.
3. Premise: Camera y has camera’s quality expectations cQ1; cQ2; :::.</p>
        <p>Conclusion: Camera y in class Y.</p>
        <p>Combining Schemes for Camera Evaluation To reason about the camera and the course
of action, we use some ontological reasoning, i.e. the class of the camera and of the
user, plus argumentation. Given that a user is in class X with certain requirements and a
camera is in class Y with certain features, and the features meet the requirements, then
that camera is appropriate. The argument that conjoins the user and camera schemes
works as a filter on the space of possible cameras that are relevant to the user. We
realise this as follows.</p>
      </sec>
      <sec id="sec-4-3">
        <title>Appropriateness Argumentation Scheme (AS3)</title>
        <p>1. Premise: Agent x is in user class X.
2. Premise: Camera y is in camera class Y.
3. Premise: The camera’s contexts of use satisfy the user’s context of use.
4. Premise: The camera’s available features satisfy the user’s desireable features.
5. Premise: The camera’s quality expectations satisfy the user’s quality expectations.</p>
        <p>Conclusion: Cameras of class Y are appropriate for agents of class X.</p>
        <p>Premises (1) and (2) of the appropriateness scheme (AS3) are the conclusions of
the user (AS1) and consumer (AS2) classification schemes, respectively. The other
premises (3)-(5) have to be determined by subsidiary arguments which nonetheless
ground variables in the same way (in Logic Programming terms, the variables are
unified). Each of these subsidiary schemes have a similar form, where premises correlate
elements from AS1 and AS2 and conclude with one of the premises of (3)-(5). The
redundancy ensures that the variables match across schemes. We leave such intermediary
schemes as an exercise for the reader.</p>
        <p>
          Practical Reasoning The objective of reasoning in this case is for the user to decide
what camera to buy. The reasoning is based on the user and the camera. This information
is then tied to the decision to buy the camera. Since reasoning about the camera relative
to the user is addressed elsewhere in the reasoning process, our scheme (AS4) is a
simplification of [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
      </sec>
      <sec id="sec-4-4">
        <title>Consumer Relativised Argumentation Scheme (AS4)</title>
        <p>1. Premise: Cameras of class Y are appropriate for agents of class X.
2. Premise: Camera y is of class Y.
3. Premise: Agent x is of class X.</p>
        <p>Conclusion: Agent x should buy camera y.</p>
        <p>The important point is that if the class of the camera and user do not align, or if there
are counterarguments to any of the premises or conclusions, then the conclusion from
AS4 would not hold.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Components of the Tool</title>
      <p>
        To build an analytic tool to explore and extract arguments, we operationalise the
components needed to recognise in the text some of the relevant elements identified in
Section 4. In this section, we briefly describe the relevant aspects of the General
Architecture for Text Engineering (GATE) framework [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and samples of how we operationalise
the components. In Section 6, we show results of sample queries.
GATE is a framework for language engineering applications, which supports efficient
and robust text processing [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]; it is an open source desktop application written in Java
that provides a user interface for professional linguists and text engineers to bring
together a wide variety of natural language processing tools and apply them to a set of
documents. The tools are formed into a pipeline (a sequence of processes) such as a
sentence splitter, tokeniser, part-of-speech tagger, morphological analyser, gazetteers,
Java Annotation Patterns Engine (JAPE) rules, among other processing components.
For our purposes, the important elements of the tool to emphasise are the gazetteers
and JAPE rules: a gazetteer is a list of words that are associated with a central concept;
JAPE rules are transductions that take annotations and regular expressions as input and
produce annotations as output. Our methodology in using GATE is described elsewhere
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and in this paper, we focus just on the key relevant elements - the gazetteers and
JAPE rules.
      </p>
      <p>
        Once a GATE pipeline has been applied to a corpus, we can either view the
annotations of a text by using the ANNIC (ANNotations In Context) corpus indexing and
querying tool [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] or view them in situ in a whole text. We illustrate both.
5.2
      </p>
      <sec id="sec-5-1">
        <title>Gazetteers and JAPE Rules</title>
        <p>In section 4, we presented terminology for discourse indicators and the camera domain.
The terminology is input to text files such as cameraFeatures.lst for terms relating to the
camera domain and conclusion.lst for terms that may indicate conclusions. The lists are
used by a gazetteer that associates the terms with a majorType such as cameraproperty
or conclusion. JAPE rules convert these to annotations that can be visualised and used
in search. For example, suppose a text has a token term “lens” and GATE has a gazetteer
list with “lens” on it; GATE finds the string on the list, then annotates the token with
majorType as cameraproperty; we convert this into an annotation that can be visualised
or searched for such as CameraProperty. A range of terms that may indicate conclusions
are all annotated with Conclusion. We can also create annotations for complex concepts
out of lower level annotations. In this way, the gazetteer provides a cover concept for
related terms that can be queried or used by subsequent annotation processes.</p>
        <p>In the implementation, we have gazetteer lists for camera domain terminology and
for user domain terminology, one list each for conclusions, premises, and contrast, and
a range of sentiment terminology lists. Samples of the lists (with number of items) are:
conclusion.lst (26): be clear, consequent, consequently, deduce, deduction, ....
cameraFeatures.lst (130): 14X Optical Zoom, action shots, AF tracking, ....
posThree.lst (172): astound, best, excellent, splendid, ....
userContextOfUse (32): adventure, ambient light indoors, astronomy photos, ....</p>
        <p>In the next section, we show sample results.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Sample Results</title>
      <p>To identify passages that can be used to instantiate the argumentation schemes, we use
ANNIC searches to investigate the entire corpus. Figure 1 shows a result of a search for
negative sentiment, followed by up to 5 tokens, followed by a user context; the search
returns six different strings that match the annotation pattern.</p>
      <p>We can also look at annotations in situ in a text. Figure 2 shows one review
document, with a variety of annotation types, where different highlights indicate different
annotation types (differentiated with colour in the original); from this review we extract
instantiations for the user and camera schemes. This passage makes the argument that
the camera is not appropriate since the user’s context of use – baby pictures – does not
match the camera context of use. In other words, we use the annotations to instantiate
the two schemes below.</p>
      <p>User Classification Argumentation Scheme - Baby Picture Reviewer
1. Premise: Agent x has user attributes: little experience.
2. Premise: Agent x has constraints: single camera.
3. Premise: Agent x has context of use portrait.
4. Premise: Agent x has user’s desirable camera features easy to hold, flash doesn’t
require user attention, zoom.
5. Premise: Agent x has quality expectations good pictures of pale objects, good
pictures of objects that don’t have contrast.
6. Premise: Agent x has values good reviews, photo quality, photo detail.
7. Conclusion: Agent x is in class Novice.</p>
      <sec id="sec-6-1">
        <title>Camera Classification Argumentation Scheme (AS2) - Baby Picture Reviewer</title>
        <p>1. Premise: Camera y has camera’s context of use daylight.
2. Premise: Camera y has camera’s available features zoom, flash.
3. Premise: Camera y has camera’s quality expectations annoying flash, amazing for
bright colours, poor when colours do not contrast (people, pale objects), good
quality with zoom, good detail with zoom.</p>
        <p>Conclusion: Camera Canon PowerShot SX220 in class daylight, contrast-oriented,
zoom camera.</p>
        <p>One argument against the above camera classification is given by another reviewer:
“This camera takes amazing low light photos...”. Based on the full text of that review, we
can instantiate the camera classification argumentation scheme differently, as follows:</p>
      </sec>
      <sec id="sec-6-2">
        <title>Camera Classification Argumentation Scheme (AS2) - Great low light</title>
        <p>1. Premise: Camera y has camera’s context of use video, photos.
2. Premise: Camera y has camera’s available features HD video recording, screen,
zoom, flash, colour settings.
3. Premise: Camera y has camera’s quality expectations lens shadow, awkward flash
location, vibrant colours.</p>
        <p>Conclusion: Camera Canon PowerShot SX220 in class video, general photo
camera.</p>
        <p>This shows some advantages of argumentation schemes. First of all, they can help
an analyst make explicit the points of contention between reviews. The reviews disagree
on the camera’s quality expectations: this particular disagreement could not easily be
discovered statistically from the text. Second, we can separate out different levels of
subjective information to be found in the reviews. The user classification scheme
separates the purely subjective information that cannot be attacked from the camera
classification scheme, which can be fruitfully attacked. Further, by classifying cameras and
users, an entire line of reasoning follows: we only need to instantiate those two schemes.</p>
        <p>Some issues do arise, and will need to be considered in future work. First, we cannot
always instantiate some premises. For example, users may not indicate user attributes
or constraints in a review. In that situation, presumptive values could be used, or found
elsewhere in the corpus. Second, there are other arguments and counterarguments that
are made. For instance, some reviews suggest ways of dealing with the popup flash so
that it’s not annoying, making the camera more comfortable to use indoors. To handle
more types of arguments and counterarguments, we will want to develop further
argumentation schemes. Some negative implications depend on a deeper analysis of the
camera domain, for instance: “You need to learn all functions in order to shoot really
good photos.” or “People look either washed out or with a flat looking red/orange
complexion.” Other arguments, such as arguments from expertise, are common, and should
be analysed further to provide support for information extraction.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Related Work and Discussion</title>
      <p>
        In this section, we outline related work, which includes opinion and review mining, user
preferences, and ontological approaches, and use of argumentation. What makes our
proposal novel and unique is the combination of rule-based text analytics, user models,
and defeasible argumentation schemes, which together highly structure the
representation of information from the source materials. In previous work we have introduced
argumentation schemes for understanding evaluative statements in reviews as arguments
from a point of view [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Our earlier, preliminary implementation, used a single
argumentation scheme [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]; this paper extends that work by implementing user terminology
and increasing the specification of camera terminology, and by using a cascade of
argumentation schemes, where the conclusions of two schemes are the premises of the
appropriateness scheme.
      </p>
      <p>
        Opinion and Review Mining Existing work includes review mining – information
extraction using sentiment terminology [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] – and feature extraction of pros and cons
[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Matching customers to the most appropriate product based on the heterogeneity
of customer reviews, rather than just statistical summaries, is an important problem;
Zhang et al. develop sentiment divergence metrics, finding that the central tendency or
polarity of reviews is insufficient [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. Our goals, in matching customers to products
by distinguishing views based on a customer profile, are similar; unlike that study, we
focus on textual analysis, rather than statistical summarization of the text.
User Preferences Case-based reasoning has been used to incorporate critique-based
feedback and preference-based feedback into recommendation systems. [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. To
predict ratings in Chinese-language restaurant reviews, Liu et al. model how frequently
users comment on features (‘concern’) and how frequently they rate features lower than
average (‘requirement’) in order to predict ratings [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. Rather than inferring user
preferences from multiple reviews written by a user, we extract user information from a
single review; although some personal information (such as the user demographics) is
consistent across items in different departments (such as books, movies, consumer
electronics, clothing, etc.), the key information about the user is that related to the product,
which depends on the category, and in some cases on the item being purchased. For
instance, preferences about an item having a flash or a viewfinder are not universal
amongst consumer electronics, but apply mainly to cameras.
      </p>
      <p>
        Ontology-related approaches Yu et al. automatically construct a hierarchical
organization for aspects from product reviews and domain knowledge [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. This approach could
be used to further enhance our extraction systems, and there are available tools in GATE
to support this: OwlExporter is a GATE plugin for populating OWL ontologies via NLP
pipelines [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]; KIM uses an ontology and knowledge base to add semantic annotations
based on information extraction [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ].
      </p>
      <p>
        Argumentation Argumentation schemes have been used as a theoretical framework for
reviews [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. Another closely related problem is argumentation mining – using natural
language processing to detect disagreement [
        <xref ref-type="bibr" rid="ref11 ref12 ref13">11,12,13</xref>
        ] or stance [
        <xref ref-type="bibr" rid="ref14 ref15">14,15</xref>
        ].
      </p>
    </sec>
    <sec id="sec-8">
      <title>Conclusions and Future Work</title>
      <p>We have presented an information extraction tool that supports the identification of
relevant information to instantiate argumentation schemes, by annotating discourse
indicators as well as user, domain, and sentiment terminology. Textual fragments are
associated with annotation types, highlighting the role the text may play in instantiating
an argumentation scheme. As we can identify positive and negative sentiment, we can
find statements that contribute to arguments for or against other statements. The novelty
of our proposal is the combination of rule-based text analytics, terminology for various
particular components of the analysis, and defeasible argumentation schemes, which
together highly structure the representation of information from the source materials.
As a result of the analysis and instantiation, we can provide a rich, articulated analysis
of the arguments for or against a particular decision.</p>
      <p>In future work, we plan to further instantiate the schemes using the tool, noting
where they work as intended and where they stand to be improved. Along with this,
conceptual issues will be addressed, for instance to clarify distinctions between the
camera’s quality expectations and features as well as to support matches between a
user’s values and camera properties. We will develop additional schemes bearing on, for
example, expertise, comparison, or particular features (e.g. warranties). An evaluation
exercise will be carried out using a web-based annotation editor and evaluation tool,
GATE Teamware, to measure the extent of interannotator agreement on the annotation
types. Important logical developments would be an ontology for users and cameras that
would support text extraction and import of scheme instances into an argumentation
inference engine to test inferences.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgements</title>
      <p>The first author’s work was supported by Science Foundation Ireland under both Grant
No. SFI/09/CE/I1380 (L´ıon2) and a Short-term Travel Fellowship. The second author
gratefully acknowledges support by the FP7-ICT-2009-4 Programme, IMPACT Project,
Grant Agreement Number 247228. The views expressed are those of the authors.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Webber</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Egg</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kordoni</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Discourse structure and language technology</article-title>
          .
          <source>Natural Language Engineering (December</source>
          <year>2011</year>
          )
          <article-title>Online first</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Nielsen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <article-title>A˚.: A new ANEW: Evaluation of a word list for sentiment analysis in microblogs</article-title>
          .
          <source>Making Sense of Microposts at ESWC</source>
          <year>2011</year>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Walton</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reed</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macagno</surname>
            ,
            <given-names>F.: Argumentation</given-names>
          </string-name>
          <string-name>
            <surname>Schemes</surname>
          </string-name>
          . Cambridge Univ. Press (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Wyner</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Atkinson</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bench-Capon</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>A functional perspective on argumentation schemes</article-title>
          .
          <source>In Proceedings of the 9th International Workshop on Argumentation in MultiAgent Systems (ArgMAS</source>
          <year>2012</year>
          ).
          <article-title>(</article-title>
          <year>2012</year>
          )
          <fpage>203</fpage>
          -
          <lpage>222</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Prakken</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          :
          <article-title>An abstract framework for argumentation with structured arguments</article-title>
          .
          <source>Argument and Computation</source>
          <volume>1</volume>
          (
          <issue>2</issue>
          ) (
          <year>2010</year>
          )
          <fpage>93</fpage>
          -
          <lpage>124</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Wyner</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schneider</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Atkinson</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bench-Capon</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Semi-automated argumentative analysis of online product reviews</article-title>
          .
          <source>In: Proceedings of the 4th International Conference on Computational Models of Argument (COMMA</source>
          <year>2012</year>
          ).
          <article-title>(</article-title>
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Cunningham</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maynard</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bontcheva</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tablan</surname>
          </string-name>
          , V.:
          <article-title>GATE: A framework and graphical development environment for robust NLP tools and applications</article-title>
          .
          <source>In: Proceedings of the Association for Computational Linguistics (ACL'02)</source>
          . (
          <year>2002</year>
          )
          <fpage>168</fpage>
          -
          <lpage>175</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Wyner</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>On rule extraction from regulations</article-title>
          .
          <source>In Legal Knowledge and Information Systems - JURIX</source>
          <year>2011</year>
          , IOS Press (
          <year>2011</year>
          )
          <fpage>113</fpage>
          -
          <lpage>122</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Aswani</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tablan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bontcheva</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cunningham</surname>
          </string-name>
          , H.:
          <article-title>Indexing and querying linguistic metadata and document content</article-title>
          .
          <source>In: Proceedings of 5th International Conference on Recent Advances in Natural Language Processing</source>
          , Borovets, Bulgaria (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Opinion mining and sentiment analysis</article-title>
          .
          <source>Foundations and Trends in Information Retrieval</source>
          <volume>2</volume>
          (
          <issue>1</issue>
          -2) (
          <year>January 2008</year>
          )
          <fpage>1</fpage>
          -
          <lpage>135</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Albert</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amgoud</surname>
          </string-name>
          , L.,
          <string-name>
            <surname>de Saint-Cyr</surname>
            ,
            <given-names>F.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saint-Dizier</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Costedoat</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Introducing argumention in opinion analysis: Language and reasoning challenges</article-title>
          .
          <source>In: Sentiment Analysis where AI meets Psychology (SAAIP at IJCNLP'11)</source>
          . (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Saint-Dizier</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Processing natural language arguments with the &lt;TextCoop&gt;platform</article-title>
          .
          <source>Argument &amp; Computation</source>
          <volume>3</volume>
          (
          <issue>1</issue>
          ) (
          <year>2012</year>
          )
          <fpage>49</fpage>
          -
          <lpage>82</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Wyner</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mochales-Palau</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moens</surname>
            ,
            <given-names>M.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Milward</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Approaches to text mining arguments from legal cases</article-title>
          .
          <source>In Semantic Processing of Legal Texts. Volume 6036 of Lecture Notes in Computer Science</source>
          . Springer (
          <year>2010</year>
          )
          <fpage>60</fpage>
          -
          <lpage>79</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Abbott</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walker</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anand</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tree</surname>
            ,
            <given-names>J.E.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bowmani</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>King</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>How can you say such things?!?: Recognizing disagreement in informal political argument</article-title>
          .
          <source>In: Proceedings of the NAACL HLT</source>
          <year>2011</year>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Walker</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anand</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abbott</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tree</surname>
            ,
            <given-names>J.E.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martell</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>King</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>That's your evidence?: Classifying stance in online political debate. Decision Support Systems (</article-title>
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Mann</surname>
            ,
            <given-names>W.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thompson</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          :
          <article-title>Rhetorical structure theory: Toward a functional theory of text organization</article-title>
          .
          <source>Text-Interdisciplinary Journal for the Study of Discourse</source>
          <volume>8</volume>
          (
          <issue>3</issue>
          ) (
          <year>1988</year>
          )
          <fpage>243</fpage>
          -
          <lpage>281</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Somasundaran</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiebe</surname>
          </string-name>
          , J.:
          <article-title>Recognizing stances in ideological on-line debates</article-title>
          .
          <source>In: Proceedings of the NAACL HLT</source>
          <year>2010</year>
          , (
          <year>2010</year>
          )
          <fpage>116</fpage>
          -
          <lpage>124</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Wyner</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schneider</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Arguing from a point of view</article-title>
          . In: First International Conference on Agreement Technologies. AT '
          <volume>12</volume>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          “
          <article-title>Opinion Mining and Sentiment Analysis</article-title>
          .
          <source>” In: Web Data Mining</source>
          . Springer, (
          <year>2011</year>
          )
          <fpage>459</fpage>
          -
          <lpage>526</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Deciphering word-of-mouth in social media: Text-based metrics of consumer reviews</article-title>
          .
          <source>ACM Trans. Manage. Inf. Syst</source>
          .
          <volume>3</volume>
          (
          <issue>1</issue>
          ) (
          <year>April 2012</year>
          )
          <volume>5</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          :
          <fpage>23</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Smyth</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Case-based recommendation</article-title>
          .
          <source>The Adaptive Web. Volume 4321 of Lecture Notes in Computer Science</source>
          . Springer (
          <year>2007</year>
          )
          <fpage>342</fpage>
          -
          <lpage>376</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>He</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Song</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Du</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Combining user preferences and user opinions for accurate recommendation</article-title>
          .
          <source>Electronic Commerce Research and Applications</source>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zha</surname>
            ,
            <given-names>Z.J.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chua</surname>
            ,
            <given-names>T.S.S.</given-names>
          </string-name>
          :
          <article-title>Domain-assisted product aspect hierarchy generation: towards hierarchical organization of unstructured consumer reviews</article-title>
          .
          <source>In: Proceedings of EMNLP '11</source>
          (
          <year>2011</year>
          )
          <fpage>140</fpage>
          -
          <lpage>150</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Witte</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khamis</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rilling</surname>
          </string-name>
          , J.:
          <article-title>Flexible ontology population from text: The OWL exporter</article-title>
          .
          <source>In: LREC</source>
          <year>2010</year>
          .
          <article-title>(</article-title>
          <year>2010</year>
          )
          <fpage>3845</fpage>
          -
          <lpage>3850</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Popov</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kiryakov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kirilov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ognyanoff</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goranov</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>KIM semantic annotation platform</article-title>
          .
          <source>In The Semantic Web - ISWC 2003. Volume 2870 of Lecture Notes in Computer Science</source>
          . Springer (
          <year>2003</year>
          )
          <fpage>834</fpage>
          -
          <lpage>849</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Heras</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Atkinson</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Botti</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grasso</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , Julia´n, V.,
          <string-name>
            <surname>McBurney</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Applying argumentation to enhance dialogues in social networks</article-title>
          .
          <source>In: CMNA</source>
          <year>2010</year>
          .
          <article-title>(</article-title>
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Sporleder</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lascarides</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Using automatically labelled examples to classify rhetorical relations: A critical assessment</article-title>
          .
          <source>Natural Language Engineering</source>
          <volume>14</volume>
          (
          <issue>3</issue>
          ) (
          <year>2008</year>
          )
          <fpage>369</fpage>
          -
          <lpage>416</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>