=Paper= {{Paper |id=Vol-1876/paper10 |storemode=property |title=Argumentation Devices in Reasoning About Health |pdfUrl=https://ceur-ws.org/Vol-1876/paper10.pdf |volume=Vol-1876 |authors=Sally Jackson,Jodi Schneider |dblpUrl=https://dblp.org/rec/conf/ijcai/JacksonS16 }} ==Argumentation Devices in Reasoning About Health== https://ceur-ws.org/Vol-1876/paper10.pdf
                                                      Proceedings of CMNA 2016 - Floris Bex, Floriana Grasso, Nancy Green (eds)




                            Argumentation Devices in Reasoning About Health
                                          Sally Jackson and Jodi Schneider
                            University of Illinois at Urbana-Champaign, sallyj@illinois.edu
                                       University of Pittsburgh, jos188@pitt.edu




                         Abstract
                                                                   Example 1, Press Release from AutismSpeaks non-profit1
    Health controversies are infused with products of
                                                                      In the largest-ever study of its kind, researchers again found that
    expert reasoning, often interpreted by non-experts.
                                                                      the measles-mumps-rubella (MMR) vaccine did not increase
    To understand these controversies, we must pay
                                                                      risk for autism spectrum disorder (ASD). This proved true even
    closer attention both to the field-dependent devices
                                                                      among children already considered at high risk for the disorder.
    that characterize expert reasoning, and to how non-
                                                                      In all, the researchers analyzed the health records of 95,727
    experts engage with experts’ evidence and                         children, including more than 15,000 children unvaccinated at
    reasoning in their own argumentative practices. We
                                                                      age 2 and more than 8,000 still unvaccinated at age 5. Nearly
    describe two argumentation devices that have
                                                                      2,000 of these children were considered at risk for autism
    emerged in medical research and discuss the role of
                                                                      because they were born into families that already had a child
    these devices within health controversies.
                                                                      with the disorder.
                                                                      The report appears today in JAMA, the Journal of the American
1   Introduction                                                      Medical Association.
Argumentation is a constantly evolving social practice, one
that builds on thousands of years of human experience. The         Example 2, The New York Times2
ubiquitous human practice of seeking advice from experts,             According to Dr. Paul Offit, an infectious disease specialist at
for example, has very long historical roots, but it is also a         Children's Hospital of Philadelphia, young children readily
basis for decision-making that is in constant flux as the             handle the immune challenges of multiple vaccines. For
grounds for expert opinion change. Expert fields do not just          example, studies have shown the five-in-one vaccine Pediarix
accumulate information; they also invent specialized ways             against hepatitis B, polio, tetanus, diphtheria and pertussis is as
of reasoning about information. Toulmin [1958] noted this             safe and effective as giving each of these vaccines individually.
fact and discussed at length the possibility that warrants (or
backing for warrants) might justify the movement from data         Example 3, The Guardian3
to claim only within particular fields. The Argument Inter-           The evidence of no link between MMR and autism is now
change Format [Chesñevar et al., 2006] acknowledges field             extremely strong. In February 2012, the Cochrane Collaboration
dependence in argumentation by including context in the               - which compiles gold-standard reviews of medical evidence -
core model and assuming that context may include domain-              conducted a huge study into the safety of MMR. This mega-
specific argumentation rules that are direct counterparts of          review brought together evidence from 54 difference(sic)
domain-independent schemes. Our goal in this paper is to              scientific studies using a variety of methodologies and involving
explore field-dependent patterns of reasoning in health and           14.7 million children from around the world.
medicine and to consider how these can be modeled.
   Several examples drawn from a contemporary health                 These passages are typical of the appearance of expert
controversy illustrate an important fact: As expert fields         knowledge in the public discussion of childhood
innovate in their own reasoning practices, arguments built         vaccination. But the critical questions associated with the
by non-experts on the prior arguments of experts may take          argument from expert opinion scheme will not provide the
forms quite unlike the canonical form of argument from             kind of searching evaluation that these examples require.
expert opinion. Each example mentions a conclusion drawn
                                                                       1
by an expert or group of experts, and at first glance, it would          https://www.autismspeaks.org/science/science-news/no-mmr-
seem that each would pass all of the tests defined by              autism-link-large-study-vaccinated-vs-unvaccinated-kids
                                                                       2
standard lists of critical questions for the expert opinion              http://well.blogs.nytimes.com/2015/08/10/not-vaccinating-
scheme [Walton et al., 2008, p. 15], including the “backup         children-is-the-greater-risk/?_r=0
                                                                       3
evidence question.”                                                      http://www.theguardian.com/society/2013/apr/25/measles-
                                                                   mmr-the-essential-guide




                                                                                                                                 49
    Proceedings of CMNA 2016 - Floris Bex, Floriana Grasso, Nancy Green (eds)




The basis for the expert opinion is in each case not only           2.1 Randomized controlled trials
field-specific information (“backup evidence”) but also
                                                                       Establishing and defending claims about medical
some field-dependent inference strategy, applied directly by
                                                                    treatments is central to health science and practice.
the expert source mentioned in Examples 1 and 3, and                Although the problem has existed throughout human
indirectly (by the expert’s own expert sources) in
                                                                    history, our standards for defense of such claims have
Example 2. How should the differences among texts like
                                                                    changed dramatically in the last century, with the invention
these be represented, and what new critical questions do
                                                                    of the Randomized Controlled Trial (RCT). RCTs combine
these arguments invite?
                                                                    three features: (1) a comparison of a treatment of interest
                                                                    with a control condition (or with an alternative treatment);
2 Field-dependent argumentation devices                             (2) random allocation of patients to treatment conditions;
Expert fields may build up repertoires of reasoning                 and (3) “blinding” of patients and researchers to the
strategies over time, resulting in field-dependent inference        treatment any given individual receives.
rules. When any such new inference rule is proposed, other             Meldrum [2000] provides an illuminating account of the
experts may challenge it, describing undercuts or rebuts to         emergence of RCTs, documenting the series of innovations
the strategy (as we will describe in 2.1 and 2.2). Iterative        that, when combined into a single experimental design,
repair and critique continue, often over long periods of time,      became the standard against which all other medical
until the strategy is defeated, abandoned, or stabilized.           evidence has come to be compared. We summarize her
   We will use the term argumentation device to describe a          account here to highlight the fact that specific innovations
stable inference rule, currently accepted within a given field      (like random allocation) serve specific argumentative
as a repeatable method for generating new, valid arguments          functions, so much so that their omission is said to make the
within the field’s domain. An argumentation device may              experiment invalid as evidence for a conclusion about the
contain material components that augment human reasoning            effect of a treatment.
in various ways and institutional components that                      Prior to the 1900’s, controlled experiments in human
underwrite their dependability.                                     health were rare, and according to Meldrum, even more
   In many respects, argumentation devices resemble                 rarely conducted on treatments that could be administered to
argumentation schemes. Schemes, though, are generally               individual patients. Medical practitioners engaged in careful
assumed to be domain-independent and stable over long               observation and sharing of results, and the literature was
periods of time [Chesñevar et al., 2006, p. 297], while the         filled with case reports of what had worked in individual
inventions we call argumentation devices are deeply                 cases, but without procedural controls needed for strong
entwined with the state of knowledge in a given domain.             inference from these observations.
They work like schemes (as rules that justify drawing a                Proliferation of treatments – particularly drugs and patent
conclusion from data); and like schemes, they have                  medicines – led to the formation of assessment agencies in
specifiable critical questions. However, the critical               the early 1900’s, including the American Medical
questions needed to evaluate the output of an argumentation         Association’s Council on Pharmacy and Chemistry, and the
device need to be discovered for each such device, often by         first U.S. federal bureau empowered to review “the
seeing how the device fares in actual debate among experts,         extravagant claims” made by the pharmaceutical industry of
and then again, in larger contexts (like public debate) where       the time [Meldrum, 2000, p749]. Of central importance to
the output of the device may be used as evidence for some           our treatment of RCTs as an argumentation device is the
further conclusion. They may change in response to change           role agencies played in challenging these extravagant
in the substantive knowledge of the field, as when some             claims.
newly discovered fact about the phenomena exposes a                    To understand RCTs as an argumentation device, it is
previously undetectable way for the device to go wrong.             important to understand how profoundly doubt,
   Argumentation devices can be extremely complex,                  disagreement, and error have affected the elaboration of this
incorporating material and institutional components that            device over time. Scientists working with human subjects
simply do not figure in ordinary schemes. For domains               had to discover the need for randomization in the
advancing high-stakes claims, like medical research, there          assignment of patients or other subjects to experimental
are many different motivations for critical scrutiny                conditions; the general superiority of comparisons based on
(scientific commitment to empirical adequacy, pragmatic             randomly assigned groups is counterintuitive, but is
interest in quality of health care, patient concern for safety,     nowadays universally acknowledged to be the best defense
financial interest in health care products, and more), and any      against bias or suspicion of bias. Other innovations like
of these motivations can lead either to the discovery of new        double-blinding were added as standard features of
critical questions or to the invention of new strategies for        experiments on human subjects, not because logic requires
disarming them. In the next two sections, we introduce two          them, but because of the practical discovery that patients’
argumentation devices that have emerged over the past half-         and experimenters’ expectations could affect health
century and co-evolved rapidly, supported by significant            outcomes, leading to novel criticisms of experiments for
investment in material and institutional resources.                 falling prey to “the placebo effect.” RCTs with various
                                                                    forms of blinding are the present standard for evidence in
                                                                    medicine, but they achieved their present status only slowly,




    50
                                                        Proceedings of CMNA 2016 - Floris Bex, Floriana Grasso, Nancy Green (eds)




and only incrementally. At each stage of development, it has         evaluation of the relevance and strength of evidence in each
been a device meant to disarm known objections to the                study; prescribed methods for combining information
conclusions drawn from a set of observations.                        quantitatively; preferred methods for presentation of
   RCTs stabilized into a standardized, widely accepted              findings; and more
form only in the late 1950’s [Meldrum, 2000, p754], about               Unlike RCTs, systematic reviews do not generate new
ten years after the first large-scale trials were initiated (1946    observations. They assemble evidence that already exists in
in the US, 1947 in the UK). A decade later RCTs gained               a scientific literature and draw inferences from this evidence
institutional status. In the wake of thalidomide-associated          in a highly disciplined way. Evidence that would be
birth defects, the U.S. Food and Drug Administration began           considered inconsistent from a common-sense point of view
to investigate new approaches for reviewing drugs for safety         is taken as input to the review, and interpreted in light of
[Meldrum, 2000]. This led to a 1970 regulation enshrining            what experts know about variability. A Cochrane Review
the RCT in U.S. law.                                                 treats study-to-study variation in findings from multiple
   RCTs are not by any means a secure defense for a claim            RCTs as normal and unremarkable, and because all relevant
about a treatment effect. A series of RCTs, each                     evidence is included, it offers good defense against any
competently executed, can come to different conclusions              charge of cherry-picking. New reviewing standards emerge
about a treatment. And each one remains vulnerable to                in response to problems noticed in the quality of
subtle counterarguments that only expert researchers are             argumentation produced by a review. For example, the
likely to discover—previously unknown confounds, for                 Cochrane handbook includes cautions against “common
example. However, RCTs handily defeat most other forms               mistakes” made in reviewing, such as concluding that there
of evidence that might be advanced for the same class of             is evidence of no effect of an intervention when all that is
claims. They are a “package deal” of evidence for a claim            really justified by the literature is that there is no evidence
and evidence against a standard set of possible rebuttals,           of an effect.5 Against a charge that the Cochrane Review is
creating a strong but still defeasible conclusion.                   only as good as the body of primary research available for
                                                                     aggregation, the Cochrane Collaboration (more than 37,000
2.2 Cochrane Reviews                                                 contributors from over 130 countries) has adopted a formal
As noted briefly above, RCTs on a particular treatment may           practice of “grading” the strength of the evidence base itself.
accumulate within a scientific literature, each reporting               Although systematic review methods are still in a period
some measurement of the effect of the treatment. Despite             of rapid methodological innovation, the Cochrane Review
the widely acknowledged value of RCTs for evaluating                 has already achieved the status of a trusted argumentation
treatment effects, expertise in interpretation is still              device, largely because its procedures are so explicitly
necessary. One of the things experts know is that random             linked to critical questions on which earlier styles of
variability is always present in the results of any series of        research synthesis regularly failed. The methodical search
identically designed experiments on human subjects. This             procedures required for a Cochrane Review make it hard for
creates an opportunity for confirmation bias to operate as           a critic to object that evidence was assembled to fit the
readers cherry-pick results that support their beliefs and           reviewer’s own hypothesis. Counter-arguing individual
ignore or discount results that do not. Accompanying the             studies (a once-common practice in narrative reviews of
rise of RCTs in medicine is another important invention, the         literature) is replaced with careful and explicit coding
systematic research review designed to aggregate evidence            decisions applied impartially to the entire corpus of
from many individual studies into a statement of what the            potentially relevant studies. Reviewer bias is further
research as a whole may be taken to support. Over just the           minimized through highly structured reporting methods: For
past three decades, a highly standardized form of systematic         example, if the review includes meta-analysis (a technique
review has emerged, known as the Cochrane Review.                    for transforming results of each individual study into a
   Cochrane Reviews are named for Archie Cochrane, a                 quantitative effect size measure), the results must be
Scottish doctor and epidemiologist, who championed the               displayed as a “forest plot” that allows readers to inspect
use of RCTs for guidance of clinical practice. In 1989, the          results on a study-by-study basis.
publication of a 2-volume work on pregnancy and childbirth
marked what Cochrane regarded as “a real milestone in the            3 Modeling the role of argumentation devices
history of randomised trials and in the evaluation of care”          In a very preliminary way, we want to consider the
[Chalmers et al., 1989; and Cochrane’s Foreword]. This was           challenges of including argumentation devices like these in
the first major systematic review in health science, a               formal models. Argumentation devices resemble schemes in
massive undertaking involving ten years of effort to review          most respects; they serve as reusable links between different
over 3000 controlled trials published since 1950 [Review,            collections of data and conclusions drawn from these data.
1990]. A Cochrane Review is a review of literature                   They are applied to data, and although devices do not need
conducted using very well-defined procedures outlined in an          defense in each application, they do have a context-
official handbook.4 These procedures include exhaustive              independent defense that can be attacked either in the
search for relevant studies; use of scoring rubrics for              particular occasion of use or in a general critique of all

   4                                                                     5
       http://handbook.cochrane.org                                          Cochrane Handbook Part 2 section 12.7.2




                                                                                                                              51
    Proceedings of CMNA 2016 - Floris Bex, Floriana Grasso, Nancy Green (eds)




arguments using the device. In an argument network, they            understand the workings of the device and have confidence
would be better represented as a scheme node than as an             in it. Both RCTs and Cochrane Reviews share a well-
information node. In a Toulmin diagram, the device is the           defined context consisting of an audience of medical
warrant for conclusions drawn from data. The field-                 experts, a pre-existing literature, and other features whose
dependence of a argumentation device will commonly be               argumentative relevance is as yet unclear. Both have
most apparent in what appears in the backing for the device.        developed iteratively from critique within the field, and both
   An argumentation device gains its status through                 are still being elaborated to eliminate vulnerabilities in their
incorporation of various assurances of its own ability to           conclusions. Argumentation devices demand consideration
deliver reliable conclusions, including new institutional           of context: not only the community within which they
resources that underwrite the device as a whole. The                emerge but also the state of play within that community.
Cochrane device is a particularly clear example, since it
depends very openly on the growth of institutional resources                          #/)0)+)
to assure that a conclusion from a Cochrane Review is based                            
                                                                                                                                            ##-.
                                                                                                                                                 -.
on the most exhaustive search possible for relevant                                     #

evidence. Although a machine-searchable database of                                                                $%
medical research literature (MEDLINE, for the US National                                                            

Library of Medicine) has been available since the 1960s, the
Cochrane Collaborative has created a specialized database                                                             
                                                                                                                       #
specifically for controlled trials, known as CENTRAL
(Cochrane Central Registry of Controlled Trials), that
includes both a subset of MEDLINE entries and other items                                                                 $%
                                                                                                             
retrieved from a variety of sources, including manual search              !'$ 
                                                                        $%&
                                                                                                          $             $#              ,$%
                                                                                                                                                            -! #!
                                                                                                                             %#*
of conference programs by members of the Cochrane                          • &#!$
                                                                                                          &#
                                                                                                                                       •                     # .
Collaboration. Reviewers are expected to search both                    • &#!$                                • #!
                                                                                                                          • )!$
MEDLINE and CENTRAL to identify every possible
relevant item, and to examine each item for whether it meets               )     
                                                                          - #.
inclusion criteria. A typical Cochrane Review will identify
thousands of potentially relevant items and winnow these to
a few dozen studies that actually provide relevant data.            Figure 1. General form of a Cochrane Review’s argument, with
   The resources that are required for an argumentation             delegations of responsibility in the backing for the warrant.
device to operate at all need some presence in any graph,
diagram, or other formal representation of an argument from
expertise that is itself an argument from some field-               4 Critical questions about devices
dependent device—arguments like those presented in                  A Cochrane Review is organized around both presentation
Examples 1, 2, and 3. These resources are meant as                  of data and response to critical questions about the gathering
strengtheners of the expert argument, but they are also a           and interpretation of the data. In other words, much of the
system of delegations in which responsibility for the validity      text of a Cochrane Review consists of explicit answers to
of any one conclusion has been spread throughout a huge             the questions other experts would be presumed to have. An
collective of participants. The individual performers of            enormous advantage that comes with use of an established
Cochrane Reviews take responsibility for faithful adherence         argumentation device is that the device itself does not need
to Cochrane procedures, but responsibility for the                  defense for each occasion of use. It can function as a
exhaustiveness of the search is delegated to databases; the         warrant for many specific conclusions, each of which has its
responsibility for what is available to be retrieved is             own unique body of evidence.
delegated to funding agencies that set research priorities;            Although an argumentation device may be applied in a
and the responsibility for establishing hierarchies of              completely uncontroversial way within an expert field, that
evidence is delegated to trusted working groups within the          is no protection against questions or challenges from beyond
Cochrane Collaboration. These delegations are themselves            the field. The fact that a device has earned the confidence of
an interesting fact about contemporary argumentation                a group of experts is not quite sufficient to earn trust from
[Jackson, 2015a] that could be better understood if they            other potential audiences. The testing ground for any new
were explicitly included in formal models of argumentation.         argumentation device is argumentation itself. The device
Figure 1 illustrates how these delegations might be                 must earn its status by withstanding critique. We end by
incorporated in a Toulmin diagram, as forms of backing for          considering what kinds of questions might arise, reasonably
the Cochrane Review procedure.                                      or even unreasonably, as devices like Cochrane Reviews
   The most distinctive differences between argumentation           enter new testing grounds.
devices and familiar argumentation schemes are their field-            To begin with, critical questions relevant to arguments
specificity and their openness to redesign [Jackson, 2015b].        supported by Cochrane Reviews share some similarities
The primary purpose of an argumentation device is to                with critical questions for arguments from expert opinion.
provide convincing evidence for a conclusion to people who          The accuracy of an arguer’s understanding of expert opinion




    52
                                                      Proceedings of CMNA 2016 - Floris Bex, Floriana Grasso, Nancy Green (eds)




is always relevant. Consider again Example 3—the                   involves questions that may need to be asked to correct an
Guardian’s appeal to a Cochrane Review of 54 studies as            unsuspected bias. Such questions can sometimes be
evidence against any link between autism and MMR. The              formulated more easily by non-experts than by the experts
review [Demicheli et al., 2012] did in fact look at 54             themselves, by coming from a perspective with its own
studies, but only 10 included autism as an outcome variable,       biases, but different ones.
and by the reviewers’ assessments of quality, it does not             In health controversies where much is at stake, both
appear that they would agree that the 10 studies relevant to       experts and non-experts will fully explore the possible
this particular claim provide “extremely strong” evidence.         grounds for disagreement with conclusions drawn from
(None of the ten were RCTs, and none individually offered          experts’ argumentation devices, and the devices themselves
a strong design for detecting a link between MMR and               will improve in order to better withstand critique. An
autism. Reviewers classified all ten of the autism-related         important goal in modeling argumentation devices is to
studies as containing either “high” risk of bias or                expose avenues for productive examination of the devices
“moderate/unknown” risk of bias.) Where the Guardian has           by non-experts, and to assist experts in responding
gone wrong here is in assuming that a “gold standard”              productively to even the most skeptical critique.
procedure can produce “extremely strong” evidence from a
research literature that is inadequate, a failure to understand    Acknowledgments
that any limitations of the primary research literature are        The second author was supported by training grant
inherited by the review.                                           5T15LM007059-29 from the National Library of Medicine
   But in addition to questions similar to those relevant to       and National Institute of Dental and Cranio-facial Research.
assessment of argument from expert opinion, any device of
this kind will be vulnerable to challenges specific to the
device. A significant feature of the current design of the
                                                                   References
Cochrane Review is that it aggregates evidence from                [Chalmers et al., 1989] Iain Chalmers, Murray Enkin, and
scientific literature (sometimes including unpublished data,          Marc J.N.C. Keirse. Effective care in pregnancy and
but mostly from reports published in some form and                    childbirth: Pregnancy. Oxford University Press, 1989.
included in a database). By design, a Cochrane Review               [Chesñevar et al., 2006] Carlos Chesñevar et al. Towards
ignores evidence that could, in principle, be relevant. This          an Argument Interchange Format, The Knowledge
includes the very wide range of evidence types that can be            Engineering Review 21(4): 293–316, December 2006.
supplied by ordinary people paying attention to their own
health and their own reactions to treatments. For the              [Demicheli et al., 2012] Vittorio Demicheli, Alessandro
vaccination controversy, this includes evidence that is               Rivetti, Maria Grazia Delabine, and Carlo di Pietrantonj.
highly credible to many members of the public (first-hand             Vaccines for measles, mumps and rubella in children.
parent observations of adverse reactions to vaccines); the            Cochrane Database of Systematic Reviews 2012 Issue 2.
fact that no serious effort has been made to systematically           Art. No.: CD004407.
review these reports is a reason for those affected to             [Jackson, 2015a] Sally Jackson. Deference, distrust, and
question the credibility of the institutions that back the            delegation: Three design hypotheses. Reflections on
Cochrane device. So one class of critical questions have to           Theoretical Issues in Argumentation Theory (pp. 227-
do with whether there are forms of evidence the device does           243). Springer International Publishing, August 2015.
not (or cannot) ingest.                                            [Jackson, 2015b] Sally Jackson. Design thinking in
   Another class of critical questions have to do with biases         argumentation theory and practice. Argumentation,
built into the device. The device is always designed to               29(3): 243-263, August 2015.
answer some set of questions but not others, and to assume
those things that its expert users assume. To illustrate, a         [Meldrum, 2000] Marcia L. Meldrum. A brief history of the
common notion within anti-vaccination discourse is that the           randomized controlled trial: From oranges and lemons to
institutions responsible for the production of the primary            the gold standard. Hematology/oncology clinics of North
research have so strong an interest in mass immunization              America, 14(4), 745-760, August 2000.
that they conceal or suppress evidence of serious risks—           [Oliver and Wood, 2014] J. Eric Oliver and Thomas Wood.
characterized as conspiracy thinking by Oliver and Wood               Medical conspiracy theories and health behaviors in the
[2014]. While no one seriously expects scientists to respond          United States. JAMA Intern Med, 174(5):817-818, 2014.
to conspiracy theories, it is certainly reasonable to ask what
                                                                    [Review, 1990] Book review of [Chalmers et al., 1989].
interests and assumptions shared within an expert
                                                                      Birth, 17(1): 55–62, March 1990.
community might make the community blind to certain
evidence or deaf to certain arguments.                              [Toulmin, 1958] Stephen E. Toulmin. The uses of
      Seeing argumentation devices as an encapsulation of             argument. Cambridge University Press, 1958.
how the expert community reasons, questions can be asked           [Walton et al., 2008] Douglas Walton, Chris Reed, and
not only about the individual use of the device in one               Fabrizio Macagno, Argumentation Schemes, 1st edn,
argument, but also about the assumptions the device                  Cambridge University Press, August 2008.
encapsulates. This is an important shift of scale that




                                                                                                                            53