=Paper= {{Paper |id=Vol-289/paper-10 |storemode=property |title=Syntactic Disambiguation for the Semantic Web |pdfUrl=https://ceur-ws.org/Vol-289/po02.pdf |volume=Vol-289 |dblpUrl=https://dblp.org/rec/conf/kcap/PoolC07 }} ==Syntactic Disambiguation for the Semantic Web== https://ceur-ws.org/Vol-289/po02.pdf

Syntactic Disambiguation for the Semantic Web
Jonathan Pool S. M. Colowick
Turing Center, University of Washington Utilika Foundation
Seattle, Washington, USA Seattle, Washington, USA
pool@cs.washington.edu smc@utilika.org

ABSTRACT METHOD
Are people willing and able to disambiguate content for the We selected 25 sentences from the Web (a small sample
Semantic Web? We asked subjects to use two methods designed to encourage completion in an online, unmoni-
(paraphrasal and truth-conditional selection) to disambigu- tored testing environment). For each sentence, we identi-
ate sentences from the Web. Native speakers did better with fied two possible meanings and wrote a pair of paraphrases
the paraphrasal method, and non-native speakers with the and an equivalent pair of truth conditions (situation descrip-
truth-conditional method. Unpaid volunteers performed tions) for them. For example, “Drinking almost always
better than paid subjects. Subjects’ average disambiguation followed a dinner-party” had these restatements:
time was about 20 seconds per sentence. Paraphrases: (1) “Almost all drinking followed dinner-
parties.” (2) “Drinking followed almost all dinner-parties.”
Categories and Subject Descriptors
Truth conditions: (1) “In the activity diaries, 900 episodes
H.1.2 User/Machine Systems – Human factors, human
of drinking were reported, and 875 of them followed din-
information processing
ner-parties.” (2) “In the activity diaries, 900 dinner-parties
H.5.2 User Interfaces – Natural language were reported, and drinking followed 875 of them.”
I.2.4 Knowledge Representation Formalisms and Methods We asked some subjects (for method comparison) to
– Semantic networks choose between the paraphrases or between the truth condi-
I.2.6 Learning – Knowledge acquisition tions, and others (for consistency measurement) to choose
I.7.2 Documentation Preparation – Markup languages both a paraphrase and a truth condition for each sentence.
These two-task subjects might see the equivalent restate-
J.5 Arts and Humanities – Linguistics
ments in the same or in the opposite order.
General Terms We recruited 386 subjects: 208 through a Web contracting
Economics, Experimentation, Human Factors, Languages service [1], paid $0.75 each; and 178 through Internet dis-
cussion groups on language and writing, unpaid.
Keywords The ability to read and write English was the only partici-
Ambiguity, Annotation, Disambiguation, Distributed Hu- pation requirement; 88% of the subjects had English as a
man Computation, Metadata, Semantic Web native language. Subjects had opportunities to give us
comments after each trial, after each block of 5 trials, and
INTRODUCTION at the end of the experiment.
Ambiguity and vagueness pervade the unstructured Web.
The Semantic Web initiative proposes to rely on humans to RESULTS
create unambiguous content, metadata, and queries, but
people have limited ability to recognize and prevent ambi- Satisfaction
guity in what they express [2, 6]. While machine under- Satisfaction was measured both by questionnaire responses,
standing of unannotated text may become feasible [3], re- which indicated moderate satisfaction for all subjects (on
searchers are working to develop practical interfaces for three dimensions: ease, interest, and usefulness), and by
human disambiguation of Web content [4]. To investigate completion rate. There were slight differences in satisfac-
methods of resolving one of the more difficult kinds of tion favoring paraphrasal over truth-conditional disam-
ambiguity, we conducted an experiment in which subjects biguation and one-task over two-task conditions. For ex-
disambiguated English sentences that contained syntacti- ample, 90% of one-task subjects, compared with only 83%
cally ambiguous quantification [5]. of two-task subjects, completed the experiment (p < 0.04).

Consistency, Speed, and Agreement
The choices made by a two-task subject in a trial were con-
sistent if the chosen truth condition was equivalent to the
chosen paraphrase. Choices were consistent in 82% of the
trials, regardless of whether the paraphrasal or the truth-
conditional task appeared first. But opposite-order trials
(with the first paraphrase equivalent to the second truth
condition and vice versa) showed less consistency (76%) often with the majority when using the paraphrasal method,
than same-order trials (86%). Of 159 subjects whose consis- but most (25 of 45) non-native speakers did so when using
tency rates differed between same- and opposite-order trials, the truth-conditional method (2-tailed p = 0.0561). The
69% (109) were less consistent on opposite-order trials (two- truth conditions’ emphasis on numerical rather than verbal
tailed p < 0.00001). reasoning may explain some of this difference.
The median time to perform a disambiguation was 20 sec-
onds on one-task trials and 31 seconds on two-task trials. DISCUSSION
Truth-conditional selection typically took 23 percent longer One-task subjects resolved ambiguities in 15-25 seconds,
than paraphrasal selection, perhaps because of the greater with approximately 80% inter-method consistency and 80%
length and complexity of the truth conditions. Overall, the majority agreement. Volunteers performed even better than
speed of disambiguation increased with experience. paid subjects, reaching 99% agreement on the most consen-
sual sentence. Many subjects, particularly in the volunteer
The fastest subject to achieve 100% consistency finished in
subsample, described the disambiguation tasks as both
a total of 709 seconds. Others achieved 90% consistency in
challenging and enjoyable.
about 500 seconds, or 20 seconds per trial (see Figure 1).
Our subjects guessed others’ intended meanings, with no
context but with the opportunity to choose between care-
fully crafted restatements. In future experiments, we intend
to study disambiguation by authors, rather than readers,
with more scalable methods of interactive disambiguation.
We surmise that authors will be motivated to limit their
ambiguity, just as our volunteers demonstrated their enthu-
siasm for disambiguation. Thus, we anticipate that the bar-
riers to author disambiguation will be more technical than
motivational. Our focus will be on developing methods that
help motivated authors to recognize and reduce ambiguity.

REFERENCES
[1] Amazon.com, “Amazon Mechanical Turk” (Web site),
2007; http://www.mturk.com/mturk/welcome.
[2] Arnold, J. E., Wasow, T., Asudeh, A., and Alrenga, P.,
Figure 1. Consistency by Duration “Avoiding Attachment Ambiguities: The Role of Con-
Insofar as the majority correctly guesses intended mean- stituent Ordering”, Journal of Memory and Language,
ings, the size of the majority is a measure of the subjects’ 51, 2004, 55-70; http://www-csli.stanford.edu/
collective success. We define a method-majority choice as ~wasow/AWAA_final.pdf.
the choice made by the majority of subjects (in all treat- [3] Etzioni, O., Banko, M., and Cafarella, M. J., “Machine
ment groups) who disambiguated the same sentence with Reading”, 2007 AAAI Spring Symposium on Machine
the same method in any trial. Of 13,859 choices made by Reading, 2007; http://turing.cs.washington.edu/papers/
all subjects, 77% were method-majority choices. This pro- SS06EtzioniO.pdf.
portion was larger for paraphrasal selection (79%) than for [4] Kaufmann, E., and Bernstein, A., “How Useful are
truth-conditional selection (75%). Paraphrasing was the Natural Language Interfaces to the Semantic Web for
better method (it had higher method-majority rates) for 223 Casual End-Users?”, 6th International Symantic Web
subjects, while truth-conditional selection was better for Conference (ISWC 2007), 2007; http://www.ifi.uzh.ch/
only 116 subjects (p < 0.00000001). ddis/staff/goehring/btw/files/
Kaufmann_Bernstein_ISWC2007.pdf.
Subsample Analysis
By most measures, the unpaid volunteers performed better [5] Pool, J., and Colowick, S. M. (in press), “Disambiguat-
than the paid subjects. Of 79 two-task volunteers, 42 were ing for the Web: A Test of Two Methods,” Proc. 4th
more consistent than the overall median, vs. 37 of 95 paid Intl. Conf. on Knowledge Capture (ACM Press, 2007);
subjects (2-tailed p = 0.0608). Of 178 volunteers, 87 made http://http://turing.cs.washington.edu/papers/
more than 1 comment, vs. 45 out of 208 paid subjects (2- disambweb.pdf.
tailed p < 0.0002). However, volunteers took longer: 84 of [6] Wasow, T., Perfors, A., and Beaver, D., “The Puzzle
178 volunteers took more than the overall median time to of Ambiguity”, in Morphology and the Web of Gram-
finish, vs. 52 of 208 paid subjects (2-tailed p < 0.0002). mar: Essays in Memory of Steven G. Lapointe, ed. O.
Native and non-native speakers of English differed most Orgun and P. Sells (Stanford: CSLI Publications,
strikingly in the disambiguation method that worked better 2005); http://montague.stanford.edu/~dib/Publications/
for them. Most native speakers (202 of 340) agreed more lapointe_paper_9-4.pdf.