Why-type Question classification in Question Answering System
                            Manvi Breja                                                     Sanjay Kumar Jain
          National Institute of Technology, Kurukshetra                       National Institute of Technology, Kurukshetra
                      Kurukshetra, Haryana                                                Kurukshetra, Haryana
                     manvi.breja@gmail.com                                                skj_nith@yahoo.com

ABSTRACT                                                                 As an attempt to understand the questioner’s intent in the why-
The fundamental requisite to acquire information on any topic has        question asked on QASs, we propose a classification of why-type
become increasingly important. The need for Question Answering           questions which plays an important role in the development of
Systems (QAS) prevalent nowadays, replacing the traditional search       QASs. We begin the analysis of 1000 why-questions, randomly sam-
engines stems from the user requirement for the most accurate an-        pled from the QA sites and from the datasets available on the Web.
swer to any question or query. Thus, interpreting the information        With the analysis, we propose a classification with four categories
need of the users is quite crucial for designing and developing a        (1) Informational Why-questions, (2) Historical Why- questions, (3)
question answering system. Question classification is an important       Contextual/Situational Why-questions, and (4) Opinionated Why-
component in question answering systems that helps to determine          questions. To enable the automatic detection of these four types of
the type of question and its corresponding type of answer. In this       questions by a parser [2], we discussed the features that differentiate
paper, we present a new way of classifying Why-type questions,           them and helps them to be recognized.
aimed at understanding a questioner’s intent. Our taxonomy classi-          Our proposed taxonomy can serve as a crucial step in the devel-
fies Why-type questions into four separate categories. In addition,      opment of Why-type QAS: first, by automatically differentiating
to automatically detect the categories of these questions by a parser,   questions, it can help us decide the knowledge source to be referred
we differentiate them at lexical level.                                  to find an answer, secondly it can help determine the expected
                                                                         answer type of a question.
                                                                            The rest of this paper is organized as follows: In section 2, we
CCS CONCEPTS
                                                                         give a brief overview on QA systems. In section 3, we discuss the
• Information systems → Question answering;                              motivation for carrying out research in why-QA. Section 4 discusses
                                                                         the related work on question classification. Section 5 describes the
KEYWORDS                                                                 research issues faced in why-QA. Section 6 introduces the research
Question answering system, why-questions, question classification,       objectives. Section 7 describes the methodology used in research.
answer types                                                             Section 8 describes the procedure of data collection to carry out
                                                                         research, Section 9 discusses the proposed classification of why-
                                                                         questions and their distinguished features analysis. Finally, Section
1   INTRODUCTION                                                         10 concludes our work with future plans.
The rapid advancement of Web has allowed the researchers to store
information on a wide variety of topics. Search engines [5] return
                                                                         2    QUESTION ANSWERING SYSTEM
a relevant list of web pages, according to the user’s need. But find-
ing the most appropriate and precise answer for a given question,        Question answering systems answer the questions asked in natural
has motivated the development of Question Answering Systems.             language. They use information retrieval and natural language pro-
These days, QA becomes a researched topic in the field of NLP            cessing techniques to find an appropriate answer. The architecture
and IR. Question answering System [8] is an information retrieval        of QAS includes four modules namely, question processing, docu-
system that automatically generates an accurate answer of a nat-         ment retrieval, answer extractor, and answer re-ranker as illustrated
ural language question. Questions elicit information in the form         in Figure 1.
of answers. The answer to the questions depends on the types of             Question processing module performs activities (1) question
questions. In English language, there are several types of questions     classification, and (2) question reformulation. The question classifi-
starting with word what, when, who, where, why, how, etc. Ques-          cation is an important module of QAS as it affects the subsequent
tions beginning with what, when, who and where are factoid type          answer extraction module, and hence determines the accuracy and
questions [13] and can be answered in a single phrase or sentence.       performance of QAS. Question classification accurately assign a
Whereas, questions starting with why and how belong to non-              label to a question and categorize it into one of the predefined
factoid questions. Such type of questions are complex and involve        classes. This further helps in predicting the answer type for the
variations in their answers. Why-type questions require reasoning        given question [33]. The question reformulation module reformu-
and explanations in their answers and how-type questions involve         lates a question (Q) into a new question (Q’) by adding appropriate
procedures/manners which vary among individuals. Their answers           terms, deleting punctuation marks, and thus, highlighting the in-
range from a sentence to a paragraph or even a whole document.           formation needs of a user. After question processing, document
Though past studies addressed the issue of question classification       retrieval module of a QAS returns a ranked list of relevant doc-
for various questions starting with what, when, where, etc., few         uments in response to a reformulated question. A document is
of them have addressed the classification of Why-type questions.         considered to be relevant if its contents are relevant to the answer
                                                                            [12, 22, 31, 34, 35, 37, 38] and How-type questions [3, 23]. Extracting
                                                                            one unique answer to a Why-type question is an open research
                                                                            challenge in Question Answering community. Thus, we aim to work
                                                                            on Why-type questions, so that it can contribute the development
                                                                            of QAS dealing with all types of questions. Question classification
                                                                            is a crucial component of modern QAS. It classifies questions into
                                                                            several semantic categories which further determines the expected
                                                                            semantic type of their answers. The semantic category helps to filter
                                                                            out irrelevant answer candidates, and determine the one accurate
                                                                            answer.


                                                                            4   RELATED WORK
                                                                            In literature, many researchers have addressed the issue of classify-
                                                                            ing questions asked in different domains. Zhang et. al. [41] followed
                                                                            the taxonomy for TREC-style questions, which contains 6 coarse
                                                                            grained categories (ABBR, DESC, ENTY, HUM, LOC, NUM) and 50
                                                                            fine grained categories. They considered only syntactic structure
                                                                            of the question in the system whose performance can be improved
                                                                            by incorporating semantic knowledge. Lili Aunimo [1] developed a
                                                                            typology of general domain question answering systems. Questions
                                                                            are evaluated on 7 set of features, consisting of lemmatized words,
                                                                            part-of-speech (POS) tags, punctuation marks, semantic tags, and
                                                                            target tags. Metzler and Croft [24] used question words and types
                                                                            and found correlations between them to train word-specific ques-
                                                                            tion classifiers. They identified question words firstly, and trained
                                                                            separate classifier for each question word. Nguyen et. al. [15] pro-
    Figure 1: Architecture of Question Answering System                     posed a subtree mining method for question classification. Fangtao
                                                                            Li et. al. [18] classified the what-type questions using head noun’s
                                                                            tag. The system can’t produce correct results, in case the head noun
and fulfills the needs of the user. The retrieval of appropriate doc-       is not present in the question. Zhiheng Huang et. al. [11] presented
uments is important in QASs as it searches for correct answers              five binary feature sets, namely question wh-word, head word,
from those documents. The answer extractor module extracts a                WordNet semantic features (hypernym) for head word, word grams,
candidate set of answers from the documents, that matches with              and word shape feature for question classification. Ambiguity arises
answer types given by the question classification module. The an-           in classifying questions. Inconsistent labeling in training and test
swer re-ranker module ranks the obtained answer candidates using            data produces incorrect parse tree which results in wrong head
various techniques and returns the highest scored answer to the             word extraction. Eduard Hovy et.al. [10] created a QA typology,
user.                                                                       consisting of 5 types of Qtargets as, Abstract, Semantic, Syntactic,
                                                                            Role, and Slot. Baoli Li et. al. [26] introduced Universal Question An-
3    MOTIVATION                                                             swering in which answer types are detected according to following
Many researchers have carried work on different modules of ques-            criteria that (1) correct answer shares the same topic with its ques-
tion answering system. According to Moldovan [9, 27], the accuracy          tion, (2) it has the same answer type as that expected by its question.
of QAS is dependent on the question classification module. If the           Harper et. al. [7] automatically classified questions into conver-
questions are properly classified, it will result in the extraction         sational and informational. [14] classified questions from Yahoo!
of the accurate answer. The questions beginning with why and                Answers into four categories, as informational, suggestion, opinion,
how are very complex, and it is very difficult to extract one accu-         and other. Zhao and Mei [42] classified question tweets into two
rate answer to such questions. Whereas, the questions beginning             categories, tweets conveying information needs and tweets not
with what, where, who, which etc. are simple and can be answered            conveying information needs. Morris et. al. [28] manually labeled
by named entity tagging. Very less question answering systems               a set of questions posted on social networking platforms and iden-
deal with why-type questions because their answers are complex              tified eight question types, including recommendation, opinion,
and differ from one user to another, depending on the context               factual knowledge rhetorical, invitation, favor, social connection,
in which it is asked. Therefore, extracting one answer to a why-            and offer. Zhe Liu and Bernard J. Jansen [19, 20] proposed a taxon-
question is one of the research area in the field of IR. However many       omy of questions posted on social networking sites, called ASK. In
researches have been carried out on classification of What-type             accuracy questions, people ask for facts or common sense; social
questions [10, 11, 18, 24, 41] questions posted on social networking        questions in which people ask for the coordination or companion;
sites [7, 14, 19–22], questions asked in Community QAS [4, 17, 40],         and knowledge questions in which people seek personal opinions
etc., but less work has been done to classify why-type questions            or advices. The performance of the system can be improved by
                                                                        2
employing semi-supervised learning algorithm such as co-EM sup-                       depending on the context of the questioner and the con-
port vector learning. Authors continued their research in 2016 [21],                  text in which the question has been asked [25, 39]. Thus,
and modeled the intent detection as a binary classification prob-                     retrieving one accurate answer is a challenging task.
lem, which classified the questions into subjective and objective.                (3) Problems in paraphrasing Why-type questions:
A classifier is built on lexical, syntactical and contextual features.                Paraphrasing is the process of restating the giving state-
Long Chen et. al. [4] classified the questions asked on Community                     ment/ question with other words, without changing their
Question Answering systems into 3 categories according to their                       actual meaning. Hence determining the semantic class of the
user intent as, subjective, objective, and social. [17] investigated,                 questions is necessary to answer why type questions [35].
how to automatically determine the subjectivity orientation of ques-              (4) Question focus and semantics of why-QA:
tions, posted in community QA portals, which helped to evaluate                       Why-QAS will be able to handle the questions of type "Why
the correct answer. They explored a supervised machine learning al-                   do our ears ring?" because the correct answer passage to this
gorithms with features like char 3-grams, word, word+char 3grams,                     question does not contain the words ears and ring rather
word-n-gram, and word POS n-gram to predict the question subjec-                      it is a phenomenon called tinnitus and the answer passage
tivity.                                                                               returns the reason for the Tinnitus [39].
With regard to the classification of why-type questions, Moldovan                 (5) Problems related to answers extraction in Why-QAS:
et al. [27] considered answers of all why-questions as only one type,                 Many of the conventional QASs are based on bag of words
i.e., reason type. Ferret et al. [6] proposed a syntactic categoriza-                 model which face problems in retrieving appropriate answers
tion of factoid questions to determine the expected answer type.                      due to semantic relations between words like polysemy,
They also have viewpoint that the answers of why and how verb                         homonymy and synonymy [25]. Thus, discourse relation-
type questions are difficult to reduce to a syntactic pattern. Suzan                  ships between the sentences and Bag-of-concepts model are
Verberne [34, 35, 37–39] used Ferret’s approach for syntactically                     needed to retrieve an appropriate answer to Why-questions.
categorizing the why-questions and determining their expected                     (6) Problems related to answer re-ranking in Why-QAS:
answer type. The author formed a set of hand written rules based                      Candidate answers are re-ranked by the classifiers. Usually
on words and classes of verb used in the why-questions. A parser                      classifiers are trained on the basis of the features, according
[32] generates a parse tree and uses the set of hand written rules                    to which they return a score to each answer. Different fea-
to choose the syntactic category of a why-question. The author                        tures like causal relations, semantic word classes, sentiment
defines six syntactic categories of why-questions (1) action ques-                    polarities, morpho-syntactic information, bag-of-words etc.
tions, e.g. Why did Ratan Tata write a letter to Narendra Modi?,                      have already been utilized [29, 30]. Thus, deciding the im-
(2) process questions, e.g. Why has Dixville grown famous since                       portance of the features on which classifiers are trained, is
1964?, (3) intensive complementation questions, e.g. Why is Mi-                       itself an another challenging task.
crosoft Windows a success?, (4) monotransitive have questions, e.g.
Why do cats have slits in their ears?, (5) existential there questions,       6    RESEARCH OBJECTIVES
e.g. Why is there a need of resource planning?, and (6) declarative           To address the gaps, mentioned in the related work section, we aim
layer questions, e.g. Why did they say that migration occurs?. The            to work on the below research objectives:
author subdivides the answer types of why-questions into cause,
                                                                                  (1) Propose a taxonomy of why-questions with the considera-
motivation, circumstance, and purpose, on the basis of the classifi-
                                                                                      tion of identifying the questioner’s need, extracting a correct
cation of adverbial clauses given by Quirk [16].The system could
                                                                                      answer, and thus maximizing the response probability.
not categorize these groups of questions, (1) in which subject was
                                                                                  (2) Understanding the different features of why-type questions
incorrectly not marked as agentive in action questions (2) questions
                                                                                      on lexical level.
with an action verb as main verb but a non-agentive subject (3)
passive questions and (4) no general rule for monotransitive have
                                                                              7    RESEARCH METHODOLOGY
questions.
                                                                              We will be following qualitative research which is collecting, ana-
5    RESEARCH ISSUES FACED IN WHY-QA                                          lyzing and interpreting data by observing what people do and say.
                                                                              Qualitative research is subjective in nature that uses very differ-
There are few research issues that are faced in Why-QAS, which
                                                                              ent methods of collecting information, mainly individual, in-depth
are described as follows:-
                                                                              interviews and focus groups. The nature of this type of research
    (1) Problems in appropriate question classification:                      is exploratory and open ended. Thus, we try to collect the dataset
        Correctly classifying why-questions and determining their             of why-questions and answers, and analyze them to propose a
        expected answer type is one of the research problem [27, 36].         taxonomy for why-type questions.
        Almost all why-questions have ’Reason’ answer type . Suzan
        Verberne in 2007, subdivided the ’Reason’ answer type into            8    DATA COLLECTION
        purpose, motivation, circumstance and cause.                          To fulfill the above mentioned research objectives, we collected
    (2) Problems in determining one unique answer:                            why-type questions from the various question answering sites such
        Why-questions require reason, elaboration, explanation etc.           as Yahoo! Answers (https://in.answers.yahoo.com/, Quora (https:
        in their answers. Answers to why questions are subjective             //www.quora.com/, Twitter (https://twitter.com/search etc. We also
        generally. Different people answer the questions differently,         consulted a dataset of why-questions and their answers, used by
                                                                          3
Suzan Verberne available at (http://liacs.leidenuniv.nl/~verbernes/           situations. These questions generally involve the condition, circum-
in her research. This process resulted in our dataset, consisting of          stance, under which a particular event happened. These questions
1000 why-questions.                                                           are related to the domains like day-to-day circumstances, personal
                                                                              life, travelling, education, science, etc. There can be one, multiple
9     PROPOSED CLASSIFICATION OF                                              or ambiguous answers to such type of questions depending on the
      WHY-QUESTIONS                                                           context of the user and question in which it is asked. Thus, the
In this paper, we try to resolve the research issue of appropriate            main focus of these questions is on the condition/context of time
classification that helps to categorize the why-questions. With a             at which event has happened. The examples of such questions are:
viewpoint to identify the main focus of the question, and deter-              a. Why do the clouds darken when it rains? b. Why do you say
mining the context of answering a question, why-questions are                 "God bless you" when people sneeze ? c. Why does the moon turn
categorized into four categories as illustrated in Figure 2.                  orange?


                                                                              9.4    Opinionated Why-Questions
                                                                              The intent of an opinionated why-question is to receive reason-
                                                                              ing about some person or product. They seek responses reflecting
                                                                              the answerer’s personal opinions, advices, preferences, desires, or
                                                                              experiences. They encourage respondents to prove their personal
                                                                              answers. Due to which, there can be multiple answers possible for
          Figure 2: Categorization of why-questions
                                                                              a question, which can be ambiguous or controversial in some cases.
                                                                              These questions usually ask for the reviews of some products, or
   (1) informational (factual) why-question that asks for reasoning           ask for the personal life, travelling, education, etc. The examples of
about some fact (either scientific or non-scientific), (2) historical         these opinionated why-questions are: a. Why was my payment in
why-question that asks for the reasoning about some event/action              a message cancelled? b. Why are some people ’doublejointed’? c.
happened in the past, (3) situational why-question asks for the               Why do we laugh?
reason about the event occurred at a particular context of time, and             Continuing our research work, we will analyze the lexical fea-
(4) opinionated why-question that asks for the personal opinions              tures in detail to distinguish the above categories of why-questions.
on some other person/product.                                                 Since different terms in question are used to depict the different in-
                                                                              formation needs, we will use the parts of speech tagging to identify
9.1    Informational Why-Questions                                            different categories. For POS tagging, we will make use of Stanford
The intent of an informational why-question is to receive answers,            Tagger [38]. For example, opinionated why-questions contain per-
describing the reason for some facts, asked in the question. These            sonal pronouns except ’it’, common noun pointing to a person like
questions look for the factual or prescriptive knowledge. The data            boy, girl, man, woman, lady, etc., and concrete noun referring to
source which is used to answer such questions are WWW, domain                 a person, followed by any action verb. Historical why-questions
knowledge, expert knowledge, books etc. because their answers are             use the auxiliary verbs and main action verbs in the past tense
fixed and easily available from the Web. There is only one possible           like did, was, were, had, could, would, should etc. Informational
answer to such questions and no ambiguous/conflicting answers                 why-questions use ’there’ which is tagged as EX (representing Ex-
are possible for such questions. Etymology questions starting with            istential there) by Stanford Tagger. Etymology questions which use
Why also belongs to this category. For example, a. Why are rabbits            terms like ’called’, ’named’, ’represented as’, ’referred’, ’considered
eyes red? b. Why is Indiglo called Indiglo? c. Why do scuba divers go         to be’ etc. also belong to informational why-questions. Situational
into the water backwards? These questions contain either one fact             why-questions use ’when’, ’if’, ’while’, ’thought’, ’after’, ’before’,
or more than one fact, which might involve comparative reasoning              ’during’ etc. as conjunction.
in their answers.                                                                Some why-questions might have features belonging to more than
                                                                              one category. To remove ambiguity, we will identify the rules that
9.2    Historical Why-questions                                               helps to assign one category to a why- question. This classification
The intent of historical why-question is to receive the reasoning             of question will further help to identify the intent and main focus
of an event/action occurred in the past. These questions generally            of the question.
relate to domains like War, inventions, Law, Rights, etc. occurred in
the past. These questions generally have one correct answer. Justifi-
cation and evidence is required in the answering of such questions.           10    CONCLUSION AND FUTURE WORK
Examples of historical why-questions are: a. Why were people re-              This paper has given a new classification of why-questions for
cruited for the Vietnam War ? b. Why did the Globe Theatre burn               question answering system. We have classified why-questions in
down ? c. Why were medieval castles built?                                    four categories, and continue to identify different features of these
                                                                              why-questions. We will implement a parser which will categorize
9.3    Situational Why-Questions                                              why-questions according to their features. We will also do analysis
The intent of situational why-question is to receive the reasoning            of the answers for why-questions and determine their expected
for the action occurred at a particular context of time or in different       answer types.
                                                                          4
REFERENCES                                                                                    [26] Junta Mizuno, Tomoyosi Akiba, Atsushi Fujii, and Katunobu Itou. 2007. Non-
 [1] Lili Aunimo. 2005. A Question Typology and Feature Set for QA. Knowledge and                  factoid Question Answering Experiments at NTCIR-6: Towards Answer Type
     Reasoning for Answering Questions (2005), 53.                                                 Detection for Realworld Questions.. In NTCIR.
 [2] Josef Bayer. 1994. SENTENCE PROCESSING AND THE NATURE OF THE                             [27] Dan Moldovan, Sanda Harabagiu, Marius Pasca, Rada Mihalcea, Roxana Girju,
     HUMAN SYNTACTIC PARSER-INTRODUCTION. (1994).                                                  Richard Goodrum, and Vasile Rus. 2000. The structure and performance of an
 [3] Payal Biswas, Aditi Sharan, and Rakesh Kumar. 2014. Question Classification                   open-domain question answering system. In Proceedings of the 38th Annual Meet-
     using syntactic and rule based approach. In Advances in Computing, Communica-                 ing on Association for Computational Linguistics. Association for Computational
     tions and Informatics (ICACCI, 2014 International Conference on. IEEE, 1033–1038.             Linguistics, 563–570.
 [4] Long Chen, Dell Zhang, and Levene Mark. 2012. Understanding user intent in               [28] Meredith Ringel Morris, Jaime Teevan, and Katrina Panovich. 2010. What do
     community question answering. In Proceedings of the 21st International Conference             people ask their social networks, and why?: a survey study of status message q&a
     on World Wide Web. ACM, 823–828.                                                              behavior. In Proceedings of the SIGCHI conference on Human factors in computing
 [5] W Bruce Croft, Donald Metzler, and Trevor Strohman. 2010. Search engines:                     systems. ACM, 1739–1748.
     Information retrieval in practice. Vol. 283. Addison-Wesley Reading.                     [29] Jong-Hoon Oh, Kentaro Torisawa, Chikara Hashimoto, Takuya Kawada, Stijn
 [6] Olivier Ferret, Brigitte Grau, Martine Hurault-Plantet, Gabriel Illouz, Laura Mon-            De Saeger, Jun’ichi Kazama, and Yiou Wang. 2012. Why question answering using
     ceaux, Isabelle Robba, and Anne Vilnat. 2001. Finding An Answer Based on the                  sentiment analysis and word classes. In Proceedings of the 2012 Joint Conference
     Recognition of the Question Focus.. In TREC.                                                  on Empirical Methods in Natural Language Processing and Computational Natural
 [7] F Maxwell Harper, Daniel Moy, and Joseph A Konstan. 2009. Facts or friends?:                  Language Learning. Association for Computational Linguistics, 368–378.
     distinguishing informational and conversational questions in social Q&A sites.           [30] Jong-Hoon Oh, Kentaro Torisawa, Chikara Hashimoto, Motoki Sano, Stijn
     In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.                De Saeger, and Kiyonori Ohtake. 2013. Why-Question Answering using Intra-and
     ACM, 759–768.                                                                                 Inter-Sentential Causal Relations.. In ACL (1). 1733–1743.
 [8] Lynette Hirschman and Robert Gaizauskas. 2001. Natural language question                 [31] Jong-Hoon Oh, Kentaro Torisawa, Canasai Kruengkrai, Ryu Iida, and Julien Kloet-
     answering: the view from here. natural language engineering 7, 4 (2001), 275–300.             zer. 2017. Multi-column convolutional neural networks with causality-attention
 [9] Konrad Höffner, Sebastian Walter, Edgard Marx, Ricardo Usbeck, Jens Lehmann,                  for why-question answering. In Proceedings of the Tenth ACM International Con-
     and Axel-Cyrille Ngonga Ngomo. 2017. Survey on challenges of question an-                     ference on Web Search and Data Mining. ACM, 415–424.
     swering in the semantic web. Semantic Web 8, 6 (2017), 895–920.                          [32] Nelleke Oostdijk. 1996. Using the TOSCA analysis system to analyse a software
[10] Eduard Hovy, Ulf Hermjakob, and Deepak Ravichandran. 2002. A ques-                            manual corpus. Industrial parsing of software manuals 17 (1996), 179.
     tion/answer typology with surface text patterns. In Proceedings of the second            [33] Håkan Sundblad. 2007. Question classification in question answering systems. Ph.D.
     international conference on Human Language Technology Research. Morgan Kauf-                  Dissertation. Institutionen för datavetenskap.
     mann Publishers Inc., 247–251.                                                           [34] Suzan Verberne. 2006. Developing an approach for why-question answering. In
[11] Zhiheng Huang, Marcus Thint, and Zengchang Qin. 2008. Question classification                 Proceedings of the Eleventh Conference of the European Chapter of the Associa-
     using head words and their hypernyms. In Proceedings of the Conference on                     tion for Computational Linguistics: Student Research Workshop. Association for
     Empirical Methods in Natural Language Processing. Association for Computational               Computational Linguistics, 39–46.
     Linguistics, 927–936.                                                                    [35] Suzan Verberne. 2010. In Search of the Why: Developing a system for answering
[12] R Jayashree and N Niveditha. 2017. Natural Language Processing Based Ques-                    why-questions. [Sl: sn].
     tion Answering Using Vector Space Model. In Proceedings of Sixth International           [36] Suzan Verberne, LWJ Boves, NHJ Oostdijk, and PAJM Coppen. 2006. Data for
     Conference on Soft Computing for Problem Solving. Springer, 368–375.                          question answering: the case of why. (2006).
[13] Daniel Jurafsky and James H Martin. 2015. Speech and Language Processing: An             [37] Suzan Verberne, LWJ Boves, NHJ Oostdijk, and PAJM Coppen. 2007. Discourse-
     Introduction to Natural Language Processing, Computational Linguistics, and                   based answering of why-questions. (2007).
     Speech Recognition. (2015).                                                              [38] Suzan Verberne, Lou Boves, Nelleke Oostdijk, and Peter-Arno Coppen. 2008.
[14] Soojung Kim, Jung Sun Oh, and Sanghee Oh. 2007. Best-answer selection criteria                Using syntactic information for improving why-question answering. In Proceed-
     in a social Q&A site from the user-oriented relevance perspective. Proceedings of             ings of the 22nd International Conference on Computational Linguistics-Volume 1.
     the Association for Information Science and Technology 44, 1 (2007), 1–15.                    Association for Computational Linguistics, 953–960.
[15] Minh Le Nguyen, Nguyen Thanh Tri, and Akira Shimazu. 2007. Subtree Mining                [39] Suzan Verberne, Lou Boves, Nelleke Oostdijk, and Peter-Arno Coppen. 2010.
     for Question Classification Problem.. In IJCAI. 1695–1700.                                    What is not in the Bag of Words for Why-QA? Computational Linguistics 36, 2
[16] Quirk Randolph-Sidney Greenbaum-Geoffrey Leech and Jan Svartvik. 1985. A                      (2010), 229–245.
     comprehensive grammar of the English language. (1985).                                   [40] Yang Xiang, Qingcai Chen, Xiaolong Wang, and Yang Qin. 2017. Answer Selection
[17] Baoli Li, Yandong Liu, Ashwin Ram, Ernest V Garcia, and Eugene Agichtein. 2008.               in Community Question Answering via Attentive Neural Networks. IEEE Signal
     Exploring question subjectivity prediction in community QA. In Proceedings of                 Processing Letters 24, 4 (2017), 505–509.
     the 31st annual international ACM SIGIR conference on Research and development           [41] Dell Zhang and Wee Sun Lee. 2003. Question classification using support vector
     in information retrieval. ACM, 735–736.                                                       machines. In Proceedings of the 26th annual international ACM SIGIR conference
[18] Fangtao Li, Xian Zhang, Jinhui Yuan, and Xiaoyan Zhu. 2008. Classifying what-                 on Research and development in informaion retrieval. ACM, 26–32.
     type questions by head noun tagging. In Proceedings of the 22nd International            [42] Zhe Zhao and Qiaozhu Mei. 2013. Questions about questions: An empirical
     Conference on Computational Linguistics-Volume 1. Association for Computational               analysis of information needs on Twitter. In Proceedings of the 22nd international
     Linguistics, 481–488.                                                                         conference on World Wide Web. ACM, 1545–1556.
[19] Zhe Liu and Bernard J Jansen. 2015. Subjective versus objective questions:
     Perception of question subjectivity in social Q&A. In International Conference on
     Social Computing, Behavioral-Cultural Modeling, and Prediction. Springer, 131–
     140.
[20] Zhe Liu and Bernard J Jansen. 2015. A Taxonomy for Classifying Questions
     Asked in Social Question and Answering. In Proceedings of the 33rd Annual ACM
     Conference Extended Abstracts on Human Factors in Computing Systems. ACM,
     1947–1952.
[21] Zhe Liu and Bernard J Jansen. 2017. ASK: A taxonomy of accuracy, social, and
     knowledge information seeking posts in social question and answering. Journal
     of the Association for Information Science and Technology 68, 2 (2017), 333–347.
[22] Zhe Liu and Bernard J Jansen. 2017. Identifying and predicting the desire to help
     in social question and answering. Information Processing & Management 53, 2
     (2017), 490–504.
[23] Nobuhito Marumo, Takashi Beppu, and Takahira Yamaguchi. 2014. A knowledge-
     transfer system integrating workflow, A rule base, Domain ontologies and a
     goal tree. In International Conference on Knowledge Science, Engineering and
     Management. Springer, 357–367.
[24] Donald Metzler and W Bruce Croft. 2005. Analysis of statistical question classifi-
     cation for fact-based questions. Information Retrieval 8, 3 (2005), 481–504.
[25] Amit Mishra and Sanjay Kumar Jain. 2016. A survey on question answering
     systems with classification. Journal of King Saud University-Computer and Infor-
     mation Sciences 28, 3 (2016), 345–361.

                                                                                          5