-

Dowsing for Math Answers with Tangent-L

Yin Ki NG

Dallas J. Fraser

Besat Kassaie

George Labahn

Mirette S. Marzouk

Frank Wm. Tompa

Kevin Wang

0 0 David R. Cheriton School of Computer Science, University of Waterloo , Waterloo, ON , Canada , N2L 3G1 1 DigitalEd , 630 Weber Street North, Suite 100, Waterloo, ON , Canada , N2V 2N2

We present our application of the math-aware search engine Tangent-L to the ARQMath Community Question Answering (CQA) task. Our approach performs well, placing in the top three positions out of all 23 submissions, including the baseline runs. Tangent-L, built on the text search platform Lucene, handles math formulae by first converting a formula's Presentation MathML representation into a Symbol Layout Tree, followed by extraction of math tuples from the tree that serve as search terms. It applies BM25+ ranking to all math tuples and natural language terms in a document during searching. For the CQA task, we index all question-answer pairs in the Math Stack Exchange corpus. At query time, we first convert a topic question into a bag of formulae and keywords that serves as a formal query. We then execute the queries using Tangent-L to find the best matches. Finally, we re-rank the matches by a regression model that was trained on metadata attributes from the corpus. Our primary run produces an nDCG0 value of 0.278 and MAP0 value of 0.063, where these are two common measures of quality for ranked retrieval. However, our best performance, an nDCG0 value of 0.345 and MAP0 value of 0.139, is achieved by an alternate run without re-ranking. Follow-up experiments help to explain which aspects of our approach lead to our success.

Community Question Answering (CQA) Mathematical Information Retrieval (MathIR) Symbol Layout Tree Lucene Mathematics Stack Exchange

Mathematical Information Retrieval (MathIR) focuses on using mathematical formulae and terminology to search and retrieve documents that include mathematical content [11, 22]. MathIR is important because content expressed in formal mathematics and formulae is often crucial and non-negligible in STEM papers [27]. In the recent decade, the MathIR research community has been growing and developing ever-improved math-aware search systems (e.g. [7, 10, 13, 14, 23–28, 30, 32]). Most of these efforts have been encouraged through a series of MathIR evaluation workshops at NTCIR-10 [2], NTCIR-11 [3], and NTCIR12 [29]. The workshops have provided corpora derived from arXiv and Wikipedia for traditional ad-hoc retrieval tasks and formula search tasks, and the data and tasks have since served as benchmarks for the research community.

The ARQMath Lab (ARQMath) [16, 31] is the first Community Question Answering (CQA) task with questions involving math data, using collections from the Math Stack Exchange (MSE),3 a math question answering site. The training corpus contains approximately 1.1 million mathematical questions and 1.4 million answers, covering MSE threads from the year 2010 to 2018. Like the NTCIR workshops that precede it, it is centered around an evaluation exercise that aims to advance math-aware search and the semantic analysis of mathematical notation and texts. The main task of ARQMath is the answer retrieval task, in which participating systems need to find answers to a set of mathematical questions among previously posted answers on MSE. A secondary task considers matching relevant formulae drawn from the mathematical questions from the same collection. Participating teams were asked to submit up to five runs for either or both tasks, and selected results received relevance assessments from human evaluators.

Related tasks to the ARQMath Lab task have been held previously: a recent math question answering task was held as part of SemEval-2019 [12], following CQA challenge series held at SemEval-2015 [18], SemEval-2016 [19], and SemEval-2017 [17]. The math question answering task at SemEval-2019 considered a math question set that was derived from Math SAT practice exams. This task was different from the ARQMath CQA task, since the data does not involve question-answer threads from a community forum, and the task targeted identification of one uniquely correct answer by multiple-choice selection or by numerical computation, instead of retrieving all relevant answers from a corpus. On the other hand, the earlier CQA challenge series at SemEval involved questioncomment threads from the Qatar Living Forum, which is a data collection that is similar to the MSE collection. This CQA challenge series, however, differs from the ARQMath CQA task in that the questions are not necessarily mathematical, and the task objective is answer-ranking instead of answer retrieval from a corpus. Besides SemEval tasks, related tasks under the question-answering context were also held previously at TREC, CLEF, and NTCIR ([1], [20]), but the data involved was not drawn from the mathematical domain and the data did not follow a community-forum structure in general.

Our team of MathDowsers4 participated in the ARQMath CQA task, with an approach based on the Tangent-L system, a math-aware search engine proposed by Fraser et al. [9]. Tangent-L is a traditional math-aware query system developed after NTCIR-12, 3 https://math.stackexchange.com 4 Water loo researchers dowsing for math (Fig.1) Fig. 1. Dowsing for math. using the data provided for all three NTCIR math search workshops, and appeared to be competitive with the systems participating in those workshops. We wished to determine whether it could, in fact, perform well against other traditional math-aware query systems and whether a traditional math-aware query system could compete with modern machine-learning approaches that might be adopted by other workshop participants in a question-answering task.

Our experiments included five submitted runs that were designed to address the following research objectives: RQ1 What is an effective way to convert each mathematical question (expressed in mathematical natural language) into a formal query consisting of keywords and formulae? RQ2 Should keywords or math formulae be assigned heavier weights in a query? RQ3 What is the effect of a re-ranking algorithm that makes use of metadata?

We present an overview of Tangent-L in Section 2. In Section 3, we describe our approach to CQA and provide details on how we retrieve and rank answer matches for a mathematical question from the MSE corpus with the use of Tangent-L. The submitted runs and the results are discussed in Section 4. In Section 5, we present conclusions and propose future work. 2

Overview of Tangent-L

The Tangent-L system is the core component of our submission. It is a traditional query system built on the popular Lucene [4] text search platform, adding methods adapted from the Tangent-3 math search system [30] so that it is capable of matching documents based on queries with keywords and formulae.

Tangent-L handles formulae by converting them into a bag of “math terms” to be matched in the same way that natural language terms are handled by Lucene. More specifically, Tangent-L takes as input a formula in Presentation MathML [5] format5 and converts it into a symbol layout tree (SLT) [30] where nodes represent the math symbols and edges represent spatial relationships between these symbols (Figure 2). Thereafter, this tree-like representation is traversed to extract a set of features, or “math tuples,” of four types to capture local characteristics of a math formula as depicted in Figure 3. In preparation for search, the math tuples replace the formula itself in the document and are then considered by Tangent-L as if each were a term in the text to be matched.

After formula conversion, Tangent-L applies BM25+ ranking [15] to all the terms in a document. Specifically, given a collection of documents D containing jDj documents and a query q consisting of a set of query terms, the score for a document d 2 D is given by

BM25+(q; d) = X w2q k 1:0 (k + 1)tf w;d b + b jdj + tf w;d d

! + log jDj + 1 jDwj ! ( 1 ) 5 A formula in LATEX representation can be converted into MathML by using LaTeXML (https://dlmf.nist.gov/LaTeXML/).

y % & i ! = ! 1 ! + ! % x where k, b, and are constants (following common practice, chosen to be 1.2, 0.75, and 1, respectively); tf w;d is the number of occurrences of term w in document d; jdj is the total number of terms in document d; d = Pd2D jjDdjj ; and jDwj is the number of documents in D containing term w. This formula is easily applied to a bag of query terms: if a term is repeated in the query, the corresponding score for that term is simply accumulated multiple times. To allow for math tuples to be given a weight that differs from natural language terms, we assign weights to query terms as follows:

BM25w+(qt [ qm; d) = BM25+(qt; d) + BM 25+(qm; d) ( 2 ) where qt is the set of keywords in a query, qm is the set of math feature tuples in that query, and is a parameter to adjust the relative weight applied to math tuples.

In the NTCIR-12 arXiv Main task benchmark, where queries are composed of formulae and keywords, the Tangent-L system gives a comparable performance to other MathIR systems [9]. We are interested in determining whether and how Tangent-L could be adapted to address the ARQLab CQA task.

Methodology

After indexing the corpus, we adopt a three-phase architecture to return answer matches for a mathematical question: 1. Conversion: Transform the input (a mathematical question posed on MSE) into a well-formulated query consisting of a “bag” of formulae and keywords. 2. Searching : Use Tangent-L to execute the formal query to find the best matches against the indexed corpus (MSE question-answer pairs). 3. Re-ranking : Re-order the best matches by considering additional metadata (such as votes and tags) associated with question-answer pairs. 3.1

Conversion: Extracting Formulae and Keywords from Questions For the CQA task, participants are given 98 real-world mathematical questions selected from MSE posts in 2019.6 Each mathematical question is a topic that contains: ( 1 ) the topic-ID, ( 2 ) the title for this topic, ( 3 ) the question (body text) for this topic, and ( 4 ) the tags for this topic. The title and body text are free text fields describing a question in mathematical natural language (as opposed to formal logic, for example), and the tags indicate the question’s academic areas.

We adopt the following automated mechanism to extract a list of topic formulae and keywords: Topic formulae. Formulae within the topic’s title and body text are extracted. All formulae within the title are selected as topic formulae. Formulae within the body text are selected only if they are not single variables (e.g., n or i) nor isolated numbers (e.g., 1 or 25).

Topic keywords. Keyword selection is summarized in Algorithm 1. Each of the topic’s tags is selected as one of the topic keywords. For the topic’s title and body text, we first tokenize the text (Algorithm 2) to obtain a list of potential word tokens. A token is then selected as a keyword if it contains a hyphen (such as “Euler-Totient” or “Cesáro-Stolz”) (Algorithm 3) or it is not a stopword (Algorithm 4) and its stem appears on a pre-constructed list of mathematical stems.

The mathematical stem list is created in a pre-processing step by automatically extracting terms from two sources: ( 1 ) tags from the indexed MSE dataset and ( 2 ) titles from English Wikipedia articles comprising the corpus for the NTCIR-12 MathIR Wikipedia task[29]. For the former source, each tag present in the ARQMath MSE corpus is tokenized, stemmed, and added to the stem list. For example, two stems (“totient” and “function”) are added to the list for the tag “totient-function.” For the latter, we first collect the HTML filenames of all articles used in the NTCIR-12 MathIR Wikipedia corpus, where each filename reflects the corresponding Wikipedia article’s title. The filenames are then transformed by removing file extensions and replacing underscores and parentheses 6 In addition, three extra questions are provided as samples with annotated answers. with spaces. Each hyphenated term is expanded to include all components as well as the hyphenated term itself. The resulting cleaned text strings are tokenized, stemmed, and punctuation removed, and the resulting stems are added to the stem list. Thus, for example, four stems (“exponenti,” “logarithm,” “distribut,” and “exponential-logarithm”) are added for the filename Exponentiallogarithmic_distribution.html. The resulting mathematical stem list comprises about 1200 stems from MSE tags and about 21,000 stems from Wikipedia article titles.

for tags in topic do

Split input by ',' to obtain a list of tags; foreach tag do add tag to keyword list; if hyphen in tag then further split by '-' to obtain a list of parts; foreach part do

add part to keyword list; for title, body text in topic do foreach token in Tokenize(input) do if !IsStopword(token) && (IsPreserved(token) jj OnMathList(Stem(token))) then append to keyword list;

Algorithm 1: ExtractKeywords(topic) Split the input by space into substrings;

foreach substring do if hyphen in substring then add substring to the token list;

Replace '-' with space throughout input;

foreach token in Treebank-Tokenization(input) do add token to the token list;

Algorithm 2: Tokenize(input)

return ('-' in token) jj ('–' in token)

Algorithm 3: IsPreserved(token) return (token in stopword set provided by the NLTK library) jj (token contains only a single character) jj (token is a numeric string)

Algorithm 4: IsStopword(token)

When selecting keywords with this procedure, we use the Treebank tokenizer, the Porter stemmer, and a list of English stop-words provided by the Python NLTK library.7 Using this approach, on average 8 topic formulae and 38 topic keywords were extracted for the CQA task (Table 1). The complete list of topic formulae and keywords can be found in the Appendix. We use the Tangent-L system to retrieve answers that match the extracted keywords and formulae.

Indexing. We build the indexed corpus with question-answer pairs. Each indexed unit includes an MSE answer along with the content of its associated question.8 For each question-answer pair, we extract the following content from the corpus XML files:

From the answer: the body text and the number of votes (from Posts.xml ); From the associated question: the title, body text, and tags of the question (from Posts.xml ) plus comments associated with each question (from Comments.xml ). Additionally, we also include the titles of all related and duplicated posts for this question (from PostLinks.xml ), having first converted all one-way links between posts into two-way links.

All formulae within the text are replaced by their Presentation MathML form using the formula mapping in the TSV files provided with the corpus data. An HTML file containing the final extracted content is then assembled as an indexing unit (Figure 4.). The indexable version of the MSE corpus includes a total of 1,445,488 documents, indexed by Tangent-L in preparation for search. Searching. Searching the corpus is a straightforward application of TangentL for the converted topics. The list of keywords and formulae (in MathML) are passed to the search engine. Tangent-L then converts the formulae to math tuples and uses corresponding postings lists in the index to compute BM25+ scores (Equation 2), weighted by a value that depends on the experimental setup for the run. 7 https://www.nltk.org/ 8 As many of us were admonished in school: “You should always include the question as part of your answer.” <body> <div class="row" id="question header">

<h1> The cow in the field problem (intersecting circular areas) </h1> <div class="question"> <div class="question" data questionid="#QID#" id="question"> <div class="post text"> <p>What length of rope should be used to tie a cow to an <strong>exterior fence post</strong> of a <em>circular</em> field so that the cow ....

</div> </div> <div class="post taglist">

<span class="post tag"> geometry </span> <hr> </div> </div> <div class="row" id="answers"> <div class="answer" data answerid="62"> <div class="post text"> <p>So, the area of the field is <span class="math container" id=795 fid=795><?xml version="1.0" encoding="UTF 8"?><math xmlns="http://www.w3.org/1998/Math/MathML" alttext= "\pi␣r^{2}" display="block"> <mrow> <mi>$\pi$</mi> <mo>~</mo> <msup> <mi>r</mi> <mn>2</mn> </msup> </mrow></math></span> and you want the cow to be able to graze an area equal to half of that.</p> <p>All you need to do is set up the equation .... </div> </div> <div class="row" id="question comments"> <table> <tr><td> I'm guessing that most fence posts are at the edge of a field, which makes this a far more interesting problem. </td></tr><tr><td> ....

</tr> </table> </div> <div class="row" id="duplicated"> <table> <tr><td> A confusing word problem related to geometry (circles) </td></tr><tr> <td> Is it possible to express the area of the intersection of 2 circles as a closed form expression? </td></tr> </table> </div> <div class="row" id="related"> <table> <tr><td> Find the area where dog can roam </td></tr><tr><td> A goat tied to a corner of a rectangle </td></tr> </table> </div> </body> The CQA task is made more challenging by the potential to use semantic information relating candidate answers to topic questions. For instance, an answer should be ranked higher in position if it answers the topic question correctly. However, a correct answer does not necessarily have the highest score based merely on matching keywords and formulae by similarity.

As such, we are motivated to explore a re-ranking algorithm applied to the results returned from the search engine. To this end, we wish to use information from the complete MSE corpus to build a model that reflects how valuable a potential answer might be. We hypothesize, with the law of simplicity in mind, that a linear function of the following four variables might serve well: 1. similarity: The similarity of keywords and formulae between the target answer (including its associated question) and the topic query is clearly an important component. This is captured by the search score returned from the Tangent-L system. 2. tags: The number of overlapping tags between the question associated with the answer and the topic query reflects how well the answer matches the query’s academic area(s). 3. votes: The fraction of votes received by the answer when posted with its associated question reflects the community’s belief in the answer’s value. 4. reputation: The reputation of the author who wrote an answer implies the trustworthiness of that answer. Intuitively, a good answer comes from an author with a good reputation. This can be computed from the user reputation score, the number of user up-votes, and the number of user downvotes.

The remaining problem is to determine what coefficient values to use when linearly combining these inputs. For this we need a training set that includes relevance assessments, which are not available as part of the corpus. Mock Relevance Assessments. As a substitute for assessed relevance, we can build a training set of queries from questions already in the corpus: we hypothesize that relevant answers include those that were actually provided for those questions as well as those provided for duplicate and related questions.

We use tags, related posts, and duplicate posts of a question, and the number of votes for each answer to calculate mock relevance assessments for answers to the training topics, based on the following two observations: 1. Considering the question associated with a target answer, the more “ontopic” that question is to the query, the more relevant the target answer is likely to be for that query. A corpus question and a query are related if they have overlapping tags, but are more related if one is marked as a related post of the other, and still more related if it is marked as a duplicate post of the other (or is, in fact, the same question). 2. If two potential answers are associated with the same question, the one with more votes should be preferred.

With these assumptions, an assessment value can be computed as follows: – Integral assessment value.

1. An answer gets an assessment value of 2, if the question associated with that answer is a duplicate post to the query question, or if the answer comes from the same question. 2. An answer gets an assessment value of 1, if the question associated with that answer is a related post to the query question; 3. Otherwise, the answer gets an integral assessment value of 0 – Fractional assessment value.

1. Assuming that all votes for answers are non-negative, an answer gets a normalized fractional assessment value between 0 and 1, calculated as the fraction of its votes to the sum of votes of all answers for the associated question, if the associated question is the same or a duplicate or related post to the query question or if the associated question contains any overlapping tags with the query question.

2. Otherwise, the answer gets a fractional assessment value of 0.

The final mock relevance assessment is the sum of the integral and fractional assessment values and, when all answers have non-negative votes, ranges from 0 to 3 (matching the assessment range expected in the CQA task):

Score

0 0 .. 1 1 .. 2 2 .. 3

Meaning

irrelevant tags overlap posts are related posts are identical or duplicates

In fact, some answers in the corpus have negative votes (down-votes). In this case, the fractional assessment value is adjusted as follows: for a thread containing any answers with negative votes, the fraction’s denominator is the absolute value of the total negative votes plus the sum of votes within a thread. If an answer has positive votes, the fraction’s numerator is the absolute value of the total negative votes plus the number of votes for the answer, otherwise, the fraction’s numerator is the (negative) value of the votes. For example, suppose a thread contains three answers, and their votes are -2, 2, 6, respectively. The fractional assessment values for these answers are -0.2, 0.4, 0.8, respectively. Thus the mock relevance assessment penalizes answers with negative votes. Linear Regression Formula. To determine the coefficients for the linear regression model, we first generate 1300 training topics from the MSE collection. We use the Tangent-L system with = 1:0 (in Equation 2) to retrieve the top 10,000 answers for each valid topic, resulting in a total of 12,970,000 answers. * the original answer vote y the denominator of the fractional assessment value Next, we generate the mock relevance assessments for these answers and associate these assessments with the values for similarity, tags, votes, and reputation for those answers, as discussed above. These 12,970,000 tuples then serve as training data for the linear regression model. Example items of the training data are shown in Table 2, and the trained coefficients are shown in Table 3.

We validate the trained linear regression model by first applying the model to predict a relevance score for the top 1000 retrieved answers of a separate set of another 100 topics, and then re-ranking the answers according to the predicted mock relevance score. Finally, we adopt the normalized Discounted Cumulative Gain (nDCG), which measures the gain of a document based on its position in the result list and its graded relevance. We compare the nDCG value of the results for those 100 topics according to the mock relevance assessment before and after re-ranking. This simple model gives a slight improvement in nDCG after the re-ranking (from 0.3192 to 0.3318). 4

Experiments

Description of Runs The MathDowsers submitted a primary run and four alternate runs, each run returning at most 1000 answer matches for all 98 topics. The primary run is designed as a combination of our hypotheses for the best configuration for all research objectives. The alternate runs are designed to test these hypotheses: the setup for each of them is the same as the primary run, except for a single aspect that is associated with a testing hypothesis, as described here and summarized in Table 4.

RQ1 Topic Conversion

In the primary run, we use the auto-extracted topic keywords and formulae described in Section 3.1 and listed in the Appendix as the query input to the search engine. To compare the effectiveness of our extraction algorithm with human understanding of the questions, an alternate run takes as input lists of up to five topic keywords and up to two formulae per topic, all manually chosen by the lab organizers after reading each question and made available to all participants. This alternate run, which includes “translated” in its label, is therefore a manual run, in contrast to all other runs that are “automatic.”

RQ2 Formula Weight

We test the weights of math formulae in a query by tuning Tangent-L’s parameter (Equation 2). In the primary run, = 0:5, which means math terms are given half the weight of keywords in a query. The reasoning behind this choice is that each formula generates many terms, one for each feature extracted, and previous experiments showed some benefit in reducing the weight accordingly [9]. Two alternate runs operate with = 0:2 (even less weight on formulae) and = 1:0 (Tangent-L’s default setting), respectively.

RQ3 Re-ranking

In the primary run, we re-rank the results from the search engine, using the model described in Section 3.3. To evaluate the effectiveness of the model, an alternate run has no re-ranking. Indexing. Indexing is done on a Ubuntu 16.04.6 LTS server, with two Intel Xeon E5-2699 V4 Processors (22 cores 44 threads, 2.20 GHz for each), 1024GB RAM and 8TB disk space (on an USB3 external hard disk). The size of the document corpus is 24.3GB.

Tangent-L requires 5.0GB of storage on the hard drive and approximately 6 hours to index all documents with parallel processing.

Searching and Re-ranking. Training and testing the model for re-ranking is done on a Linux Mint 19.1 machine, with an Intel Core i5-8250U Processor (4 cores 8 threads, up to 3.40 GHz), 24GB RAM and 512GB disk space.9

Model training using the Python scikit-learn library10 takes less than 30 seconds, and re-ranking for all 98 topics requires around 3 seconds per run.

Searching is executed on this same Mint machine, and retrieval time statistics for Tangent-L are reported in Table 5. alpha05 y 13.3 0.669 (A.67) / 0.775 (A.94) 63.4 (A.76) / 48.7 (A.28) alpha02 13.3 0.661 (A.67) / 0.850 (A.94) 59.1 (A.76) / 49.5 (A.28) alpha10 13.1 0.616 (A.67) / 0.784 (A.83) 54.0 (A.76) / 48.7 (A.28) alpha05-translated 5.3 0.247 (A.99) / 0.291 (A.94) 32.8 (A.11) / 25.0 (A.67) alpha05-noReRank 13.3 0.669 (A.67) / 0.775 (A.94) 63.4 (A.76) / 48.7 (A.28) y The run alpha05 does not, in fact, take any additional retrieval time, since it merely re-ranks the retrieved items from the alpha05-noReRank run. The primary measure for the task is the Normalized Discounted Cumulative Gain (nDCG) with unjudged documents removed (thus nDCG0). Following the organizers’ practice [31], we also measure Mean Average Precision (MAP) with unjudged documents removed (MAP0) and Precision at top-10 matches (P@10). Additionally, we calculate the bpref measure, which other researchers have found useful, and the count for Unjudged Answers within the top-k retrieved answers for each topic. 9 A NVIDIA GeForce MX150 graphics card with 2GB on-card RAM is available on the machine, but it was not used for the experiments. 10 https://scikit-learn.org

The lab organizers obtained relevance assessments for 77 of the 98 topics. With respect to the primary measure, MathDowsers placed in the top three positions out of all 23 submissions including the five baseline runs [31]. The retrieval performance for the MathDowsers’ submissions is summarized in Table 6, along with the performance of baseline systems provided by the lab organizers. One of the five baselines is an unrealizable model and the other four are traditional text or math-aware query systems adapted for the task. They are described by the lab organizers as follows [31]: Linked MSE posts: an unrealizable model “built from duplicate post links from 2019 in the MSE collection (which were not available to participants). This baseline returns all answer posts from 2018 or earlier that were in threads from 2019 or earlier that MSE moderators had marked as duplicating the question post in a topic. The posts are sorted in descending order by their vote scores.” Approach0: “the ECIR 2020 version of the Approach0 text + math search engine [32], using queries manually created by the third and fourth authors. This baseline was not available in time to contribute to the judgement pools and thus was scored post hoc.” Formulae are represented by operator trees (OPT) where internal nodes are operators and leaves are operands, and formulae are matched efficiently by finding structurally identical sub-trees with dynamic pruning.

TF-IDF + Tangent-S: “a linear combination of TF-IDF and Tangent-S results [see below]. To create this combination, first the relevance scores from both systems were normalized between 0 and 1 using min-max normalization, and then the two normalized scores were combined using an unweighted average.” TF-IDF: “a TF-IDF (term frequency–inverse document frequency) model from the Terrier platform [21]... Formulae are represented using their LATEX string...

The TF-IDF baseline used default parameters in Terrier.” Tangent-S: like Tangent-L, an extension from the Tangent-3 math search system [30]: “a formula search engine using SLT and OPT formula representations [6]. One formula was selected from each Task 1 question title if possible; if there was no formula in the title, then one formula was instead chosen from the question’s body. If there were multiple formulae in the selected field, the formula with the largest number of nodes in its SLT representation was chosen... (Tangent-S) retrieves formulae independently for each representation, and then linearly combines SLT and OPT scoring vectors for retrieved formulae [6]. For ARQMath, we used the average weight vector from cross validation results obtained on the NTCIR-12 formula retrieval task.” From the top half of Table 6, the first observation is that the primary run, with our presumed best configuration, performs better than only one of our alternate runs. The second observation is that lowering the weight placed on math terms ( = 0.2) improves the performance, and using the default weight ( = 1.0) hurts the performance. Thirdly, the alternate run using manually extracted formulae and keywords performed better than the primary run. Finally, the alternate run without re-ranking achieves the best performance in all evaluation measures. This best submission also performs well with respect to all other evaluation measures, with the best (but unrealizable) baseline system, Linked MSE posts, being the only submission to perform better with respect to those other measures.

In retrospect, it appears as if all aspects of our primary run leave room for improvement. In order to explore these observations more closely, we execute several additional runs designed after the conclusion of the formal experiment (post-experiment ). As detailed in Table 7, these runs examine the performance of our system without re-ranking for additional values of and with automatic or manual choice of keywords and formulae. We can now summarize our insights from all of Table 6 with respect to our research objectives.

The effect of re-ranking. Comparing the result from the submissions and the post-experiment runs, we see that our re-ranking design was detrimental to the performance (RQ3). A consistent drop of at least 0:06 in nDCG0 can be observed for runs after re-ranking (e.g alpha02 vs. alpha02-noReRank ). A similar deterioration can be observed in other evaluation measures as well. The effect of . Considering runs without re-ranking, we confirm that the performance gradually improves with the decrease of weight for math formulae in a query. When we decrease from 1.0 to 0.1, a gain of 0.06 in nDCG0 is achieved for runs with auto-extracted topic conversion (alpha10-noReRank vs. alpha01-noReRank ). Similarly, a gain of 0.03 in nDCG0 is achieved for runs with manual topic conversion (alpha10-trans-noR vs. alpha01-trans-noR). Gains are also observed using other evaluation measures.

Unfortunately, when is set to a much smaller value, namely 0.01, the overall performance is questionable since the result of different evaluation measures is contradictory. We observe that for the run with auto-extracted topic conversion (alpha001-noReRank ), all measures are at their lowest compared to runs with other choices of . However, for the run with manual topic conversion (alpha001trans-noR), nDCG0 and P@10 are still at their lowest, but MAP0 is the secondbest, and bpref is at its highest among all other choices of .

Table 8 shows that with = 0:01 unjudged answer coverage at top-k for alpha001-noReRank is extraordinary high (at least 80%). Similarly, among the manual runs, alpha001-trans-noR also has by far the highest unjudged answer coverage for various values of k across all choices of . The large percentage of unjudged answers implies that the evaluation of these two runs might not actually be informative.

We conclude that keywords should be assigned heavier weights in a query (RQ2). Good performance is achieved when math terms are given one-tenth of the weight of keywords in a query ( = 0.1). We are unable to determine the effect when the weight of math formulae is set even smaller.

The effect of topic conversion. Table 6 shows that with high values (0.5 and 1.0), runs with manual topic conversion consistently perform better than runs with auto-extracted topic conversion in all evaluation measures. However, with lower values (0.1 and 0.2), runs with auto-extracted topic conversion performs better for P@10. When = 0:1, which we have just concluded to be the best setting for , the performance of the two runs with respect to nDCG0 and bpref measures becomes essentially indistinguishable. We conclude that the proposed method to convert a topic into a query composed of keywords and formulae is competitive with human ability to select search terms (RQ1). We discuss this observation further in Section 5. bpref To gain a deeper insight into the behaviour of our best-performing submission run (alpha05-noReRank ), we examine its performance within each topic category, as determined by the lab organizers over three aspects. Dependency shows to what extent a topic depends on the text, formula, or both. Topic Type gives a broad categorization of whether a topic asks about a computation, a concept, or a proof. Difficulty approximates how hard a topic is to answer, using three levels of difficulty: easy, medium, and hard. The breakdown counts of the 77 test topics by category is shown in Table 9.

Table 10 tabulates the performance by category for our best-performing submission run vs. the best-performing (but unrealizable) baseline. It shows that our system has a weaker performance than the ideal with respect to MAP0, P@10, and bpref measures in all categories. However, with respect to the nDCG0 measure, its better overall performance results from its performance in some particular categories.

In terms of dependency, our system has strong performance for topics that rely heavily on formulae. We attribute this strength to Tangent-L, the underlying MathIR system that is competitive with other state-of-the-art systems for formula retrieval [9]. We further conclude that, in spite of our earlier conclusion that performance improves for lower values of , setting = 0—assigning no weight to math terms in a query—would be a poor design decision for our system.

In terms of topic type, we observe that our system is strong at Computationtype and Proof -type topics, but is particularly weak at Concept -type topics when compared with the other two types of topics. With further inspection, we see that none of the Concept -type topics have a Formula-dependency as shown in Table 11, which might be the reason why our system does not perform as well in that category.

Our system excels at all three levels of difficulty: Easy, Medium, and Hard. We attribute this performance to the even distribution of a significant number of topics relying on formulae (with either a Formula-dependency or Both-dependency) among the three levels, as observed in Table 11.

Similar conclusions can also be made when comparing the nDCG0 of our best-performing submission with that of other baselines, as shown in Table 12.

Finally, we observe that our system might be tuned to boost performance to accommodate its weakness if the type of category were known in advance. For instance, in Table 13 we observe that the overall nDCG0 value can be slightly improved if our trained re-ranking model is applied on Concept -type topics only. This boosting effect is only essentially distinguishable for the setting of = 1:0 and not effective for our recommended setting of = 0:1.11 This leads us to speculate that some advantage might be gained by building a system that adapts to the type of query being posed, instead of seeking a one-size-fits-all solution. 11 This effect might be partially attributable to the fact that the re-ranking model is trained over answers retrieved by Tangent-L under the setting = 1:0, as described in Section 3.3. alpha10 ( = 1:0) alpha05 ( = 0:5) alpha02 ( = 0:2) alpha01 ( = 0:1)

Conclusions and Future Work

We conclude that a traditional math-aware query system remains a viable option for addressing a CQA task specialized in the mathematical domain. In particular, Tangent-L is a math-aware search engine well-suited to retrieve answers to many computational and proof-like questions in the presence of math formulae. We hypothesize that part of the success for all five MathDowsers runs results from the search engine having indexed question-answer pairs instead of answers only, thereby providing a context for evaluating the suitability of each answer in serving as an answer to newly posed topic questions.

Nevertheless, several of our initial experimental design decisions turn out to be somewhat disappointing. In the remainder of this section, we share our thoughts regarding room for improvement.

Improvements for re-ranking. The first improvement is with respect to the design and training of the re-ranking model described in Section 3.3. Several avenues are open: 1. We realize in retrospect that we should normalize the scores returned by Tangent-L within each topic: as for many search engines, the scores for one query are not comparable to the scores for another query. However, we have since found that even with proper normalization, the linear regression model does not produce an overall valuable re-ranking. 2. The mock relevance scores used for training the model might not be indicative of assessed relevance, therefore giving a poor model for re-ranking. For future work, now that there are actual assessments from the ARQMath Lab, we could use those for training and avoid the need for mock scores. This approach could be tested by conducting some cross-validation studies. 3. Perhaps using linear regression is not appropriate for this application. An alternate approach is to first transform (mock or actual) assessments into a set of discrete scores, and then treat these scores as categories to be predicted by a classification model, such as a Support Vector Machine. Other approaches to model CQA-type features should also be investigated, including those that have been shown to be successful in the SemEval CQA Challenge series [17– 19].

Improvements for topic conversion. Another area for improvement is with respect to the design of the auto extraction algorithm described in Section 3.1. Although the use of our automatic extraction algorithm performs comparably to manual topic conversion, we have not considered constraining the number of keywords and formulae in the algorithm. With a widely fluctuating formula-tokeyword ratio across topics, the performance of our extraction algorithm might be hindered by using any fixed value for all queries. Preliminary investigations have been unable to establish a correlation between the best value for and the ratio of the number of keyword terms to the number of terms extracted from math formulae [8], but this new benchmark might provide additional insights into how to choose a value for “on the fly” that depends on the number of keyword terms and math terms.

Alternatively, constraining the maximum number of keywords and formulae and the sizes of formulae extracted from a topic description might be a better approach, since it would also constrain the number of keyword and math terms. However this poses a new question on determining the best maximum limit. We believe that research related to Automatic Term Extraction (ATE) in technical domains, or in mathematical domains, might provide valuable insights into our problem.

Acknowledgements

We gratefully acknowledge financial support from the Waterloo-Huawei Joint Innovation Lab and from NSERC, the Natural Science and Engineering Research Council of Canada in support of our project entitled “Searching Documents with Text and Mathematical Content Using a Pen Based Interface.” Prof. Gordon Cormack kindly let us use one of his research machines for indexing the corpus. We also thank the ARQMath Lab organizers (including, notably, Behrooz Mansouri) for the time and effort expended on developing the idea for the Lab and submitting the proposal to CLEF, as well as preparing the corpus, the questions, the manual translation of the topic questions into formulae and keywords, and the relevance assessments. Andrew Kane and anonymous reviewers from other participating lab teams made several valuable suggestions for improving our presentation after reading a draft version of this paper.

Appendix

Below is the list of formulas and keywords converted from the given topics using the method described in Section 3.1. Among the 98 topics submitted for each run, 77 were (partially) assessed for the CQA task. The topics with no assessments are A.2, A.6, A.22, A.25, A.34, A.46, A.57, A.64, A.70, A.71, A.73, A.76, A.81, A.82, A.84, A.91, A.92, A.94, A.95, A.97 and A.100, marked with a dagger (y) below.

Topic List of formulas and keywords A.1 [c, f (x) = xx22++2xx++cc , [ 1; 13 ], f (x) = xx22++2xx++cc , f (x), [ 1; 13 ], Finding, value, range, rational, function, does, contain, calculate, range, rational, function, reverse, across, this, find, value, range, does, contain, functions] [f 0(x) = f (x + 1), ddfx = f (x + 1), Solving, differential, equations, form, solve, differential, equations, form, ordinary, differential, equations, ordinary-differential-equations] [p5, 10 10, p5, 10 10, p5, f (x), [a; b], f (a), f (b), Approximation, correct, resolve, problem, Find, approximation, correct, using, bisection, has, placed, function, sure, go, function, Mathematica, given, calculations, function, interval, opposite, signs, tolerance, number, iterations, numerical, methods, algorithms, bisection, numerical-methods] [Pn

k=0 nk k, n2n 1, compute, this, combinatoric, sum, sum, know, result, know, does, one, even, sum, like, this, has, binomial, coefficients, combinatorics, number, theory, summation, proof, explanation, numbertheory, proof-explanation] [P (( 2 )j( 1 )) = P (( 2 )\( 1 )) = P (( 2 ))P (( 1 )) = P (( 2 )), family, has, two, Given,

P (( 1 )) P (( 1 )) one, boy, probability, boys, family, has, two, Given, one, boy, probability, boys, was, this, question, using, conditional, probability, event, first, boy, event, second, probability, second, boy, given, first, boys, formula, since, second, boy, does, depend, first, detailed, solution, correct, probability, proof, verification, conditional, probability, proof-verification, conditional-probability] [5133 mod 8:, 5n mod 8 = 5, 5133 mod 8 = 5, calculate, mod, number, big, exponent, find, noticed, lead, say, know, prove, prove, this, case, find, solution, algebra, precalculus, arithmetic, algebra-precalculus] [ 1111000 1 , 1110 1, 1110 1 = x (mod 100), 112 21 (mod 100), (112)2 ( 21 )2 (mod 100), 114 441 (mod 100), 114 41 (mod 100), (114)2 (41)2 (mod 100), 118 1681 (mod 100), 118 81 (mod 100), 118 112 (81 21) (mod 100), 1110 1701 (mod 100) =) 1110 1 (mod 100), 1110 1 ( 1 1 ) (mod 100) =) 1110 1 0 (mod 100), x = 0, 1110 1, Finding, remainder, using, modulus, divided, solve, term, mod, tried, mod, mod, mod, mod, mod, mod, mod, mod, mod, mod, mod, mod, value, divisible, this, approach, long, time, competitive, exam, math, contest, without, using, process, determine, remainder, above, problem, very, helpful, advance, elementary, number, theory, modular, arithmetic, divisibility, alternative, proof, elementary-number-theory, modular-arithmetic, alternativeproof] [limn!1 qn (27()3nn()n!!)3 , limn!1 qn (27()3nn()n!!)3 , finding, value, Finding, value, try, solve, help, limits] [PnN=0 nxn, Pin=0 i2 = (n2+n)6(2n+1) , this, series, need, write, series, form, does, involve, summation, notation, example, Does, anyone, idea, this, multiple, ways, including, using, generating, functions, sequences, series, sequences-and-series] [R01 sixnax , R01 sixnax , Find, values, improper, integral, sin, converges, Find, values, improper, integral, sin, converges, expand, using, series, expansion, improper, integrals, improper-integrals] A.12 A.13 A.14 A.15 A.16 A.17 [R3, u = (a; b; c), v = (d; e; f ), R2, R3, u = (a; b), v = (d; e), R2, R3, (u; v) = ( f (u); g(v) ), R2, R3, R2, (u; v) = (2u cos v; u sin v), cross, product, dimensions, math, book, using, states, cross, product, two, vectors, defined, direction, resultant, determined, curling, fingers, vector, pointing, direction, cross, product, cross, product, defined, Is, degenerate, case, cross, product, like, this, type, determinant, instance, parameterization, needed, calculate, examples, book, calculating, determinate, cos, sin, multivariable, calculus, vectors, multivariable-calculus] [(1 + ip3)1=2, Finding, roots, complex, number, was, solving, practice, problems, this, question, It, find, roots, sketch, linear, algebra, complex, numbers, polar, coordinates, linear-algebra, complex-numbers, polarcoordinates] [Rab f (x)dx + Rff((ab)) f 1(x)dx ?, Rab f (x)dx + Rff((ab)) f 1(x)dx ?, bf (b) af (a), expression, expression, answer, answer, calculus] [y = xy0 + 12 (y0)2, 12 y0(2x + y0) = y, x2 + y = t, Help, solving, first, order, differential, equation, first-order, first, order, differential, equation, this, find, way, solve, use, derivate, idea, first-order, ordinary, differential, equations, ordinary-differential-equations] [Pn

i=1 ixi 1, 1+2x+3x2 +4x3 +5x4 +:::+nxn 1 +:::, x 6= 1; jxj < 1, Sn, S2 = 1 + x + x2 + x3 + x4 + :::, d(S2) = S1, jxj < 1, S2, 11 xxn = 1 1 x , S1 = dx d(S2) = d( 1 1x ) = (1 1x)2 , Derive, sum, series, need, find, partial, sums, dx dx finally, sum, tried, series, source, series, sum, geometric, progression, this, answer, sequences, series, convergence, summation, power, series, sequences-and-series, power-series] [R01 ln(1+x1)+lxn(1 x) dx, R01 ln(1+x1)+lxn(1 x) dx, I(a; b) = R01 ln(1 ax1+)lxn(1+bx) dx, d2dIa(dab;b) , Finding, Calculate, My, try, Let, compute, happy, see, ideas, order, kill, this, integral, integration, sequences, series, definite, integrals, closed, form, sequences-and-series, definite-integrals, closed-form] [Rx1=0 sinx(x) , eziz , Rx1=0 sinx(x) , f (z) = eziz , = 1 + R + 2 + , 1(t) = t; t 2 [i ; iR], R(t) = Reit; t 2 [ 2 ; 2 ], 2(t) = t; t 2 [ iR; i ], (t) = eit; t 2 [ 2 ; 2 ], sinx(x) , R f = i , ! 0, R R f = 0, R ! 1, Calculate, sin, function, calculate, sin, function, using, closed, path, use, sin, even, function, has, anti, derivative, integral, closed, path, managed, show, show, Help, complex, analysis, improper, integrals, complexanalysis, improper-integrals] A.18 [limn!1 , limn!1 , 9

n log 2 + n log , 2e , 4e , 2=e, Evaluate, Evaluate, using, Stolz, know, n + 1 many, question, like, this, solve, using, Stolz, method, log, applied, Stolz, log, log, answer, answer, help, Edit, On, log, log, log, log, log, log, log, log, log, log, Cesáro-Stolz, Cesáro-Stolz, Cesáro-Stolz,, sequences, series, sequences-and-series] [p4 1, p4 1, 74 1, 24 3 5 2, 114 1, 24 3 5 61, 24 3 5, p4 1, (p2 + 1)(p 1)(p + 1), 24, 16n + x, Greatest, common, factor, was, find, greatest, common, factor, primes, First, value, has, divisors, has, divisors, has, prove, divisible, even, integers, know, prove, divisibility, since, check, numbers, prove, divisibility, assigning, divisibility, greatest, common, divisor, greatest-common-divisor] [n 2 N n f41g, (n) = 40, n 2 N, (n) = 40, , n = 41, n0s, Calculate, looking, Euler, Totient, Function, found, one, namely, calculate, EulerTotient, totient, function, totient-function] [999:::9 , 999:::9 , 999:::9 x(mod100), 0 x 100, 999:::9 (nine9s) = 9a, 9a(mod100), a(mod (100)), (100) = 40, a = b(mod40), 99:::9 (eight9s) = 9b, 9b(mod40), b(mod (40)), (40) = 16, b = c(mod16), 999:::9 (seven9s) = 9c, 9c(mod16), c(modphi( 16 )), ( 16 ) = 8, c(mod8), 9 = 1(mod8), c = 1(mod8), Finding, last, two, digits, nine, continuing, learning, modular, arithmetic, confused, this, question, Find, last, two, digits, nine, phi, function, used, this, problem, far, this, In, order, know, need, know, In, order, know, need, know, In, order, know, need, know, need, find, like, made, along, way, lot, back, order, value, last, two, anyone, help, this, number, theory, modular, arithmetic, number-theory, modular-arithmetic] A.22y [d; d + 1; d + 2::: = N , N , d; d + 1; d + 2:::, N = 30, Ans = 3, d1 = 4; d2 = 6; d3 = 8, d1 : 4 + 5 + 6 + 7 + 8 = 30, (P(d + n) P(d 1)), Find, number, satisfies, challenge, cat, trip, walk, way, sum, given, many, ways, Example, Edit, way, see, many, subsets, way, sequences, series, number, theory, sequences-and-series, number-theory] A.23 [27 38 52 711, 23 34 5, x = 23 34 5x, find, product, two, integers, greatest, common, divisor, least, common, multiple, this, question, help, solve, tried, assuming, Gcd, product, factors, prime, numbers, greatest, common, divisor, least, common, multiple, prime-numbers, greatestcommon-divisor, least-common-multiple] A.24 [p2i 1?, p2i 1?, 2i 1 = (a+bi)2, a2 +2abi b2, a2 b2 = 1, 2ab = 2, a2 = b 2, b 2 b2 = 1, b4 + 1 = 1, b4 = 2, b = p42, p2i 1, Is, this, only, way, evaluate, work, solve, way, algebra, precalculus, algebraprecalculus] A.29 A.30 A.31 A.25y [P (x2 + 1) = (P (x))2 + 1, P (0), P (x2 + 1) = (P (x))2 + 1, P (x), P (0) = 0, P ( 1 ) = 1, P ( 2 ) = 2, P ( 5 ) = 5, P ( 26 ) = 26, P (677) = 677, P (x) = x, y = P (x), y = x, P (0) = 2, P (x) = (x2 + 1)2 + 1, P (0) = 3, limx!1 logx P (x), P (x), P (0), P (x), polynomial, polynomial, Let, points, log, does, converge, So, values, makes, polynomial, polynomials] A.26 [R01 sinx x dx:, R0x sixnx dx, solve, indefinite, integral, using, Taylor, series, trying, show, integral, convergent, sin, My, first, taylor, series, sin, next, step, real, analysis, calculus, integration, taylor, expansion, riemann, integration, real-analysis, taylor-expansion, riemann-integration] A.27 [e3i =2, e i = 1, e3 i=2, (e i)3=2, (p 1)3 = i3 = i, p( 1 )3 = p 1 = i, value, solving, value, know, confused, right, answer, evaluate, two, possible, answers, this, one, correct, answer, going, complex, numbers, exponentiation, complex-numbers] A.28 [sin( 18 ) = a+cpb , a + b + c, sin( 18 ) = a+cpb , a + b + c, sin( 18 ), xz , x = a + pb; z = c, y = qc2 (a + pb)2, cos( 18 ) = yz = pc2 (ca+pb)2 , b = (c sin( 18 ) a)2 = c2 sin2( 18 ) 2ac sin( 18 ) + a2, sin( 18 ) = 1+4 p5 , A; B; C, A sin( 18 )2 + B sin( 18 ) + C = 0, sin( 18 ), Ax2 + Bx + C, a =

B; b = B2 4AC; c = 2A, sin( 18 ) = ( 1 + p5)=4, sin, sin, form, sin, right, triangle, sides, front, corner, angle, degrees, hypotenuse, find, cos, found, sin, sin, sin, solution, says, sin, intuition, find, sin, sin, sin, root, Totally, This, question, prove, part, solution, algebra, precalculus, trigonometry, euclidean, geometry, contest, math, algebra-precalculus, euclidean-geometry, contest-math] [3 + 2i, 4i, i = p 1, 15i = 0, 15 p11 = 0 0 = 0;, 51x = 0, 5 x = 0 0 = 0:, Dividing, Complex, Numbers, Infinity, My, PreCalcu1 1 lus, properties, limits, before, test, stated, real, number, divided, infinity, equals, This, whether, complex, number, divided, infinity, equal, This, completely, was, theoretical, calculation, knowing, calculated, complex, number, since, using, properties, utilized, real, numbers, state, since, Is, this, theoretical, calculation, correct, concept, this, algebra, precalculus, limits, complex, numbers, infinity, algebra-precalculus, complexnumbers] [a3 +b3 +c3 3abc, a3 +b3 +c3 3abc, (a) 1, (b) 0, (c) 1, (d) 2, a; b, ex, Find, binomial, theorem, Find, help, solve, this, added, It, expansion, know, use, binomial, theorem, binomial-theorem]

training data A.32 [Empty(x) () 6 9y(y 2 x), Empty(x) () 6 9y(y 2 x), definitions, axioms, very, elementary, definition, first, order, logical, example, say, Define, Is, definition, axiom, call, definitional, this, one, place, predicate, symbol, Empty, new, among, listed, primitives, say, Zermelo, has, only, identity, membership, primitive, So, stating, definitions, effect, stating, axioms, characterizing, primitive, definitional, axioms, complete, reference, specified, set, symbols, Is, correct, case, why, call, axiom, state, mean, why, say, example, Definitional, axiom, terminology, definition, first, order, logic, axioms, first-order-logic] A.33 [f (t; x), @@3t3f , @@3xf3 , Physical, meaning, significance, third, derivative, function, Given, physical, quantity, represented, function, meaning, third, derivative, physics] A.34y [a; b > 0, a " b, a ### b, a b, a #n b, n 3, a ## b = a + 1, b + 1, a "n b, n 0, a ## b = b + 1, a # b = a + b 1, a > b, a # b, a ### b, a > b, a ## b = b + 1 + ab ;, a ### b, Extending, Knuth, non, positive, values, up-arrow/hyperoperations, non-positive, So, idea, extend, Knuth, arrow, notation, included, zero, negative, It, normally, defined, basically, hyperoperation, sequence, only, try, go, backwards, trivial, extension, letting, arrows, represent, negative, arrows, why, heck, coming, expression, does, Alternatively, way, defining, zero, arrows, does, So, question, Is, extension, Knuth, arrow, notation, exists, Edit, this, question, initially, was, correct, So, example, extension, modified, question, An, extension, define, satisfies, recursive, definition, Edit, turns, correct, this, imply, example, close, need, exception, case, does, show, evaluating, need, defined, try, extend, instance, abuse, case, allowed, let, finding, intractable, result, up-arrow, up-arrow, hyperoperation, ackermann, function, ackermann-function] A.35 [R ex2 dx, R e2xdx, does, function, antiderivative, know, this, question, sound, why, ca, write, does, antiderivative, In, light, this, question, sufficient, conditions, function, need, careful, examination, function, say, does, antiderivative, way, see, function, right, say, does, antiderivative, integration] A.36 [:P , true, :P ! A1 ! ::: ! An ! P , :P ! P () :(:P )_P () P , :P , :P , :P , Proof, contradiction, status, initial, assumption, proof, complete, First, like, say, looked, answers, specific, question, found, existing, question, Say, need, prove, statement, method, Assuming, holds, using, list, statements, proven, hold, derived, proof, arrive, list, proven, statements, was, proofs, contain, lines, contradiction, proves, initial, assumption, was, holds, initial, assumption, proven, FALSE, why, sure, derived, holds, particular, holds, On, hand, derived, assumption, explain, why, this, type, argument, used, logic, proof, writing, proof-writing] A.38 A.39 A.40 A.41 [f g = g f , f 1 6= g, f 6= id 6= g, f g = g f ?, f (x) = 2x, g(x) = 3x, f : R r f1g ! R, f (x) = 2x=(1 x), g(x) = f 1 g f = 2 +g(g1(21x2xxx) ) ;, g0, gn+1 7! f 1 gn f , g0, Non, trivial, examples, real, valued, functions, inverses, identity, linear, Examples, trivial, this, given, function, one, go, example, function, commutes, this, similar, fixed, point, iteration, defined, need, function, function, finding, fixed, point, possible, strategy, ca, find, real-valued, real, analysis, functional, analysis, functions, realanalysis, functional-analysis] [a; b, q; r : a = bq + r, Choose, q : qb <= a, Uses, Axiom, Choice, first, year, maths, student, drift, years, ZFC, axioms, first, time, college, stuff, far, nowhere, near, ZFC, terms, happened, use, axiom, choice, time, module, even, know, name, example, proof, non, negative, integers, exist, integers, known, restrictions, proof, like, this, largest, integer, Is, axiom, choice, allows, this, simple, important, step, couple, questions, name, simple, proofs, theorems, results, etc, axiom, choice, essential, read, has, long, topic, dispute, mathematicians, even, people, accept, alternative, axiomatic, systems, work, equally, well, without, needing, first-year, non-negative, set, theory, set-theory] [20182019, 20192018 , 2019 log(2018) , 2018 log(2019), log 2019 > log 2018, 20192018, know, value, logs, sides, log, log, know, log, log, does, this, mean, one, algebra, precalculus, logarithms, algebra-precalculus] [a1x1 +a2x2 +a3x3 +:::+anxn =, f (x+y) = f (x)+f (y), f (cx) = cf (x), meaning, term, linear, called, linear, equation, represents, equation, line, dimensional, So, linear, comes, word, line, higher, power, graph, function, straight, called, linear, differential, equation, derivatives, power, equal, similar, above, definition, linear, function, called, linear, In, this, definition, linearity, function, does, word, linear, means, does, relate, straight, line, Finally, does, term, linear, means, case, linear, vector, spaces, reference, straight, line, So, whether, linear, word, used, different, contexts, Does, different, meaning, different, situation, Or, linearity, refers, relation, straight, line, At, Least, explain, linearity, function, linear, vector, space, relate, equation, line, linear, algebra, linear-algebra] [n m, Prn=1( 1 )(n r) nr (r)m, Confusion, find, number, onto, functions, two, sets, given, In, book, given, two, finite, sets, containing, elements, number, onto, functions, well, ca, combinations, combinatorics, functions, combinations] f0; 1; 2; 3; :::g, f = 0, R01 f 2(x) dx = 0, > 0, M = supx2[0;1] jf (x)j, [0; 1], R01 xkf (x) dx = 0, k = f0; 1; 2; 3; :::g, f = 0, [0; 1], R 1 0 xkf (x) dx = 0, k = f0; 1; 2; 3; :::g, f = 0, f = 0a.e., Lebesgue, integrable, function, satisfies, prove, indefinitely, differentiable, real, valued, function, satisfies, prove, My, prove, this, assertion, prove, approximation, theorem, find, polynomial, sup, It, need, condition, indefinitely, differentiable, continuous, hold, My, question, Lebesgue, integrable, real, valued, function, satisfies, prove, except, set, measure, TRUE, Riemann, integrable, real, valued, function, satisfies, prove, except, set, measure, EDIT, prove, prove, fourier, coefficient, equal, cos, cos, proof, complete, real, analysis, real-analysis] A.48 A.49 A.50 A.51 [p, 0 < r < p 1, q, rq 1 mod p, 0 < r < p 1, rq 1 mod p, rq 1 = kp, qr kp = 1, Prp=12 r = (p 2)2(p 1) = p p 2 3 + 1 1 mod p, (p 1)! 1 mod p, (p 1)! + 2 1 mod p, (p + 1), (p + 1)(p 1)! = (p + 1) mod p, (p + 1)(p 1)! = 1 mod p, Prove, given, prime, exists, mod, Prove, given, prime, exists, mod, only, taken, one, number, theory, course, years, this, popped, computer, science, class, was, assuming, this, proof, elementary, since, current, class, algorithm, cours, basic, tried, look, couple, approaches, reverse, engineer, arrive, conclusion, need, little, manipulation, looks, ca, see, sum, mod, looks, good, know, final, Wilson, Lagrange, vaguely, this, theorem, was, looking, old, book, was, see, arrived, prime, mod, multiplier, built, factorial, expression, was, adding, side, mod, dead, end, sure, was, multiplying, mod, results, mod, multiple, sure, valid, elementary, number, theory, proof, verification, elementary-number-theory, proof-verification] [x; y 0, (x+y)k xk +yk, k 1, x; y 0, (x+y)k xk +yk, k 2 R 1, x x+y, xk (x+y)k, k 1, yk (x+y)k, xk +yk 2(x+y)k, x; y 0, f (k) = (x + y)k, g(k) = xk + yk, (x + y)r xr + yr, r 2 Z+, k = 1, (x+y)r > xr +yr, r 2, m 2= Z, f (m) = g(m), k = 1, (x+y)r > xr +yr, r 2, k = 1, showing, trying, show, So, far, tried, things, since, sides, inequality, positive, inequality, hold, Adding, inequalities, this, close, course, Alternatively, was, fix, let, let, show, using, binomial, theorem, show, intersect, only, better, proving, statement, since, positive, integers, real, number, property, exist, since, violate, only, continuity, positive, integers, course, this, dependent, showing, intersect, only, running, low, real, analysis, functions, inequality, real-analysis] [ 2nn = Pkn=0 nk 2, (1 + x)2n = [(1 + x)n]2, xn, 2nn , Pkn=0 nk 2, Is, simple, combinatoric, interpretation, this, identity, across, exercise, prove, identity, exercise, It, use, prove, identity, expressions, identity, coefficients, expansions, expressions, course, number, was, whether, equivalent, counting, interpretation, It, clear, number, ways, half, elements, set, this, possible, interpret, equivalently, equivalent-counting, combinatorics, binomial, coefficients, binomial-coefficients] [P n2+1cos n , P n2+1cos n , , A , n; j2 + cos nj 1 + , (n 2 A () 1 cos n 1 + ), A , Divergent, series, cos, Show, cos, divergent, My, main, problem, infinitely, small, positive, real, number, define, set, cos, cos, divergence, come, sum, idea, handle, this, real, analysis, integration, sequences, series, analysis, real-analysis, sequences-and-series] [Xn n + r 21r = 2n, xn, Sum, series, binomial, coefficients, Prove, try, r r=0 coefficient, solve, binomial, coefficients, binomial-coefficients] A.52 [8n 2 N, 9m 2 N, m > n, m, n1; n2; :::; nk 2 N, n = n1n2:::nk + 1, n1; n2; :::; nk, a; b 2 N, Z+, a = qb + r, 0 r < q, k 2 N, n1; n2; :::; nk 1, 8i, ni - n = n1n2:::nk + 1, 9n 2 N, 8m 2 N, m n, Prove, prime, two, parts, Prove, least, divisible, numbers, Prove, truth, negation, leads, Use, theorem, exist, unique, quotient, remainder, part, given, show, set, sure, go, part, know, negation, prime, sure, proof, theorem, proof, writing, prime, numbers, proof-writing, prime-numbers] A.53 [g 2 G, gh = 1, hg = 1, AB = 1 ; BA = 1, AB = 1 ) BA = 1, Show, one, sided, inverse, square, matrix, TRUE, inverse, one-sided, know, group, element, does, mean, In, case, matrices, linear, maps, vector, spaces, TRUE, This, happens, square, matrices, case, even, form, group, multiplication, restrict, square, matrices, simple, proof, this, avoids, chasing, entries, makes, use, simply, vector, space, structure, linear, transformations, In, prove, this, this, imply, group, one, sided, two, sided, inverses, has, infinite, since, finite, group, finite, dimensional, representation, one-sided(but, two-sided), linear, algebra, group, theory, linear-algebra, group-theory] A.54 [P (N ) = (SjS N ), P (N ) = (SjS N ), using, diagonal, argument, show, uncountable, tips, solutions, this, one, using, diagonal, argument, show, uncountable, discrete, mathematics, elementary, set, theory, discrete-mathematics, elementary-set-theory] A.55 [ p1 1 = p 1, p 1, (p 1)2 = 1, p1 1 = q 11 = p 1:, calculation, set, new, number, property, write, know, this, correct, result, missing, complex, numbers, definition, complex-numbers] A.56 [9p pisprime ! 8x(xisprime) , curious, logical, formula, involving, prime, numbers, Let, set, natural, Is, formula, TRUE, FALSE, know, answer, this, question, shortest, way, arrive, conclusion, using, deduction, system, logic, first, order, logic, first-order-logic] A.57y [Rn, f : B ! Rm, f 1, f (B), f (X), Rn, f : X ! Rm, f 1 : f (X) 7! X, f (X), Preimage, continuous, one, one, function, connected, domain, continuous, one-to-one, know, given, compact, subset, continuous, injective, one, one, function, continuous, This, TRUE, know, image, connected, subset, connected, continuous, let, connected, non, compact, subset, continuous, injective, one, one, trying, counterexample, mapping, continuous, parametrized, example, advance, (one-to-one), (non-compact), (one-to-one), real, analysis, general, topology, analysis, continuity, metric, spaces, real-analysis, general-topology, metric-spaces] A.58 [3 arcsin 14 + arccos 1116 = 2 , 3 arcsin 41 + arccos 1161 = 2 , Prove, arcsin, help, this, exercise, know, prove, answer, fully, Exercise, Prove, arcsin, trigonometry] A.59 [Pdjn (d) = n, (n), Pdjn (d) = n, n = Qkm=1 pk k , d = Qkm=1 pkk , 0 k k, Multiple, proofs, looking, multiple, proofs, statement, denotes, Euler, totient, one, unique, factorisation, theorem, align, group, theory, number, theory, alternative, proof, big, list, group-theory, numbertheory, alternative-proof, big-list] A.60 [an = 1 p12 ::: 1 pn1+1 , n 1, limn!1 an, p1 , an, (0:293) (0:423) (0:5) (0:553) (0:622) (0:647) (0:667) ::::, (D), an, Limiting, value, sequence, tends, infinity, Let, equals, does, exist, equals, equals, My, Approach, particular, direction, procedure, find, value, tends, So, tried, like, this, simple, way, substitute, values, trying, find, limiting, value, So, value, tending, option, tried, like, this, find, value, converges, tends, infinity, help, procedure, solve, this, question, calculus, sequences, series, limits, products, sequences-and-series] A.61 [i; j 2 N, n = 3i + 5j, n 8, i; j 2 N, n = 3i + 5j, n 8, n = 8 =) 8 = 3 1 + 5 1, n = 9 =) 9 = 3 3 + 5 0, n = 10 =) 10 = 3 0 + 5 2, n = h, n = h + 1, k + 1 = 3i + 5j, exists, Prove, exists, hard, time, this, exercise, trying, prove, induction, Basis, step, Induction, step, TRUE, TRUE, So, know, proving, elementary, number, theory, discrete, mathematics, induction, diophantine, equations, elementary-number-theory, discretemathematics, diophantine-equations] A.62 [Q, Z, jQj = jZj, Q, Z Z, jZ Zj = jZj, Prove, cardinality, set, rational, numbers, set, integers, equal, learned, cardinality, discrete, class, days, this, This, confusing, entirely, sure, even, full, question, Let, denote, set, rational, numbers, denote, set, Prove, saying, element, element, know, prove, bijection, even, prove, help, discrete, mathematics, discrete-mathematics] A.63 [gcd, lcm, 2, n1; n2, lcm(n1; n2) = gcdn(n1n1;2n2) , n1; n2; n3; :::; nr, gcd, positive, integers, two, positive, integers, relationship, greatest, common, divisor, least, common, multiple, given, gcd, set, positive, integers, does, relationship, hold, Is, TRUE, like, this, prove, handle, proof, explanation, greatest, common, divisor, least, common, multiple, proofexplanation, greatest-common-divisor, least-common-multiple] A.64y [f : [a; b] ! R, f ([a; b]) [a; b], c 2 [a; b], f (c) = c, f (a) = a, f (b) = b, f (a) = a, f (b) = b, [a; b], f (a) > a, f (b) < b, f (a) > a, f (b) < b, x; y 2 [a; b], f (a) = x, f (b) = y, f [a; b] = [x; y], [x; y] [a; b], [x; y], c 2 [a; b], f (c) = c, continuous, Prove, exists, point, satisfying, continuous, Prove, exists, point, satisfying, left, show, well, assume, Since, values, this, assuming, So, far, Assume, Let, means, Notice, Since, continuous, exists, equal, equal, This, assume, Intermediate, Value, Theorem, real, analysis, continuity, proof, explanation, real-analysis, proof-explanation] A.65 [e 2 t 2 e21t2 , ; t 0, e 2 t 2 e21t2 1, ; t 0, ln, ( 1 ), t et 12:, x ex 1, x 0, show, show, Applying, sides, yields, equivalent, So, show, this, calculus, inequality, exponential, function, exponentialfunction] A.66 [x; h 2 Rd, A 2 Rd d, (xT Ah)T = hT AT x, x; h 2 Rd, A 2 Rd d, (xT Ah)T = hT AT x, possible, possible, linear, algebra, transpose, linearalgebra] A.67 [k k, k l, l l, Combination, matrixes, matrix, matrix, matrix, prove, matrix, elements, equal, know, rules, calculating, determinants, know, this, question, calculus, determinant] A.68 [an + 1, a + 1, n, an + 1, a + 1, 1, n 2 N, 2k + 1, a2(k+1)+1 + 1, = a2k+3 + 1, = a3 a2k + 1, = (a3 + 1) a2k a2k + 1, a2k, an + 1, a + 1, Prove, divisible, odd, Prove, divisible, odd, know, Since, odd, rewrite, assume, holds, prove, holds, next, sure, Since, means, exponential, term, even, cant, use, divisible, odd, polynomials, induction, divisibility] A.69 [ ss + s+s 1 + ::: + ns = ns++11 , n s, 0 s n, Induction, two, variable, parameters, So, was, assigned, this, problem, tried, professor, explanations, My, professor, saying, statement, need, prove, formula, correct, need, use, induction, variables, sure, help, combinatorics] A.70y [PjN=01 cos (2j2+N1) = 0, l 2 Z, N 2 N, PjN=01 cos l (2j2+N1) = 0, Proving, cos, Let, need, prove, cos, tried, use, Euler, formula, sum, first, terms, geometric, serie, ideas, trigonometry, summation] A.71y [12 + 22 + :::: + n2 = n(n+1)6(2n+1) , 12 + 22 + :::: + n2 = n(n+1)6(2n+1) , LHS = 12, RHS = (1+1)6(2+1) = 263 = 1, LHSp = 12 + 22 + ::: + p2, RHSp = p(p+1)6(2p+1) , LHSp+1 = 12 + 22 + :::: + p2 + (p + 1)2, RHSp+1 = (p+1)((p+1)+61)(2(p+1)+1) , RHSp+1 = RHSp + (p + 1)2, RHSp+1 = p(p+1)(2p+1) + (p + 1)2, RHSp+1 = (p+1)((p+1)+1)(2(p+1)+1) , 6 6 p(p+1)(2p+1) + (p + 1)2, Show, induction, Show, induction, My, Case, 6 Case, Case, show, this, induction, need, show, So, need, rewrite, equal, Anyone, see, Or, solution, induction] A.72 [X = Y, X 2 Y, X = Y, X 2 Y, Is, possible, Is, possible, set, equal, set, set, element, set, set, theory, axioms, set-theory] A.73y [ n0 2 + n1 2 + ::: + nn 2 = n0 2 + n1 2 + ::: + nn 2 = (1 + x)n(1 + x)n = (1 + x)2n, xn2n,n n0, 1 + n1 x + ::: + + ::: + nn 2, nrxnx,r +2nn::,: x+n,nn 22nnxnnn,,, xn, xi, xn i, xn, nn r = nr , n0 2 + n1 2 n0 2 + n1 2 + ::: + nn 2 = 2nn , xn, xn, xn, Help, proof, proof, required, made, binomial, expose, was, forward, questions, exposing, see, question, marks, like, this, one, points, quite, numeration, This, Prove, use, equality, call, result, proved, finding, coefficient, terms, this, equality, binomial, theorem, left, hand, side, this, equation, product, two, factors, equal, factors, multiply, term, term, first, factor, has, term, second, factor, has, coefficients, Since, summation, equal, So, left, hand, side, equation, coefficient, expand, right, hand, side, equation, find, coefficient, left, hand, side, equation, prove, equal, In, This, was, My, one, heck, does, this, equation, come, know, equation, come, requested, prove, different, equality, two, why, solution, first, equation, finding, coefficients, one, made, see, one, three, why, coefficient, left, hand, side, equation, made, coefficient, right, hand, side, this, equation, well, prove, original, equation, one, was, prove, first, place, know, many, guys, help, long, post, long, (?-n), (?-1), (?-2)., left-hand, right-hand, (?-3), left-hand, (?-1), (?-2), (?-3), right-hand, discrete, mathematics, binomial, coefficients, binomial, theorem, discrete-mathematics, binomial-coefficients, binomial-theorem] A.79 [

1 A.74 [f : (0; 1) ! R, f (x) = x + , [2; 1), f : (0; 1) ! R, f (x) = x + x 1

, [2; 1), x = 1, f ( 1 ) = 2, [2; 1), Show, image, function, interval, x Show, image, function, interval, So, show, function, interval, functions, elementary, set, theory, elementary-set-theory] A.75 [m, limu!1 uemu = 0, limu!1 uemu = 0, eu, >, (umm++11)! , Prove, integer, show, integer, Looking, solutions, sure, this, logical, step, real, analysis, calculus, limits, real-analysis] A.76y [Z, a + bZ = fz 2 Z j z = a + bkforsomek 2 Zg, a 2 Z, b 2 Z f0g, fai + biZ j i 2 Ng, Si2N(ai + biZ) = Z, I N, Si2I (ai + biZ) = Z, fpkg = f2; 3; 5; : : : g, `1; `2, `1 = `2 = 5, f0 + pkj g, j = 1; : : : ; n, p > maxfpk1 ; : : : ; pkn g, p 2= S1 j n(0 + pkj Z), p 2= ( 1 + 5Z) [ (1 + 5Z), 5 - p 1, 5 - p + 1, Covering, arithmetic, progressions, solving, problems, old, exam, topology, translated, problem, algebraic, terms, problem, Let, collection, sets, satisfying, Show, whether, always, possible, extract, finite, lot, elementary, algebra, Let, set, construct, non, negative, integers, instance, pick, finite, sub, collection, assume, prime, max, run, This, course, possible, prime, last, this, approach, means, prove, infinitely, many, primes, ending, like, thing, prove, simple, problem, like, Surely, way, solving, this, EDIT, interested, solution, topology, whether, solution, like, solution, works, non-negative, sub-collection, general, topology, elementary, number, theory, general-topology, elementary-numbertheory] A.77 [( 1 )( 1 ) = 1, ( 1 )( 1 ) = 1, 1 1 = 1, 1 x = x, Show, relation, consequence, distributive, law, Show, relation, consequence, distributive, This, question, first, problem, Theory, Andre, point, tried, using, help, work, elementary, number, theory, elementary-number-theory] A.78 training data

e ixu 1 A.80

jxj, u 6= 0, Inequality, complex, exponential, Rudin, u Real, Complex, Analysis, uses, this, proof, near, chapter, real, Why, this, TRUE, Edit, real, inequality, exponential, function, fourier, transform, exponential-function, fourier-transform] [N, ;; f1g; f2g; f1; 2g; f3g; f1; 3g; f2; 3g; f1; 2; 3g; f4g; : : :, f1g, f2g, [n] = f1; 2; : : : ; ng, [n], 2n, Why, does, this, proof, set, finite, subsets, countable, set, work, set, subsets, found, this, proof, thread, found, simple, answers, sort, formula, like, trying, way, see, set, finite, subsets, countable, probably, list, elements, set, one, coming, first, next, shows, set, pattern, see, least, In, words, first, comes, comes, time, new, integer, list, subsets, contain, ones, contain, showed, subsets, show, first, elements, this, applies, finite, subsets, cant, why, apply, set, subsets, continue, this, scheme, assume, way, infinity, quite, help, analysis, elementary, set, theory, proof, explanation, elementary-set-theory, proof-explanation] A.81y [n 2 N, f1; ; ng, M 6= ;, x 2 M , f ( 1 ), f ( 1 ) := x, f1g, f1; ; ng, M ff ( 1 ); ; f (n)g, x 2 M ff ( 1 ); ; f (n)g, g( 1 ); ; g(n + 1), g( 1 ) := f ( 1 ); ; g(n) := f (n), g(n + 1) := x, f1; ; n + 1g, n1, h(n1), f1; ; n1g, g(n1), h(n1), (1; g( 1 )); ; (n1; g(n1)), h(n2), n2 n1, g(n2), g(n2), h(n2), h(n3), n3 > n1, (n1 + 1; g(n1 + 1)); ; (n3; g(n3)), g(n3), h(n3), h(n), n 2 N, h : N ! M , h : N ! M , h : N ! M , infinite, set, contains, countable, Why, proof, Axiom, Choice, Let, infinite, Proposition, exists, injection, Since, exists, Define, injection, exists, injection, Since, infinite, set, empty, So, exists, Define, injection, Let, arbitrary, natural, calculate, injection, Proposition, return, value, pairs, calculate, search, database, value, database, return, value, calculate, pairs, database, Proposition, return, value, calculate, above, injection, Why, proof, man, know, injection, man, know, injection, elementary, set, theory, axiom, choice, elementary-set-theory, axiom-of-choice]

Z 2 A.82y [f (x), [0; 2 ], f (x), [0; 2 ], A = f (x)dx, g(x), f (x) = g(x) cos(x), f (x), A =

Z 2 g(x) cos(x)dx, t = sin(x), dt = cos(x)dx, x = arcsin(t), t = sin(x), sin(0) = 0, 2 , sin( 2 ) = 0, A = g(arcsin(t))dt, A = 0, 0 definite, integrals, evaluate, using, periodic, functions, know, reasoning, know, went, discuss, this, Maths, even, find, Let, assuming, function, continuous, has, antiderivative, interval, Let, area, curve, interval, exist, function, cos, Substituting, value, cos, Using, substitution, Let, sin, cos, arcsin, Changing, limits, sin, sin, sin, Substituting, definite, integral, arcsin, Definite, Integral, lower, upper, bounds, So, help, calculus, integration, trigonometry, definite, integrals, inverse, function, definiteintegrals, inverse-function] A.83 [an M , Is, sequence, sums, inverse, natural, numbers, bounded, reading, Infinite, Sequences, right, trivial, matter, determine, boundedness, mind, sequence, before, learn, sequence, know, sequence, bounded, above, number, calculus, sequences, series, harmonic, numbers, sequences-and-series, harmonic-numbers] A.84y [4; x, Z[x], I =<; p; x >, Z[x], I =< p; x >, Z[x], 4; x, Z[x], Is, ideal, generated, principal, ideal, principal, ideal, My, question, Is, principal, ideal, prime, ideal, generated, principal, ideal, abstract, algebra, ring, theory, ideals, principal, ideal, domains, abstract-algebra, ring-theory, principal-ideal-domains] [N , (+1), 1=2, 1=2, pn = 21 pn 1, pn, pN = 1, pN 1 = 2, p0 = 2N , Expected, number, steps, bug, reach, position, bug, time, position, At, step, bug, moves, right, step, probability, returns, origin, probability, expected, number, steps, this, bug, reach, position, tried, first, find, possibility, this, bug, reaches, number, steps, recurrence, equation, find, possibility, bug, position, reach, boundary, condition, see, does, make, sense, sort, value, probability, first, number, expected, steps, sure, recurrence, equation, markov, chains, random, walk, markov-chains, random-walk] 0 n 1 [Pkn=0 k @ A = O 2n log3 n ?, Is, TRUE, log, Problem, Is, TRUE, log, k My, solution, this, upper, bound, way, large, ca, find, solution, combinatorics, elementary, number, theory, discrete, mathematics, elementarynumber-theory, discrete-mathematics] A.87 [8n 2 N : (Pin=1 ai)(Pin=1 a1i ) n%, ai, 8i 2 N : ai 2 R+, n = 1, n = 2, n = 3, a; b 2 R+, ab = 1, a + b 2, n = 3, a; b; c 2 R+, ab + ab 2, ac + ac 2, cb + cb 2, P ( 1 ), P (n), Is, TRUE, positive, TRUE, prove, this, holds, using, lemma, Lemma, Let, example, case, proven, like, this, Let, lemma, sure, generalized, version, natural, ca, come, counterexample, try, prove, induction, Let, Base, case, Inductive, hypothesis, assume, Inductive, step, know, Is, this, inequality, TRUE, prove, anyone, show, counterexample, algebra, precalculus, inequality, algebra-precalculus] A.88 [x4 + 10x2 + 1, Z[x], x4 + 10x2 + 1, Z[x], Is, polynomial, reducible, Is, polynomial, reducible, abstract, algebra, ring, theory, field, theory, irreducible, polynomials, abstract-algebra, ring-theory, field-theory, irreducible-polynomials] A.89 [A2 + B2 = C2 + D2, A; B; C; D, Parametrization, pythagorean, like, equation, pythagorean-like, Is, known, complete, parametrization, Diophantine, equation, positive, rational, numbers, equivalently, integers, number, theory, diophantine, equations, number-theory, diophantineequations] A.90 [n n, n n, A 1, A 1A = In ^ AA 1 = In ( 1 ), In, n n, A 1A = In, AA 1 6= In, Question, definition, Inverse, matrix, definition, matrix, inverse, matrix, property, identity, cases, way, around, making, FALSE, statement, linear, algebra, matrices, linear-algebra] A.91y [R ! R, x2, Continuous, function, reaches, value, range, times, Is, continuous, function, reaches, possible, values, value, range, times, example, perfect, was, question, almost, certain, functions, knows, bunch, real, analysis, calculus, real-analysis]

1. Abacha , A.B. , Agichtein , E. , Pinter , Y. , Demner-Fushman , D. : Overview of the medical question answering task at TREC 2017 LiveQA . In: Voorhees, E.M. , Ellis , A . (eds.) Proceedings of the 26th Text REtrieval Conference , TREC 2017 . NIST Special Publication , vol. 500 - 324 ( 2017 )

2. Aizawa , A. , Kohlhase , M. , Ounis , I. : NTCIR-10 Math Pilot task overview . In: Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies (NTCIR-10) . pp. 654 - 661 ( 2013 )

3. Aizawa , A. , Kohlhase , M. , Ounis , I. , Schubotz , M. : NTCIR-11 Math- 2 task overview . In: Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies (NTCIR-11) . pp. 88 - 98 ( 2014 )

4. Białecki , A. , Muir , R. , Ingersoll , G.: Apache Lucene 4 . In: SIGIR 2012 Workshop on Open Source Information Retrieval. pp. 17 - 24 ( 2012 )

5. Carlisle , D. , Ion , P. , Miner , R.: Mathematical markup language (MathML) version 3.0 2nd edition . W3C recommendation (Apr 2014 ), http://www.w3.org/TR/2014/REC-MathML3-20140410/

6. Davila , K. , Zanibbi , R.: Layout and Semantics: Combining Representations for Mathematical Formula Search . ACM Reference ( 2017 ). https://doi.org/10.1145/3077136.3080748

7. Davila , K. , Zanibbi , R. , Kane , A. , Tompa , F.W. : Tangent-3 at the NTCIR-12 MathIR task . In: Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies (NTCIR-12) . pp. 338 - 345 ( 2016 )

8. Fraser , D.J.: Math Information Retrieval using a Text Search Engine . Master's thesis , Cheriton School of Computer Science, University of Waterloo ( 2018 )

9. Fraser , D.J. , Kane , A. , Tompa , F.W. : Choosing math features for BM25 ranking with Tangent-L . In: Proceedings of the 18th ACM Symposium on Document Engineering (DocEng 2018 ). pp. 17 : 1 - 17 : 10 ( 2018 )

10. Gao , L. , Yuan , K. , Wang , Y. , Jiang , Z. , Tang , Z. : The math retrieval system of ICST for NTCIR-12 mathir task . In: Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies (NTCIR-12) . pp. 318 - 322 ( 2016 )

11. Guidi , F. , Sacerdoti Coen , C. : A survey on retrieval of mathematical knowledge . Mathematics in Computer Science 10 ( 4 ), 409 - 427 ( 2016 )

12. Hopkins , M. , Le

Bras

, R. , Petrescu-Prahova , C. , Stanovsky , G. , Hajishirzi , H. , Koncel-Kedziorski , R.: SemEval-2019 task 10: Math question answering . In: Proceedings of the 13th International Workshop on Semantic Evaluation (SemEval2019) . pp. 893 - 899 ( June 2019 )

13. Kristianto , G.Y. , Topić , G. , Aizawa , A. : Combining effectively math expressions and textual keywords in Math IR . In: Proceedings of the 3rd International Workshop on Digitization and E-Inclusion in Mathematics and Science 2016 ( DEIMS2016 ). pp. 25 - 32 ( 2016 )

14. Kristianto , G.Y. , Topic , G. , Aizawa , A. : MCAT math retrieval system for NTCIR12 mathir task . In: Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies (NTCIR-12) . pp. 323 - 330 ( 2016 )

15. Lv , Y. , Zhai , C. : Lower-bounding term frequency normalization . In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM'11) . pp. 7 - 16 ( 2011 )

16. Mansouri , B. , Agarwal , A. , Oard , D.W. , Zanibbi , R.: Finding old answers to new math questions: The ARQMath lab at CLEF 2020 . In: Advances in Information Retrieval, Proceedings of the 42nd European Conference on IR Research (ECIR 2005). Lecture Notes in Computer Science , vol. 12036 , pp. 564 - 571 . Springer ( 2020 )

17. Nakov , P. , Hoogeveen , D. , Màrquez , L. , Moschitti , A. , Mubarak , H. , Baldwin , T. , Verspoor , K. : SemEval-2017 task 3: Community question answering . In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) . pp. 27 - 48 ( Dec 2018 )

18. Nakov , P. , Màrquez , L. , Magdy , W. , Moschitti , A. , Glass , J. , Randeree , B. : SemEval-2015 task 3: Answer selection in community question answering . In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval2015) . pp. 269 - 281 ( 2015 )

19. Nakov , P. , Màrquez , L. , Moschitti , A. , Magdy , W. , Mubarak , H. , Freihat , A.A. , Glass , J. , Randeree , B. : SemEval-2016 task 3: Community question answering . In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval2016) . pp. 525 - 545

20. Olvera-Lobo , M.D. , Gutiérrez-Artacho , J. : Question answering track evaluation in TREC, CLEF and NTCIR . In: Advances in Intelligent Systems and Computing . vol. 353 , pp. 13 - 22 ( 2015 )

21. Ounis , I. , Amati , G. , Plachouras , V. , He , B. , Macdonald , C. , Johnson , D.: Terrier information retrieval platform . In: Advances in Information Retrieval, Proceedings of the 27th European Conference on IR Research (ECIR 2005). Lecture Notes in Computer Science , vol. 3408 , pp. 517 - 519 . Springer ( 2005 )

22. Pineau , D.C. : Math-aware search engines: Physics applications and overview . CoRR abs/1609 .03457 ( 2016 )

23. Růžička , M. , Sojka , P. , Líška , M. : Math indexer and searcher under the hood: Fine-tuning query expansion and unification strategies . In: Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies (NTCIR-12) . pp. 331 - 337 ( 2016 )

24. Schubotz , M. , Meuschke , N. , Leich , M. , Gipp , B. : Exploring the one-brain barrier: A manual contribution to the NTCIR-12 mathir task . In: Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies (NTCIR-12) . pp. 309 - 317 ( 2016 )

25. Schubotz , M. , Youssef , A. , Markl , V. , Cohl , H.S.: Challenges of mathematical information retrieval in the NTCIR-12 math Wikipedia task . In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2015 ). pp. 951 - 954 ( 2015 )

26. Sojka , P. , Líšaka , M.: The art of mathematics retrieval . In: Proceedings of the 11th ACM Symposium on Document Engineering (DocEng 2011 ). pp. 57 - 60 . ACM Press, New York, New York, USA ( 2011 )

27. Sojka , P. , Novotný , V. , Ayetiran , F. , Lupták , D. , Štefánik , M. : Quo Vadis, math information retrieval. Tech. rep. ( 2019 )

28. Thanda , A. , Agarwal , A. , Singla , K. , Prakash , A. , Gupta , A. : A document retrieval system for math queries . In: Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies (NTCIR-12) . pp. 346 - 353 ( 2016 )

29. Zanibbi , R. , Aizawa , A. , Kohlhase , M. , Ounis , I. , Topić , G. , Davila , K. : NTCIR-12 MathIR task overview . In: Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies (NTCIR-12) . pp. 299 - 308 ( 2016 )

30. Zanibbi , R. , Davila , K. , Kane , A. , Tompa , F.W.: Multi-stage math formula search: Using appearance-based similarity metrics at scale . In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016 ). pp. 145 - 154 ( 2016 )

31. Zanibbi , R. , Oard , D. , Agarwal , A. , Mansouri , B. : Overview of ARQMath 2020: CLEF Lab on answer retrieval for questions on math . In: Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum ( 2020 )

32. Zhong , W. , Rohatgi , S. , Wu , J. , Giles , C.L. , Zanibbi , R.: Accelerating substructure similarity search for formula retrieval . pp. 714 - 727 ( 2020 )