<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Search Query Extension Semantics</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<affiliation key="aff0">
								<address>
									<addrLine>Olga Ataeva 1, ], Vladimir Serebryakov 2, -1423-621X], Natalia Tuchkova 3, ] 1</addrLine>
									<postCode>0000-0003-0367-5575, 0000-0003, 0000-0001-6518-5817</postCode>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="department">Dorodnicyn Computing Center FRC CSC of RAS</orgName>
								<address>
									<addrLine>Vavilov str., 40</addrLine>
									<postCode>11933</postCode>
									<settlement>Moscow</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Search Query Extension Semantics</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">CA1E87DBCF255FB762D3E1DF93262C41</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T07:43+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Search Model</term>
					<term>Word2vec</term>
					<term>Synonyms</term>
					<term>Query</term>
					<term>Query Extension</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The problems of extracting the most complete information from the semantic library by accounting for related documents are considered. Expert knowledge encrypted in the subject area can be made available when the user obtains additional information from linked documents. A feature of the approach is the use of a shallow neural network algorithm to expand the search query in mathematical subject areas, where expert knowledge is available with a significant scientific background of users. The solution to this problem can be achieved by means of semantic analysis in the knowledge space using machine learning algorithms. The paper investigates the construction of a vector representation of documents based on paragraphs in relation to the data array of the digital semantic library LibMeta. Each piece of text is labelled. Both the whole document and its separate parts can be marked. Since the problem of enriching user queries with synonyms was solved, when building a search model in conjunction with word2vec algorithms, an approach of "indexing first, then training" was used to cover more information and give more accurate results.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The history of research the problems of expanding the request for the most complete coverage of information is quite long <ref type="bibr" target="#b0">[1]</ref><ref type="bibr" target="#b1">[2]</ref><ref type="bibr" target="#b2">[3]</ref><ref type="bibr" target="#b3">[4]</ref><ref type="bibr" target="#b4">[5]</ref><ref type="bibr" target="#b5">[6]</ref><ref type="bibr" target="#b6">[7]</ref><ref type="bibr" target="#b7">[8]</ref>. The problem itself is directly related to the understanding of the subject of the search, that is, the level of competence of the user and the capabilities of the information retrieval system to use expert knowledge. Ideally, the use of query enhancement and refinement functionality assumes the presence of the actual data and knowledge base and the ability to reformulate the original query in order to improve the search result. Many approaches have been developed with the advent of artificial intelligence algorithms and corresponding programming tools in this area <ref type="bibr" target="#b8">[9]</ref>. The first expert system using query refinement technique, Dendral <ref type="bibr" target="#b9">[10,</ref><ref type="bibr" target="#b10">11]</ref> was developed in 1965 for the analysis of chemical compounds. An example of another system based on medical expertise was MYCIN <ref type="bibr" target="#b11">[12]</ref> presented to the scientific community in 1972. During the dialogue, MYCIN offered options for the diagnosis and further investigation of the patient. Using about 500 inference rules, MYCIN performed at about the same level of competence as blood infection specialists and better than general practitioners.</p><p>The next stage of introducing artificial intelligence into knowledge systems is due to the use of neural network algorithms <ref type="bibr" target="#b12">[13]</ref>. Despite the fact that the ideas of creating mathematical models based on the functioning of biological neural networks have been developing since 1943 <ref type="bibr" target="#b13">[14]</ref>, their practical implementation has gained popularity with the accumulation of digitized data, that is, already in the 21st century. Some researchers have noted this as a new era in the "partially forgotten" for the time of artificial intelligence. Search algorithms began to learn <ref type="bibr" target="#b14">[15]</ref> on the accumulated queries, accumulate the most frequent of them, as well as the corresponding answers. All this contributed to an increase in the reaction speed of the search service, the development of targeted offers and user tips.</p><p>More complex links and structures are embedded in scientific libraries, which is dictated by the logic of subject areas and requires more careful processing of links to provide users with advanced query capabilities <ref type="bibr" target="#b15">[16]</ref>. One such subject area is mathematics. It is of interest to study and replenish the mathematical encyclopedia, to identify unaccounted for semantic relationships of concepts and formulas.</p><p>This work is devoted to the use of shallow neural network algorithms <ref type="bibr" target="#b16">[17]</ref> to expand the search query in mathematical subject areas based on the LibMeta <ref type="bibr" target="#b17">[18]</ref> library, presented in the form of an ontology, and is a continuation of the authors' research in this direction <ref type="bibr" target="#b18">[19]</ref><ref type="bibr" target="#b19">[20]</ref><ref type="bibr" target="#b20">[21]</ref><ref type="bibr" target="#b21">[22]</ref><ref type="bibr" target="#b22">[23]</ref><ref type="bibr" target="#b23">[24]</ref>. The description of the subject area is terminologically limited to the terms of the mathematical encyclopedia <ref type="bibr" target="#b24">[25]</ref>. As a corpus of texts, many mathematical articles are considered, which are partially supplied with codes of thematic classifiers MSC (https://msc2020.org/) and UDC (https://teacode.com/online/udc/) and correspond to a certain structure.</p><p>The LibMeta resources include a thesaurus on ordinary differential equations (ODE), dictionaries for special functions of equations of mathematical physics. All dictionaries are semantically linked to a mathematical encyclopedia <ref type="bibr" target="#b24">[25]</ref>. These resources are used to analyze semantic relationships.</p><p>This paper presents a search model (part 2), outlines a technique based on the use of algorithms for vector representation of texts <ref type="bibr" target="#b25">[26]</ref><ref type="bibr" target="#b26">[27]</ref><ref type="bibr" target="#b27">[28]</ref><ref type="bibr" target="#b28">[29]</ref> (part 3); shows the application of the search model to add synonyms to a search query (part 4); and examples of search query extension (part 5), which demonstrate the application of the model to improve search results, also provides estimates of the completeness and accuracy of the algorithm, and also shows the process of ranking documents.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Search Model</head><p>The construction of the search model in LibMeta is based on three main key points, namely:</p><p> converting documents to searchable format;  requests are presented in a format that allows expressing the user's information needs;  the assessment of the compliance of the document with the request.</p><p>In our case, for the preparation of documents, preprocessing of full texts was carried out to remove the publisher's markup and highlight the main parts of the text. Then a full-text document index was created, which allows you to efficiently load and store data and provide quick access to it. Queries written in natural language are used, which can be enriched with synonyms by the system. The assessment of the compliance of a document with a request is subjective and depends on the method used.</p><p>One of the most commonly used document and query presentation models is the vector space model <ref type="bibr" target="#b25">[26]</ref><ref type="bibr" target="#b26">[27]</ref><ref type="bibr" target="#b27">[28]</ref><ref type="bibr" target="#b28">[29]</ref>. In this model, one of the models based on artificial neural networks, both the request and the document are represented by a vector and the distance between them is measured, which estimates the degree of closeness of the document and the request.</p><p>In vector notation, each word is associated with a weight, which can be calculated in different ways. One of the most commonly used algorithms is the TF-IDF <ref type="bibr" target="#b29">[30]</ref> algorithm, the main idea of which is that the more often a word appears in one document, the more important it is. And at the same time, the more common a word is in a corpus of documents, the less important it is. Another common model is the probabilistic model, which is based on an estimate of the likelihood that a document is relevant to a particular query. One of the popular scoring algorithms in this model is Okapi BM25 <ref type="bibr" target="#b29">[30,</ref><ref type="bibr" target="#b30">31]</ref>.</p><p>The main problem of any search model is to provide relevant results in relation to the user's information needs: from query analysis to ranking search results. This work is devoted to options for resolving this problem. One of the modern approaches is to use neural networks for text processing, since text is an example of data that can be parsed into smaller structures such as paragraphs, sentences, words, etc. depending on the text. This approach to text processing allows you to capture the semantics of the text, since closely related words or fragments of text occur in the same context and lie side by side in vector space. The search model used in this work is based on the vector representation of words and documents built using the word2vec <ref type="bibr" target="#b26">[27]</ref><ref type="bibr" target="#b27">[28]</ref><ref type="bibr" target="#b28">[29]</ref> neural network algorithm <ref type="bibr" target="#b16">[17]</ref>.</p><p>Integration of neural network and index can be done in the following ways:</p><p> first training on the corpus of texts, then indexing the texts and share them in the search;  indexing first, then training on indexed data and sharing in search;  first training, then extraction / creation of useful resources by the trained network, and then indexing of all resources, both new and original. Since we were solving the problem of enriching user queries with synonyms, in the LibMeta system we used the "indexing first, then training" approach to provide more results and more accurate results, based on extended queries on the one hand. On the other hand, using the extended version of word2vec in conjunction with the LibMeta search engine, it becomes possible to give users smarter recommendations based on the documents found. This approach to sharing the index and search engine and neural network allows for relevant models and ranking functions that adapt well to the underlying data. The version of the model built on the LibMeta search index using word2vec algorithms, hereinafter we will be abbreviated as wsgMath.</p><p>Figure <ref type="figure" target="#fig_0">1</ref> schematically illustrates the operation of a search based on a neural network, which receives a query string as input, then returns synonyms to the query using the model built by word2vec. In another case, a document on a vector representation can be submitted to the input, which, using the constructed model, gives recommendations in the form of a list of documents similar to it.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Vector Representation of Documents</head><p>Studies show <ref type="bibr" target="#b25">[26]</ref><ref type="bibr" target="#b26">[27]</ref><ref type="bibr" target="#b27">[28]</ref><ref type="bibr" target="#b28">[29]</ref> that vector representations of text are well suited for taking into account the semantics of words, but the meaning and deep semantics of text documents depend not only on the meaning of individual words. For this purpose, you need to study the semantics of phrases and longer text fragments.</p><p>For convenience, we will use the term "paragraph" to denote a paragraph, as such, but also for fragments of a paragraph or several phrases from the text. As applied to our field and the specifics of the structure of a mathematical text, these can also be theorems, lemmas, etc.</p><p>Note that the term "important" fragment will also be used. In scientific texts, this is an abstract, introduction, conclusion, theorem, etc. This term is defined since the specified elements of a scientific publication will be used as defining for documents belonging to a certain subject area.</p><p>Content for research is the resources of the LibMeta digital library <ref type="bibr" target="#b31">[32]</ref>, where, along with the accumulated original thesauri and dictionaries (for special functions, ordinary differential equations, mixed equations of mathematical physics), a mathematical library is integrated <ref type="bibr" target="#b24">[25]</ref>.</p><p>Therefore, to construct wsgMath, taking into account the context for paragraphs, we used a version of the word2vec algorithm which is a generalization (extension) of the original doc2vec algorithm <ref type="bibr" target="#b28">[29,</ref><ref type="bibr" target="#b32">33]</ref>. For this, during training, one more component is added to the vector. Thus, when training "vectors of the word w", the "document vector d" is also trained, and upon completion of training, we obtain a vector representation of the document. As a result of the processing of the original content, the presentation of documents as a set of "related contents" was obtained. "Related content" is a semantically similar article related to articles from a mathematical encyclopedia and thesauri.</p><p>The procedure for highlighting such content will be used to offer the user semantically related documents. It is essential that without the application of the algorithm for highlighting related content, such documents will not be displayed in the search results by request, since they may not contain keywords from the query or not directly related to a certain subject area in other terms.</p><p>The peculiarity of common search models, such as the vector space model with TF-IDF, is that they only take into account individual terms. This approach does not always lead to optimal results because contextual information is discarded. The word context is understood as N words in the text before the word for which the vector is constructed, and N words after this word. In contrast to the TF-IDF model, the individual elements of the vector are not interpretable, but the distance between the vectors is investigated, which is interpreted as the semantic proximity of words.</p><p>Based on the vector representation, the proximity of the texts is measured. Using the search index and vector document representation together leverages the ability of these views to capture the semantics of text when building search models that are well adapted to the data.</p><p>The main metrics for measuring the proximity of texts are cosine distance and Euclidean distance, which are used to capture semantically similar words, sentences, paragraphs, etc.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Revealing Synonyms</head><p>The analysis of mathematical texts is conventionally considered as the analysis of the actual mathematical text as a whole, the analysis of formulas as a "separate language" for the representation of mathematical knowledge and the establishment of semantic links between the text and formulas. Further, only the analysis of the mathematical text as a whole is considered.</p><p>To extract synonyms for query terms from the constructed model, lexical and grammatical templates were used, which are one of the recognized methods for extracting links from text <ref type="bibr" target="#b33">[34]</ref><ref type="bibr" target="#b34">[35]</ref><ref type="bibr" target="#b35">[36]</ref><ref type="bibr" target="#b36">[37]</ref><ref type="bibr" target="#b37">[38]</ref>. Based on the idea of using such patterns, we investigated the task of extracting synonyms of concepts and extracting / constructing simple patterns from them to identify relationships.</p><p>The implementation of the model consists in the application of an iterative research algorithm, which will be called iraWsgMath below. We list its main stages:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head> allocation of synonyms of terms</head><p>As an example, we will demonstrate the query "Cauchy problem" (For the convenience of the reader, the examples have been translated into English, but the work was done for texts in Russian. In Russian the considering term is "задача Коши"), which consists of two words, "problem" and "Cauchy", each of which has its own synonyms, which are presented in Table <ref type="table" target="#tab_0">1</ref>. The third column presents the query context as one unit Cauchy problem (задача Коши). Extracting its synonyms, it is clear that the list consists of words where the adjective boundary falls, which also occurs in the synonyms of individual words in the first two columns.</p><p>In this case, the term "Cauchy problem" itself has the following synonyms: "Cauchy equation", "Cauchy inequality", which were determined on the basis of high estimates of the proximity of the following pairs of synonyms, for example, for a pair (problem, equation), the proximity estimate is 0.84.</p><p>Note that when constructing synonymous terms, synonyms of the word Cauchy were not used, since it was defined as a named entity Cauchy based on a dictionary that includes a list of persons mentioned in the mathematical encyclopedia. But at the same time, we note that Riemann got into the synonyms of Cauchy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head> determination of classes of synonyms by parts of speech</head><p>Lexical and grammatical templates were used to extract synonyms for query terms from the constructed model. They are one of the recognized methods for extracting links from text <ref type="bibr" target="#b33">[34,</ref><ref type="bibr" target="#b34">35]</ref>. Based on the idea of using such patterns, we investigated the task of extracting synonyms of concepts and extracting / constructing simple patterns from them to identify relationships. Consider a link extraction pattern based on a simple adjective &lt;term&gt; pattern that most often indicates generic links. The original term is a generic concept, and the combination corresponding to the pattern is a specific concept <ref type="bibr" target="#b35">[36,</ref><ref type="bibr" target="#b36">37,</ref><ref type="bibr" target="#b37">38]</ref>.</p><p>Each word was considered separately, the synonyms are filtered by parts of speech and a possible synonym (candidate) of the term is formed from them. After that, a sentence was formed for the term and its synonyms based on the selected templates.</p><p>Based on these synonyms, the following sentences were formed, which were obtained in accordance with the pattern "adjective &lt;request term&gt; &lt;synonyms of the request term&gt;": [Cauchy boundary value problem, Cauchy boundary equation, Cauchy boundary inequality] (In Russian was used word "краевой").</p><p>When compiling an extended query, it was also proposed to use the adjective boundary as a synonym, therefore additional queries were used: [Cauchy boundary value problem, Cauchy boundary equation, Cauchy boundary inequality]. (In Russian was used word "краевой").</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head> selection of patterns of "capture" of links</head><p>The &lt;term&gt; verb pattern was considered to analyze and construct more complex relationship patterns in the &lt;term&gt; verb &lt;term&gt; thesaurus. Using this pattern to fill the links with it requires a separate analysis and is beyond the scope of the word2vecbased algorithm considered in this article.</p><p>In the process of training the model, several verbs were defined to identify patterns using the considered algorithm. When analyzing context-sensitive synonyms, the list of emerging verbs was rather limited, which is not surprising due to the specifics of the subject area. The list of these verbs is limited to such as: apply, use, apply, base, prove, consider, consider, define, depend, be, embody. Also often used are verbal nouns formed from the listed verbs: application, use, application, basis, definition.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head> improving the "quality of terms" and checking them</head><p>To improve the search for extended domain terms matching templates for verbose terms in the thesaurus, possible spellings of terms were considered, for example, for "ordinary differential equation", the possible options are "ODE", "ordinary DE", etc. All possible spellings were explored as separate terms. Since there are few such terms in the studied subset of terms, they did not have a significant meaning on the results.</p><p>Validation of the model and the links it retrieves was performed based on the thesaurus ODE.</p><p>The problem of synonyms and their extraction using word2vec with search index is covered in more detail in <ref type="bibr" target="#b23">[24]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Examples</head><p>The combined use of the full-text index <ref type="bibr" target="#b39">[40,</ref><ref type="bibr" target="#b40">41]</ref> and the search model wsgMath makes it possible to extend the original query with synonyms. Extending queries with synonyms without wsgMath requires pre-compiled synonym dictionaries. You can use resources such as WordNet (https://wordnet.princeton.edu/) or RuWordNet (https://ruwordnet.ru/ru), but the main problem is that synonyms from pre-compiled dictionaries are not tied to the data being indexed and their use does not improve the results.</p><p>Figure <ref type="figure">2</ref> shows the main steps of forming a model wsgMath for generating query synonyms in LibMeta content. The query string coming from the full-text search in-terface goes through the Analyzer. Analyzer is a functional part of the model, where the basic operations for interacting with the wsgMath model are performed. All operations described in points in the previous section refer to its main functionality.</p><p>The Analyzer splits a string into words, analyzes and transforms them. From the wsgMath model, synonyms for words are extracted and filtered, an extended query is formed, with the help of which the corresponding documents are extracted from the full-text index. Fig. <ref type="figure">2</ref>. Joint use of a search engine and a neural network model built on the basis of an index using word2vec algorithms to generate an extended query with synonyms. A user's information need is defined as a chain of requests that leads him to the information he needs. Each subsequent request in this chain is a refinement of the previous one.</p><p>A real information request, as a rule, consists of an initial request and clarifications. Let's consider an example, when the primary query leads to excessive information noise, and the refinement allows you to get a more pertinent answer, and compare the search results using the wsgMath model and without it. For comparison, statistical characteristics are calculated (denote score) obtained using the TF-IDF algorithm.</p><p>The example below demonstrates three lists (List 1-3) with different scores depending on how the query is expanded. For example, when searching for the test query "Cauchy problem", the user enters the qualifying query "Cauchy boundary value problem" and finds the information of interest. Based on the fact that the search index contains 3654 scientific articles, of which only 637 contain a mention of the "Cauchy problem". Of these, 59 pieces were selected for the user, since the query words were found in significant parts of the document (title and annotation). With this approach, the document of interest to the user is in 18th place. Part of the list is shown below, "score" value shows how well a document matches the request and is calculated based on statistical characteristics such as TF-IDF.</p><p>List 1:</p><p>1. The Cauchy problem for the system of equations of the theory of elasticity and thermoelasticity in space score = 0.65376675 2. The Cauchy problem for the system of thermoelasticity equations in space score = 0.64415324 ……………………………………………………….</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="18.">On the Well-Posedness of a Boundary Value Problem on the Line for Three Analytic Functions score = 0.5233538</head><p>With the refinement query "Cauchy boundary value problem", the list of results looks as shown below, and the document of interest is moved to the fifth position, while the number of documents satisfying the query text is reduced to 338, while the user is recommended only 20 of them. Let us consider the situation when the query "Cauchy problem" is extended by synonyms and is transformed into the form "boundary", "problem or equation or inequality Cauchy" (in Russian: "краевая или граничная", "задача или уравнение или неравенство Коши") using the wsgMath model. In parentheses in an extended query, synonyms are listed, connected by a logical operation OR. The presence of at least one of these synonyms is required. This approach insignificantly increases the completeness of the answer, and the accuracy also increases, therefore, the degree of satisfaction of the user's need increases. The list of results obtained is displayed below and the searched document is in second place. The number of documents corresponding to the request is 395 and the user will receive the desired answer already in the first positions, while the size of the issue by the system is 65. This example illustrates the effect of this approach already at the level of extending queries with synonyms based on indexed documents. With this approach, all suggested synonyms are found in the search engine index and the query extension is guaranteed to offer answers to the user's queries.</p><p>The use of the extended version of word2vec (doc2vec or paragraph2vec, in different sources in different ways) <ref type="bibr" target="#b28">[29,</ref><ref type="bibr" target="#b32">33]</ref> allows you to introduce an additional element, such as a label for a text fragment or the entire document, and based on the vectors of these labels, select similar documents not only by the exact match of keywords or terms, but based on the context of individual fragments or the entire document. As an illustration, Fig. <ref type="figure" target="#fig_3">3</ref> shows the main steps of this approach. This feature is used to issue documents that are close in meaning, which do not appear in the search results, but may be of interest to the user. Let's take a closer look at the process of ranking documents based on the wsgMath model when searching for similar documents. When a document enters the system, its current vector representation is retrieved, a search is performed, and the labels of the nearest documents are returned, the cosine distance of which exceeds a certain threshold, determined experimentally as 0.6. Below is the result of the work on the example of the document, which in the previous example was the desired one. As the closest to it, 9 documents were found whose cosine distance exceeded 0.6.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>List 4:</head><p>1. Some classes of singular integral equations solvable in closed form cosineSimilarity = 0.8136491179466248 2. Riemann's boundary value problem for a half-plane with a coefficient exponentially decreasing at infinity cosineSimilarity = 0.8028532266616821 3. Algorithm for constructing a quasiregular asymptotic representation of the solution of singularly perturbed linear multipoint boundary value problems with fast and slow variables cosineSimilarity = 0.7246567010879517 4. Solution in closed form of an integral equation of convolution type in the hyperelliptic case cosineSimilarity = 0.6468908786773682 5. On biorthogonal systems generated by some involutive operators cosineSimilarity = 0.6454607248306274 6. On linear periodic systems in the plane having matrices of the required form cosineSimilarity = 0.6165973544120789 7. On integral equations for the Riemann function cosineSimilarity = 0.6134763956069946 8. Gakhov's equation for an exterior mixed inverse boundary value problem with respect to a parameter ... cosineSimilarity = 0.6059825420379639 9. On a nonlinear integral equation of the first kind cosineSimilarity = 0.6017340421676636 Fig. <ref type="figure">4</ref>. Joint use of a search engine and a neural network model built on the basis of an index using word2vec algorithms using attribute search.</p><p>In Fig. <ref type="figure">4</ref> adds steps that include attribute search and how it interacts with the previously described search components. Attribute-based search delineates the boundaries in which documents are searched (by author, by year, etc.), then a transition to fulltext search can be performed on them, and/or its results can also be refined based on the similarity of documents.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusion</head><p>Vector representation of documents is proposed to expand the search query, increase the coverage of information on demand. It is shown that the quality of an answer to a request is improved by taking into account semantically close text fragments. The model proposed in the work was tested on primary data, namely, arrays of articles not systematized by subject matter. Note that the technology of processing and thematic classification of primary data using machine learning methods has been tested. This technology can be used for the subject classification of the texts of scientific articles in Russian and the comparison of selected subjects with the English-language classification by comparing the MSC and UDC classifiers.</p><p>Integration of neural network and search indexes makes possible to give users smarter results based on the identified relations among documents. Also, the considered search model can be used for thematic processing, both primary texts of scientific articles, and already systematized, provided with keywords and links to classifiers. In the second case, this can help to identify interdisciplinary research, as well as erroneous assignments of the subject area, since not only secondary documents, but also the texts of articles (primary documents) are taken as the basis for thematic analysis.</p><p>Aknowledgement. The work is presented in the framework of the implementation of the theme of the state assignment "Mathematical methods of data analysis and forecasting" FRC CSC of RAS and partially supported by grant #20-07-00324 of the Russian Foundation of Basic Research.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Joint use of a search engine and a neural network model built on the basis of an index using word2vec algorithms.</figDesc><graphic coords="4,202.82,147.40,200.95,119.80" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>List 2 : 1 . 5 .</head><label>215</label><figDesc>Projection procedures for non-local improvement of linearly controlled processes score = 0.8902895 2. On one method of constructing parametric synthesis for a linear-quadratic optimal control problem score = 0.8708762 ………………………………………… On the Well-Posedness of a Boundary Value Problem on the Line for Three Analytic Functions score = 0.85024154</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>List 3 : 1 .</head><label>31</label><figDesc>On a positive radially symmetric solution of the Dirichlet problem for one nonlinear equation and a numerical method for obtaining it score = 0.9809638 2. On the Well-Posedness of a Boundary Value Problem on the Line for Three Analytic Functions score = 0.9587569 3. On a positive radially symmetric solution of the Dirichlet problem for one nonlinear equation and a numerical method for obtaining it score = 0.9512307</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Joint use of a search engine and a neural network model built on the basis of an index using word2vec algorithms to generate an extended query with synonyms and refine search results based on a selection of similar documents.</figDesc><graphic coords="10,202.82,395.40,200.95,184.55" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="12,183.32,147.40,239.40,253.48" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Synonyms for each words of query "Cauchy problem" (задача Коши)</figDesc><table><row><cell>problem (задача)</cell><cell>Cauchy (Коши)</cell><cell>Cauchy problem (задача Коши)</cell></row><row><cell>equation (уравнение)</cell><cell>Riemann (Риман)</cell><cell>to define (определять)</cell></row><row><cell>inequality (неравенство)</cell><cell>boundary (краевой)</cell><cell>boundary (краевой)</cell></row><row><cell>boundary (краевой)</cell><cell></cell><cell></cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">The vocabulary problem in human-system communication</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">W</forename><surname>Furnas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">K</forename><surname>Landauer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">M</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">T</forename><surname>Dumais</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Commun. ACM</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="issue">11</biblScope>
			<biblScope unit="page" from="964" to="971" />
			<date type="published" when="1987">1987</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A knowledge-based approach to online document retrieval system design</title>
		<author>
			<persName><forename type="first">G</forename><surname>Biswas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bezdek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">L</forename><surname>Oakman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. ACM SIGART Int. Symp. Methodol</title>
				<meeting>ACM SIGART Int. Symp. Methodol</meeting>
		<imprint>
			<publisher>Intell. Syst</publisher>
			<date type="published" when="1986">1986</date>
			<biblScope unit="page" from="112" to="120" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Query expansion using lexical-semantic relations</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">M</forename><surname>Voorhees</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">17th Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. Retr</title>
				<meeting><address><addrLine>Dublin, Ireland</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1994">1994</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Automatic query expansion using SMART: TREC 3, presented at the 3rd</title>
		<author>
			<persName><forename type="first">C</forename><surname>Buckley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Salton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Allan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Singhal</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Text Retr. Conf. (TREC)</title>
				<imprint>
			<date type="published" when="1995">1995</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Query expansion</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">N</forename><surname>Efthimiadis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Annu. Rev. Inf. Sci. Technol</title>
		<imprint>
			<biblScope unit="volume">31</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="121" to="187" />
			<date type="published" when="1996">1996</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">OntoSeek: Content-Based Access to the Web</title>
		<author>
			<persName><forename type="first">N</forename><surname>Guarino</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Intelligent Systems</title>
		<imprint>
			<biblScope unit="page" from="70" to="80" />
			<date type="published" when="1999-06">May-June. 1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A review of ontology based query expansion</title>
		<author>
			<persName><forename type="first">J</forename><surname>Bhogal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Macfarlane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Smith</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Inf. Process. Manage</title>
		<imprint>
			<biblScope unit="volume">43</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="866" to="886" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Concept based query expansion</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Qui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Frei</surname></persName>
		</author>
		<idno type="DOI">10.1145/160688.160713</idno>
	</analytic>
	<monogr>
		<title level="m">SIGIR &apos;93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval</title>
				<meeting><address><addrLine>Pittsburgh, Pennsylvania, USA; New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="1993-07-01">June 27 -July 01, 1993. 1993</date>
			<biblScope unit="page" from="160" to="169" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">LISP: the Language of Artificial Intelligence</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Berk</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1985">1985</date>
			<publisher>Van Nostrand Reinhold Company</publisher>
			<biblScope unit="page" from="1" to="25" />
			<pubPlace>New York</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">DENDRAL: A Case Study of the First Expert System for Scientific Hypothesis Formation</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">K</forename><surname>Lindsay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">G</forename><surname>Buchanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">A</forename><surname>Feigenbaum</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lederberg</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artificial Intelligence</title>
		<imprint>
			<biblScope unit="volume">61</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="209" to="261" />
			<date type="published" when="1993">1993</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">An Instrumentation Crisis in Biology</title>
		<author>
			<persName><forename type="first">J</forename><surname>Lederberg</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1963">1963</date>
		</imprint>
		<respStmt>
			<orgName>Stanford University Medical School. Palo Alto</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">MYCIN</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">J</forename><surname>Copeland</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Encyclopedia Britannica</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">An Introduction to Neural Networks</title>
		<author>
			<persName><forename type="first">K</forename><surname>Gurney</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1997">1997</date>
			<publisher>CRC Press</publisher>
			<pubPlace>London and New York</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">A logical calculus of the ideas immanent in nervous activity</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">S</forename><surname>Mcculloch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Pitts</surname></persName>
		</author>
		<idno type="DOI">10.1007/BF02478259</idno>
		<ptr target="https://doi.org/10.1007/BF02478259" />
	</analytic>
	<monogr>
		<title level="j">Bulletin of Mathematical Biophysics</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="115" to="133" />
			<date type="published" when="1943">1943</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title/>
		<author>
			<persName><surname>Machinelearning</surname></persName>
		</author>
		<ptr target="http://www.machinelearning.ru/" />
		<imprint>
			<date type="published" when="2021-07-27">2021/07/27</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Bazy znanij intellektualnyh sistem</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">A</forename><surname>Gavrilova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">F</forename><surname>Horoshevskij</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2000">2000</date>
			<publisher>SPb. Piter</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Machine Learning with Shallow Neural Networks</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C</forename><surname>Aggarwal</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-</idno>
		<idno>319-94463-0_2</idno>
		<ptr target="https://doi.org/10.1007/978-3-" />
	</analytic>
	<monogr>
		<title level="m">Neural Networks and Deep Learning</title>
				<meeting><address><addrLine>Cham.</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Ontology based approach to modeling of the subject domain &quot;Mathematics&quot; in the digital library</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">A</forename><surname>Sererbryakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">M</forename><surname>Ataeva</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Lobachevskij Journal of Mathematics</title>
		<imprint>
			<biblScope unit="volume">42</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page" from="1920" to="1934" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Ontological Approach: Knowledge Representation and Knowledge Extraction</title>
		<author>
			<persName><forename type="first">O</forename><surname>Ataeva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Serebryakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tuchkova</surname></persName>
		</author>
		<idno type="DOI">10.1134/S1995080220100030ISSN19950802</idno>
		<ptr target="https://doi.org/10.1134/S1995080220100030ISSN19950802" />
	</analytic>
	<monogr>
		<title level="j">Lobachevskii Journal of Mathematics</title>
		<imprint>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="issue">10</biblScope>
			<biblScope unit="page" from="1938" to="1948" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Mathematical Physics Branches: Identifying Mixed Type Equations</title>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">M</forename><surname>Ataeva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">A</forename><surname>Sererbryakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">P</forename><surname>Tuchkova</surname></persName>
		</author>
		<idno type="DOI">10.1134/S1995080219070047</idno>
		<ptr target="https://doi.org/10.1134/S1995080219070047" />
	</analytic>
	<monogr>
		<title level="j">Lobachevskij Journal of Mathematics</title>
		<imprint>
			<biblScope unit="volume">40</biblScope>
			<biblScope unit="issue">7</biblScope>
			<biblScope unit="page" from="876" to="886" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Mathematical Physics Problems: Thesaurus and Ontology</title>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">M</forename><surname>Ataeva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">A</forename><surname>Sererbryakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">P</forename><surname>Tuchkova</surname></persName>
		</author>
		<ptr target="http://ceur-ws.org/Vol-2523/paper16.pdf" />
	</analytic>
	<monogr>
		<title level="m">Selected Papers of the XXI International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2019)</title>
				<meeting><address><addrLine>Kazan, Russia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">October 15-18. 2019</date>
			<biblScope unit="volume">2523</biblScope>
			<biblScope unit="page" from="158" to="168" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Predstavlenie matematicheskih ponyatij v ontologii nauchnyh znanij</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Muromskij</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">P</forename><surname>Tuchkova</surname></persName>
		</author>
		<idno>-2019-9-1-50-69</idno>
		<ptr target="https://doi.org/0.18287/2223-9537" />
	</analytic>
	<monogr>
		<title level="j">Ontologiya proektirovaniy</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="50" to="69" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Query Expansion Method Application for Searching in Mathematical Subject Domains</title>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">M</forename><surname>Ataeva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">A</forename><surname>Sererbryakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">P</forename><surname>Tuchkova</surname></persName>
		</author>
		<ptr target="http://ceur-ws.org/Vol-2543/rpaper04.pdf" />
		<imprint>
			<date type="published" when="2020">2020. 2021/04/27</date>
			<biblScope unit="page" from="38" to="48" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Using Applied Ontology to Saturate Semantic Relations</title>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">M</forename><surname>Ataeva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">A</forename><surname>Sererbryakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">P</forename><surname>Tuchkova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Lobachevskij Journal of Mathematics</title>
		<imprint>
			<biblScope unit="volume">42</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page" from="1776" to="1785" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">M</forename><surname>Vinogradov</surname></persName>
		</author>
		<title level="m">Mathematical Encyclopedia</title>
				<meeting><address><addrLine>Moscow</addrLine></address></meeting>
		<imprint>
			<publisher>Soviet Encyclopedia</publisher>
			<date type="published" when="1982">1982</date>
			<biblScope unit="volume">1</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">LRD: Latent Relation Discovery for Vector Space Expansion and Information Retrieval</title>
		<author>
			<persName><forename type="first">A</forename><surname>Gonçalves</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Uren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pacheco</surname></persName>
		</author>
		<idno type="DOI">10.1007/11775300_11</idno>
	</analytic>
	<monogr>
		<title level="m">Conference: Advances in Web-Age Information Management, 7th International Conference,WAIM 2006</title>
				<meeting><address><addrLine>Hong Kong, China</addrLine></address></meeting>
		<imprint>
			<publisher>Proceedings</publisher>
			<date type="published" when="2006">June 17-19, 2006. 2006</date>
		</imprint>
	</monogr>
	<note type="report_type">Technical Report</note>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Efficient Estimation of Word Representations in Vector Space</title>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Corrado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dean</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of Workshop at ICLR</title>
				<meeting>Workshop at ICLR</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Linguistic Regularities in Continuous Space Word Representations</title>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">T</forename><surname>Yih</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zweig</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of NAACL HLT</title>
				<meeting>NAACL HLT</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Distributed Representations of Sentences and Document</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Machine Learning</title>
				<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1188" to="1196" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Introduction to Information</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Raghavan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Schütze</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Retrieval</title>
				<meeting><address><addrLine>Cambridge</addrLine></address></meeting>
		<imprint>
			<publisher>Cambridge Univ. Press</publisher>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">The Probabilistic Relevance Framework: BM25 and Beyond</title>
		<author>
			<persName><forename type="first">S</forename><surname>Robertson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zaragoza</surname></persName>
		</author>
		<idno type="DOI">10.1561/1500000019</idno>
	</analytic>
	<monogr>
		<title level="j">Foundations and Trends in Information Retrieval</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="333" to="389" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Ontologiya cifrovoj semanticheskoj biblioteki LibMeta</title>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">M</forename><surname>Ataeva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">A</forename><surname>Serebryakov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Informatics and Applications</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="2" to="10" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">MLPV: Text Representation of Scientific Papers Based on Structural Information and Doc2vec</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Luo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Chen</surname></persName>
		</author>
		<idno type="DOI">10.11648/j.ajist.20190303.12</idno>
		<ptr target="https://doi.org/10.11648/j.ajist.20190303.12" />
	</analytic>
	<monogr>
		<title level="j">American Journal of Information Science and Technology</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="62" to="71" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Extracting Semantic Representations from Word Cooccurrence Statistics: A Computational Study</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Bullinaria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Levy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Behavior Research Methods</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="page" from="510" to="526" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">Lexico-syntactic patterns for automatic ontology building</title>
		<author>
			<persName><forename type="first">C</forename><surname>Klaussner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhekova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Second Student Research Workshop associated with RANLP</title>
				<meeting>the Second Student Research Workshop associated with RANLP</meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="109" to="114" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">A Taxonomy and Survey of Semantic Approaches for Query Expansion</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Raza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mokhtar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ahmad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Pasha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Pasha</surname></persName>
		</author>
		<idno type="DOI">10.1109/ACCESS.2019.2894679</idno>
		<ptr target="https://doi.org/10.1109/ACCESS.2019.2894679" />
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="17823" to="17833" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zhou</surname></persName>
		</author>
		<ptr target="https://arxiv.org/abs/1506.00528" />
		<title level="m">Medical synonym extraction with concept space models</title>
				<imprint>
			<date type="published" when="2015">2015. last accessed 2021/07/27</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<monogr>
		<title level="m" type="main">Vector-based Models of Semantic Composition</title>
		<author>
			<persName><forename type="first">J</forename><surname>Mitchell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lapata</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b38">
	<analytic>
		<title level="a" type="main">Applying word2vec technology to shifter extraction task</title>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">K</forename><surname>Polozov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">A</forename><surname>Volkova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ternational research journal</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="issue">1</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b39">
	<analytic>
		<title level="a" type="main">Compact suffix array -a space-efficient full-text index</title>
		<author>
			<persName><forename type="first">V</forename><surname>Makinen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Fundamenta Informaticae</title>
		<imprint>
			<biblScope unit="volume">56</biblScope>
			<biblScope unit="issue">1-2</biblScope>
			<biblScope unit="page" from="191" to="210" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b40">
	<analytic>
		<title level="a" type="main">Compressed full-text indexes</title>
		<author>
			<persName><forename type="first">V</forename><surname>Makinen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Navarro</surname></persName>
		</author>
		<idno type="DOI">10.1145/1216370.1216372</idno>
		<ptr target="https://doi.org/10.1145/1216370.1216372" />
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="79" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
