<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Employing Active Learning for Training a DL-Model for Citation Identification in Patent Text ⋆</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Farag</forename><surname>Saad</surname></persName>
							<email>farag.saad@fiz-karlsruhe.de</email>
							<affiliation key="aff0">
								<orgName type="institution">FIZ Karlsruhe -Leibniz Institute for Information Infrastructure</orgName>
								<address>
									<addrLine>Hermann-von-Helmholtz Platz 1 •</addrLine>
									<postCode>76344</postCode>
									<settlement>Eggenstein-Leopoldshafen</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Hidir</forename><surname>Aras</surname></persName>
							<email>hidir.aras@fiz-karlsruhe.de</email>
							<affiliation key="aff0">
								<orgName type="institution">FIZ Karlsruhe -Leibniz Institute for Information Infrastructure</orgName>
								<address>
									<addrLine>Hermann-von-Helmholtz Platz 1 •</addrLine>
									<postCode>76344</postCode>
									<settlement>Eggenstein-Leopoldshafen</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Mark</forename><surname>Prince</surname></persName>
							<email>mprince@cas.org</email>
							<affiliation key="aff1">
								<orgName type="department">CAS -Chemical Abstracts Service</orgName>
								<address>
									<addrLine>2540 Olentangy River Rd</addrLine>
									<postCode>43202</postCode>
									<settlement>Columbus</settlement>
									<region>OH</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="laboratory">Workshop on Patent Text Mining and Semantic Technologies</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Employing Active Learning for Training a DL-Model for Citation Identification in Patent Text ⋆</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">32C418FF5CEAF9D493CE7B1EF74DAEB5</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:10+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Patent Citations</term>
					<term>Named Entity Recognition</term>
					<term>Deep learning</term>
					<term>Active learning</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Citations play an important role in patent analytics. Due to the fact that existing citation lists in patent documents are incomplete, detecting and enhancing them automatically from the patent text has been a user need in patent information retrieval since a while. In this paper, we describe an approach for the identification of citations in patent text using Deep Learning (DL) models. We apply active learning for training and improving of a DL-based named entity recognition (NER) model for this task. The evaluation showed a high accuracy for the focused type of citations, i.e. for the p-c-p (patent cites patent) case.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Citations play an important role in patent retrieval and analytics. Since the existing citation lists in patent documents are incomplete, it is a long-cherished user wish to automatically determine and complete them from the patent text. Furthermore, citations within a full-text of a patent do not follow well-defined patterns or rules, therefore identifying them with high accuracy is a challenging task. In addition, researching and implementing a suitable approach for utilizing citations from patent text depends on the use case (e.g., search for prior art, linking to corresponding online sources, etc.). Successfully creating such citation lists for patents automatically from patent text will enable users to extend their discovery more efficiently to stated background or adjacent prior art.</p><p>In general, patent citations typically come in two types: a patent cites another patent (herein referred to as p-c-p) or a patent cites literature (herein referred to as p-c-l) referring to as NPL (non patent literature) citations. However, in this paper the focus will be on the p-c-p use case where we have designed, implemented and evaluated an approach for patent citation identification based on the chosen p-c-p citation type.</p><p>There are two citations patterns types for the p-c-p citation use case, the standard citation pattern type and the non-standard citation pattern type. In the standard citation pattern type, patent applicants tend to use a simple form for referencing other patent publications e.g., "US20050114951A1, WO 2006122188" etc., while in the nonstandard citation pattern type patent applicants tend to use more complex pattern for citing other patents e.g., "U.S. Pat. Nos. <ref type="bibr">6,808,085; 6,736,293; 6,732,955; 6,708,846; 6,626,379; 6,626,330; 6,626,328; 6,454,185</ref> In the following, we firstly review shortly the related work in Section 2, followed by a presentation of the proposed approach in Section 3. In Section 4 an empirical evaluation is presented and discussed. Some hints about future work is given in Section 5. A conclusion is given in Section 6.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Several machine learning approaches have been applied to the problem of extracting data from free text (NER) i.e, citation extraction, among these approaches Support Vector Machines (SVM) e.g., <ref type="bibr" target="#b0">[1]</ref>, Hidden Markov Models (HMM) e.g., <ref type="bibr" target="#b1">[2]</ref>, Conditional Random Fields (CRF) e.g., <ref type="bibr" target="#b2">[3]</ref>. However, in the past few years, Deep Learning (DL) approaches for the NER task (mainly LSTM = Long Short-Term Memory, CNN = Convolutional Neural Network became dominant as they outperformed the state-of-the-art approaches significantly <ref type="bibr" target="#b3">[4]</ref>. In contrast to machine learning approaches, where features are designed and prepared through human effort, deep learning is able to automatically discover hidden features from unlabelled data. The first application for NER using a neural network (NN) was proposed in <ref type="bibr" target="#b4">[5]</ref>. In this paper the authors considered six standard NLP tasks, among them the NER task where atomic elements in the sentence were labelled into categories such as "PERSON", "COMPANY", or "LOCATION". The authors used feature vectors generated from all words in an unlabelled corpora. A separate feature (orthographic) is included based on the assumption that a capital letter at the beginning of a word is a strong indication that the word is a named entity. The proposed controlled features were later replaced with word embeddings <ref type="bibr" target="#b5">[6]</ref>  <ref type="bibr" target="#b6">[7]</ref>. Word embeddings, which is a representation of word meanings in 𝑛-dimensional space, were learned from unlabelled data. A major strength of these approaches is that they allow the design of training algorithms that avoid task-specific engineering and instead rely on large, unlabelled data to discover internal word representations that are useful for the NER task.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">P-C-P Approach based on Deep Learning</head><p>To find suitable training data which can be used to train the p-c-p NER model, we have firstly investigated if there is any publicly available training dataset that we can rely on to build the NER precursor model which will be used to enlarge the training data we have in hand. A precursor model is a type of temporary model which is trained on a small set of training data and will be re-trained further based on a larger training data.</p><p>In the following, we give some insight about the publicly available training data as well as the generated training data using the active learning framework.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Public Training Dataset</head><p>We have identified two freely available datasets: the GRO-BID <ref type="foot" target="#foot_0">1</ref> and the manually/expert-created dataset at FIZ Karlsruhe. The GROBID project is specialized on literature citation extraction. However, they have recently done some work related to patent citation extraction. After a pre-processing steps e.g., removing non-English patents, corrupted documents etc., the total number of obtained annotated documents were 130 belonging to three patent authorities: EPO<ref type="foot" target="#foot_1">2</ref> , US <ref type="foot" target="#foot_2">3</ref> and PCT (WIPO) <ref type="foot" target="#foot_3">4</ref> . The citations coverage for each patent document is varying, there is some document which contains two citations while some other documents contain up to 375 citations. The domains of focus is life science. The total number of the extracted training paragraphs that hold citations were 487. The FIZ dataset contains 41 annotated patent documents. The citation coverage for each patent varies, between 2 and 66 citations. The domains of focus are life science and technology. The total number of the extracted training paragraphs were 230. Both datasets have a different format, not equivalent to the NER state-of-the art format e.g., IOB (Inside-outside-beginning (tagging) ) format (See <ref type="bibr" target="#b3">[4]</ref>). We have unified the format of both datasets to the standard IOB format and processed the resulting training data (with 717 training paragraphs) to train the precursor p-c-p NER model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Generated P-C-P Training Data</head><p>For training the precursor p-c-p NER model the freely acquired training data is not sufficient and needs to be improved in quality and quantity. Therefore, we have generated our own training data. To achieve this goal, patent paragraphs that hold many citations are pre-annotated by the p-c-p precursor model first. Then, the patent subject matter experts (SMEs) used the visual annotation user interface of the Prodigy<ref type="foot" target="#foot_4">5</ref> annotation tool to review and enhance the annotated p-c-p citations. Prodigy is an annotation tool with a easy to use interactive interface that supports by active learning. It is a scriptable tool that allows users to create the annotation themselves, enabling rapid iterations.</p><p>From the utilized patent full-text databases PCT (WIPO) and US we have prepared 500 (in total 1000) citation-rich paragraphs belonging to an equally distributes set of patent documents based on their IPC/CPC<ref type="foot" target="#foot_5">6</ref> classes. In order to use citation-rich paragraphs for training, we have kept only the part of the detailed description (DETD) which holds at least 8 cited patents identified by the precursor p-c-p model. Furthermore, we took into account to have sufficient training instance candidates (in the selected paragraphs) that represent the two types of citation patterns, the standard as well as the non-standard one. The training instance candidates are then reviewed by the SMEs to accept or correct each instance using the Prodigy annotation tool (See Figure <ref type="figure">1</ref>). The process starts by training a precursor model based on the public dataset, which is then utilized to enlarge the training data iteratively (See Figure <ref type="figure" target="#fig_1">2</ref> (1)). This precursor model was used to filter the acquired US and PCT raw data where we have kept only the part of the Detailed Description (DETD) text which holds at least 8 patent citations (See Figure <ref type="figure" target="#fig_1">2</ref> (2)). We have then pre-processed and prepared the raw data to ensure that it holds sufficient citations. We then loaded part of it into the Prodigy tool along with the integrated precursor model to start the training process for the final NER model in several iterations. The input training instance candidates (in the selected paragraphs), which were annotated by the precursor model, are loaded into the Prodigy tool and presented to the SMEs for reviewing. The SMEs interacted with the pre-annotations and either approved, corrected or added new annotations (See Figure <ref type="figure" target="#fig_1">2</ref> (3)) in the presented paragraphs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Model Design and Training</head><p>In the first iteration, the SMEs have reviewed 250 preannotated paragraphs for each US and PCT databases. We have used these reviewed pre-annotated paragraphs to re-train the NER model (See Figure <ref type="figure" target="#fig_1">2 (4</ref>)) where the model reached F-Score of 85% and hence another iteration was required. To enhance the model performance, We have picked up more raw data and have pre-annotated it again (250 for each US and PCT databases), using the enhanced p-c-p NER model (See Figure <ref type="figure" target="#fig_1">2</ref> (2)). The newly prepared citation-rich paragraphs were reviewed by the SMEs and used to re-train the NER model. If needed, this process will be iteratively repeated and will end when we reach a certain degree of confidence that the final NER model is significantly trained to be applied for the p-c-p citation identification task (See Figure <ref type="figure" target="#fig_1">2 (4)</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Evaluation</head><p>Once the p-c-p model is sufficiently trained, the citation identification approach is evaluated by processing the evaluation corpus of 245 patents (prepared by SMEs) representing a random collection of patents from the US (128 patents) and PCT (117 patents) patents. To start the p-c-p model evaluation process, all required materials e.g., identified citations of the SMEs evaluation corpus, etc., were handed over to the SMEs.</p><p>We have evaluated the p-c-p model based on the 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 (𝑃 ) (See equation 1), 𝑅𝑒𝑐𝑎𝑙𝑙 (𝑅) (See equation 2), and 𝐹 1 − 𝑆𝑐𝑜𝑟𝑒 measures (See equation 3). In order to compute the scores, the False Positive (𝐹 𝑃 ), True Positive (𝑇 𝑃 ) and False Negative (𝐹 𝑃 ) counts are determined first. The 𝐹 𝑃 refer to the number of wrongly identified citations by the p-c-p model. The 𝑇 𝑃 refer to  </p><formula xml:id="formula_0">𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇 𝑃 𝑇 𝑃 + 𝐹 𝑁<label>(2)</label></formula><formula xml:id="formula_1">𝐹 1 − 𝑆𝑐𝑜𝑟𝑒 = 2 * (𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 * 𝑅𝑒𝑐𝑎𝑙𝑙) 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙<label>(3)</label></formula><p>As it is shown in Table <ref type="table" target="#tab_1">1</ref> in total the p-c-p model successfully identified 951 citations out of 978 citations and failed to identify only 28 citations. An explanation of these fails is that in rare cases patent applicants don't cite citations in the right way, for example, consider this citation example "This is a continuation-in-part of my copending application Ser. No. 43,784 for Catalytic Reformer Process". Here the patent authority marker (US, WO, JP, etc.) is missing so the model has no clue about which patent authority is meant, therefore, to minimize error rate the model was trained to neglect such incomplete citation.</p><p>Even though the p-c-p model obtained a higher evaluation score also for different patent authorities e.g., Finish, Japanese etc. that it was not trained on, it failed to identify some citations that appear in some special context. An effective solution for these failures is to increase the training data to cover more patent authorities. This can be done efficiently using the framework described in Section 3.2.</p><p>Generally, the p-c-p model performed very well in most cases and could achieve a high precision of 96%, a high recall of 89%. In addition, we have computed the F1-Score measure to take into account both Precision and Recall measure in order to ultimately measure the accuracy of the model. Despite the fact that the p-c-p model was trained on a very small training dataset consisting of 1717 training paragraphs, the F1-score (using the evaluation corpus prepared by the SMEs) shows that the p-c-p model achieves a certain degree of accuracy and reaches the F1-Score of 92%.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Future Work Directions</head><p>The DL-based p-c-p NER model has been trained with a small set of training data (1717 training paragraphs) related to two patent authorities US and PCT. Even though the model obtained a higher evaluation score also for different patent authority data. However, the developed model needs further training and testing to cover more citations belonging to different patent authorities e.g., to cover more citation patterns which might be specialized to some patent authority. Based on our experience so far, a few thousand training paragraphs for each patent authority should be sufficient. To speed up this process, the developed visual active learning approach in this paper can be utilized.</p><p>To utilize the extracted citation for further tasks or application such as search, linking patents with a literature knowledge base through citation etc., the extracted citations need to be post-processed. This importantly needed as significant portions of the identified citations were presented in the non-standard citation form e.g., U.S. Pat. Nos. <ref type="bibr">5,188,960; 5,689,052; 5,880,275; 5,986,177; 7,105,332; 7,208,474.</ref> Hence, after extraction, the individual citations should be normalized accordingly: US5188960, US5689052, US5880275, etc. Another example for the normalization is splitting up the identified citation string e.g., "EP 0 716 884 A2" into meaningful segments: The patent authority "EP", the patent number "0716884", the patent kind code "A2", and, finally the normalized patent string "EP0716884A2".</p><p>To consider a detailed patent citation type such as the filling number of a patent application, the publication of a patent application etc., it is essential to integrate a patent citation specific scheme into the developed approach. For example, if we consider the US patent citation, we noticed that the filing number of a US patent application has a specific format (e.g., No.16/769,261), the publication of a US patent application has a specific format (e.g., US 2005/0114951 A1; starting with a year number) and a US patent has a specific format (e.g., <ref type="bibr">US 6,</ref><ref type="bibr">808,</ref><ref type="bibr">085)</ref>. Encoding such features into the developed approach will certainly lead to a significant improvement.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>In this paper, we have developed a DL-based p-c-p NER model to identify citations in the patent fulltext. To realize that, we have designed, implemented and evaluated an active learning framework for patent citation identification employing a DL approach. Furthermore, to train a robust citation identification p-c-p model with high accuracy, we have designed an active learning framework that can be used by patent SMEs to iteratively improve the model performance significantly with less manual effort. In the first iteration, the reviewed pre-annotated paragraphs (250 for each US and PCT databases) have led to a significant model improvement. However, another iteration which involved another 250 pre-annotated paragraphs for each database was required in order to achieve the desired F1-Score of 92%, and to stop the re-training process.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 2 Figure 1 :</head><label>21</label><figDesc>Figure 2 shows the workflow how the final NER model (based on Convolutional Neural Networks (CNNs)) is built for the p-c-p task within the active learning framework. To train the NER model we have utilized the open source framework spaCy7 . For rapid implementation, we have used the spaCy implementation which is provided by the Prodigy framework. As spaCy offers no pre-</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Building the p-c-p ner model and improving it through Experts' interaction</figDesc><graphic coords="4,89.29,84.19,416.69,226.77" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="3,89.29,84.19,416.70,255.13" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>, United States Provisional Application No.61/914,561, Japanese Unexamined Patent Publication No. 4-187748, US provisional application Serial No 61/640,128" etc. Based on that, we have developed and trained a suitable p-c-p NER DL model for identifying and extracting those types of citation patterns automatically (see Section 3).</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1</head><label>1</label><figDesc>The p-c-p ner model overall score for precision, recall and F1-Score</figDesc><table><row><cell>DATABASE</cell><cell cols="2">Identified citations FP</cell><cell>TP</cell><cell cols="4">FN Precision Recall F1-Score</cell></row><row><cell>US</cell><cell>727</cell><cell>17</cell><cell>710</cell><cell>23</cell><cell>0.97</cell><cell>0.96</cell><cell>0.96</cell></row><row><cell>PCT</cell><cell>251</cell><cell>11</cell><cell>241</cell><cell>47</cell><cell>0,95</cell><cell>0.83</cell><cell>0.88</cell></row><row><cell>Summation</cell><cell>978</cell><cell>28</cell><cell>951</cell><cell>70</cell><cell>0.96</cell><cell>0.89</cell><cell>0.92</cell></row><row><cell cols="3">the number of correctly identified citations by the p-c-p</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="3">model. The 𝐹 𝑁 refers to the number of citations that</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>the model fails to identify.</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =</cell><cell>𝑇 𝑃 𝑇 𝑃 + 𝐹 𝑃</cell><cell>(1)</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://github.com/kermitt2/grobid/tree/master/grobidtrainer/resources/dataset</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">2 https://www.epo.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">en 3 https://www.cas.org/support/training/stnanavist/uspatfull-</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">anavist 4 https://www.cas.org/support/training/stnanavist/pctfull-anavist</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://prodi.gy/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">https://www.wipo.int/classifications/ipc/en/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_6">https://spacy.io/</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">A structural svm approach for reference parsing</title>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">X</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">R</forename><surname>Thoma</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICMLA.2010.77</idno>
	</analytic>
	<monogr>
		<title level="m">Ninth International Conference on Machine Learning and Applications</title>
				<imprint>
			<date type="published" when="2010">2010. 2010</date>
			<biblScope unit="page" from="479" to="484" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A trigram hidden markov model for metadata extraction from heterogeneous references</title>
		<author>
			<persName><forename type="first">B</forename><surname>Ojokoh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tang</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.ins.2011.01.014</idno>
		<ptr target="https://doi.org/10.1016/j.ins.2011.01.014" />
	</analytic>
	<monogr>
		<title level="j">Information Sciences</title>
		<imprint>
			<biblScope unit="volume">181</biblScope>
			<biblScope unit="page" from="1538" to="1551" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Grobid: Combining automatic bibliographic data recognition and term extraction for scholarship publications</title>
		<author>
			<persName><forename type="first">P</forename><surname>Lopez</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:27383212" />
	</analytic>
	<monogr>
		<title level="m">European Conference on Research and Advanced Technology for Digital Libraries</title>
				<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Improving named entity recognition for biomedical and patent data using bi-lstm deep neural network models</title>
		<author>
			<persName><forename type="first">F</forename><surname>Saad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Aras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Hackl-Sommer</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-51310-8_3</idno>
		<idno>doi:</idno>
		<ptr target="10.1007/978-3-030-51310-8_3" />
	</analytic>
	<monogr>
		<title level="m">Natural Language Processing and Information Systems -25th International Conference on Applications of Natural Language to Information Systems, NLDB 2020</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">E</forename><surname>Métais</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Meziane</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Horacek</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Cimiano</surname></persName>
		</editor>
		<meeting><address><addrLine>Saarbrücken, Germany</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">June 24-26, 2020. 2020</date>
			<biblScope unit="volume">12089</biblScope>
			<biblScope unit="page" from="25" to="36" />
		</imprint>
	</monogr>
	<note>Proceedings</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A unified architecture for natural language processing: Deep neural networks with multitask learning</title>
		<author>
			<persName><forename type="first">R</forename><surname>Collobert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Weston</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 25th international conference on Machine learning</title>
				<meeting>the 25th international conference on Machine learning</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="160" to="167" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Natural language processing (almost) from scratch</title>
		<author>
			<persName><forename type="first">R</forename><surname>Collobert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Weston</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bottou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Karlen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kavukcuoglu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">P</forename><surname>Kuksa</surname></persName>
		</author>
		<idno>Repository -CORR abs/1103.0398</idno>
		<imprint>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
	<note type="report_type">Computing Research</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Improving feature extraction using a hybrid of cnn and lstm for entity identification</title>
		<author>
			<persName><forename type="first">E</forename><surname>Parsaeimehr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Fartash</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Torkestani</surname></persName>
		</author>
		<idno type="DOI">10.1007/s11063-022-11122-y</idno>
	</analytic>
	<monogr>
		<title level="j">Neural Process Lett</title>
		<imprint>
			<biblScope unit="volume">55</biblScope>
			<biblScope unit="page" from="5979" to="5994" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
