<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>AIResume: Automated generation of Resume Work History</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Janani Balaji</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Madhav Sigdel</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Phuong Hoang</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mengshu Liu</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Georgia</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>janani.balaji</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>madhav.sigdel</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>phuong.hoang</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>mengshu.liu</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>mohammed.korayem}@careerbuilder.com</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Automatic text generation has redefined the act of content generation in multiple fields such as article/news summarization, chatbots, and virtual assistants. For a person on the job market, a resume is an important piece of document that determines his rate of success in landing a job. In this paper, we introduce AIResume (AIR) - a tool that utilizes a vast knowledge base comprising of resume entries and job descriptions to generate the work history portion of a person's resume with minimal input from the user. The system starts with suggesting personalized job titles based on the employer and then goes on to mine and present the user with relevant work activities for the selected employer and job title.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        The advances in computing capability and connectivity have
increased the usage of smart phones as mainstream operational units.
However, the smaller footprint of a mobile device limits its usability
as a primary text input/output device. Research has been made
towards optimizing the data intake experience by automating text
generation and recommendation process [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ][
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Tools like
voicebased input and automatic text suggestions help improve the user
experience while interacting with hand held devices.
      </p>
      <p>In the recruitment space, smart phones are increasingly being
used as the key devices in searching and applying for jobs. A
successful job search starts with a well written and concise resume
that highlights the key accomplishments - education, work history
with job functions, skill set, certifications etc.,. Nevertheless, at
CareerBuilder, we have observed that more than 30% of job
seekers do not have a resume to start with. This, combined with the
increased popularity of smart phones as job search tools, prompted
us to put Natural Language Generation (NLG) and Information
Extraction (IE) technology and the huge number of resumes and
job descriptions we have collected over the years to use, to help the
user construct his/her resume.</p>
      <p>In this paper, we introduce AI-Resume (AIR), our resume
generation tool that helps construct the work history of a resume, with
minimal input from the user. The system comprises of two parts
a personalized job title suggestion and personalized work activity
suggestion. The user starts the process by entering the name of
the company s/he wishes to enter in the resume. The system, in
response, recommends the user with job titles that are personalized
to that company. These personalized job titles are generated by
collecting all the job titles associated with the chosen employer and
applying extensive cleaning and grouping strategies. Once the user
selects a job title, the user is presented with a set of work activities
that are common for the chosen company and job title. The work
activities are mined from the several million user resumes and job
postings using language parsing techniques and ranking. A screen
shot of the tool is shown in Figure 1. The key contributions of this
paper can be summarized as:
• Present AIR, an automated tool that helps build the work
history section of a user resume.
• Define the challenges in recommending personalized titles
per employer and explain the methodology used to extract
relevant titles.
• Provide a framework to mine relevant work activities for a
given employer and job title from the database of resumes
and job postings.</p>
      <p>The rest of the paper is organized as follows. We describe the
overview of our system in Section 2, and Section 3 explains our
personalized job title generation strategy, while Section 4 elaborates
our model to extract relevant work activities for a given company
and job title. Section 5 discusses the evaluation results, Section 6
briefs over the related research and Section 7 concludes the paper
with directions for future research.
2</p>
    </sec>
    <sec id="sec-2">
      <title>SYSTEM OVERVIEW</title>
      <p>Figure 2 provides the overview of AIR system that consists of three
main components - data sources, titles and activities extraction,
and user interaction.</p>
      <p>Data sources: Having been in the forefront of the recruitment
domain, at CareerBuilder, we have hundreds of millions of resume
and job posting data collected over the years. We use both the</p>
      <p>Input employer name
resume and the job posting data to mine for personalized job titles
and personalized work activities. We collected about 50 million
work histories and 10 million job descriptions to be used to create
AIR.</p>
      <p>
        Job titles and activities extraction: The resumes and job
postings are often whole blobs of text, and a fair bit of pre-processing
is required to make the data usable. We use an in-built parsing
service [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] to process the resumes and job postings and extract only
the portions related to the work experience or job requirements.
Likewise, we use our in-house job title classification called carotene
[
        <xref ref-type="bibr" rid="ref32">32</xref>
        ] and employer name normalization [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] services to normalize
the job title and employer name to normalized encodings. Once
these pre-processing steps are done, we apply extensive cleaning,
extraction and clustering mechanisms to generate the job titles and
work activities.
      </p>
      <p>User interaction: The user interacts with the system by
providing the employer name and the system suggests the personalized
job titles to the user. Once a user selects the job title, the system
suggests personalized work activities for the selected company and
job title. The user can select and edit the suggested work activities.
As such, the user can easily create a resume with a detailed work
experience section.
3</p>
    </sec>
    <sec id="sec-3">
      <title>PERSONALIZED JOB TITLES</title>
      <p>The goal of providing job titles personalized to an employer is to as
closely as possible mimic the process of creating a resume. Though
there are common job titles such as "Software Engineer" and
"Customer Service Representative", we have observed that oftentimes
companies like to go with a personalized job title that reflects the
domain and the specific nature of work. For example, though
commonly referred to as a "Cook" or a "Deli Worker", some companies
in the restaurant business choose to call their food prep workers as
"Sandwich Artist", thereby giving a personalized touch to the
profession. It is also common to include specific departments within the
job titles to diferentiate the same position across diferent teams.
"Nurse - ICU" and "Nurse - Oncology" give proper context into
the nature of duties performed and such titles make more sense
to go on a resume. We help the user identify the closest job title
that they held in their employment by mining our database for job
titles associated with the employer and then performing targeted
Resume / Jobs
Personalized job
titles extraction</p>
      <p>Personalized
work activities
extraction
Personalized
job titles
cleaning and clustering to produce a list of job titles personalized to
an employer. This section explains the challenges and methodology
behind mining personalized job titles.
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>Challenges</title>
      <p>The major challenge we faced in generating personalized job titles
was to perform a targeted cleaning of the job title extracted from
the resume or job posting. We have observed three diferent kinds:</p>
      <p>Human Errors: Misspellings get introduced since human input
is involved. Common job titles are often expressed using their
abbreviations, e.g., Mgr for Manager, SVP for Senior Vice President,
Legal Asst. for Legal Assistant. We want to resolve all variations to
the complete job title as that is most widely used in professional
documents.</p>
      <p>Extraneous Information: The job titles in job descriptions are
more prone to containing additional information related to the job
being appended to the job title such as company name, location,
specific timing information like the shifts involved, whether the job
is part time or full time, etc.</p>
      <p>Parsing Errors: These are the obvious errors when sections of
the job posting other than the job title get passed in as the job title.
We have sometimes even seen the entire job posting data being
labeled as the job title.</p>
      <p>These are typical noises that are routinely dealt with in text
processing applications. But, what makes the cleaning process more
dificult in our scenario is that what is defined as a noise varies with
the kind of job title. For example, consider the job title
Communications Oficer - Part-time, No Benefits . In the context of this job title,
"Communications Oficer" is the true job title, whereas the rest of
the text which is "Part-time, No Benefits " is noise. In routine text
cleaning jobs, a list of stop words can be constructed to filter out
irrelevant information. However, in our case, it is not
straightforward, as the noise varies with context. Now consider the job title
"Aflac Benefits Professional". In this case, Benefits is a legitimate
word that needs to be included in the title, thus preventing the use
of a global list of stop words. Due to the wide range of job titles
available, constructing a comprehensive list of stop words for each
category was not a feasible task. Hence we devised a combination
of preliminary cleaning and clustering technologies to clean up the
job titles.
3.2</p>
    </sec>
    <sec id="sec-5">
      <title>Problem Definition</title>
      <p>Given a company C and a set of job titles J = {Ji }im=1, the aim
of generating the personalized job titles is to find the set of titles
T ⊆ {Ti |Ti = σ (Ji )}im=1, where σ represents cleaning function to
eliminate extraneous noise in the job titles.
3.3</p>
    </sec>
    <sec id="sec-6">
      <title>Cleaning Methodology</title>
      <p>The process of mining personalized job titles follows a two-step
approach. First, a preliminary round of cleaning is performed to
remove the obvious errors. We then used clustering technique to
group related job titles into clusters and chose a representative from
each cluster denote the respective clean title.</p>
      <p>
        Preliminary Cleaning: For the preliminary cleaning, we
extract the raw job title Tr , normalized employer id E, normalized
carotene code C and normalized onet code O. We start of the
process by creating a set of general stop words S applicable across all
occupations. These contain the common stop words like is, are, that,
was etc.,. and also job title specific ones like part time, seasonal, full
time. In order to handle the misspellings, abbreviations and short
forms that could be present, we trained a Word2Vec[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] model
and used the vector similarity to create a substitution dictionary of
common misspellings to their proper forms. We then tokenized the
job titles in our Carotene taxonomy into unigrams and bigrams to
create a dictionary of tokens in each Carotene code, denoted d1C and
dC , respectively. Finally, we tokenized the entire job and resume
2
corpus into unigrams and bigrams and calculated their tf-idf score
on a Carotene level, denoted tf-idf1C and tf-idf2C , respectively.
      </p>
      <p>We approached the preliminary cleaning as finding the best
substring of the given job title that is most relevant to the Carotene
category identified. Accordingly, we tokenized the raw job title
Tr into a set of tokens {Tri } and formed all possible continuous
subsets of length ranging from 1 to n − 1 where n = ∥Tri ∥. We then
scored each subset Ts and extracted the top k subsets for each title
as possible candidates. The scoring method is given below:
S(Tsi ) =
S1(Tsi ) =
S2(Tsi ) =
j=1
j=1
S1(Tsi ) + S2(Tsi )</p>
      <p>2n
∥Tsi ∥
Õ tf-idf1C Tsij · wsj ,
∥Tsi ∥
Õ tf-idf2C Tsij · wsj ,
2.0 if

wsj = −1.0 if</p>
      <p>T i
sj ⊆ d1C ∪ d2C
T i
sj ⊆ S
otherwise.
1.0

An example for the cleaning methodology is given in Figure 3. Once
the job titles undergo a preliminary cleaning, the top k subsets
sorted in descending order by their scores are selected for each title
to be subjected to the clustering process that follows.</p>
      <p>Title Clusters: The preliminary cleaning process removes most
of the added noise. However, given the wide range of job titles
present and their uneven distribution, we observed that it was
necessary to augment the preliminary cleaning with a clustering
strategy focused on each employer. The cleaning process described
above is employer agnostic. Though it takes into account the job
title classification, it does not capture all the specific
employerspecific vagaries. The clustering process that follows attempts to
form clusters of job titles from each company and choose a cluster
representative for each cluster to be presented as the set of
personalized job titles for the company. We started of by creating a
graph with pair-wise distances between each raw title within each
company. The distances were calculated as:</p>
      <p>
         ∥Ti∗∩Tj∗ ∥ if m∗ &gt; 1
d(Ti , Tj ) =  ∥Ti∗∪Tj∗ } ∥ (5)
lev(Ti , Tj ) otherwise,

where lev(Ti , Tj ) denotes the Levenshtein [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] distance between Ti
and Tj and Ti∗ and Tj∗ represent the tokens of Ti and Tj respectively,
(1)
(2)
(3)
(4)
      </p>
      <p>$1000 Signon Bonus!** Seasonal Local Truck Driver
truck driver, local</p>
      <p>truck driver
Looking For Cdl A
Dry Van Truck
Driver Jobs in
Texas Garlandtx
cdl a dry van
truck driver, truck
driver</p>
      <p>Truck Driver (no
CDL required)
truck driver</p>
      <p>Garbage Truck
Driver - Class B
CDL required
garbage truck
driver, truck
driver</p>
      <p>Residential/Roll
Off/Front Load</p>
      <p>Garbage Truck
garbage truckDriver
driver, truck
driver</p>
      <p>Raw Job
Title 
Clean
Subsets
Truck Driver</p>
      <p>Garbage Truck Driver</p>
      <p>Clean Title
and the minimum token length, m∗, is defined as,
m∗ = min(∥Ti∗ ∥, ∥Tj∗ ∥)
i, j
(6)</p>
      <p>
        Once the graph was formed, we used Correlation Clustering[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
to form smaller clusters that represent the same job title with minor
variations in representation. After the clusters are identified, we
choose a cluster representative for each cluster to denote the cleaned
job title of the cluster. For the cluster representative, we chose the
maximum common continuous sub-sequence among all the subsets
that are part of the cluster. At the end of the clustering process,
we were able to generate a list of personalized job titles for each
company. An example of the clustering process and the cluster
representative selection is given in Figure 3.
4
      </p>
    </sec>
    <sec id="sec-7">
      <title>PERSONALIZED WORK ACTIVITIES</title>
      <p>Once the user selects a company and a job title from the
Personalized Job Title list (Section 3), the AIR tool generates a list of work
activities that are suitable for the selected Company-Job Title
combination. O*Net publishes a set of work activities that are common
for each occupational code 1. However, these are generic work
activities that are collected at the occupational code level and might be
too general for our use case. For example, the Occupational Code
151131 denotes "Computer Programmers" in general. However, several
specialized occupations like ".Net Programmer, Computer Game
Programmer, Database Engineer etc.,." fall under the "Computer
Programmers" category. Since we are building a resume, our aim is
to provide a highly personalized set of job duties that are relevant
to the job title. Furthermore, we wanted to drill down the activities
to also be company specific instead of only being job title specific.
4.1</p>
    </sec>
    <sec id="sec-8">
      <title>Challenges</title>
      <p>We use an extraction-based approach to form the work activities
rather than a generative approach. Our strategy was to collect all
the resumes and job posting data for each company, group them
by Carotene (job title), extract relevant activities and finally rank
the activities for the selected Company and Carotene. As such, the
majority of our challenges were in extracting the work activities
from the available job descriptions.</p>
      <p>Parsing errors: To extract the work activities, we would ideally
want the source text to only contain the job duties in a job or work
1https://www.onetcenter.org/content.html/4.D
experience section in a resume. The source data of job postings and
resumes consists of various irrelevant sections such as the company
description, qualifications, benefits. In addition, the input text may
contain some personal information such as phone number, email
address, url. Therefore, we not only need to extract the relevant
section, but also extract the phrases matching work activities and
eliminate the irrelevant phrases.</p>
      <p>Variability in the text: Since the job and resume data are
coming from millions of customers and users, there is a lot of variability
in the format, structure and writing of the text. Some characters
are lost due to character encoding mismatch. The text might be
in the form of bullet points, with missing bullet point characters
and missing sentence boundaries which needs to be fixed for the
activity retrieval process to work well.
4.2</p>
    </sec>
    <sec id="sec-9">
      <title>Problem Definition</title>
      <p>Let D = {Di }in=1 represent all the available job descriptions/resumes
for a given company C and normalized job title T . Likewise, let Si
represent the set of likely job duty phrases from job description Di .
The complete set S = Ðn</p>
      <p>i=1 Si contain all the likely work duties for
the job description set D. Finally, the phrases in S are clustered and
ranked to obtain the most relevant and unique set of work activities
for the selected company job title. This process is applied for every
company and job-title combination.</p>
      <p>Job postings</p>
      <p>Extract job
requirements using
job parser</p>
      <p>Extract work
experience using</p>
      <p>CV parser
Resumes</p>
      <p>Preprocessing &amp; sentence</p>
      <p>segmentation</p>
      <p>Dependency parsing and
matching activity POS pattern</p>
      <p>Deduplication &amp; ranking</p>
      <p>Personalized
work activities</p>
    </sec>
    <sec id="sec-10">
      <title>Extraction Methodology</title>
      <p>
        The first step in generating the work activities for a given
CompanyJob Title combination involves collecting the set of job
descriptions/resumes belonging to the Company and Job Title in question.
For each of job descriptions/resumes thus collected, we employ
dependency parsing techniques and POS pattern match to extract
relevant segments of the text that represent ’activities’. Finally, we
score and rank the extracted work activity phrases by their
relevance to the job title and select the top k work activities to be
presented to the user. We used the Spacy natural language
processing library [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] for the dependency parsing and the part-of-speech
(POS) tagging because of its eficiency and reliability.
      </p>
      <p>Figure 4 shows the complete pipeline of our work activities
extraction process. We initially run through our Job/Resume parser
service to extract the job requirement or work experience section
from the input text. We then apply some pre-processing steps and
extract phrase patterns matching work activity. This is followed by
de-duplication and ranking step to recommend the most relevant
and unique set of work activities to the user.
4.3.1 Activities Extraction. Here we describe the steps for
extracting the work activities.</p>
      <p>Pre-processing and Sentence Segmentation: The source
resume or job text could be in HTML format. We use regular
expressions to remove the HTML tags. Similarly, if the source text have
identifiers such as email, phone number and url, we apply
corresponding regular expressions in the text and replace these with
their respective identifier tags. Later in the processing, we look for
these tags and eliminate the phrases consisting such tags.</p>
      <p>Jobs content often comes in the form of bullet points, having no
defined sentence boundary markers. Oftentimes, the bullet point
characters are lost due to encoding mismatch. It is important to
identify and segment such lengthy text into diferent sentences
because the reliability of an NLP system is highly dependent on the
structure of the input data. We apply heuristics approach to define
boundaries as follows:
• Add a sentence break after the preceding word if there is a
verb starting with an uppercase, e.g., customer satisfaction
Establish daily to customer satisfaction. Establish daily.
• Add a sentence break on words with lowercase followed
by uppercase verb, e.g., Coaching of employeesAnalyze call
volume to Coaching of employees. Analyze call volume.
• Identify bullet characters and replace them in the text with
full stop. We used part-of-speech (POS) tagging and get the
non-alphabetic characters preceding the verbs and identify
bullets as the character group repeated the most.</p>
      <p>Extracting activity phrase: Here, we apply the previous steps
and get the list of sentences. We maintain a stop-verbs dictionary to
iflter non-activity verbs such as modal verbs ( have, must) and
continuity verbs (including, following). For each sentence, we look for
the first candidate activity verb not present in the filter dictionary.
We then get the dependency tree starting from this candidate verb
and extract the phrase starting from this verb and ending in a noun
or a pronoun. The final extracted phrase includes the tokens in the
tree starting from candidate verb until noun or pronoun within
minimum and maximum token threshold. We apply minimum and
maximum word count thresholds of 3 and 30, respectively. Setting
these thresholds allows us to get informative activities as well as to
reduce noise.</p>
      <p>Post-processing: From the list of likely activity phrases, we
remove any phrases containing stop word list such as email, phone,
url as tagged in the pre-processing step or few other tokens such as
qualifications, benefits, requirements , commonly present as section
headers in a job or a resume.</p>
      <p>Figure 5 provides an example input text and the output along
with the intermediate steps. In this example, the first likely activity
VERB is the word preparing. The dependency tree starting from
this word and ending in the noun/pronoun gives the text
"Preparing/dispensing medications" as the final output.
4.3.2 Scoring, De-duplication and Ranking. Depending on the
number of records, length of each text, and matching of POS patterns,
we might end up with hundreds of likely work activities for each
Original text: Significant part of this job is preparing/dispensing medications &lt;p&gt;
Pre-processing: Significant part of this job is preparing/dispensing medications .</p>
      <p>Dependency tree with POS tag
Start</p>
      <p>End
Candidate activity verb: preparing
Extracted activity:  Preparing/dispensing medications
company-job title combination. Hence, we used a scoring scheme
based on tf-idf to score each segment extracted and rank them
based on the score. In particular, we tokenized the segment into
the set of tokens {SeдT ok0, SeдT ok1, ...SeдT okn } and computed the
weighted tf-idf score as follows:</p>
      <p>SeдScore = 1 Õn tf-idf(SeдT oki ),
n i=0
(7)
where n is the number of tokens in the segment.</p>
      <p>Since the activities are extracted from multiple descriptions,
there may be several activities with overlapping content. To present
the user with the most relevant and unique set of activities, we
perform de-duplication based on overlap similarity and connected
component labeling. For every pair of activities extracted for a
company and job description, we calculate the overlap similarity.
Let S1 and S2 be the set of tokens from a pair of activities, we
calculate the overlap similarity between S1 and S2 as defined in (8).
s = ∥S1 ∩ S2 ∥ (8)</p>
      <p>min(∥S1 ∥, ∥S2 ∥)</p>
      <p>We use a threshold (τ = 0.7) on the overlap similarity to form
connected component labeling on the activities pairs. From each
connected component, we select the activity with the highest tf-idf
score. Thus selected activities are ranked in the decreasing order of
weights and the top k are extracted to be presented to the user.</p>
    </sec>
    <sec id="sec-11">
      <title>5 EXPERIMENTAL RESULTS</title>
    </sec>
    <sec id="sec-12">
      <title>5.1 Personalized Titles</title>
      <p>We evaluated the title personalization system by randomly sampling
raw titles and their corresponding clean titles across job categories.
We selected 405 job titles from resumes and 405 job titles from job
postings for our evaluation and manually evaluated the
corresponding clean titles. For each raw title, we extracted the top 5 clean
titles returned from the cluster containing the raw title. We
evaluate if the top clean title returned for the cluster was an acceptable
clean representation, and if an acceptable clean representation was
available in the top 5 results returned for the cluster.</p>
      <p>We then calculated the accuracy as, acctopN = ∥Na ∥/∥Nt ∥
where Na and Nt are the number of raw titles that had an acceptable
clean title version in its top N result and the total number of raw
titles evaluated, respectively. The results are given in Table 1</p>
    </sec>
    <sec id="sec-13">
      <title>5.2 Personalized Work Activities</title>
      <p>To evaluate the efectiveness of our method, we randomly selected
900 descriptions (500 jobs and 400 resume profiles) corresponding
to 30 most frequent carotene. For each carotene, 2-3 companies
were selected with 3-4 descriptions per company and carotene. We
evaluated our method using two methods: programmatically and
manually:</p>
      <p>Programmatic evaluation: To evaluate the quality of extracted
activities, we first wanted to determine the relevancy of extracted
activities from each job description or resume input. The
objective here is to determine the relevancy of activities prior to
deduplication and ranking. For this, we manually extracted all
relevant job activities from the 900 job and resume profiles. We then
applied the pre-processing, activity extraction and post-processing
stage of our model and obtained the list of likely work activities for
each input. We evaluated the string to string matching of system
activities and gold standard activities. Let Sд and Sa represent a pair
of work activities from the gold standard and system extraction list,
and nд and na represent their respective word counts. We consider
the pair a match if either string is a sub-string of the other and
min(nд , na )/max(nд , na ) &gt; 0.5. With this programmatic
evaluation, we achieve 75% precision and 55% recall. Note that this step
considers all system and manual activities without ranking. There
are hundreds of activities for each company job-title combination.
Since we are interested in listing the top k, let’s say 20 activities, 55%
recall is an acceptable result. The false positives also decreases after
applying ranking, de-duplication, and top k activities selection.</p>
      <p>Evaluating activities per company-carotene: To measure
the accuracy of our extracted job activities from the end-users
perspective, we ran through our complete model of phrase extraction,
de-duplication and ranking with the above manually annotated 900
jobs comprising 206 company carotene combinations.The extracted
activities were then ranked as Good, OK or Bad. Table 2 provides the
summary of our evaluation. Good indicates that all of the activities
were relevant and had good sentence constructs. OK represented
the group with mostly useful activities with a few irrelevant or bad
phrase constructs. The users would find the recommended activities
useful even if they prefer to edit the few irrelevant activities. Lastly,
the Bad category is the group with majority irrelevant or badly
structured phrases. This shows more than half of the results in the
Good category and in total the Good’s and OK’s account for 90%.
6</p>
    </sec>
    <sec id="sec-14">
      <title>RELATED RESEARCH</title>
      <p>AIR is a unique tool and to the best of our knowledge, the first of its
kind. The Personalized Titles suggestion is in its core an
information extraction (IE) problem, where the goal is to extract the actual
job title given a noisy text. The Personalized Work Activities
recommendation can be linked to document summarization/information
extraction, though the precise nature of the output poses additional
challenges. This section glosses over the past research in the
areas of information extraction, document summarization, author
assistant systems and online recruitment space highlighting the
diferences and unique needs in our work.</p>
      <p>
        Information extraction deals with identifying specific data of
interest from natural-language. Classifying entities to pre-defined
categories of objects such as names, places, organizations is called
Named Entity Recognition (NER). Several works have dealt with
this in diferent domains [
        <xref ref-type="bibr" rid="ref21 ref23 ref9">9, 21, 23</xref>
        ]. Most of the NER studies use the
language model derived from complete sentences to identify entities
[
        <xref ref-type="bibr" rid="ref18 ref3">3, 18</xref>
        ]. Rule based systems like [
        <xref ref-type="bibr" rid="ref26 ref31">26, 31</xref>
        ] use a set of lexical rules to
extract entities. However, personalized titles suggestion involves
working with phrases and partial sentences, thereby making the
traditional NER methods unsuitable. The need for a well annotated
KB also makes it dificult to apply NER principles.
      </p>
      <p>
        Text summarization refers to the method of producing a
summary with the most important information present in the source
documents. Summarization can be single document vs multi-document
[
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. Single document summarization uses single large text source
input while multi-document summarization consists of several
documents with possibly overlapping content as the input. For example,
      </p>
      <p>
        Generating a new job summary using several relevant job
descriptions is a multi-document summarization problem. Summarization
systems can also be categorized as extractive vs abstractive
summarization [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The majority of prior work on summarization methods
relies on extracting sections from the text and ranking them for
potential inclusion in the summary. Various methods such as tf-idf and
position weighting [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], recurrent-neural network approach [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
graph-based ranking [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] and clustering and optimization methods
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] have been proposed to determine sentence importance weights.
There are also studies focusing on abstract summary using deep
learning based summarization methods [
        <xref ref-type="bibr" rid="ref14 ref15 ref30">14, 15, 30</xref>
        ]. Our
Personalized Work Activities follows extractive summarization approach.
      </p>
      <p>
        Several work have been described in the literature on authoring
assistant systems. The goal of such systems is to assist users in
writing content by providing relevant suggestions. In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], the authors
describe a system to help review writers by suggesting review
topics and relevant details on the topic. Similarly, the GhostWriter [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
uses case-based reasoning on existing product reviews to provide
content suggestions to review writers. The AIResume system has a
similar goal of suggesting the job titles and job duties to be filled in
a resume. As discussed in the paper, it utilizes information retrieval,
summarization and ranking techniques to make personalized
recommendations. To our knowledge, there has been no prior work
on such system.
      </p>
      <p>
        In online recruitment space, machine learning techniques have
been successfully applied in job search, job recommendation, and
talent matching in various settings [
        <xref ref-type="bibr" rid="ref12 ref13 ref25">12, 13, 25</xref>
        ]. There have been few
studies to automatically parse diferent sections in a job or a resume
[
        <xref ref-type="bibr" rid="ref28 ref8">8, 28</xref>
        ]. These methods are very helpful for several down-stream
automated applications. Having an intelligent system to suggest
activities based on their employer and job title would allow the
users to quickly build a detailed resume. This is what we aimed to
solve via AIResume.
7
      </p>
    </sec>
    <sec id="sec-15">
      <title>CONCLUSION</title>
      <p>In this paper, we presented our novel AI based resume builder
that helps job seekers build their resumes quickly and eficiently
on their mobile devices. AIResume leverages a large dataset of job
postings and resume profiles collected over several years and applies
natural language extraction, extensive filtering and aggregation
techniques to generate recommended job titles and work activities
personalized to each company. Users find these job titles and work
activities helpful for the work experience section in their resumes.
This feature has been released in CareerBuilder App Store and has
attracted great interest from the users.</p>
      <p>While our current methods for AIResume provides a good
baseline system, there are diferent areas we would like to improve
upon to make the system more robust and useful. Currently,
AIResume only supports building the work history section of the resume.
Incorporating the other integral parts of the resume such as the
educational qualifications, and skills are still a work in progress. There
exists some non-activity phrases coming from irrelevant text and
some noisy constructs. To this end, we would explore classification
model to distinguish a phrase as a work activity or not. Likewise, we
would like to explore sentence revision techniques on the extracted
phrases to make the activities concise and informative.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Rasim</surname>
            <given-names>M Alguliyev</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Ramiz M Aliguliyev</given-names>
            ,
            <surname>Nijat R Isazade</surname>
          </string-name>
          , Asad Abdi, and
          <string-name>
            <given-names>Norisma</given-names>
            <surname>Idris</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>COSUM: Text summarization based on clustering and optimization</article-title>
          .
          <source>Expert Systems</source>
          <volume>36</volume>
          ,
          <issue>1</issue>
          (
          <year>2019</year>
          ),
          <year>e12340</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Mehdi</given-names>
            <surname>Allahyari</surname>
          </string-name>
          , Seyedamin Pouriyeh, Mehdi Assefi, Saied Safaei,
          <string-name>
            <surname>Elizabeth D Trippe</surname>
            , Juan B Gutierrez, and
            <given-names>Krys</given-names>
          </string-name>
          <string-name>
            <surname>Kochut</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A brief survey of text mining: Classification, clustering and extraction techniques</article-title>
          .
          <source>arXiv preprint arXiv:1707.02919</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Isabelle</given-names>
            <surname>Augenstein</surname>
          </string-name>
          , Leon Derczynski, and
          <string-name>
            <given-names>Kalina</given-names>
            <surname>Bontcheva</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Generalisation in Named Entity Recognition</article-title>
          .
          <source>Comput. Speech Lang</source>
          . 44,
          <string-name>
            <surname>C (</surname>
          </string-name>
          July
          <year>2017</year>
          ),
          <fpage>61</fpage>
          -
          <lpage>83</lpage>
          . https://doi.org/10.1016/j.csl.
          <year>2017</year>
          .
          <volume>01</volume>
          .012
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Nikhil</given-names>
            <surname>Bansal</surname>
          </string-name>
          , Avrim Blum, and
          <string-name>
            <given-names>Shuchi</given-names>
            <surname>Chawla</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <string-name>
            <given-names>Correlation</given-names>
            <surname>Clustering</surname>
          </string-name>
          .
          <source>Machine Learning</source>
          <volume>56</volume>
          ,
          <volume>1</volume>
          (
          <issue>01</issue>
          <year>Jul 2004</year>
          ),
          <fpage>89</fpage>
          -
          <lpage>113</lpage>
          . https://doi.org/10.1023/B:MACH.
          <volume>0000033116</volume>
          .57574.95
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Derek</given-names>
            <surname>Bridge</surname>
          </string-name>
          and
          <string-name>
            <given-names>Paul</given-names>
            <surname>Healy</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>The GhostWriter-2.0 Case-Based Reasoning system for making content suggestions to the authors of product reviews</article-title>
          .
          <source>Knowledge-Based Systems</source>
          <volume>29</volume>
          (
          <year>2012</year>
          ),
          <fpage>93</fpage>
          -
          <lpage>103</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Ziqiang</given-names>
            <surname>Cao</surname>
          </string-name>
          , Furu Wei, Li Dong,
          <string-name>
            <given-names>Sujian</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Ming</given-names>
            <surname>Zhou</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Ranking with recursive neural networks and its application to multi-document summarization</article-title>
          .
          <source>In Twenty-ninth AAAI conference on artificial intelligence .</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Ruihai</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <surname>Kevin</surname>
            <given-names>McCarthy</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michael O'Mahony</surname>
            ,
            <given-names>Markus</given-names>
          </string-name>
          <string-name>
            <surname>Schaal</surname>
            , and
            <given-names>Barry</given-names>
          </string-name>
          <string-name>
            <surname>Smyth</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Towards an intelligent reviewer's assistant: recommending topics to help users to write better product reviews</article-title>
          .
          <source>In Proceedings of the 2012 ACM international conference on Intelligent User Interfaces. ACM</source>
          ,
          <volume>159</volume>
          -
          <fpage>168</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Shweta</given-names>
            <surname>Garg</surname>
          </string-name>
          , Sudhanshu S. Singh,
          <string-name>
            <given-names>Abhijit</given-names>
            <surname>Mishra</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Kuntal</given-names>
            <surname>Dey</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>CVBed: Structuring CVs usingWord Embeddings</article-title>
          .
          <source>In Proceedings of the Eighth International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>2</volume>
          :
          <string-name>
            <given-names>Short</given-names>
            <surname>Papers</surname>
          </string-name>
          <article-title>)</article-title>
          .
          <source>Asian Federation of Natural Language Processing</source>
          , Taipei, Taiwan,
          <fpage>349</fpage>
          -
          <lpage>354</lpage>
          . https://www.aclweb.org/anthology/I17-2059
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Phuong</given-names>
            <surname>Hoang</surname>
          </string-name>
          , Thomas Mahoney, Faizan Javed, and
          <string-name>
            <surname>Matt McNair</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>LargeScale Occupational Skills Normalization for Online Recruitment</article-title>
          .
          <source>AI</source>
          Magazine
          <volume>39</volume>
          ,
          <issue>1</issue>
          (Mar.
          <year>2018</year>
          ),
          <fpage>5</fpage>
          -
          <lpage>14</lpage>
          . https://doi.org/10.1609/aimag.v39i1.
          <fpage>2775</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Matthew</given-names>
            <surname>Honnibal</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ines</given-names>
            <surname>Montani</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>spacy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing</article-title>
          . To appear (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Christina</surname>
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>James</surname>
          </string-name>
          and
          <string-name>
            <surname>Kelly M. Reischel</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>Text Input for Mobile Devices: Comparing Model Prediction to Actual Performance</article-title>
          .
          <source>In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '01)</source>
          .
          <fpage>365</fpage>
          -
          <lpage>371</lpage>
          . https: //doi.org/10.1145/365024.365300
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Faizan</surname>
            <given-names>Javed</given-names>
          </string-name>
          , Qinlong Luo,
          <string-name>
            <surname>Matt</surname>
            <given-names>McNair</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Ferosh</given-names>
            <surname>Jacob</surname>
          </string-name>
          ,
          <source>Meng Zhao, and Tae Seung Kang</source>
          .
          <year>2015</year>
          .
          <article-title>Carotene: A job title classification system for the online recruitment domain</article-title>
          .
          <source>In 2015 IEEE First International Conference on Big Data Computing Service and Applications</source>
          . IEEE,
          <fpage>286</fpage>
          -
          <lpage>293</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Krishnaram</surname>
            <given-names>Kenthapadi</given-names>
          </string-name>
          , Benjamin Le, and
          <string-name>
            <given-names>Ganesh</given-names>
            <surname>Venkataraman</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Personalized job recommendation system at linkedin: Practical challenges and lessons learned</article-title>
          .
          <source>In Proceedings of the Eleventh ACM Conference on Recommender Systems. ACM</source>
          ,
          <volume>346</volume>
          -
          <fpage>347</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Piji</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Wai</given-names>
            <surname>Lam</surname>
          </string-name>
          , Lidong Bing, and
          <string-name>
            <given-names>Zihao</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Deep recurrent generative decoder for abstractive text summarization</article-title>
          .
          <source>arXiv preprint arXiv:1708.00625</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Linqing</surname>
            <given-names>Liu</given-names>
          </string-name>
          , Yao Lu, Min Yang, Qiang Qu, Jia Zhu, and
          <string-name>
            <given-names>Hongyan</given-names>
            <surname>Li</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Generative adversarial network for abstractive text summarization</article-title>
          .
          <source>In Thirty-Second AAAI Conference on Artificial Intelligence .</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Qiaoling</surname>
            <given-names>Liu</given-names>
          </string-name>
          , Faizan Javed, and
          <string-name>
            <given-names>Matt</given-names>
            <surname>Mcnair</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Companydepot: Employer name normalization in the online recruitment industry</article-title>
          .
          <source>In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM</source>
          ,
          <volume>521</volume>
          -
          <fpage>530</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Soukoref</surname>
            <given-names>R.</given-names>
          </string-name>
          ,W. MacKenzie,
          <string-name>
            <surname>I. S.</surname>
          </string-name>
          <year>2002</year>
          .
          <article-title>Text entry for mobile computing: Models and methods, theory and practice</article-title>
          .
          <source>In Human-Computer Interaction</source>
          .
          <fpage>147</fpage>
          -
          <lpage>198</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Muhammad</surname>
            <given-names>Kamran</given-names>
          </string-name>
          <string-name>
            <surname>Malik</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Urdu Named Entity Recognition and Classification System Using Artificial Neural Network</article-title>
          .
          <source>ACM Trans. Asian Low-Resour. Lang. Inf. Process</source>
          .
          <volume>17</volume>
          ,
          <issue>1</issue>
          ,
          <string-name>
            <surname>Article 2</surname>
          </string-name>
          (
          <issue>Sept</issue>
          .
          <year>2017</year>
          ),
          <volume>13</volume>
          pages. https://doi.org/10.1145/3129290
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Rada</given-names>
            <surname>Mihalcea</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Graph-based ranking algorithms for sentence extraction, applied to text summarization</article-title>
          .
          <source>In Proceedings of the ACL Interactive Poster and Demonstration Sessions.</source>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Tomas</surname>
            <given-names>Mikolov</given-names>
          </string-name>
          , Kai Chen, Gregory S. Corrado, and
          <string-name>
            <given-names>Jefrey</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Eficient Estimation of Word Representations in Vector Space</article-title>
          .
          <source>CoRR abs/1301</source>
          .3781 (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>David</given-names>
            <surname>Miller</surname>
          </string-name>
          , Sean Boisen, Richard Schwartz, Rebecca Stone, and
          <string-name>
            <given-names>Ralph</given-names>
            <surname>Weischedel</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Named Entity Extraction from Noisy Input: Speech and OCR</article-title>
          .
          <source>In Proceedings of the Sixth Conference on Applied Natural Language Processing (ANLC '00)</source>
          .
          <article-title>Association for Computational Linguistics</article-title>
          , Stroudsburg, PA, USA,
          <fpage>316</fpage>
          -
          <lpage>324</lpage>
          . https://doi.org/10.3115/974147.974191
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Frederic</surname>
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>Agnes F.</given-names>
          </string-name>
          <string-name>
            <surname>Vandome</surname>
            ,
            <given-names>and John McBrewster. 2009. Levenshtein</given-names>
          </string-name>
          <string-name>
            <surname>Distance</surname>
            : Information Theory, Computer Science, String (Computer Science), String Metric, Damerau?Levenshtein Distance, Spell Checker,
            <given-names>Hamming</given-names>
          </string-name>
          <string-name>
            <surname>Distance</surname>
          </string-name>
          . Alpha Press.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Raymond</surname>
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mooney</surname>
            and
            <given-names>Razvan</given-names>
          </string-name>
          <string-name>
            <surname>Bunescu</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Mining Knowledge from Text Using Information Extraction</article-title>
          .
          <source>SIGKDD Explor. Newsl. 7</source>
          ,
          <issue>1</issue>
          (
          <year>June 2005</year>
          ),
          <fpage>3</fpage>
          -
          <lpage>10</lpage>
          . https://doi.org/10.1145/1089815.1089817
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Ani</surname>
            <given-names>Nenkova</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kathleen McKeown</surname>
          </string-name>
          , et al.
          <year>2011</year>
          .
          <article-title>Automatic summarization</article-title>
          .
          <source>Foundations and Trends® in Information Retrieval 5</source>
          ,
          <fpage>2</fpage>
          -
          <lpage>3</lpage>
          (
          <year>2011</year>
          ),
          <fpage>103</fpage>
          -
          <lpage>233</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Chuan</surname>
            <given-names>Qin</given-names>
          </string-name>
          , Hengshu Zhu, Tong Xu, Chen Zhu, Liang Jiang, Enhong Chen, and
          <string-name>
            <given-names>Hui</given-names>
            <surname>Xiong</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Enhancing Person-Job Fit for Talent Recruitment: An Abilityaware Neural Network Approach</article-title>
          .
          <source>In The 41st International ACM SIGIR Conference on Research &amp;#38; Development in Information Retrieval (SIGIR '18)</source>
          . ACM, New York, NY, USA,
          <fpage>25</fpage>
          -
          <lpage>34</lpage>
          . https://doi.org/10.1145/3209978.3210025
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Steinberger</surname>
            <given-names>Ralf</given-names>
          </string-name>
          , Pouliquen Bruno, and
          <string-name>
            <given-names>Ignat</given-names>
            <surname>Camelia</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Using Languageindependent Rules to Achieve High Multilinguality in Text Mining</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Yohei</given-names>
            <surname>Seki</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Sentence Extraction by tf/idf and position weighting from Newspaper Articles</article-title>
          . (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>Swapnil</given-names>
            <surname>Sonar</surname>
          </string-name>
          and
          <string-name>
            <given-names>Bhagwan</given-names>
            <surname>Bankar</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Resume parsing with named entity clustering algorithm</article-title>
          . paper, SVPM College of Engineering Baramati, Maharashtra, India (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Melanie</surname>
            <given-names>Tosik</given-names>
          </string-name>
          , Carsten Lygteskov Hansen, Gerard Goossen, and
          <string-name>
            <given-names>Mihai</given-names>
            <surname>Rotaru</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Word Embeddings vs Word Types for Sequence Labeling: the Curious Case of CV Parsing</article-title>
          .
          <source>In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. Association for Computational Linguistics</source>
          , Denver, Colorado,
          <fpage>123</fpage>
          -
          <lpage>128</lpage>
          . https://doi.org/10.3115/v1/
          <fpage>W15</fpage>
          -1517
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>Mahmood</given-names>
            <surname>Yousefi-Azar</surname>
          </string-name>
          and
          <string-name>
            <given-names>Len</given-names>
            <surname>Hamey</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Text summarization using unsupervised deep learning</article-title>
          .
          <source>Expert Systems with Applications</source>
          <volume>68</volume>
          (
          <year>2017</year>
          ),
          <fpage>93</fpage>
          -
          <lpage>105</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>Wajdi</given-names>
            <surname>Zaghouani</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>RENAR: A Rule-Based Arabic Named Entity Recognition System</article-title>
          .
          <volume>11</volume>
          ,
          <issue>1</issue>
          ,
          <string-name>
            <surname>Article 2</surname>
          </string-name>
          <source>(March</source>
          <year>2012</year>
          ),
          <volume>13</volume>
          pages. https://doi.org/10.1145/2090176. 2090178
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Yun</surname>
            <given-names>Zhu</given-names>
          </string-name>
          , Faizan Javed, and
          <string-name>
            <given-names>Ozgur</given-names>
            <surname>Ozturk</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Document Embedding Strategies for Job Title Classification</article-title>
          . In The Thirtieth International Flairs Conference.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>