<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>November</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Method of Detecting Cybersecurity Objects Based on OSINT  Technology </article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dmytro Lande</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olexander Puchkov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ihor Subach</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute for Information Recording of the National Academy of Sciences of Ukraine</institution>
          ,
          <addr-line>2, Mykoly Shpaka Str., Kyiv, 03013</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”</institution>
          ,
          <addr-line>37, Prosp. Peremohy, Kyiv, 03056</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>16</volume>
      <issue>2022</issue>
      <fpage>0000</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>   The information resources of the Internet contain a lot of hidden knowledge. This knowledge is contributed by users forming a kind of expert environment. In this regard, the main task of open source intelligence technologies (OSINT) is identification and extraction of hidden expert knowledge, their generalization, as well as further analytical processing. To achieve this purpose, methods of in-depth data analysis (Text Mining), linguistic and statistical methods, as well as methods of cluster analysis are used. The paper suggests a method of extracting concepts from the texts of messages of network sources related to the subject area of cybersecurity. These concepts are filtered according to statistical characteristics and ranking. A network of their relationships is created, clustered and visualized. To create a software implementation of the suggested approaches, the Perl programming language is used in the Linux OS environment, as well as software tools for graph modeling, analysis, and visualization - Gephi.</p>
      </abstract>
      <kwd-group>
        <kwd>1  OSINT</kwd>
        <kwd>cybersecurity objects</kwd>
        <kwd>time series</kwd>
        <kwd>concept extraction</kwd>
        <kwd>terms network</kwd>
        <kwd>web resources</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction </title>
      <p>Specialists working in a specific subject area usually know its main concepts and objects.
However, with the passage of time, new concepts and new objects emerge. In the field of
cybersecurity, various types of cyberattacks, hacker groups, destructive software, analytical groups,
etc., can become such objects. New meaningful connections between such objects may appear,
previous ones may disappear, which also requires additional analysis. In certain groups of objects, for
example, criminal hacker groups, the centers and objects of special attention of cybersecurity
specialists may shift. Thus, there is a task of constant information monitoring within the defined
subject area.</p>
      <p>
        Such information is widely available in social networks, on forums, the Internet (particularly,
documents posted on websites), to the content of which OSINT can be applied [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. OSINT is one of
the directions of intelligence, the essence of which is the search and analysis of information obtained
from open sources, collection of information and its further analysis, formation of reports concerning
the object of surveillance [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The role of OSINT in ensuring cybersecurity is determined by a number
of aspects, including availability and efficiency, volume, quality, reliability, ease of further use, cost
of obtaining, etc.
      </p>
      <p>The process of OSINT planning and preparation is influenced by such factors as effective
information support – information about the objects of information and cyberattacks is obtained from
open sources. The availability, depth and scope of publicly available information allow us to find the
necessary information without involvement of specialized means of intelligence, unnecessary
technical and human methods of conducting intelligence. The possibility of massive monitoring of
open sources of information in order to find targeted content, people and events leads to the necessity
to use Big Data technologies, which are successfully developing nowadays. In addition, a sharp
reduction in access time is achieved. As experience shows, competently collected pieces of
information from open sources in total can be equivalent or even more significant than professional
intelligence reports.</p>
      <p>The objective of this work is to create and test the method for determining the main cybersecurity
objects and connections between them based on the analysis of the meaningful component of the
webspace, as well as formation, clustering and analytical processing of the formed networks of
cybersecurity objects, analysis of the objects dynamics in the subject area. To achieve this goal, a
number of tasks are solved, in particular, the targeted information collection, its processing, extracting
the necessary entities from it, establishing connections between them, that is, forming a network,
cluster analysis of objects network, identifying the centers of these clusters, etc.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Method description  </title>
      <p>A feature of this technique is the simplicity of its implementation when using a typical information
retrieval system and a system for analyzing and visualizing graph structures.</p>
      <p>A method is proposed for consideration, the essence of which is to perform such technological
operations as expert creation of queries to existing information retrieval systems corresponding to the
subject area. As a result of query processing, large arrays of relevant documents are created. Named
entities (objects) belonging to different periods of time are extracted from the selected arrays. In the
future, through the analysis of networks, the interconnections of objects are studied, individual
clusters are determined.</p>
      <p>Fig. 1 shows the main stages (chains) of the method including 1) obtaining information; 2)
extraction of concepts – cybersecurity objects; 3) filtering concepts with the involvement of experts
(or artificial intelligence tools); 4) formation of a cybersecurity objects network; 5) analysis
(including clustering) and visualization of this networks; 6) visualization of the dynamics of the
appearance of concepts in time.
The method, offered in this paper for the selection of named entities corresponding to cybersecurity
objects, identification of connections and the study of the dynamics and identified named entities in
information flows, involves the implementation of a number of stages.
2.1.</p>
    </sec>
    <sec id="sec-3">
      <title>Information collection </title>
      <p>
        At the first stage of the information extraction method, an information array of documents relevant
to the topic is formed. For this purpose, existing information and search systems, both public and
corporate content monitoring systems, such as the Cyber Aggregator system [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], should be used.
      </p>
      <p>
        It should be noted that the Cyber Aggregator system collects news from 12 social networks and
provides users’ access to it in search mode. It is also possible to download relevant information in
RSS format [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>Like most similar systems for aggregating information from social networks, the CyberAgregator
system consists of three main parts: a server for collecting and primary processing of information, an
information retrieval server (search engine) and an interface server from which the service is provided
to users and other systems through the API .</p>
      <p>Aggregation of information from social networks includes the following steps:
1) search for messages from social networks related to a common broad topic – the formation of an
information flow from thematic messages;
2) determining the language of individual messages downloaded from social networks;
3)extracts from information messages, such concepts as keywords, persons, companies,
geographical names, etc;
4) sentiment analysis of individual messages;
5) data formatting, conversion to standard formats (XML, JSON);
6) loading the received stream into full-text databases.</p>
      <p>The CyberAgregator system provides the user with a web interface from which he can access the
functions of information search and analysis.</p>
      <p>The system user receives documents upon request both in the retrospective database (Search) and
in the current information (Current), as well as for data analysis (Analysis).</p>
      <p>As a result of a query search (Fig. 2), the user is provided with a list of relevant message titles with
links to the full texts of these messages in the system, as well as to these messages in social networks.</p>
      <sec id="sec-3-1">
        <title>Figure 1:A fragment of the user interface in search mode </title>
        <p>To obtain an information array of publications on cybersecurity, it is necessary to determine the
necessary period of processing a thematic request to such a system, for example, a request for
information selection was used:</p>
        <p>"cybersecurity | cyberattack"</p>
        <p>To simplify the extraction of named entities, requests in Cyrillic are used within the framework of
the suggested method.</p>
        <p>If a user finds documents that are relevant to their search query, they can save the query for future
use by selecting the 'Add Request' command.</p>
        <p>You can later display the found messages in RSS format (with subsequent loading of these results
into the so-called RSS aggregators on an ongoing basis), as well as display search results with details
on a geographical map, scalable both automatically and through settings.</p>
        <p>As a result of processing such a request, an array of relevant documents is obtained, which is
subjected to further processing.
2.2.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Information extraction </title>
      <p>At the second stage, on the basis of linguistic and statistical analysis, concepts from the subject
area contained in the documents of the information array obtained at the first step are extracted. The
main idea of recognizing named entities – cybersecurity objects is that nowadays most new concepts
in Cyrillic messages are denoted by Latin letters (non-Cyrillic short phrases in the information array
are taken into account), or by Cyrillic letters but in quotation marks.</p>
      <p>The peculiarity of the given method is the simplicity of its implementation when using a typical
information search system and a system of graph structures analysis and visualization.</p>
      <p>
        Usually[
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ], the detection of named entities (Named-entity recognition, NER) is carried out with
the help of special software libraries (spaCy, Flair, FastText), the common disadvantages of which are
the low speed of extracting concepts and the need for a complex stage of system training (it is known
that names of cybersecurity objects are not always typical company or brand names).
      </p>
      <p>The use of network information not in Latin encoding (Ukrainian, Russian, Chinese, or other
languages) greatly simplifies the task of extracting cybersecurity objects, such as hacker groups,
names of analytical centers, etc., which are mostly written in Latin encoding.</p>
      <p>
        In particular, the spaCy library is interesting in that several pre-trained models are available in
about 20 languages [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This means that in many cases it is not necessary to train your model to
extract entities. The spaCy library is considered a "production class" framework because it is very
fast, reliable, and comes with comprehensive documentation. Another popular Python entity detection
framework is the Flair library [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which is based on the PyTorch deep learning framework. It is
gaining a lot of popularity as it achieves higher precision in many languages compared to spaCy.
However, the increase in accuracy comes at the cost of speed reduction.
      </p>
      <p>Within the framework of this work, the application of a new heuristic approach is proposed. The
main idea of recognizing nominal entities – cybersecurity objects – is that, at present, most of the
nominal entities of cybersecurity objects, such as hacker groups, names of analytical centers, etc.,
mentionedin messages from social networks are not in Latin coding(Ukrainian, Russian, Chinese,
etc.);they are mainly indicated inLatin (non-Cyrillic short phrases in the information array are
takeninto account), or in Cyrillic letters, but in quotation marks. This greatly simplifies the extraction
task. In these cases, it is sufficient to detect short words or phrases in Latin encoding or in quotation
marks. Obviously, the technical solution of such a problem does not require large resource and time
costs (rather spaCy), including special machine learning.</p>
      <p>At the same time, a dictionary of known named entities of cybersecurity objects, which are
searched for in the information array, is also used to extract already known named entities.
2.3.</p>
    </sec>
    <sec id="sec-5">
      <title>Information filtering </title>
      <p>At the third stage, the selected concepts are sorted by frequency and filtered by an expert
specialist. Usually, the number of possible cybersecurity objects detected by this method does not
exceed several thousand, that is why this operation does not take much time.
2.4.</p>
    </sec>
    <sec id="sec-6">
      <title>Objects network formation </title>
      <p>
        At the fourth stage, a network of selected concepts is formed [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. For this, undirected connections
between concepts are defined. Connections can be established on the basis of different approaches, in
particular, two concepts can be considered connected if they are included in the same segment of the
document (sentence, paragraph, circle of N words, or the entire document) from the selected
information array. Also, connections can be calculated as mutual correlations between time series of
frequencies of occurrence of individual named entities per day.
      </p>
      <p>
        A method is proposed that associates a nominal entity (a concept from the subject area of
cybersecurity) with a dynamics vector corresponding to the distribution of documents by dates (days).
More specifically, each day is assigned a number – the number of occurrences of the concept in
publications covered by the content monitoring system. The dimension of this vector corresponds to
the number of days, the length of the time interval during which the array of network publications
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] was analyzed. An example of time series corresponding to entities (names of criminal
cybergroups is shown in Fig. 3a, 3b).
      </p>
      <p>Figure 3a: Graph of the number of messages per day. Request Cozy Bear  
Figure 3b: Graph of the number of messages per day. Request Fancy Bear </p>
      <p>To form a correlation network, a number of steps are performed, namely:1) for each entity, a
request is generated to the content monitoring service (in our case, to the Cyber Aggregator system).
The analysis period is also determined - the dimension of the corresponding time series - dynamics
vectors; 2) as a result of query execution, a set of dynamics vectors corresponding to the given
nominal entities (concepts) is determined; 3) the set of maximum cross-correlations between the
obtained vectors is calculated, the corresponding correlation matrix is formed with elements:
aij (m)  max
m
nm
 wki m wkj
k 1
.</p>
      <p>(1)</p>
      <p>Each entity sk from the set S  sk |kS|1 is assigned a vector parameter value wk   w1k , w2k ,..., wnk  ,
where n  G is the number of elements in the parameter set. The max function is used for the reasons
that processes that are similar in nature can have behavior that is close in dynamics, but possibly with
a time shift; 4)the adjacency matrix is formed in accordance with formula (1) and this matrix is saved
in a file in CSV format. Due to the fact that there are links between all nodes in the adjacency table,
the links are ignored, the value of which is less than some selected threshold. The choice of this
threshold depends entirely on the experience of the analysts.</p>
      <p>Compared to existing approaches, the method proposed in this paper has several advantages. First,
it uses intuitive rules to determine the weight of nodes and links, which closely reflect real-world
dynamics. Second, it has a reliable mathematical basis for correlation analysis. Third, it takes into
account previously unused parameters, such as time series of the dynamics of publications, to group
entities according to their development trends over time. Finally, the method is objective and
relatively easy to implement.
2.5.</p>
    </sec>
    <sec id="sec-7">
      <title>Network Clustering, Visualization </title>
      <p>
        At the fifth stage, clustering of the selected network is carried out and objects - centers of clusters
are found according to the modularity algorithm, as well as visualization of the formed network using
the Gephi system [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ] (Fig. 4).
      </p>
      <p>On a Fig. 4 entities related to the Main Directorate of the General Staff of the Armed Forces of the
Russian Federation (GRU) military units 21165, 71330 and 74455, respectively, are marked in blue
and green; basilica and orange - subjects of cybersecurity related to the Federal Security Service of
the Russian Federation (FSB); purple - cybersecurity entities related to the Foreign Intelligence
Serviceof the Russian Federation (SVR).</p>
      <p>
        Gephi [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ] is currently the most popular program for visualization and analysis of networks
and graphs ("network graphs"). Gephi provides fast layout, efficient filtering, and interactive data
exploration, and, besides, it is one of the best options for visualizing large-scale networks. The main
option for exporting graph data from an external file is to load the initial network data in CSV format,
in which the elements are separated by semicolons. For the analysis of large and dense networks
(arranging graph nodes) with the Gephi system, efficient layout modules such as Yifan-Hu,
Forcedirected are supplied. In particular, the Yifan-Hu algorithm is an ideal option for application after
other, faster and coarser algorithms. Most of the methods suggested by Gephi can be performed
within a reasonable time; a combination of, for example, OpenOrd and Yifan-Hu gives the highest
quality visuals.
      </p>
      <sec id="sec-7-1">
        <title>REvil </title>
        <p>Sodinokibi 
DarkSide 
Armageddon 
UAC‐0010 
Gamaredon 
Cozy_Bear 
Nobelium 
Buhtrap 
Shuckworm 
APT_29 
Sandworm 
Conti 
UAC‐0082 
BlackEnergy 
Telebots 
Voodoo_Bear 
Iron_Viking 
Sandworm_Team 
UAC‐0113 
Fancy_Bear 
Strontium 
Pawn_Storm 
Primitive_Bear 
Sednit 
Actinium 
Bromine 
Turla 
Energy_Bear 
 </p>
        <p>Node
degree 
7 
3 
2 
26 
16 
13 
12 
10 
5 
2 
1 
25 
12 
11 
10 
9 
9 
8 
7 
2 
23 
21 
11 
10 
9 
9 
8 
7 
7 
 </p>
        <p>The adjacency matrix A consists of elements Avw . whose values are equal to 0, if the node v is not
connected to the node w, and the weight of the connection between v and w, if these nodes are
connected to each other.</p>
        <p>The modularity of the network can be expressed by the formula:</p>
        <p>Q  2m v,w  Avw  k2vmkw   cv , cw ,
1 
(2)
where Avw is the element of the adjacency matrix A , m is the number of edges in the graph, kv , kw is
the degree of nodes v and w respectively, δ – is Kronecker’s delta (indicates whether nodes v and w
are in the same module).</p>
        <p>The results of the network analysis in the given example indicate the affiliation of the considered
hacker groups to the special services of the Russian Federation, namely: the GRU, the SVR and the
FSB. Currently, some of the most well known Russian-affiliated hacker groups include Fancy Bear,
Cozy Bear, Turla, Sandworm, and Berserk Bear.</p>
        <p>Based on the selected information arrays, named entities from the cybersecurity field related to
different time periods are extracted.</p>
        <p>If we consider a set of documents related to a certain topic as:
where the indexes of the documents i  1,...,n correspond to time (in particular, days).</p>
        <p>Let us denote the set of named entities F  { f1, f2,..., fm }, where the numbers of the entities and
their indices are i  1,...,m .</p>
        <p>We denote the entity extraction function for day i as:</p>
        <p>D  {d1,d2,...,dn },</p>
        <p>Ex(di )  { f1i , f2i ,..., fmi },
where f ji is the frequency of mention of entity j on day i.
2.6.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Dynamics Visualization </title>
      <p>In order to visualize the dynamics of the appearance of a set of nominal entities { f1i , f2i ,..., fmi },
within n days, a special form of visual display of entities in a section, a phraseology diagram
(PhDi)is offered. The cells of this diagram are filled with numerical values corresponding to the f ji –
frequency of appearance of named entities in relation to the dates of their appearance. That is, the
columns of this table correspond to dates, while the rows correspond to named entities, which can be
used as a kind of meaningful information flow filter.</p>
      <p>In fact, the diagram is a two-dimensional projection of a set of time series of the dynamics of the
relevant information flows, similar to those shown in Fig. 3.</p>
      <p>The proposed Ph-Di diagram is presented as a table, with cells colored in varying shades according
to the number of publications on the selected object per day. In this chart, a lighter shade corresponds
to a higher value.The offered schemes for a relatively small number of lines - named entities allow
you to visually distinguish groups of similar objects by date and intensity of publication without
additional processing.</p>
      <p>When constructing a diagram, rows may be rearranged (regrouping of named entities). For further
clustering, it is suggested to form a relations network of named entities (connections based on the
correlation of time series) and to highlight groups that are most interconnected and distant from each
other (identify cliques).</p>
      <p>
        Later on, the dynamics of mentions of these objects is studied; a form of visual representation of
the information flow in the section of objects and dates is offered, which is a rectangular table, the
cells of which are filled with numerical values corresponding to the frequency of appearance of the
objects names in the information flow in the dates section [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The considered approach can be used
to solve the problems of analysis and visualization of the objects distribution for any selected
information arrays in terms of issues that are of interest to the researchers and have a significant time
frame.
      </p>
      <p>Fig. 5. provides a Ph-Di diagram for concepts relevant to the subject area of cybersecurity. In the
diagram, the vertical dimension corresponds to the subjects of cybersecurity, and the horizontal
dimension corresponds to the dates of publications about them. The color of the cells (dot)
corresponds to the numerical values of messages per day relative to the corresponding cybersecurity
subjects: light shades correspond to larger values, dark shades correspond to smaller ones. Horizontal
light risks in the diagram correspond to the periods of activity of the corresponding subject in social
networks.</p>
      <p>Figure 5: A diagram corresponding to the activity of cybersecurity objects </p>
      <p>In practice, the form is implemented as an HTML file using the language. In this diagram, bright
horizontal lines (high frequency of individual named entities during a certain period) can clearly tell
the user about trends and high activity of individual objects in the information field of the Internet.</p>
    </sec>
    <sec id="sec-9">
      <title>3. Conclusions</title>
      <p>To sum up, we can state that a method of identifying the named entities of cybersecurity objects
from documents, as well as analyzing the relationships and dynamics of objects in the subject area, is
suggested. This method takes into account the hidden knowledge contributed by the expert network
environment. It is based on the application of documents from the Internet in non-Latin encoding,
despite the fact that most cybersecurity objects, such as hacker groups, names of analytical centers,
etc., are mostly written in Latin encoding. Taking this fact into account significantly simplifies the
task of extracting named entities and speeds up the solution of the problem.</p>
      <p>Thus, in order to implement the proposed method: 1) a set of initial requests to existing
information and search systems is created; 2) developed software (software) for extracting the
necessary fragments from the selected documents; 3) object extraction software based on a heuristic
model was developed; 4) the software for the formation of correlation networks of the interconnection
of objects, their visualization, and cluster analysis was adapted; 5) Ph-Di visualization software was
developed.</p>
      <p>The results of content monitoring of the Internet resources and the conducted cluster analysis
indicate the affiliation of the considered hacker groups to the special services of the Russian
Federation, namely: the GRU, the SVR and the FSB. Currently, the most famous criminal cyber
groups of Russian origin are Fancy Bear, Cozy Bear, Turla, Sandworm, and Berserk Bear.Cluster
analysis and visualization of the resulting network of cybersecurity objects and the use of Ph-Di
diagrams allow us to visually observe the state and dynamics of the conceptual base development of
the cybersecurity subject area.</p>
      <p>As a result of the research, it was shown that the use of the Ph-Di visualization tool allows
decomposing the original time series by the composition and features of objects, identifying the
activity of publications corresponding to certain concepts, determining the links between objects,
details of the dynamics of the emergence of new objects in the information streams.This methodology
can be based on data obtained from content monitoring systems, which are commonly used for
various analytical purposes. The goal is to identify and group entities based on their relationships and
dynamics, even if explicit links between them are not present.</p>
      <p>The considered approach can be used to analyze and visualize the distribution of objects for any
selected arrays of information over a significant period of time based on the interests of the study.</p>
    </sec>
    <sec id="sec-10">
      <title>4. References </title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Tabatabaei</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wells</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>OSINT in the Context of Cyber-Security</article-title>
          . In: Akhgar,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Bayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Sampson</surname>
          </string-name>
          ,
          <string-name>
            <surname>F</surname>
          </string-name>
          . (eds) Open Source Intelligence Investigation.
          <source>Advanced Sciences and Technologies for Security Applications</source>
          . Springer, Cham. DOI:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -47671-1_
          <fpage>14</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>[2] ATP 2-22.9. Army Techniques Publication No. 2-22.9 (FMI 2-22.9)</source>
          . Open-Source
          <string-name>
            <surname>Intelligence</surname>
          </string-name>
          . Headquarters Department of the Army Washington, DC, 10
          <year>July 2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Yong-WoonHwang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Im-Yeong</surname>
            <given-names>Lee</given-names>
          </string-name>
          , Hwankuk Kim,
          <string-name>
            <given-names>Hyejung</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Donghyun</given-names>
            <surname>Kim</surname>
          </string-name>
          .
          <source>Current Status and Security Trend of OSINT. Wireless Communications and Mobile Computing</source>
          , vol.
          <year>2022</year>
          ,
          <string-name>
            <surname>Article</surname>
            <given-names>ID</given-names>
          </string-name>
          1290129, 14 pages,
          <year>2022</year>
          . https://doi.org/10.1155/
          <year>2022</year>
          /1290129
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Komil</surname>
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Vora</surname>
          </string-name>
          ,
          <string-name>
            <surname>Avani R. Vasant</surname>
          </string-name>
          , Saurabh Shah. (
          <year>2022</year>
          ).
          <article-title>Custom Named Entity Recognition for Gujrati Text Using Spacy</article-title>
          .
          <source>Mathematical Statistician and Engineering Applications</source>
          ,
          <volume>71</volume>
          (
          <issue>3</issue>
          ),
          <fpage>1483</fpage>
          -
          <lpage>1495</lpage>
          . DOI:
          <volume>10</volume>
          .17762/msea.v71i3.
          <fpage>502</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Sharma</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amrita</surname>
            , Chakraborty,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2022</year>
          ).
          <article-title>Named Entity Recognition in Natural Language Processing: A Systematic Review</article-title>
          . In: Gupta,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Khanna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Kansal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Fortino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Hassanien</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.E</surname>
          </string-name>
          . (eds)
          <source>Proceedings of Second Doctoral Symposium on Computational Intelligence . Advances in Intelligent Systems and Computing</source>
          , vol
          <volume>1374</volume>
          . Springer, Singapore. https://doi.org/10.1007/
          <fpage>978</fpage>
          -981-16-3346-1_
          <fpage>66</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>[6] spaCy, URL: https://spacy.io/models</mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Hugging</given-names>
            <surname>Face</surname>
          </string-name>
          , URL:https://huggingface.co/models?library=flair
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P. Ramesh</given-names>
            <surname>Babu</surname>
          </string-name>
          .
          <article-title>Measuring Research in RSS Feed Literature: A Scientometric Study</article-title>
          .
          <source>In Measuring and Implementing Altmetrics in Library and Information Science Research</source>
          . Alliance Broadcast Pvt. Ltd,
          <string-name>
            <surname>India</surname>
          </string-name>
          . -
          <volume>13</volume>
          p.
          <year>2020</year>
          .DOI:
          <volume>10</volume>
          .4018/978-1-
          <fpage>7998</fpage>
          -1309-5.
          <year>ch008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Lande</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dmytrenko</surname>
          </string-name>
          , O.
          <source>Creating Directed Weighted Network of Terms Based on Analysis of Text Corpora</source>
          .
          <source>2020 IEEE 2nd International Conference on System Analysis and Intelligent Computing, SAIC</source>
          <year>2020</year>
          ,
          <year>2020</year>
          , 9239182. DOI:
          <volume>10</volume>
          .1109/SAIC51296.
          <year>2020</year>
          .9239182
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <article-title>Social media and depression symptoms: A network perspective</article-title>
          .
          <source>By Aalbers</source>
          , George,
          <string-name>
            <surname>McNally</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Richard J.</given-names>
            ,
            <surname>Heeren</surname>
          </string-name>
          , Alexandre,de Wit, Sanne,Fried,
          <string-name>
            <surname>Eiko I</surname>
          </string-name>
          .
          <source>Journal of Experimental Psychology: General</source>
          , Vol
          <volume>148</volume>
          (
          <issue>8</issue>
          ),
          <source>Aug</source>
          <year>2019</year>
          ,
          <fpage>1454</fpage>
          -
          <lpage>1462</lpage>
          . DOI:
          <volume>10</volume>
          .1037/xge0000528
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Cherven</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Mastering Gephi Network Visualization</surname>
          </string-name>
          . - Packt
          <string-name>
            <surname>Publishing</surname>
          </string-name>
          ,
          <year>2015</year>
          . - 378 p.
          <source>ISBN 78-1-78398-734-4.</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Gephi</surname>
          </string-name>
          , URL: https://gephi.org/
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Zgurovsky</surname>
          </string-name>
          , Dmitry Lande, KostiantynYefremov, OlehDmytrenko, AndriyBoldak, ArtemSoboliev.
          <source>Extracting and Identifying Relationships of Key Phrases in Information Flows. Published in: 2022 IEEE 3rd International Conference on System Analysis &amp; Intelligent Computing (SAIC</source>
          )
          <fpage>04</fpage>
          -
          <lpage>07</lpage>
          October
          <year>2022</year>
          . DOI:
          <volume>10</volume>
          .1109/SAIC57818.
          <year>2022</year>
          .
          <volume>9923019</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>