<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Review of Data Collection and Analysis Methods in Intelligent Information Processing Systems⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Myroslav Ryabyy</string-name>
          <email>m.o.ryabyy@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anton Spiridonov</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nataliia Korshun</string-name>
          <email>n.korshun@kubg.edu.ua</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roman Kyrychok</string-name>
          <email>r.kyrychok@kubg.edu.ua</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>State Scientific and Research Institute of Cybersecurity Technologies and Information Protection</institution>
          ,
          <addr-line>3/6 M. Zaliznyaka str., 03142 Kyiv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <fpage>611</fpage>
      <lpage>619</lpage>
      <abstract>
        <p>This paper examines modern methods of data collection and analysis in intelligent information processing (IIP) systems, emphasizing the growing importance of integrating non-traditional data sources into analytical frameworks. A comprehensive review of available services is conducted, highlighting their features, advantages, and limitations across various domains, including marketing, jurisprudence, medicine, and military applications. The study particularly focuses on the integration of data from messengers and public communication channels, given their increasing role in disseminating real-time information. The proposed approach combines traditional data collection services with alternative sources such as Telegram, Facebook, and YouTube, offering a more holistic and representative information environment. This integration facilitates the automated detection of patterns, trends, and anomalies, thereby enhancing decision-making processes in dynamic and data-intensive sectors. The results of experimental research confirm the viability of utilizing even unconventional information sources in analytical systems, demonstrating their effectiveness in detecting disinformation, forecasting potential threats, and automating routine analytical tasks. Furthermore, the paper introduces a conceptual model that integrates hyper-automation technologies with IIP to optimize data collection, preprocessing, and analysis. The model leverages robotic process automation (RPA) and artificial intelligence (AI)-driven classification techniques to enhance efficiency and scalability. Experimental validation of the proposed model demonstrates its potential for real-world implementation, particularly in scenarios requiring rapid adaptation to evolving information landscapes. The findings underscore the significance of hyperautomated systems in addressing contemporary challenges in data intelligence. By improving the accuracy, speed, and adaptability of information processing, such systems hold substantial promise for applications in cybersecurity, regulatory compliance, business intelligence, and public sector decisionmaking. The study concludes with insights into the future development of hyper-automated data processing frameworks and their role in shaping next-generation analytical capabilities.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;intelligent information processing</kwd>
        <kwd>hyper-automation</kwd>
        <kwd>data analysis</kwd>
        <kwd>monitoring systems</kwd>
        <kwd>integration</kwd>
        <kwd>automation</kwd>
        <kwd>disinformation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Main content</title>
      <p>Today, numerous services for data collection and processing can be successfully used in various
fields, including marketing, law, medicine, and others. Considering the current geopolitical
situation, data processing technologies have significant potential for military applications. For
example, they can be utilized in systems for detecting and analyzing disinformation and
propaganda, which are critical aspects of national security.</p>
      <p>Military organizations use modern data processing technologies to ensure national security.
Information analysis systems can identify potential threats and disinformation in vast data sets and
assist in strategic military decision-making [1–3].</p>
      <p>In modern marketing, data analysis solutions enable businesses to understand customer
behavior, identify market trends, and develop effective advertising and product promotion
strategies [9].</p>
      <p>In law, information technologies can be used for the rapid and efficient analysis of legal
information, legal case studies, and the preparation of legal documentation.</p>
      <p>In medicine, data processing systems assist doctors in analyzing large medical datasets,
identifying disease patterns, and developing individualized treatment approaches [6].</p>
      <p>Thus, data processing technologies play a crucial role in many areas of life, from business to
national security, and continue to evolve, making a significant contribution to social progress and
the efficiency of various organizations.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Existing solutions</title>
      <p>A review of available solutions (Table 1) was conducted to evaluate the effectiveness of data
collection services from various sources. An experiment was carried out to compare data retrieval
capabilities from different registries and services (Fig. 1).</p>
      <sec id="sec-3-1">
        <title>Name</title>
      </sec>
      <sec id="sec-3-2">
        <title>Sources</title>
      </sec>
      <sec id="sec-3-3">
        <title>Paid/Free</title>
      </sec>
      <sec id="sec-3-4">
        <title>Availability</title>
      </sec>
      <sec id="sec-3-5">
        <title>Description</title>
        <sec id="sec-3-5-1">
          <title>Paid</title>
        </sec>
        <sec id="sec-3-5-2">
          <title>Free</title>
        </sec>
        <sec id="sec-3-5-3">
          <title>Free</title>
        </sec>
        <sec id="sec-3-5-4">
          <title>Free</title>
        </sec>
        <sec id="sec-3-5-5">
          <title>SaaS</title>
        </sec>
        <sec id="sec-3-5-6">
          <title>SaaS</title>
        </sec>
        <sec id="sec-3-5-7">
          <title>SaaS</title>
        </sec>
        <sec id="sec-3-5-8">
          <title>SaaS</title>
        </sec>
        <sec id="sec-3-5-9">
          <title>SaaS</title>
        </sec>
        <sec id="sec-3-5-10">
          <title>SaaS</title>
        </sec>
        <sec id="sec-3-5-11">
          <title>SaaS</title>
        </sec>
        <sec id="sec-3-5-12">
          <title>SaaS</title>
          <p>Mustafa Dzhemilev—a political and public figure of Ukraine of Crimean Tatar descent, Hero
of Ukraine.</p>
          <p>Anton Spiridonov—a private person.</p>
          <p>The research findings in the legal and marketing sectors are presented in Table 2 and Table 3,
respectively.
1.</p>
          <p>Selected services for the experiment:</p>
          <p>YouControl
State Register of Court Decisions
Clarity-project
Semantic Force</p>
          <p>YouScan.</p>
        </sec>
        <sec id="sec-3-5-13">
          <title>Type of data</title>
          <p>Enterprises
l
l
u
f
,t e
s m
equ an
e</p>
          <p>R</p>
          <p>Service
Name
KVED
Date of registration
VAT code
Court decisions
Participation in
public procurement
v
o
n
o
d
i
r
i
p</p>
          <p>S
Youcontrol
+
+
+
+
+
–
v
e
l
i
m
e
h
z</p>
          <p>D
+
+
+
+
+
–
iirSvoodpn lizeevhDm
State Register of Court</p>
          <p>Decisions
+
+
+
+
+
–
+
+
+
+
+
–
Clarity-project
+
+
+
+
+
+
v
o
n
o
d
i
r
i
p
S
+
+
+
+
+
+
v
e
l
i
m
e
h
z
D
Registration
Phone
Real estate
Vehicle
Court decisions</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Key findings and proposed model</title>
      <p>The study highlighted that most intelligent information processing (IIP) tools are offered as
Software as a Service (SaaS). While SaaS solutions provide convenience and enhanced processing
power, they also have drawbacks [3]. For example, data owners may lose control over their
information as data is transferred to third parties, raising concerns about confidentiality and
security.
The analysis of technologies shows that many services specialize in specific domains but lack
efficient interaction with each other [4]. At the same time, the increasing use of messengers and
public channels as information transmission tools emphasizes the need for integrating data from
these sources.</p>
      <p>A conceptual model with additional data sources based on hyperautomation principles and
intermediate data cleansing and processing blocks has been proposed [5] (Fig. 3).</p>
      <p>To validate the capabilities of the conceptual model, a prototype was developed for data
collection from multiple services, including YouControl and the Telegram news channel (TSN
News). The prototype was implemented using Python and the RPA UiPath [2] technology stack.
{</p>
      <p>Example request:
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "/v1/usr/{contractorCode}",
"type": "object",</p>
      <p>"title": "Unified State Register of Legal Entities and Individual Entrepreneurs. Individual
Entrepreneur",
},
"nameInEnglish": null,
"code": "30045171",
"legalPersonName": "\"INTERSPORT\"",
"legalForm": "LIMITED LIABILITY COMPANY",
"registrationViaReformation": null,
"branches": null,
"economicActivities": [
{
"code": "68.20",
"description": "Renting and operating of own or leased real estate (main activity)"
"code": "77.39",</p>
      <p>"description": "Rental of other machinery, equipment, and tangible goods not
elsewhere classified"
},
{
},
{
},
{
},
{
},
{
}
"code": "93.11",
"description": "Operation of sports facilities"
"code": "93.29",
"description": "Other amusement and recreation activities"
"code": "46.90",
"description": "Non-specialized wholesale trade"
"code": "56.10",
"description": "Restaurant activities and mobile food service activities"
],
"authorityInfo": "Obolon District State Administration in Kyiv",
"managingGovernmentAuthority": null,
"founders": [</p>
      <p>When processing a Telegram query, the system returns messages containing keywords in JSON
format. For example, a query was made using the keyword “Bradley”.</p>
      <p>Example response:
{
"name": "TSN News / TSN.ua",
"type": "public_channel",
"id": 1305722586,
"type": "bold",
"text": "BMP Bradley vs. Russian tank T-80 — who will win?"
"type": "plain",</p>
      <p>"text": "\n\nOf course, the fighters of the 47th Mechanized Brigade won, who
},
{
},
{
},
{
},
{
},
{
},
{
},
{
}
"type": "bold",
"text": "destroyed enemy equipment"
"type": "plain",
"text": " using a TOW anti-tank missile.\n\n"
"type": "text_link",
"text": "Website",
"href": "https://tsn.ua/"
"type": "text_link",
"text": "Facebook",
"href": "https://www.facebook.com/tsn.ua"
"type": "plain",
"text": " | "
"type": "plain",
"text": ""
"type": "text_link",
"text": "YouTube",
"href": "https://www.youtube.com/tsn"
}
]</p>
      <p>]</p>
      <p>The response presented above displays the results of queries by keywords, confirming the
feasibility of collecting and processing data from messengers. The analysis of results indicates that
even diverse information sources, such as public channels in messengers, can be integrated into a
system to construct a more comprehensive and representative information model. This enables the
identification of trends, patterns, and anomalies in data streams, which is crucial for making
effective management decisions in rapidly changing markets, legal environments, or national
security contexts. Thus, a broader data coverage can be achieved for analysis and the development
of an extended internal database for an information processing system.</p>
      <p>Expanding data sources and applying hyper-automation not only accelerates the process of data
collection and processing but also enhances analytical quality. The integration of robotic process
automation (RPA) technologies with intelligent information processing (IIP) enables the automatic
detection of relationships between data that were previously unattainable using traditional
analytical approaches. This allows for the automation of complex tasks such as detecting
disinformation or predicting potential threats to national security, significantly improving
readiness and response to new challenges [7–11].</p>
      <p>Thus, the experiment confirms the potential for research in the field of information technology
development based on the combination of IIP and hyper-automation, which holds great promise for
the creation of new innovative solutions. These solutions can significantly enhance various aspects
of life and business, such as commerce, law, healthcare, and government administration, ensuring
more accurate and timely information processing and decision-making [12–14].</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>The research confirms the relevance of developing information technologies that integrate IIP with
hyper-automation, unlocking significant potential for new innovative solutions. These solutions
can improve various fields such as business, law, healthcare, and public administration by enabling
more accurate and timely data processing and decision-making [8].</p>
      <p>For successful implementation, the following technologies should be explored:


</p>
      <p>Hyperautomation Systems (RPA): Automates routine tasks and processes, freeing human
resources for more complex activities. Useful for automating data processing operations
and integration.</p>
      <p>Interconnected Databases: Facilitates fast data exchange across systems and applications,
forming the foundation for integrating diverse services into a unified global solution.
Intelligent Information Processing (IIP): Enables automated analysis, classification, and
interpretation of large data volumes, ensuring efficient collection and processing of
information from messengers and other sources.</p>
      <p>A comprehensive approach leveraging these technologies will allow the creation of a robust and
efficient system capable of collecting, processing, and analyzing information from diverse sources
while automating routine operations to enhance productivity and decision-making quality.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>While preparing this work, the authors used the AI programs Grammarly Pro to correct text
grammar and Strike Plagiarism to search for possible plagiarism. After using this tool, the authors
reviewed and edited the content as needed and took full responsibility for the publication’s content.
[1] S. Lenkov, et al., Conceptual scheme of an intelligent data processing system, Collection of
scientific works of the Military Institute of the Taras Shevchenko National University of Kyiv,
46, 2014, 181–190.
[2] G. Council, Injecting (artificial) intelligence into robotic process automation. URL:
http://www.datacenterjournal.com/injecting-artificialintelligence-robotic-process-automation/
[3] V. Sytnyk, M. Krasnyuk, Intelligent data analysis (data mining): Textbook, Kyiv, 2007.
[4] M. Bazzel, Open source intelligence techniques: Resources for searching and analyzing online
information, 2012.
[5] P. Bornet, I. Barkin, J. Wirtz, Intelligent automation: learn how to harness artificial intelligence
to boost business &amp; make our world more human, 1st ed., 2020.
[6] V. Mayer-Schönberger, K. Cukier, Big Data: A revolution that will transform how we live,
work, and think, 1st ed., New York: Houghton Mifflin Harcouхrt, 2013.
[7] O. Solomentsev, et al., Efficiency of operational data processing for radio electronic
equipment, Aviation 23(3) (2020) 71–77. doi:10.3846/aviation.2019.11849
[8] A. Shyian, et al., Development of a software module for modeling the process of
semiautomatic recognition of objects on aerial images, in: Proceedings of the Institute of Applied
Mathematics and Mechanics, 2023.
[9] O. Zolotukhina, V. Titarenko, Machine learning and forecasting in cyber-physical systems, in:</p>
      <p>Control, Optimisation and Analytical Processing of Social Networks, vol. 2392, 2019, 121–130.
[10] S. Gnatyuk, Critical aviation information systems cybersecurity, meeting security challenges
through data analytics and decision support, NATO Science for Peace and Security Series, D:
Information and Communication Security, IOS Press Ebooks, vol. 47(3), 2016, 308–316.
[11] M. Zaliskyi, et al., Method of traffic monitoring for DDoS attacks detection in e-health systems
and networks, in: Informatics &amp; Data-Driven Medicine, vol. 2255, 2018, 193–204.
[12] Y. Deville, L. T. Duarte, A. Deville, A taxonomy of relationships between information
processing, machine learning and quantum physics: Quantum-inspired, quantum-assisted,
quantum-targeted and related approaches, in: IEEE Mediterranean and Middle-East
Geoscience and Remote Sensing Symposium (M2GARSS), 2024, 361–365.
doi:10.1109/M2GARSS57310.2024.10537558
[13] V. Ivanov, et al., The removal of phosphorus from reject water in a municipal wastewater
treatment plant using iron ore, J. Chem. Technol. Biotechnol. 84(1) (2009) 78–82.
doi:10.1002/jctb.2009
[14] R Zhang, et al., Exploration of the methods of artificial intelligence computer processing
information, in: 3rd Int. Conf. for Innovation in Technology (INOCON), 2024, 1–4.
doi:10.1109/INOCON60754.2024.10511463</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>