<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>International Journal of Circuit Theory and Applications</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1002/cta.2784</article-id>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sergii Khlamov</string-name>
          <email>sergii.khlamov@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vadym Savanevych</string-name>
          <email>vadym.savanevych1@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olexander Briukhovetskyi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Iryna Tabakova</string-name>
          <email>iryna.tabakova@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kharkiv National University of Radio Electronics</institution>
          ,
          <addr-line>Nauki avenue 14, Kharkiv, 61000</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Mukachevo</institution>
          ,
          <addr-line>89600</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Western Radio Technical Surveillance Center, State Space Agency of Ukraine</institution>
          ,
          <addr-line>Kosmonavtiv Street</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <volume>48</volume>
      <issue>7</issue>
      <fpage>243</fpage>
      <lpage>263</lpage>
      <abstract>
        <p>The very fast technological progress provokes the creation of a big volume of the information that can be fed in different forms. There are various science directions that use high dimensional data sets for the analysis. In this paper we presented the few aspects of the "knowledge discovery in databases" (KDD) process related to the Data Mining stage in astronomy, analyzed and reviewed Data Mining approaches. We presented the examples of astronomical sources of Big Data, instruments, information types, processing algorithms that can be used for the Data Mining process in astronomy. The paper deals with applying the CoLiTec (Collection Light Technology) software for the online processing of the different types of astronomical information using the Data Mining approach. This is achieved by using of the developed OnLine Data Analysis System (OLDAS), which helps with solving of the Data Mining tasks, like clustering, classification, and identification. Data mining, big data, knowledge discovery in databases, recognition patterns, image COLINS-2022: 6th International Conference on Computational Linguistics and Intelligent Systems, May 12-13, 2022, Gliwice, Poland ORCID: 0000-0001-9434-1081 (S. Khlamov); 0000-0001-8840-8278 (V. Savanevych); 0000-0002-4550-5606 (O. Briukhovetskyi); 00000001-6629-4927 (I. Tabakova); 0000-0003-2689-2679 (T. Trunova)</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>processing, classification, datasets, series of images</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        The huge engineering revolution is closely connected with the 21st century and characterized by
the terrific technological progress. Such progress causes the creating of a large number of the various
data that fed in an online or offline modes in the form of data streams, predefined sets, series, video,
etc. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Such all data as a huge number of files, streams, memory grow and grow. It requires a lot of
storage space like data centers, servers, archives [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], Virtual Observatories [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ], etc.
      </p>
      <p>
        This ability of data to grow is ahead of all computing abilities of the already existed
computers/machines/servers. In this case the processing optimization of the data streams, sets, data is
very important by using only required input information to help computers/machines/servers to work
more productively.
exception [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>The data mining and knowledge discovery approaches become more and more popular and actual
in the different research and experiments to improve productivity and efficiency of the processing
algorithms in the different fields of interest. Astronomy as a research field of interest is not an</p>
      <p>So, what is the data mining approach? It is a process of the information receiving from the large
data sets by using the extracting or discovering patterns and involving the methods at the intersection
of disciplines like computer science, statistics, machine learning and database systems.
EMAIL:
(S.</p>
      <p>Khlamov);
izumsasha@gmail.com</p>
      <p>2022 Copyright for this paper by its authors.</p>
      <p>
        The data mining carries out about the information extracting using the intelligent methods from a
data set to transform it into the obvious structure for the further use. The data mining is an analysis
step of the "knowledge discovery in databases" (KDD) process [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The full flow with all intermediate
stages of the "knowledge discovery in databases" process is presented in the Figure 1.
      </p>
      <p>
        The main goal of the data mining is extraction of the potentially useful information for the
knowledge from the given large input data sets/streams/video using the appropriate associations,
relationships, or recognition patterns [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Then the received data is transformed into the subsets with
known and required structure. These formed subsets are used for the future effective analysis and
usage.
      </p>
      <p>
        In this paper we presented the few aspects of the "knowledge discovery in databases" process
related to the data mining stage in astronomy, analyzed and reviewed data mining approaches. The
examples of astronomical sources of big data, instruments, kind of information, processing algorithms
are provided. The goal of current research is to apply the developed CoLiTec (Collection Light
Technology) software [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] to the data mining purposes with the astronomical images.
      </p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Works</title>
    </sec>
    <sec id="sec-4">
      <title>2.1. Data mining in astronomy</title>
      <p>
        The data mining and knowledge discovery become areas of growing significance. Such growing
was caused by the increasing needs for KDD techniques for the different directions, like databases,
knowledge gathering, machine learning, statistics, data visualization, and high performance
computing. The data mining and knowledge discovery also is highly useful for the artificial
intelligence techniques in many areas, like industry, commerce, government, education, astronomy
and so on [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>The data mining in astronomy is a very powerful approach, which has a big potential for the fully
exploitation the exponentially increasing amount of data and promises an excellent scientific progress.
But with the wrong using it can be little more than the black box application of complex computing
algorithms that can provide very questionable results. So, the data mining can be much more powerful
tool, which is pretty good adapted for the astronomical tasks, instead of accurate selection or
continually modification an appropriate processing algorithm.</p>
      <p>
        Nowadays, in the big data era, there are different fields in astronomy that are vital for the dealing
with big data and data mining issues. They are astroinformatics, astrostatistics, astrochemistry, etc.
All progressive astronomers, researchers, scientists are ready to face the technological challenges and
opportunities provided by the massive data volume and open exciting perspectives for the new
astronomical discoveries by applying of the advanced data mining approach. The diversity of
scientific tasks and complexity of the astronomical big data provoke the development of innovative
processing algorithms and methods as well as a highly usage of the Information and Communications
Technologies (ICT) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
2.2.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Astronomical big data sources</title>
      <p>
        There are a lot of different scientific programs, projects, databases, virtual observatories [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ],
services that solve different research tasks by using the data mining and "knowledge discovery in
databases" approaches. The DAME (DAta Mining &amp; Exploration) program includes a set of
webbased services that perform scientific investigation and analysis of the astronomical big data sets [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
The engineering design and requirements are constructed on the new paradigm of web-based
resources that realize the efficient data mining framework in the data-centric era [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        The data mining problems of data analysis and visualization [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] from the huge stellar catalogues
that contain billions of objects are more difficult because of appearing the massive data sets, like
2MASS (Two Micron All Sky Survey) [13], WISE (Wide-field Infrared Survey Explorer) [14], ESA
Euclid space mission [15]. Such astronomical big data is received by the modern robotic telescopes,
like Pan-STARRS (Panoramic Survey Telescope and Rapid Response System) [16], ESA GAIA
(Global Astrometric Interferometer for Astrophysics) space mission [17], Thirty Meter
Telescope (TMT) [18].
      </p>
      <p>Especial attention can be paid to the SDSS (Sloan Digital Sky Survey) [19], as to the most
successful sky survey in the astronomy history. The SDSS project has formed the most detailed
threedimensional maps of the Universe with deep multi-color images of 1/3 of the sky, and spectra for
more than three million astronomical objects [20].</p>
      <p>One more interesting and more huge wide-field survey with reflecting telescope is under
construction and called Large Synoptic Survey Telescope (LSST) [21]. It has a primary mirror with
diameter 8.4 meters and includes three mirrors with a very wide field of view (FOV) of 3.5-degree,
which is presented in the Figure 2.</p>
      <p>The LSST science database is focused on the following goals:
• scalability (at petabytes scales) of existing machine learning and data mining algorithms;
• development of grid with enabling the parallel data mining algorithms;
• designing a robust system for brokering classifications from the event pipeline;
• indexing of multi-attribute and multi-dimensional petascale astronomical databases for the
rapid querying;
• multi-resolution methods (object classification, outlier identification, anomaly detection).</p>
    </sec>
    <sec id="sec-6">
      <title>3. Methods</title>
    </sec>
    <sec id="sec-7">
      <title>3.1. Mathematical processing methods</title>
      <p>
        The data mining purposes regarding the astronomical image processing are focused on but not
limited to the following tasks: brightness equalization [22], background alignment [23], object’s
images detection [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], moving objects detection [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], astrometry of objects (positional object
coordinates estimation in the image that are re-calculated into the sky position) [24], photometry of
objects (object’s brightness estimation in the magnitude) [25], the parameters determination of the
object’s image and apparent motion [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], reference objects cataloging [26], objects recognition [27],
Wavelet coherence analysis [28] and others.
      </p>
      <p>The data mining of astronomical images includes the different major areas of application of image
and signal processing like the following [27].</p>
      <p>
        • Filtering. The clear raw signals in astronomy are very rarely existed without noise, so the
removal of noise is necessary for the future useful data interpretation. In common, the data
cleaning is required to bypass the artifacts of instrumental measurements without changing of
the complexity of data.
• Deconvolution. The signal "deblurring" is also used for reasons that are very similar to
filtering, as a preliminary to the data interpretation. Deblurring of the objects motion in images
is very important in astronomy, as well as the removing effects of atmospheric blurring, to
improve the quality of seeing.
• Compression. There are several facts that show the importance of effective and efficient
compression technology: long-term storage of astronomical data, developing of detectors for
the ever-larger image sizes, research in astronomy is a geographically distributed activity.
• Mathematical morphology. The combinations of erosion and dilation operators often
provoke the opening and closing operations. So, in the greyscale/boolean images they allow
creation of immediately practical framework for the processing. The median function plays
such role for the order and rank functions. In this case, the multiple scale mathematical
morphology is an immediate generalization of the astronomical images processing [29].
• Edge detection. The gradient information is not very popular information for the astronomical
image analysis because of their boolean nature. So, in this case, the objects edges identification
is used more often like curves in the image by the brightness changes or discontinuities [30].
• Corner detection. A group of algorithms that are used within computer vision systems to
extract certain kinds of the features and infer contents of an image [31]. It is often used for the
object’s recognition, object’s image registration, object’s motion detection [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], video tracking,
and 3D reconstruction.
• Blob (point) detection. The mathematical methods for the region’s detection in the image.
      </p>
      <p>Such regions have a difference in brightness and color that are compared to the neighboring
regions. The blob is a region with points in which properties are constant or approximately
constant, so all points in the blob are like each other [32].
• Ridge detection. The mathematical methods for the ridge’s localization in the image that
defined as curves whose points are the function’s local maximum, like the geographical
ridges [33].
• Segmentation and pattern recognition. In astronomy, the segmentation and pattern
recognition is used for the object detection while the term feature selection [26] is more
popular in areas outside astronomy. In common, they are used for the assignment of the
object’s images to a proper class by the highlighting of significant features that characterize
this class [34].
• Hough and Radon transforms. The detection of curves is required for the many
segmentation classes and feature analysis. It does not matter if the signal is faint or strong, the
noise is usually the most critical one. The Ridgelet and Curvelet transforms provide the
powerful generalizations for resolving such problems [27].</p>
      <p>The described above mathematical image processing methods are different but all of them can be
used as pre-processing stage of the data mining of astronomical images in the processing pipeline
before the main image processing algorithm (object’s image recognition, object detection, objects
parameters estimation, trajectory detection, trajectory parameters estimation) is applied.
3.2.</p>
    </sec>
    <sec id="sec-8">
      <title>Astrophysical processing methods</title>
      <p>The object classification is an important initial step in the scientific data mining process because it
provides the algorithms and methods for organizing the scientific information in a way that can be used
to make the appropriate hypotheses and to compare with the existing models.</p>
    </sec>
    <sec id="sec-9">
      <title>3.2.1. Star-Galaxy separation</title>
      <p>Because of the small physical size of stars compared to their distance from the observing point,
almost all stars are unresolved in the photometric datasets, and thus appear as the point objects in the
CCD-image [35]. The galaxies in common case subtend a larger angle, even when they are further
away, so appear as the extended objects in the CCD-image. But the other astrophysical objects such as
quasars and supernovae also appear as point objects. So, the separation of photometric catalogs into
stars, galaxies, and other objects, is an important and difficult task.</p>
      <p>The huge number of galaxies and stars in typical surveys requires the morphology separation as a
process, which is automated or semi-automated. This task is a well-studied and the several automated
approaches for big data analysis were implemented, like for the digitization of the scanned photographic
plates by machines such as the Automatic Plate Measuring (APM) [36] and Palomar Digital Sky
Survey (DPOSS) [37].</p>
      <p>Also, the several data mining methods have already been developed and implemented using the
Artificial Neural Network (ANN) [38], mixture modeling [39], where the most methods achieving over
95% efficiency.</p>
      <p>In general, such methods are based on the astrophysical object’s classification using a set of the
measured morphological parameters that are received from the survey photometry, with shape, structure,
texture, inclination, arm pitch, color, resolution, exposure, colors, spectra, and other astrophysical
information.</p>
      <p>The main advantage of these data mining methods is that all such information about each
astrophysical object is easily extended and incapsulated into the massive datasets [40].</p>
    </sec>
    <sec id="sec-10">
      <title>3.2.2. Galaxy morphology</title>
      <p>There is a various morphology of galaxies based on the wide range of different sizes and shapes of
them. The most popular system of the morphological classification of galaxies is the Hubble Sequence
of spiral, barred spiral, elliptical, and irregular, and galaxies from the different subclasses [41]. This
system correlates to the many important physical properties in the formation and evolution of
galaxies [42].</p>
      <p>The galaxy morphology is a very complex phenomenon, which is correlated to the underlying
physics, but it is not unique to any one given process. But, anyway, the Hubble sequence is still actual,
even if it being rather subjective and based on the visible-light morphology, which was originally
received from the blue-biased photographic plates.</p>
      <p>The Hubble sequence was extended in different ways using the data mining approach and ANN
applying [38] to predict the galaxies’ type at low redshift and finding the equal accuracy to human
experts. ANNs were also applied to the higher redshift data to distinguish between normal and peculiar
galaxies. Also, the fundamentally topological and unsupervised SOM ANN was used for the galaxy’s
classification based on the CCD-images, received from the Hubble Space Telescope [43], where the
initial distribution of classes is not known. The approach of using the ANNs also was used to determine
the morphological types from galaxy spectra [44].</p>
      <p>For the galaxy morphology research even the Fourier decomposition was used on the galaxy images
implemented with ANNs for the bars detection and types assigning [45].</p>
    </sec>
    <sec id="sec-11">
      <title>4. The CoLiTec software</title>
      <p>
        The different data mining approaches for the astronomical images processing is provided by the
CoLiTec software [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which allows the input data processing in near real time/online mode. This is a
very complicated system for the astronomical data sets processing, which includes the different
features, user-friendly tools for the processing management, results reviewing [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], integration with
online catalogs and a lot of various computational components that are based on the developed
methods [
        <xref ref-type="bibr" rid="ref8">8, 24, 26</xref>
        ]. The processing results are also available and can be visualized.
      </p>
      <p>The high level processing pipeline with developed modules and implemented methods of the
CoLiTec software is presented in the Figure 3.</p>
      <p>The processing steps of the CoLiTec software in the pipeline according to the data mining
approaches are described below.
4.1.</p>
    </sec>
    <sec id="sec-12">
      <title>Pre-processing</title>
      <p>The pre-processing step of the CoLiTec software in OnLine Data Analysis System (OLDAS)
mode includes the input data set processing as soon as they successfully received from different
sources. Such raw data is moderated before the computational process starts. The unsupported and
corrupted frames are rejected at this step. The useful information from the input data set is only used
during the computational process.
4.2.</p>
    </sec>
    <sec id="sec-13">
      <title>Clustering</title>
      <p>The selected useful information from the input data set is categorized into clusters using the
specified attributes. The CoLiTec software uses the different attributes, such as equatorial
coordinates, filter type, telescope, investigated object and others. Based on these attributes the
necessary information from the input data set is separated into subsets with similar data and stored at
the different distributed servers, clusters or even networks.</p>
    </sec>
    <sec id="sec-14">
      <title>Classification</title>
      <p>After clustering process, the created subsets of data are classified by the applying of a known
structure of the raw astronomical data that specified in Flexible Image Transport System (FITS)
standard by NASA [46]. FITS standard is the most used digital file format in astronomy. Such format
is designed especial in form of the image metadata, which includes different scientific data, like
astrometric, photometric, calibration information and others. After classification the FITS files are
sent to the processing pipeline.
4.4.</p>
    </sec>
    <sec id="sec-15">
      <title>Identification</title>
      <p>During processing pipeline all received classified FITS files pass through the identification step.
At this step all FITS files related to the service master-frame are used for the frame's calibration (e.g.,
bias, dark, darkflat, flat). Otherwise, if this is a raw light frame the processing pipeline starts
computing process.
4.5.</p>
    </sec>
    <sec id="sec-16">
      <title>Processing</title>
      <p>
        The computing process in the processing pipeline is managed by the OLDAS and includes two
stages: intraframe and interframe processing. The intraframe stage includes the various processes for
the image filtration and objects detection. The major goal of the object’s detection in the series of
images is to recognize the object, its borders and determine the parameters of its image [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        There are a different recognition patterns or types of the astronomical objects in the image that can
be detected: point objects, long objects, blurred objects, objects with flare or intersection with another
objects. Such types of objects can be belonged to the galaxy, star, robot [47], drone [48], rocket,
satellite [49], and even comet [50] or asteroid [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>The features of CoLiTec software related to the intraframe stage are described below:
• processing of the very wide field of view (FOW) – up to 10 square degrees;
• automated calibration process;
• cosmetic correction process;
• FrameSmooth software for background alignment and brightness equalization [22];
• automated rejection of the worst observations;
• fully automated robust algorithm of astrometric reduction;
• semi-automated algorithm of photometric reduction;
• automated rejection of objects with bad or unclear measurements.</p>
      <p>The object’s image detection process during the intraframe processing is presented in the Figure 4.</p>
      <p>
        The interframe stage includes the various processes for detection of the objects motion. The major
goal of the moving objects detection in the series of images is to recognize the object’s trajectory and
determine its parameters [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>The features of CoLiTec software related to the interframe stage are the following:
• automated detection of the faint moving objects with signal-to-noise ratio (SNR) more than 2.5;
• automated detection of very slow objects with near-zero apparent motion from 0.7 pix./frame;
• automated detection of very fast objects with apparent motion up to 40.0 pix./frame.</p>
      <p>The object’s motion detection process during the interframe processing is presented in the Figure 5.</p>
      <p>The data mining analysis by the CoLiTec software is performed using the following technological
features:
• multi-threaded processing;
• multi-cores systems using with managing the individual treatment processes;
• deciding system, which allows adapting the user settings for the processing;
• notification system, which informs user about the correct results at each processing stage;
• data control managing during processing using the subject mediator.</p>
      <p>After pipeline processing and data mining analysis, the CoLiTec software produces the various
forms of results representation, including visualization results and reports generation for the different
services. To summarize results and for the visual analysis of them, the LookSky viewer with
userfriendly GUI is used (see Figure 6).</p>
    </sec>
    <sec id="sec-17">
      <title>5. Results</title>
      <p>As a result, about 700,000 observations were made using the CoLiTec software with approach for
the data mining of astronomical images. According to these observations the following discoveries
were also done:
• more than 1,600 asteroids;
• 5 Near Earth Objects (NEOs);
• 21 Trojan asteroids of Jupiter;
• 1 Centaur (2013 UL10);
• 5 comets (C/2010 X1 (Elenin), P/2011 NO1 (Elenin), C/2012 S1 (ISON) [50], P/2013 V3
(Nevski), C/2017 T3 (ATLAS) [51]).</p>
      <p>All mentioned above observations and discoveries are approved and confirmed by the Minor
Planet Center (MPC) as an official organization for the observing and reporting on minor planets
under the auspices of the International Astronomical Union (IAU).</p>
      <p>
        Below are presented the several examples of the series of images processing after the data mining
stage in the processing pipeline of the CoLiTec software [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Such data mining stage was performed
using the astronomical information from the different astronomical archives [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and at the real-time
receiving of CCD-images right from the telescopes during observations.
      </p>
      <p>The few processing results of the automated calibration, cosmetic correction, background
alignment and brightness equalization processes are presented in the Figure 7.</p>
      <p>The few processing results of the object’s measurements mining, object’s image detection, and
object’s apparent motion detection processes are presented in the Figure 8.</p>
      <p>For all detected by the CoLiTec software objects a lot of apparent motion parameters are
determined. Some of them are the object’s velocity V in the equatorial system (right ascension RA and
declination DE) and in the cartesian system according to two axis x and y, and the distance S, which
was passed by the object from the first positional measurement till the last one in the observing series
of images.</p>
      <p>The few processing results with the determined apparent motion parameters of the real Solar
System objects (SSOs) are presented in the Table 1.</p>
    </sec>
    <sec id="sec-18">
      <title>6. Conclusions</title>
      <p>The values of S distance, which was passed by the different objects from the first positional
measurement till the last one in the observing series of images, shows that such objects have near-zero
apparent motion, and the CoLiTec software successfully detected these objects and estimates the
positional parameters and the motion parameters.</p>
      <p>The very fast technological progress, networks of automated ground- and space-based observation
systems, new scientific programs, surveys, projects lead to the fast growing of astronomical data.</p>
      <p>
        We presented the developed CoLiTec software [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which is used for the online processing of the
different types of astronomical information using the data mining approaches. As described in the
paper the knowledge discovery in databases with the data mining analysis step is applicable and very
practical for the optimization of data stream processing and receiving only useful information. It
allows applying only necessary input data to improve the computing abilities of machines.
      </p>
      <p>The CoLiTec software realizes different data mining principles and stages of processing, such as
anomaly detection, pre-processing, clustering, classification, identification, processing (edge
detection, segmentation, recognition patterns, object [52] and motion detection) and summarization.</p>
      <p>
        Such data mining principles in the CoLiTec software are implemented by the especial developed
mathematical methods and components for the intraframe and interframe processing, astronomical
data mining from the different on-line services, archives [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], Virtual Observatories [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ],
visualization [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] under the CoLiTec project [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>The scientific novelty of the current research is that the CoLiTec software is the first astronomical
software, which fully implements all steps according to the data mining approach and pipeline to
process the different astronomical data.</p>
      <p>Using the described in the paper data mining approaches the CoLiTec software helps with
countless observation and discoveries of SSOs, which are confirmed by the MPC and the appropriate
Minor Planet Electronic Circulars (MPECs).</p>
    </sec>
    <sec id="sec-19">
      <title>7. Acknowledgements</title>
      <p>The authors thank all observatories, online services and tools that provided data to conduct the
current research for testing the developed CoLiTec software with realization of the data mining
approaches.</p>
    </sec>
    <sec id="sec-20">
      <title>8. References</title>
      <p>[13] E. Ríos-López, et al., 2D surface brightness modelling of large 2MASS galaxies–I: photometry and
structural parameters, MNRAS, vol. 507, issue 4, pp. 5952-5973, 2021. doi: 10.1093/mnras/stab2321.
[14] B. Lyu, et al., WISE View of Changing-look Active Galactic Nuclei: Evidence for a Transitional</p>
      <p>Stage of AGNs, ApJ, vol. 927, issue 2, 227, 2022. doi:10.3847/1538-4357/ac5256.
[15] C. Baccigalupi, et al., Euclid preparation, A&amp;A, vol. 657, A90, 2022.
doi:10.1051/00046361/202141393.
[16] K. Xiao, and H. Yuan, Validation and Improvement of the Pan-STARRS Photometric
Calibration with the Stellar Color Regression Method, The Astronomical Journal, vol. 163,
issue 4, p. 185, 2022. doi:10.3847/1538-3881/ac540a.
[17] J. Guiraud, and W. Roux, New Questions Opened by the Big Data in the World of the Science
Data Processing Centre for Gaia Mission in CNES, Space Operations. Springer, Cham, pp.
309324, 2022. doi:10.1007/978-3-030-94628-9_14.
[18] P. Prabahar, and N. Radhakrishnan, Thirty Meter Telescope (TMT) Research as Reflected in
Web of Science: A Study, Science &amp; Technology Libraries, pp. 1-13, 2022.
doi:10.1080/0194262X.2021.2018634.
[19] Yi, Zhenping, et al., Automatic detection of low surface brightness galaxies from SDSS images,</p>
      <p>MNRAS, stac775, 2022. doi:10.1093/mnras/stac775.
[20] Y. Zhang and Y. Zhao, Astronomy in the big data era, Data Science Journal, vol. 14, 2015.
[21] D. J. Hoover, D. Z. Seligman, and M. J. Payne, The Population of Interstellar Objects Detectable
with the LSST and Accessible for In Situ Rendezvous with Various Mission Designs, The
Planetary Science Journal, vol. 3, issue 3, p. 71, 2022. doi:10.3847/psj/ac58fe.
[22] P. A. Dubovský, et al., FrameSmooth software-new tool for the calibration of astronomical
images, 48th Conference on Variable Stars Research, 16 p., 2017.
[23] Š. Parimucha, et al., CoLiTecVS – A new tool for an automated reduction of photometric
observations, Contributions of the Astronomical Observatory Skalnate Pleso, vol. 49, issue 2, pp.
151-153, 2019. doi:2019CoSka..49..151P.
[24] V. Akhmetov, S. Khlamov, V. Khramtsov, and A. Dmytrenko, Astrometric reduction of the
wide-field images, Advances in Intelligent Systems and Computing, vol. 1080, pp. 896–909,
2020. doi: 10.1007/978-3-030-33695-0_58.
[25] I. Kudzej, et al., CoLiTecVS – A new tool for the automated reduction of photometric
observations, Astronomische Nachrichten, vol. 340, pp. 68–70, 2019.
[26] V. Savanevych, V. Akhmetov, et al., Selection of the reference stars for astrometric reduction of
CCD-frames, Advances in Intelligent Systems and Computing, vol. 1080, pp. 881–895, 2020.
doi: 10.1007/978-3-030-33695-0_57.
[27] R. Gonzalez, and R. Woods, Digital image processing, Fourth edition, New York, NY: Pearson,
2018.
[28] M. Dadkhah, V. V. Lyashenko, Z. V. Deineko, et al., Methodology of wavelet analysis in
research of dynamics of phishing attacks, International Journal of Advanced Intelligence
Paradigms, vol. 12(3-4), pp. 220-238, 2019. doi:10.1504/IJAIP.2019.098561.
[29] W. Burger, and M. Burge, Principles of digital image processing: fundamental techniques, New</p>
      <p>York, NY: Springer, 2009.
[30] D.-W., Lee, and S.-H. Lee, Edge Detection using Cost Minimization Method, Journal of Internet
of Things and Convergence, vol. 8, issue 1, pp. 59-64, 2022. doi:10.20465/KIOTS.2022.8.1.059.
[31] Y. Wang, et al., A Target Corner Detection Algorithm Based on the Fusion of FAST and Harris,</p>
      <p>Mathematical Problems in Engineering, vol. 2022, 4611508, 2022. doi:10.1155/2022/4611508.
[32] T. Lindeberg, Scale selection properties of generalized scale-space interest point detectors,
Journal of Mathematical Imaging and vision, vol. 46(2), pp. 177-210, 2013.
doi:10.1007/s10851012-0378-3.
[33] G. S. Shokouh, et al., Ridge detection by image filtering techniques: a review and an objective
analysis, Pattern Recognition and Image Analysis, vol. 31, issue 3, pp. 551-570, 2021.
doi:10.1134/S1054661821030226.
[34] J-L. Starck, and F. Murtagh, Astronomical image and data analysis, Astronomy and Astrophysics</p>
      <p>Library, Second edition, Springer, 2007.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Peralta</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>del</article-title>
          <string-name>
            <surname>Rio</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Ramirez-Gallego</surname>
            ,
            <given-names>I. Triguero</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Benitez</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Herrera</surname>
          </string-name>
          ,
          <article-title>Evolutionary feature selection for big data classification: A map reduce approach</article-title>
          , Mathematical Problems in Engineering, vol.
          <year>2015</year>
          ,
          <volume>246139</volume>
          , pp.
          <fpage>11</fpage>
          ,
          <year>2015</year>
          . doi:
          <volume>10</volume>
          .1155/
          <year>2015</year>
          /246139.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Cavuoti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brescia</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Longo</surname>
          </string-name>
          ,
          <article-title>Data mining and knowledge discovery resources for astronomy in the Web 2.0 age, SPIE Astronomical Telescopes and Instrumentation, Software and Cyberinfrastructure for Astronomy II</article-title>
          , vol.
          <volume>8451</volume>
          ,
          <year>2012</year>
          . doi:
          <volume>10</volume>
          .1117/12.925321.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Genova</surname>
          </string-name>
          ,
          <article-title>Data as a research infrastructure CDS, the Virtual Observatory, astronomy, and beyond</article-title>
          ,
          <source>EPJ Web of Conferences</source>
          , vol.
          <volume>186</volume>
          ,
          <string-name>
            <given-names>EDP</given-names>
            <surname>Sciences</surname>
          </string-name>
          ,
          <volume>01001</volume>
          ,
          <year>2018</year>
          . doi:
          <volume>10</volume>
          .1051/epjconf/201818601001.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pasian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brescia</surname>
          </string-name>
          , and G. Longo,
          <article-title>Astronomical images and data mining in the international virtual observatory context</article-title>
          , Science: Image In Action, pp.
          <fpage>230</fpage>
          -
          <lpage>240</lpage>
          ,
          <year>2012</year>
          . doi:
          <volume>10</volume>
          .1142/9789814383295_
          <fpage>0019</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Fluke</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Jacobs</surname>
          </string-name>
          ,
          <article-title>Surveying the reach and maturity of machine learning and artificial intelligence in astronomy</article-title>
          ,
          <source>Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery</source>
          , vol.
          <volume>10</volume>
          , issue
          <volume>2</volume>
          :
          <fpage>e1349</fpage>
          ,
          <year>2020</year>
          . doi:
          <volume>10</volume>
          .1002/widm.1349.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Ž.</given-names>
            <surname>Ivezić</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>Statistics</surname>
          </string-name>
          , Data Mining, and
          <article-title>Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data</article-title>
          , Princeton University Press,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Khalil</surname>
          </string-name>
          , et al.,
          <article-title>Big data in astronomy: from evolution to revolution</article-title>
          ,
          <source>International Journal of Advanced Astronomy</source>
          , vol.
          <volume>7</volume>
          , issue 1,
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .14419/ijaa.v7i1.
          <fpage>18029</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Khlamov</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Savanevych</surname>
          </string-name>
          ,
          <article-title>Big astronomical datasets and discovery of new celestial bodies in the Solar System in automated mode by the CoLiTec software, Knowledge Discovery in Big Data from Astronomy and Earth Observation</article-title>
          ,
          <string-name>
            <surname>Part</surname>
            <given-names>IV</given-names>
          </string-name>
          , Chapter
          <volume>18</volume>
          ,
          <string-name>
            <surname>Astrogeoinformatics</surname>
          </string-name>
          : Elsevier, pp.
          <fpage>331</fpage>
          -
          <lpage>345</lpage>
          ,
          <year>2020</year>
          . doi:
          <volume>10</volume>
          .1016/B978-0
          <source>-12-819154-5</source>
          .
          <fpage>00030</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Mor</surname>
          </string-name>
          , et al.,
          <article-title>Expanding Big Data mining for Astronomy, XIV Scientific Meeting of the Spanish Astronomical Society</article-title>
          , p.
          <fpage>235</fpage>
          ,
          <year>2020</year>
          . doi:2020sea..
          <source>confE.235M.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Brescia</surname>
          </string-name>
          , and
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Longo, Astroinformatics, data mining and the future of astronomical research</article-title>
          ,
          <source>Nuclear Instruments and Methods in Physics Research Section A: Accelerators</source>
          , Spectrometers,
          <source>Detectors and Associated Equipment</source>
          , vol.
          <volume>720</volume>
          , pp.
          <fpage>92</fpage>
          -
          <lpage>94</lpage>
          ,
          <year>2012</year>
          . doi:
          <volume>10</volume>
          .1016/j.nima.
          <year>2012</year>
          .
          <volume>12</volume>
          .027.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>O.</given-names>
            <surname>Vaduvescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Curelaru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Popescu</surname>
          </string-name>
          ,
          <article-title>Mega-Archive and the EURONEAR tools for data mining world astronomical images</article-title>
          ,
          <source>Astronomy and Computing</source>
          , vol.
          <volume>30</volume>
          ,
          <issue>100356</issue>
          ,
          <year>2020</year>
          . doi:
          <volume>10</volume>
          .1016/j.ascom.
          <year>2019</year>
          .
          <volume>100356</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>V.</given-names>
            <surname>Akhmetov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Khlamov</surname>
          </string-name>
          , I. Tabakova,
          <string-name>
            <given-names>W.</given-names>
            <surname>Hernandez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. I. N.</given-names>
            <surname>Hipolito</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Fedorov</surname>
          </string-name>
          ,
          <article-title>New approach for pixelization of big astronomical data for machine vision purpose</article-title>
          ,
          <source>IEEE International Symposium on Industrial Electronics</source>
          , pp.
          <fpage>1706</fpage>
          -
          <lpage>1710</lpage>
          ,
          <year>2019</year>
          . doi:
          <volume>10</volume>
          .1109/ISIE.
          <year>2019</year>
          .
          <volume>8781270</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>