<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Methodology of wavelet analysis in research
of dynamics of phishing attacks. International Journal of Advanced Intelligence Paradigms</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1016/B978-0-12-819154-5.00015-1</article-id>
      <title-group>
        <article-title>Automated Data Mining of the Reference Stars From Astronomical CCD Frames</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sergii Khlamov</string-name>
          <email>sergii.khlamov@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vadym Savanevych</string-name>
          <email>vadym.savanevych1@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tetiana Trunova</string-name>
          <email>tetiana.trunova@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zhanna Deineko</string-name>
          <email>zhanna.deineko@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oleksandr Vovk</string-name>
          <email>oleksandr.vovk@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roman Gerasimenko</string-name>
          <email>roman.herasymenko@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kharkiv National University of Radio Electronics</institution>
          ,
          <addr-line>Nauki avenue 14, Kharkiv, 61166</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>1080</volume>
      <issue>7</issue>
      <fpage>0000</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>In astronomical images obtained using telescopes and cameras, there are from 1 to 100 thousand or more stars depending on the resolution and exposure time. These objects are fixed against the background of the frame and have constant positions in the celestial sphere. To determine which part of the sky corresponds more accurately to a given frame, it is necessary to associate the frame with known astronomical astrometric and photometric catalogs. These catalogs contain millions of position values of various stars as static objects. Having such information in the form of big data, as well as a huge amount of classified and clustered data in the form of databases, computational methods for fast extraction of the necessary data from them need to be developed. For this purpose, classical methods of "knowledge discovery in databases" (KDD) and Data Mining exist. However, for their proper application, it is necessary to classify the input data set for subsequent analysis and rejection. The implementation of these methods is closely related to the developed mathematical computational methods for automatic selection of reference stars in astronomical images. The result is implemented in the Lemur software of the CoLiTec (Collection Light Technology) project for the astronomical data processing using the data mining methods.</p>
      </abstract>
      <kwd-group>
        <kwd>Reference stars</kwd>
        <kwd>data mining</kwd>
        <kwd>big data</kwd>
        <kwd>knowledge discovery in databases</kwd>
        <kwd>astronomical catalogues</kwd>
        <kwd>astrometry</kwd>
        <kwd>photometry</kwd>
        <kwd>recognition patterns</kwd>
        <kwd>image processing</kwd>
        <kwd>series of images</kwd>
        <kwd>CCD-frames</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Technological progress in the production of cameras as charge-coupled devices (CCD) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and
telescopes demonstrates continuous acceleration, reflecting modern trends in scientific and
engineering developments. New materials and more efficient components, such as
photodetectors and optical systems, contribute to the creation of cameras and telescopes with
increased resolution and sensitivity.
      </p>
      <p>
        Today's digital cameras possess impressive characteristics, including resolutions exceeding
100 megapixels, significantly surpassing those of previous models. Telescopes are also keeping
pace: it is expected that the Large Synoptic Survey Telescope (LSST) (Figure 1) will have a
resolution of around 3.2 gigapixels [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
      </p>
      <p>
        The speed of data acquisition and processing has also increased noticeably. Modern cameras
are capable of recording and processing data orders of magnitude faster than their predecessors,
achieving shooting speeds of up to 20 frames per second while maintaining full resolution. The
development of parallel computational algorithms contributes to more efficient processing of
such a big data [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] obtained from cameras and telescopes, opening up new possibilities for
scientific research and practical applications in various fields including astronomy [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
__________________________
      </p>
      <p>
        As the resolution of cameras and telescopes has increased over time, allowing for the
registration and analysis of more detailed images of the night sky, this process directly impacts
the ability to detect and study various objects in space [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Additionally, wide-angle cameras allow
for an expanded field of view, significantly increasing the number of objects captured in a single
frame. This is particularly useful when studying areas with high stellar density or when searching
for distant galaxies and cosmic objects [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        However, even with the increase in resolution and field of view, the quality of astronomical
imaging of objects, especially those located at great distances and with low brightness, remains a
challenge for modern cameras and telescopes. One of the main factors affecting image quality is
the level of noise. When imaging faint objects, even a small amount of noise on the camera sensor
can significantly distort the image and hinder its interpretation. Despite significant
improvements in noise reduction in modern cameras, this aspect remains a problem when
dealing with very faint and distant objects [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Another important factor is atmospheric conditions. Atmospheric turbulence and atmospheric
distortions can significantly affect the quality of images and its typical shape or form [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
especially when working with high magnifications. This limitation can be overcome by using
space telescopes, which are located beyond the Earth's atmosphere.
      </p>
      <p>It is also worth considering the technical limitations of cameras and telescopes, such as limited
sensitivity to certain wavelengths or limitations on dynamic range. Research and development
efforts continue with the aim of overcoming these limitations; however, this remains a relevant
area of research in astronomy and optics.</p>
      <p>
        With these improvements in cameras and telescopes comes an increase in the volume of data
in astronomical catalogs [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. This is due to the increasing number of observations conducted, the
use of more advanced instruments, and the expansion of the coverage area of the celestial sphere.
So, it is necessary to develop the mathematical methods for the automated data mining [10] of
the reference stars [11] from astronomical CCD-frames received from a such big volume of
astronomical data.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <sec id="sec-2-1">
        <title>2.1. Astronomical big data sources</title>
        <p>In recent decades, astronomy has undergone a data processing and analysis revolution due to
the implementation of large astronomical projects and the use of advanced observational tools.
These modern instruments collect and process vast amounts of data, extracting valuable
information about celestial objects and phenomena [12].</p>
        <p>One of the main sources of data is ground-based telescopes and observatories. These facilities,
located around the world, gather data on stars, galaxies, and other celestial objects using various
observation methods, including optical, infrared, and ultraviolet spectra. Thanks to ground-based
telescopes and Virtual Observatories [13], astronomers have access to a wide range of objects
and can study their properties and evolution [14].</p>
        <p>New telescopes are capable of generating huge volumes of high-resolution data. This includes
not only images but also spectroscopic data and information on temporal changes in the
brightness of objects. Modern big data processing techniques and machine learning allow for
efficient extraction of information from data arrays, identification of interesting objects, and
automatic compilation of astronomical catalogs [15]. Additionally, international projects and
collaborations are a contributing factor. Astronomers from around the world join forces to
conduct joint observations and create extensive catalogs, leading to more comprehensive
coverage of the celestial sphere and enrichment of data [16]. This growth of data in astronomy
provides unique opportunities for scientific research.</p>
        <p>Pattern recognition of both celestial objects and patterns themselves is an important tool for
analyzing data obtained from astronomical surveys and observations [17]. Pattern recognition
involves the process of analyzing and identifying astronomical objects in images obtained from
telescopes and space observatories. This process may include the detection and classification of
stars, galaxies, cosmic objects, and other astronomical phenomena based on their characteristics
such as position [18], brightness [19], shape, spectral features, and others. Initially, data
collection and preparation occur – this may include observational information from telescopes,
including images, spectra, and time series. Then, data processing, including noise filtering,
background alignment [20], image quality enhancement, and identification of objects of interest.</p>
        <p>Space telescopes and missions are also important sources of astronomical data. Space
telescopes such as Hubble [21], Kepler, and the upcoming James Webb Space Telescope provide
high-quality data on distant galaxies, exoplanets, and other objects in the Universe. These
missions allow astronomers to explore cosmic objects without the influence of Earth's
atmosphere and expand our knowledge of the Universe [22].</p>
        <p>Radioastronomical observations also play a crucial role in astronomical research. Using radio
telescopes such as the Very Large Array (VLA) and ALMA, astronomers study radio sources and
processes occurring in the radio spectrum. These observations provide information on various
phenomena, including active galactic nuclei, radio pulsars, and cosmic microwave background
radiation [23].</p>
        <p>Recently, the detection of gravitational waves has become a new source of astronomical data.
Using interferometers such as LIGO and Virgo, astronomers detect events such as black hole
mergers and neutron star mergers. These observations provide new data on cosmic phenomena
that were previously inaccessible to observation.</p>
        <p>All astronomical data sources even small play a key role in scientific research, providing
astronomers with valuable information about cosmic objects and processes. Modern
observational tools and data processing technologies allow researchers to expand our knowledge
of the Universe and address important scientific questions.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Data mining in astronomy</title>
        <p>Data mining, in the context of astronomical research, is a method of data analysis that relies
on computational algorithms to discover patterns, trends, and structures in space data [24]. With
the increasing volume of astronomical data generated by advancements in observational
technologies and expansion of spatial coverage, data mining becomes a crucial tool for extracting
valuable information from these data arrays.</p>
        <p>Astronomical data is often characterized by high dimensionality, complex structure, and a
significant amount of noise. Data mining enables efficient processing and analysis of such data,
revealing hidden patterns that may not be apparent at first glance. Data mining methods
encompass various approaches such as clustering, classification, regression, associative rules,
and others.</p>
        <p>Applied to astronomical data, data mining can serve various purposes, including [25]:
• discovery of new object classes: using clustering and classification methods, data mining
can unveil new classes of astronomical objects hidden within the dataset.
• identification of correlations and dependencies: by analyzing multiple parameters and
characteristics of astronomical objects, data mining can help identify correlations and
dependencies among them, leading to new scientific discoveries and understanding of
physical processes.
• prediction of temporal changes: using regression and time series methods, data mining
can be employed to predict temporal changes in luminosity or other characteristics of
astronomical objects.</p>
        <p>Thus, data mining represents a powerful tool for the analysis of astronomical data [26], playing
a significant role in unraveling the mysteries of the Universe and deepening scientific
understanding of space.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Knowledge discovery in databases in astronomy</title>
        <p>Pattern recognition of celestial objects [27] and patterns is closely related to concepts such as
data mining and knowledge discovery in databases (KDD) in the context of astronomical research.</p>
        <p>Data mining is the process of automated analysis of large volumes of data to identify
interesting and non-obvious patterns, templates, and trends. In astronomy, where data on
astronomical objects are extremely extensive and multiparametric, data mining plays a key role
in processing and analyzing this data. Pattern recognition of objects and stars is one of the stages
of this process, where machine learning algorithms [28] and statistical methods [29] are applied
for classification and clustering of objects in images or observational data.</p>
        <p>On the other hand, KDD encompasses a wide range of methods and techniques for identifying
and interpreting patterns and new knowledge from databases. In astronomy, where data often
have a complex structure and may contain noise, KDD helps researchers identify hidden
relationships between various parameters of astronomical objects, leading to the discovery of
new physical laws and understanding of the Universe [30]. Thus, for pattern and star recognition,
the use of techniques such as data mining as well as KDD in astronomy is an important aspect as
they help researchers gain valuable knowledge from cosmic astronomical data. Knowledge
discovery in databases in astronomy is a methodology for analyzing and interpreting extensive
datasets collected from various observational sources in space (Figure 2).</p>
        <p>This approach involves the application of various algorithms and models to identify important
patterns, trends, and structures within astronomical data, enabling astronomers to extract new
knowledge and draw scientific conclusions. Knowledge discovery in databases in astronomy
leads to a number of scientific outcomes, including:
• classification of stellar spectra: utilizing machine learning methods and data analysis,
astronomers can classify stars based on their spectral characteristics. For example, clustering
spectral data enables the identification of various types of stars and determining their
evolutionary stages.
• discovery of new classes of galaxies: analyzing astronomical catalogs using knowledge
discovery algorithms can lead to the discovery of new types of galaxies, such as galaxies with
unusual shapes or structures, which require further investigation and explanation.
• prediction of gamma-ray bursts: time series methods and statistical analysis can be
employed to forecast temporal changes in the activity of gamma-ray bursts, enabling
astronomers to prepare for observations and studies of such phenomena.
• identification of gravitational lenses: Analyzing large astronomical databases using
knowledge discovery algorithms can assist in identifying and classifying gravitational lenses,
which is crucial for studying dark matter and furthering our understanding of cosmology.</p>
        <p>These examples illustrate the significance of knowledge discovery in databases in astronomy,
as it plays a crucial role in scientific research and helps expand our understanding of the Universe.</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Data mining of the reference stars</title>
        <p>
          The uniformity of the standard form of the image of objects [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] is an important factor influencing
the subsequent process of identification with the astronomical catalogue [31]. Therefore, it is
necessary to conduct an in-depth analysis of literature data to compare methods for preparing images
for the identification process itself. Such methods are expected to reduce the shift in the positional
coordinates of the frame center between the frames themselves in the series.
        </p>
        <p>For example, classical methods of computer vision [32] and object image recognition [16] are
not able to provide the required level of processing speed. These methods require the analysis of
all pixels of potential objects to determine their typical shape. However, when the standard form
is heterogeneous, objects are confused, which increases the processing and identification time.
Methods for estimating image parameters [33] are based on the analysis of only those pixels that
potentially belong to the object under study. Their disadvantage is the inability to determine
specific pixels and reject those whose intensity exceeds a specified limit value initially accurately.</p>
        <p>In the study [34], the authors use automatic selection of a reference point to select calibration
frames. However, this is not a requirement for the identification process itself. Because if there
are artifacts in the image, these control points may be false. Thus, the accuracy of identification
with real objects from the astronomical catalog decreases. The works [35] propose segmentation
method. However, it only work with single images of objects. That is, in the case of a variety of
standard shapes (stroke, extended, circular), this method will not provide the necessary accuracy
due to the ambiguity in the number of brightness peaks.</p>
        <p>This variety of typical shapes also influences various methods of Wavelet transform [36] and
time series analysis [37]. The disadvantage of these methods is that they can only work with
“pure” measurements, so image heterogeneity will greatly spoil the overall indicator. Another
implementation is presented in the study [38] in the form of an additional calibration procedure
to avoid the internal coma of the telescope’s secondary mirror. But, to equalize brightness and
remove “highlights”, there is a brightness method that is more improved in accuracy and quality
using an inverse median filter [39]. However, the disadvantage of these implementations is the
poor accuracy of positional coordinate estimates during the process of identification between
frames of the same series.</p>
        <p>The matched filtering procedure is also known [40], but it uses only an analytical image model.
The disadvantage of this procedure is the inaccuracy of identification when the typical image of
an object is different in different frames of the series. The classical method of adding
frames [41, 42] to improve the “super” frame is also ineffective in the case when the SSO image
does not have clear boundaries on all digital frames of the series. Therefore, it is necessary to
develop the mathematical computational methods for automatic selection of reference stars in
astronomical images, which will take into account the peculiarities of digital frame formation.
  = apl1  x + bpl1  y + cpl1;

 = apl2  x + bpl2  y + cpl2,
 − 
 =00 +arctg(

 = arcsin cos 00 + sin 00 ,
 1+ 2 + 2

cos 00 − sin 00
);</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Determining an estimate of the equatorial coordinates of astronomical objects in CCD-frame</title>
        <p>The preliminary identification procedure [43] allows us to obtain linear plate constants
(  1,   1,   1) and (  2,   2,   2), which will determine the relationship between the
coordinate system (CS) of the CCD-frame and the tangential (ideal) coordinate system of
CCDframe:
where ξ and η – ideal (tangential) coordinates of reference stars;
x, y – measured coordinates of reference stars in the coordinate system of CCD-frame.</p>
        <p>The calculated linear constants of the plate allow us to obtain estimates of the equatorial
coordinates of objects in the frame using the following expression:
(1)
(2)
where  00,  00 – equatorial coordinates of the optical center of the CCD-matrix.</p>
        <p>In the final conversion from CCD-frame coordinates to equatorial coordinates, a cubic model
of plate constants is used, which ensures reliable identification and measurement of positions
throughout the frame.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Uniform distribution of candidates for the reference stars in astronomical</title>
      </sec>
      <sec id="sec-3-3">
        <title>CCD-frame</title>
        <p>Practice shows that the concentration of bright measurements in a certain area of the CCD-frame
(for example, in the center) can increase the identification accuracy in this area by reducing it in
other areas of the same CCD-frame (Figure 3, left).</p>
        <p>To ensure almost equal accuracy of object coordinate measurements throughout the entire
CCDframe, it is advisable to distribute candidates for reference stars evenly throughout the frame.</p>
        <p>Thus, a uniform distribution of identified pairs throughout the entire CCD-frame will ensure the
necessary uniformity in the accuracy of determining the equatorial coordinates of objects
throughout the entire CCD-frame (Figure 3, right).</p>
        <p>Therefore, it is necessary to fragment the frame into   ×   areas (sections) of equal area
for uniform distribution of identified pairs on the CCD-frame when selecting candidates for
reference stars. In each frame fragment, the same number of objects with a bright image (stars) is
selected.</p>
        <p>The number of measurements of the frame Nmea and stars from the forms of the astronomical
catalog Nst obtained during observations and intra-frame processing is divided by the number of
frame sections.</p>
        <p>Next, in each such area   ⁄  2 of the brightest measurements of the frame and   ⁄  2
of the brightest stars in the catalog are selected.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.3. Mathematical method for automated selection of the reference stars in astronomical CCD-frame</title>
        <p>At each stage of selecting guide stars, measurements of nearby objects are excluded from the set
of candidates. This means that the distance between them does not exceed the previously specified
value  ( _ ). That is, the i-th and m-th measurements of the CCD-frame are excluded from
candidates for reference stars if the following condition is met:
(xmeainfr − xmeam nfr )2 + ( ymeainfr − ymaem nfr )2  rmea _ group ,
(3)
where xmeamnfr and ymeamnfr are the positional coordinates of the measurement of a nearby object
in the SC of CCD-frame.</p>
        <p>Like (3), measurements of nearby stars from clusters/compact groups of stars in the
astronomical catalog are excluded from consideration. This is the case when a nearby object has a
comparable or greater brilliance. The criterion for such membership is the presence of a nearby
star at a distance less than a predetermined value  ( _ ):</p>
        <p>(catj − cat )2 + (catj − cat )2  rstar _ group ,
where αcat and δcat – positional coordinates on the celestial sphere in the astronomical catalog form.</p>
        <p>Another important criterion for rejecting candidates is the absence of a brightness peak in the
image of the object on the CCD-frame. The criterion for such absence of a peak can be considered
the approximate equality of the brightness of the potential peak and the brightness of the pixels  
from the region   of size   ×   of pixels centered in the potential peak. Approximate
equality is the difference between the brightness of the pixels of a potential peak and the brightness
of the region by no more than   brightness units.</p>
        <p>( Apeak − Aik )  N Apeak , for i, k   peak
(4)
(5)</p>
        <p>In this way, sets of measurements are formed from the side of the CCD-frame and the
astronomical catalog, which in the subsequent stage will take part in vaporization and testing of
hypotheses about the correspondence of the “measurement-formula” identification pair.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.4. Data mining process of the automated reference stars selection</title>
        <p>The architecture of data mining process of the automated reference stars selection includes the
following sequence of operations (Figure 4).
1. Calculation of linear plate constants (  1,   1,   1) and (  2,   2,   2) (1).
2. Obtaining an estimate of the equatorial coordinates of objects (2).</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiment</title>
      <p>The object of study are the images of the Solar System objects (SSO) (like stars, asteroids, comets)
and any other space objects (like space robots [44], drones [45], satellites [46]) detected in a series
of CCD-frames. The initial series for the study were obtained from a variety of telescopes installed
at observatories in Ukraine and around the world. Namely, the ISON-NM observatory, the
SANTEL400AN telescope (New Mexico, USA); Vihorlat Observatory, VNT telescope (Humenne, Slovakia);
Odesa-Mayaky Observatory, OMT-800 telescope (Mayaki, Ukraine); Cerro Tololo observatory,
PROMPT-8 telescope (La Serena, Chile) [47]. All mentioned above observatories were approved
and confirmed by the Minor Planet Center (MPC) as an official organization for the observing and
reporting on minor planets or SSOs under the auspices of the International Astronomical
Union (IAU) [48].</p>
      <p>To verify the developed mathematical computational methods for automatic selection of
reference stars in astronomical images, testing was carried out on a series of frames containing
27,352 measurements. Such a total number of measurements was successfully identified with the
astronomical catalog.</p>
      <p>The USNO B1.0 catalog was used as a photometric catalogue. The catalog contains angular
positional coordinates and magnitudes of more than one billion SSOs, formed over 3.6 billion
measurements.</p>
      <p>When conducting research, the following values of the parameters of the developed methods were
assumed:
• the number of the brightest measurements of the CCD-frame was   = 400;
• the number of the brightest measurements of the catalog forms for selecting candidates for
reference stars was   = 600;
• the number of fragments into which the CCD-frame is divided for uniform distribution was
  = 4;
• the number of measurements of the CCD-frame ∆  = 300;
• the number of measurements of the catalog form ∆  = 500 with increasing iteration;
• the criterion for the absence of a peak is the deviation of the brightness of the object image
pixels by no more than   = 4 in the region   ×   (  = 5) centered at the
peak;
• the maximum permissible distance between measurements on a CCD-frame of close group
objects was   _ = 20 pixels;
• the maximum permissible distance between measurements in the form of catalogs of nearby
group objects was   _ = 5 pixels;
• the coefficient of the rule for rejecting «measurement-formula» pairs from a set of reference
stars was considered   = 1.</p>
      <p>The parameters of the procedure listed above were obtained empirically.</p>
      <p>The following statistical indicators of the accuracy of reference star measurements were studied:
estimates of the average deviation of estimates of equatorial coordinates between the catalog and
measured values, ∆̅ , ∆̅ ; standard deviation (RMS)   ,   ,   and an estimate of the mean deviation
of the gloss estimate between the catalog and measured values ∆̅ .</p>
      <p>Histogram of the distribution of deviations of the equatorial coordinate (right ascension (RA))
of reference stars from the brightness and coordinates of objects in the rectangular coordinate
system of the CCD-frame is presented in Figure 5.</p>
      <p>Histogram of the distribution of deviations of the equatorial coordinates (declination (DE)) of
reference stars from the brightness and coordinates of objects in the rectangular coordinate system
of the CCD-frame is presented in Figure 6.</p>
      <p>The received dependence of deviations of equatorial coordinates from the position of reference
stars in frame is presented in Figure 7.</p>
      <p>The received dependence of deviations of equatorial coordinates from the brightness assessment
of objects in frame is presented in Figure 8.</p>
      <p>Presented in Table 1 indicators show the successful application of the developed methods. The
standard deviation of frame identification errors in this case is 5–7 times less than without using
the developed methods.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>
        Existing methods for basic image processing [41] and computer vision [32] were analyzed.
However, the speed and accuracy of identification by such methods directly depends on the
characteristics of the formation of a series of digital frames. There is also a dependence on the
completeness of the astronomical catalog with data and on the constancy of the typical image [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
of the object in all frames of the series. Therefore, to develop the methods for automated data
mining of the reference stars from astronomical CCD-frames and certain rules and criteria for
rejecting candidates at each iteration were proposed.
      </p>
      <p>The obtained research results, as well as the developed mathematical computational methods
for automatic selection of reference stars in astronomical images, were implemented in the C++
programming language. This code was implemented at the stage of intra-frame processing of the
Lemur software package (Ukraine) [49] for the automated detection of new and maintenance of
known objects within the CoLiTec project [50]. The developed mathematical computational
methods, implemented in Lemur software (Ukraine), was used during the successful
identification of CCD-frames, which contained a total of more than 800,000 SSOs. Their
measurements were also successfully identified with known astronomical catalogs.</p>
      <p>Obtained in Table 1, the results are determined by the uniform distribution of candidates for
reference stars, as well as correctly selected conditions and rejection criteria. It clearly indicates
that the assigned tasks have been successfully completed. The research showed that the usage of
the developed methods reduces identification errors with cataloged (reference) objects by 5–7
times. This significantly affects the quality and accuracy of a few tasks for detecting the
trajectories of objects.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>In recent decades, astronomy has undergone a data processing and analysis revolution due to the
implementation of large astronomical projects and the use of advanced observational tools. These
modern instruments collect and process vast amounts of data, extracting valuable information
about celestial objects and phenomena. In astronomical images obtained using telescopes and
cameras, there are from 1 to 100 thousand or more stars depending on the resolution and
exposure time. These objects are fixed against the background of the frame and have constant
positions in the celestial sphere. To determine which part of the sky corresponds more accurately
to a given frame, it is necessary to associate the frame with known astronomical astrometric and
photometric catalogs.</p>
      <p>These catalogs contain millions of position values of various stars as static objects. Having such
information in the form of big data, as well as a huge amount of classified and clustered data in
the form of databases, computational methods for fast extraction of the necessary data from them
need to be developed. For this purpose, classical methods of "knowledge discovery in databases"
(KDD) and Data Mining exist. However, for their proper application, it is necessary to classify the
input data set for subsequent analysis and rejection. The implementation of these methods is
closely related to the developed mathematical computational methods for automatic selection of
reference stars in astronomical images.</p>
      <p>We presented the developed Lemur software of the CoLiTec (Collection Light Technology)
project, which is implemented as a client-server application for the processing of astronomical
data using the data mining and KDD methods. As described in this article the KDD with the data
mining step is very useful for the data optimization to receive only the helpful data with reference
stars.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>The research was performed with help of all observatories, useful tools and observers who
provided astronomical data for testing the developed Lemur software with implementation of the
data mining methods. The research was supported by the Ukrainian project of fundamental
scientific research “Development of computational methods for detecting objects with near-zero
and locally constant motion by optical-electronic devices” #347 in 2024-2026 years.
[10] H. Yang, et al., Data mining techniques on astronomical spectra data–II. Classification
analysis, Monthly Notices of the Royal Astronomical Society, vol. 518, issue 4, pp. 5904-5928,
2023. doi: 10.1093/mnras/stac3292.
[11] V. Savanevych, et al., Selection of the reference stars for astrometric reduction of CCD-frames,
Advances in Intelligent Systems and Computing, vol. 1080, pp. 881–895, 2020. doi:
10.1007/978-3-030-33695-0_57.
[12] F. Genova, Data as a research infrastructure CDS, the Virtual Observatory, astronomy, and beyond,</p>
      <p>EPJ Web of Conferences, vol. 186, EDP Sciences, 2018. doi: 10.1051/epjconf/201818601001.
[13] P. Hasan, and S. N. Hasan, Astronomy data, virtual observatory and education, Proceedings
of the International Astronomical Union, vol. 15, issue S367, pp. 151-154, 2019.
doi: 10.1017/S174392132100034X.
[14] D. Oszkiewicz, et al., Spin rates of V-type asteroids, Astronomy and Astrophysics, vol. 643,</p>
      <p>A117, 2023. doi: 10.1051/0004-6361/202038062.
[15] V. Akhmetov, et al., New approach for pixelization of big astronomical data for machine vision
purpose, IEEE International Symposium on Industrial Electronics, pp. 1706–1710, 2019.
doi: 10.1109/ISIE.2019.8781270.
[16] S. Khlamov, I. Tabakova, T. Trunova, Recognition of the astronomical images using the Sobel
filter, Proceedings of the 29th IEEE IWSSIP 2022, Sofia, Bulgaria, June 1st – 3rd, 4 p., 2022.
doi: 10.1109/IWSSIP55020.2022.9854425.
[17] M. K. Cavanagh, K. Bekki, and B. A. Groves, Morphological classification of galaxies with deep
learning: comparing 3-way and 4-way CNNs, Monthly Notices of the Royal Astronomical
Society, vol. 506, issue 1, pp. 659-676, 2021. doi: 10.1093/mnras/stab1552.
[18] L. Mykhailova, et al., Method of maximum likelihood estimation of compact group objects
location on CCD-frame, Eastern-European Journal of Enterprise Technologies, vol. 5, issue 4,
pp. 16-22, 2014. doi: 10.15587/1729-4061.2014.28028.
[19] Š. Parimucha, et al., CoLiTecVS – A new tool for an automated reduction of photometric
observations, Contributions of the Astronomical Observatory Skalnate Pleso, vol. 49, issue 2,
pp. 151-153, 2019. doi: 2019CoSka..49..151P.
[20] S. Khlamov, et al., Development of the matched filtration of a blurred digital image using its
typical form, Eastern-European Journal of Enterprise Technologies, vol. 1, issue 9-121, pp.
62–71, 2023. doi: 10.15587/1729-4061.2023.273674.
[21] H. E. Bond, et al., Hubble Space Telescope Imaging of Luminous Extragalactic Infrared
Transients and Variables from the Spitzer Infrared Intensive Transients Survey, The
Astrophysical Journal, vol. 928, issue 2, p. 158, 2022. doi: 10.3847/1538-4357/ac5832.
[22] J. Bennett, S. Shostak, N. Schneider, and M. MacGregor, Life in the Universe. Princeton</p>
      <p>University Press, 2022.
[23] D. S. Madgwick, Correlating galaxy morphologies and spectra in the 2dF Galaxy Redshift
Survey, MNRAS, vol. 338, issue 1, pp. 197-207, 2003. doi:
10.1046/j.13658711.2003.06033.x.
[24] Ž. Ivezić, et al., Statistics, Data Mining, and Machine Learning in Astronomy: A Practical</p>
      <p>Python Guide for the Analysis of Survey Data, Princeton University Press, 2019.
[25] R. Mor, et al., Expanding Big Data mining for Astronomy, XIV Scientific Meeting of the Spanish</p>
      <p>Astronomical Society, p. 235, 2020. doi: 2020sea..confE.235M.
[26] Y. Zhang, and Y. Zhao, Astronomy in the big data era, Data Science Journal, vol. 14, 2015.
[27] S. Khlamov, V. Savanevych, I. Tabakova, and T. Trunova, The astronomical object recognition
and its near-zero motion detection in series of images by in situ modeling, Proceedings of the
29th IEEE IWSSIP 2022. doi: 10.1109/IWSSIP55020.2022.9854475.
[28] C. J. Fluke, and C. Jacobs, Surveying the reach and maturity of machine learning and artificial
intelligence in astronomy, Wiley Interdisciplinary Reviews: Data Mining and Knowledge
Discovery, vol. 10, issue 2: e1349, 2020. doi: 10.1002/widm.1349.
[29] V. Shvedun, et al., Statistical modelling for determination of perspective number of advertising
legislation violations, Actual Problems of Economics, vol. 184, issue 10, pp. 389-396, 2016.
[30] W. P. McCray, The biggest data of all: Making and sharing a digital universe, Osiris, vol. 32,
issue 1, pp. 243-263, 2017. doi: 10.1086/693912.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Chierchie</surname>
          </string-name>
          , et al.,
          <article-title>Detailed modeling of the video signal and optimal readout of chargecoupled devices</article-title>
          ,
          <source>International Journal of Circuit Theory and Applications</source>
          , vol.
          <volume>48</volume>
          , issue 7, pp.
          <fpage>1001</fpage>
          -
          <lpage>1016</lpage>
          ,
          <year>2020</year>
          . doi:
          <volume>10</volume>
          .1002/cta.2784.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Hoover</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. Z.</given-names>
            <surname>Seligman</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Payne</surname>
          </string-name>
          ,
          <article-title>The Population of Interstellar Objects Detectable with the LSST and Accessible for In Situ Rendezvous with Various Mission Designs</article-title>
          ,
          <source>The Planetary Science Journal</source>
          , vol.
          <volume>3</volume>
          , issue 3, p.
          <fpage>71</fpage>
          ,
          <year>2022</year>
          . doi:
          <volume>10</volume>
          .3847/psj/ac58fe.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Peralta</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>del</article-title>
          <string-name>
            <surname>Rio</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Ramirez-Gallego</surname>
            ,
            <given-names>I. Triguero</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Benitez</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Herrera</surname>
          </string-name>
          ,
          <article-title>Evolutionary feature selection for big data classification: A map reduce approach</article-title>
          , Mathematical Problems in Engineering, vol.
          <year>2015</year>
          ,
          <volume>246139</volume>
          , pp.
          <fpage>11</fpage>
          ,
          <year>2015</year>
          . doi:
          <volume>10</volume>
          .1155/
          <year>2015</year>
          /246139.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Khalil</surname>
          </string-name>
          , et al.,
          <article-title>Big data in astronomy: from evolution to revolution</article-title>
          ,
          <source>International Journal of Advanced Astronomy</source>
          , vol.
          <volume>7</volume>
          , issue 1,
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .14419/ijaa.v7i1.
          <fpage>18029</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Oszkiewicz</surname>
          </string-name>
          , et al.,
          <article-title>Spins and shapes of basaltic asteroids and the missing mantle problem</article-title>
          ,
          <source>Icarus</source>
          , vol.
          <volume>397</volume>
          ,
          <issue>115520</issue>
          ,
          <year>2023</year>
          . doi:
          <volume>10</volume>
          .1016/j.icarus.
          <year>2023</year>
          .
          <volume>115520</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>V.</given-names>
            <surname>Troianskyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kashuba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Bazyey</surname>
          </string-name>
          , et al.,
          <source>First reported observation of asteroids 2017 AB8</source>
          ,
          <year>2017</year>
          QX33,
          <article-title>and 2017 RV12, Contributions of the Astronomical Observatory Skalnaté Pleso</article-title>
          , vol.
          <volume>53</volume>
          , pp.
          <fpage>5</fpage>
          -
          <lpage>15</lpage>
          ,
          <year>2023</year>
          . doi:
          <volume>10</volume>
          .31577/caosp.
          <year>2023</year>
          .
          <volume>53</volume>
          .
          <issue>2</issue>
          .5.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>V.</given-names>
            <surname>Troianskyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kankiewicz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Oszkiewicz</surname>
          </string-name>
          ,
          <article-title>Dynamical evolution of basaltic asteroids outside the Vesta family in the inner main belt</article-title>
          ,
          <source>Astronomy and Astrophysics</source>
          , vol.
          <volume>672</volume>
          ,
          <issue>A97</issue>
          ,
          <year>2023</year>
          . doi:
          <volume>10</volume>
          .1051/
          <fpage>0004</fpage>
          -6361/202245678.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>V.</given-names>
            <surname>Savanevych</surname>
          </string-name>
          , et al.,
          <article-title>Formation of a typical form of an object image in a series of digital frames</article-title>
          ,
          <source>Eastern-European Journal of Enterprise Technologies</source>
          , vol.
          <volume>6</volume>
          , issue 2-120, pp.
          <fpage>51</fpage>
          -
          <lpage>59</lpage>
          ,
          <year>2022</year>
          . doi:
          <volume>10</volume>
          .15587/
          <fpage>1729</fpage>
          -
          <lpage>4061</lpage>
          .
          <year>2022</year>
          .
          <volume>266988</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>V.</given-names>
            <surname>Akhmetov</surname>
          </string-name>
          , et al.,
          <article-title>Fast coordinate cross-match tool for large astronomical catalogue</article-title>
          ,
          <source>Advances in Intelligent Systems and Computing</source>
          , vol.
          <volume>871</volume>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>16</lpage>
          ,
          <year>2019</year>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -01069-
          <issue>0</issue>
          _
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>