=Paper=
{{Paper
|id=Vol-2022/paper58
|storemode=property
|title=
Astrophysical Data Analytics based on Neural Gas Models, using the Classification of Globular Clusters as Playground
|pdfUrl=https://ceur-ws.org/Vol-2022/paper58.pdf
|volume=Vol-2022
|authors=Giuseppe Angora,Massimo Brescia,Giuseppe Ricci,Stefano Cavuot,Maurizio Paolillo,Thomas H. Puzi
|dblpUrl=https://dblp.org/rec/conf/rcdl/AngoraBRCPP17
}}
==
Astrophysical Data Analytics based on Neural Gas Models, using the Classification of Globular Clusters as Playground
==
Astrophysical Data Analytics based on Neural Gas Models,
using the Classification of Globular Clusters as Playground
© Giuseppe Angora1 © Massimo Brescia2 © Giuseppe Riccio2 © Stefano Cavuoti3
© Maurizio Paolillo 3 © Thomas H. Puzia4
1
Department of Physics “E. Pancini”, University Federico II,
Via Cinthia 6, 80126 Napoli, Italy
2
INAF Astronomical Observatory of Capodimonte,
Via Moiariello 16, 80131 Napoli, Italy
3
Department of Physics “E. Pancini”, University Federico II,
Via Cinthia 6, 80126 Napoli, Italy
4
Institute of Astrophysics, Pontificia Universidad Católica de Chile,
Av. Vicuña Mackenna 4860, Macul, Santiago, Chile
gius.angora@gmail.com
Abstract. In Astrophysics, the identification of candidate Globular Clusters through deep, wide-field,
single band HST images, is a typical data analytics problem, where methods based on Machine Learning have
revealed a high efficiency and reliability, demonstrating the capability to improve the traditional approaches.
Here we experimented some variants of the known Neural Gas model, exploring both supervised and
unsupervised paradigms of Machine Learning, on the classification of Globular Clusters, extracted from the
NGC1399 HST data. Main focus of this work was to use a well-tested playground to scientifically validate
such kind of models for further extended experiments in astrophysics and using other standard Machine
Learning methods (for instance Random Forest and Multi Layer Perceptron neural network) for a comparison
of performances in terms of purity and completeness.
Keywords: data analytics, astroinformatics, globular clusters, machine learning, neural gas.
1 Introduction The astrophysical case is related to the identification
of Globular Clusters (GCs) in the galaxy NGC1399 using
The current and incoming astronomical synoptic single band photometric data obtained through
surveys require efficient and automatic data analytics observations with the Hubble Space Telescope (HST)
solutions to cope with the explosion of scientific data [8], [25],[27].
amounts to be processed and analyzed. This scenario,
The physical identification and characterization of a
quite similar to other scientific and social contexts,
Globular Cluster (GC) in external galaxies is considered
pushed all communities involved in data-driven
important for a variety of astrophysical problems, from
disciplines to explore data mining techniques and
the dynamical evolution of binary systems, to the
methodologies, most of which connected to the Machine
analysis of star clusters, galaxies and cosmological
Learning (hereafter ML) paradigms, i. e.
phenomena [27].
supervised/unsupervised self-adaptive learning and
parameter space optimization[3],[6],[7] . Here, the capability of ML methods to learn and
recognize peculiar classes of objects, in a complex and
Following this premise, this paper is focused on the
noising parameter space and by learning the hidden
investigation about the use of a particular kind of ML
correlation among object’s parameters, has been
methods, known as Neural Gas (NG) models[21], to
demonstrated particularly suitable in the problem of GC
solve classification problems within the astrophysical
classification[8]. In fact, multi-band wide-field
context, characterized by a complex multi-dimensional
photometric data (colours and luminosities) are usually
parameter space. In order to scientifically validate such
required to recognize GCs within external galaxies, due
models, we decided to approach a typical astrophysical
to the high risk of contamination of background galaxies,
playground, already solved with ML methods [8], [11]
which appear indistinguishable from galaxies located
and to use in parallel other two ML techniques, chosen
few Mpc away, when observed by ground-based
among the most standard, respectively, Random Forest
instruments. Furthermore, in order to minimize the
[5] and Multi Layer Perceptron Neural Network[23], as
contamination, high-resolution space-borne data are also
comparison baseline.
required, since they are able to provide particular
Proceedings of the XIX International Conference physical and structural features (such as concentration,
“Data Analytics and Management in Data Intensive core radius, etc.), thus improving the GC classification
Domains” (DAMDID/RCDL’2017), Moscow, Russia, performance [25].
October 10–13, 2017
381
In[8] we demonstrated the capability of ML methods m_V=27.5, i.e. 4 mag below the GC luminosity function,
to classify GCs using only single band images from thus allowing to sample the entire GC population (see[8]
Hubble Space Telescope with a classification accuracy for details).
of 98.3%, a completeness of 97.8% and only 1.6% of
residual contamination. Thus confirming that ML
methods may yield low contamination by minimizing the
observing requirements and extending the investigation
to the outskirts of nearby galaxies.
These results gave us an optimal playground where
to train NG models and to validate their potential to solve
classification problems characterized by complex data
with a noising parameter space.
The paper is structured as follows: in Sect. 2 we
describe the data used to test of the various methods. In
Sect. 3 we provide a short methodological and technical
description of the models. In Sect. 4 we describe the
experiments and results about the parameter space
analysis and classification experiments, while in Sect. 5 Figure 1 The FoV covered by the HST/ACS mosaic in
we discuss the results and draw our conclusions. the broad V band
2 The Astrophysical Playground The source subsample used to build our Knowledge
As introduced, the HST single band data use dare very Base (KB) to train the ML models, is composed by 2100
suitable to investigate the classification of GCs. They, in sources with 11features (7 photometric and 4
fact, are deep and complete in terms of wide-field morphological parameters).
coverage, i. e. able to sample the GC population, to Such parameter space includes three aperture
ensure a high S/N ratio required to measure structural magnitudes within 2, 6 and 20 pixels (mag_aper1,
parameters [10]. Furthermore, they provide the mag_aper2, mag_aper3), is ophotal magnitude
possibility to study the overall properties of the GC (mag_iso), kron radius (kron_rad), central surface
populations, which usually may differ from those of the brightness (mu0), FWHM (fwhm_im),and the four
central region of a galaxy. structural parameters, respectively, ellipticity, King's
tidal, effective and core radii (calr_t, calr_h, calr_c). The
With such data we intend to verify that Neural Gas
target values of the KB required as ground truth for
based models could be able to identify GCs with low
contamination even with single band photometric training and validation, i.e. the binary column indicating
information. Throughout the confirmation of such the source as GC or not GC, is provided through the
behavior, we are confident that these models could solve typical selection based on multi-band magnitude and
other astrophysical problems as well as in other data- colour cuts. The original 2100 sources having a target
driven problem contexts. assigned have been randomly shuffled and split into a
training (70%) and a blind test set (30%).
2.1 The data
3 The Machine Learning Models
The data used in the described experiment consist of
wide field single band HST observations of the giant In our work we tested three different variants of the
elliptical NGC1399 galaxy, located in the core of the Neural Gas model, using two additional machine
Fornax cluster[27]. Due to its distance (D=20.130 Mpc, learning methods, respectively feed-forward neural
see[13]), it is considered an optimal case where to cover network and Random Forest, as comparison benchmarks.
a large fraction of its GC system with a restricted number In the following all main features of these models are
of observations. This dataset was used by[25] to study described.
the GC-LMXB connection and the structural properties 3.1 Growing Neural Gas
of the GC population. The optical data were taken with
the HST Advanced Camera for Surveys, in the broad V Growing Neural Gas (GNG) is presented by[14] as a
band filter, with 2108 seconds of integration time for variant of the Neural Gas algorithm (introduced by[21]),
each field. The observations were arranged in a 3x3 ACS which combines the Competitive Hebbian Learning
mosaic with a scale of 0.03 arcsec/pix, and combined into (CHL, [22]) with a vector quantization technique to
a single image using the MultiDrizzle routine[19]. The achieve a learning that retains the topology of the dataset.
field of view of the ACS mosaic covers ~100 square Vector quantization techniques[22] encode a data
arcmin (Figure 1), extending out to a projected galacto- manifold, e.g. , using a finite set of reference
centric distance of ~55 kpc. vectors , . Every data
The source catalog was generated using Sextractor vector is described by the best matching reference
[4],[2], by imposing a minimum area of 20 pixels: it
vector for which the distortion error is
contains 12915 sources and reaches 7σ detection at
minimal. This procedure divides the manifold into a
382
number of subregions: periodically: during the adaptation steps the error
, called Voronoi accumulation allows to identify the regions in the input
polyhedra[24], within which each data vector is space where the signal mapping causes major errors.
described by the corresponding reference vector . Therefore, to reduce this error, new units are inserted in
such regions[14].
The Neural Gas network is a vector quantization
model characterized by N neural units, each one An elimination mechanism is also provided: once
associated to a reference vector, connected to each other. the connections, whose age is greater than a certain
When an input is extracted, it induces a synaptic threshold, have been removed, if their connected units
excitation detected by all the neurons in the graph and remain isolated (i.e. without emanating edges), those
causes its adaptation. As shown in[21], the adaptation units are removed[14].
rule can be described as a “winner-takes-most” instead 3.2 GNG with Radial Basis Function
of “winner-takes-all” rule:
Fritzke describes an incremental Radial Basis Function
. (1)
(RBF) network suitable for classification and regression
The step size describes the overall extent of the
problems [14].
adaptation. While is a function in
which is the “neighborhood-ranking” of the reference The network can be figured out as a standard RBF
vectors. Simultaneously, the first and second Best network [9], with a GNG algorithm as embedded
Matching Units (BMUs) develop connections between clustering method, used to handle the hidden layer.
each other[21]. Each unit of this hybrid model (hereafter GNGRBF)
Each connection has an “age”; when the age of a is a single perceptron with an associated reference vector
connection exceeds a pre-specified lifetime T, it is and a standard deviation. For a given input-output pair
removed[21]. Martinez's reasoning is interesting[22]: , the activation of the i-th unit is
they demonstrate how the dynamics of neural units can described by
be compared to a gaseous system. Let’s define the
density of vector reference at location through .
Each of the single perceptron computes a weighted
, where is the volume of sum of the activations:
Voronoi polyhedra. Hence, is a step function on
each Voronoi polyhedra, but we can still imagine that
their volumes change slowly from one polyhedra to the The adaptation rule applies to both reference vectors
next, with continuous. In this way, it is possible to forming the hidden layer and the RBF weights. For the
derive an expression for the average change: first, the adaptation rule is the same of the updating rule
for the GNG network, while for the weights:
(2) (3)
where is the data point distribution.
Similarly to the GNG network, new units are inserted
The equation suggests the name Neural Gas: the average
where the prediction error is high, updating only the Best
change of the reference vectors corresponds to a motion
Matching Unit at each iteration:
of particles in a potential . Superimposed on
the gradient of this potential there is a force proportional
to , which points toward the direction of the .
space where the particle density is low. 3.3 Supervised Growing Neural Gas
Main idea behind the GNG network is to
successively add new units to an initially small network, The Supervised Growing Neural Gas (SGNG) algorithm
by evaluating local statistical measures collected during is a modification of the GNG algorithm that uses class
previous adaptation steps[14]. Therefore, each neural labels of data to guide the partitioning of data into
unit in the graph has associated a local reconstruction optimal clusters[15],[20]. Each of the initial neurons is
error, updated for the BMU at each iteration (i. e. each labelled with a unique class label. To reduce the class
impurity inside the cluster, the original learning rule (1)
time an input is extracted): .
is reformulated by considering the case where the BMU
Unlike the Neural Gas network, in the GNG the belongs or not to the same class of the neuron whose
synaptic excitation is limited to the receptive fields reference vector is the closest to the current input.
related to the Best Matching Unit and its topological Depending on such situation the SGNG learning rule is
neighbors: expressed alternatively as:
It is no longer necessary to calculate the ranking for all
neural units, but it is sufficient to determine the first and
the second BMU. (4)
Where is the nearest class neuron and
The increment of the number of units is performed
is a function specifically introduced to
383
maintain neurons sufficiently distant one each other. For 4.1 The Classification Statistical Estimators
the neuron which is topologically close to the neuron ,
In order to evaluate the performances of the selected
the rule intends to increase the clustering accuracy[20].
classifiers, we decided to use three among the classical
The insertion mechanism has to reduce not only the intra-
and widely used statistical estimators, respectively,
distances between data in a cluster, but also the impurity
average efficiency, purity, completeness and F1-score,
of the cluster. Each unit has associated two kinds of error:
which can be directly derived from the confusion
an aggregated and a class error. A new neuron is inserted
matrix[28], showed in Figure 2. The average
close to the neuron having a highest class error
efficiency(also known as accuracy, hereafter AE), is the
accumulated, while the label is the same as the neuron
ratio between the sum of correctly classified objects on
label with the greater aggregated error.
both classes (true positives for both classes, hereafter tp)
3.4 Multi Layer Perceptron and the total amount of objects in the test set. The purity
(als known as precision, hereafter pur) of a class
The Multi Layer Perceptron (MLP) architecture is
measures the ratio between the correctly classified
one of the most typical feed-forward neural
objects and the sum of all objects assigned to that class
networks[23]. The term feed-forward is used to identify
(i.e. tp/ [tp+fp], where fp indicates the false
basic behavior of such neural models, in which the
positives).While the completeness (also known as recall,
impulse is propagated always in the same direction, e.g.
hereafter comp) of a class is the ratio tp/ [tp+fn], where
from neuron input layer towards output layer, through
fn is the number of false negatives of that class. The
one or more hidden layers (the network brain), by
quantity tp+fn corresponds to the total amount of objects
combining the sum of weights associated to all neurons.
belonging to that class. The F1-score is a statistical test
As easy to understand, the neurons are organized in that considers both the purity and completeness of the
layers, with proper own role. The input signal, simply test to compute the score (i. e. 2 [pur*comp]/
propagated throughout the neurons of the input layer, is [pur+comp]).
used to stimulate next hidden and output neuron layers.
By definition, the dual quantity of the purity is the
The output of each neuron is obtained by means of an
contamination, another important measure which
activation function, applied to the weighted sum of its
indicates the amount of misclassified objects for each
inputs.
class.
The weights adaptation is obtained by the Logistic
Regression rule[17], by estimating the gradient of the
cost function, the latter being equal to the logarithm of
the likelihood function between the target and the
prediction of the model. In this work, our implementation
of the MLP is based on the public library Theano[1].
Figure 2 The confusion matrix used to estimate the
3.5 Random Forest classification statistics. Columns indicate the class
Random Forest (RF) is one of the most widely known objects as predicted by the classifier, while rows are
machine learning ensemble methods [5], since it uses a referred to the true objects of the classes. Main diagonal
random subset of candidate data features to build an terms contain the number of correctly classified for the
ensemble of decision trees. Our implementation makes two classes, while fp counts the false positives and fn the
use of the public library scikit-learn[26]. This method has false negatives of the GC class
been chosen mainly because it provides for each input
In statistical terms, it is well known the classical
feature as core of importance (rank) measured in terms
tradeoff between purity and completeness in any
of its informative contribution percentage to the
classification problem, particularly accentuated in
classification results. From the architectural point of
astrophysical problems[12]. In the specific case of the
view, a RF is a collection (forest) of tree-structured
GC identification, from the astrophysical point of view,
classifiers , where the are independent, we were mostly interested to the purity, i. e. to ensure the
identically distributed random vectors and each tree casts highest level of true GCs correctly identified by the
a unit vote for the most popular class at input. Moreover, classifiers[8]. However, within the comparison
a fundamental property of the RF is the intrinsic absence experiments described in this work, our main goal was to
of training over fitting[5]. evaluate the performances of the classifiers mostly
4 The experiments related to the best tradeoff between purity and
completeness.
The five models previously introduced have been
applied to the dataset described in Sec. 2.1 and their 4.2 Analysis of the Data Parameter Space
performances have been compared to verify the Before to perform the classification experiments, we
capability of NG models to solve particularly complex preliminarily investigated the parameter space, defined
classification problems, like the astrophysical by the 11 features defined in Sec. 2.1, identifying each
identification of GCs from single-band observed data. object within the KB dataset of 2100 objects. Main goal
of this phase was to measure the importance of any
feature, i.e. its relevance in terms of informative
384
contribution to the solution of the problem. In the ML Finally, the experiment E4 is performed to verify the
context, this analysis is usually called feature results by removing only the two worst features.
selection[16]. Its main role is to identify the most Table 1 List of selected experiments, based on the
relevant features of the parameter space, trying to analysis of the parameter space. The third column
minimize the impact of the well known problem of the reports the identifiers of the included features,
curse of dimensionality, i.e. the fact that ML models according to the importance ranking (see legend in
exhibit a decrease of performance accuracy when the Figure 3)
number of features is significantly higher than included
optimal[18]. This problem is mainly addressed to cases EXP ID # features
features
with a huge amount of data and dimensions. However, E1 4 1,2,3,5
its effects may also impact contexts with a limited E2 6 1,2,3,4,5,6
amount of data and parameter space dimension.
E3 7 1,2,3,4,5,6,10
The Random Forest model resulted particularly E4 9 1,2,3,4,5,6,7,8,9
suitable for such analysis, since it is intrinsically able to
provide a feature importance ranking during the training 4.3 The Classification Experiments
phase. The feature importance of the parameter space,
representing the dataset used in this work, is shown in Following the results of the parameter space analysis,
Figure 3. the original domain of features has been reduced, by
varying the number and types of included features.
From the astrophysical point of view, this ranking is Therefore, the classification experiments have been
in accordance with the physics of the problem. In fact, as performed on the dataset, described in Sec. 2.1,
expected, among the five most important features there composed by 2100 objects and represented by a
are the four magnitudes, i. e. the photometric log-scale parameter space with up to a maximum of 9 features
measures of the observed object’s photonic flux through (Table 1).
different apertures of the detector. Furthermore, almost
all photometric features resulted as the most relevant. Table 2 Statistical analysis of the classification
Finally, by looking at the Figure 3, there is an interesting performances obtained by the five ML models on the
gap between the first six and the last five features, whose blind test set for the four selected experiments. All
cumulative contribution is just ~11% of the total. Finally, quantities are expressed in percentage and related to
a very weak joined contribution (~3%) is carried by the average efficiency (AE), purity for each class (purGC,
two worst features (kron_rad and calr_c), which can be purNotGC), completeness for each class (compGC,
considered as the most noising/redundant features for the compNotGC) and the F1-score for GC class. The
problem domain. contamination is the dual value of the purity
RF MLP SGNG GNGRBF GNG
ID Estimator
% % % % %
AE 88.9 84.4 88.1 88.1 88.4
purGC 85.9 80.1 89.7 85.4 83.7
compGC 87.3 82.6 80.3 85.7 89.2
E1
F1-scoreGC 86.6 81.3 84.7 85.5 86.4
purNotGC 91.0 87.6 87.2 90.0 92.1
compNotGC 89.7 85.6 93.0 89.6 88.1
AE 89.0 85.1 87.3 88.3 83.2
Figure 3 The feature importance ranking obtained by the
purGC 84.9 77.0 81.0 82.9 74.0
Random Forest on the 11-feature domain of the input
dataset during training (see Sec. 2.1 for details). The blue compGC 89.2 90.7 90.3 90.0 91.1
E2
vertical lines report the importance estimation error bars F1-scoreGC 87.0 83.3 85.4 86.3 81.7
Based on such considerations, the analysis of the purNotGC 92.2 92.6 92.7 92.6 92.6
parameter space provides a list of most interesting compNotGC 89.0 85.6 85.7 87.4 80.0
classification experiments to be performed with the AE 89.0 83.2 85.1 89.2 86.8
selected five ML models. This list is reported in Table 1.
The experiment E1 is useful to verify the efficiency purGC 85.2 77.2 80.0 86.0 84.1
by considering the four magnitudes. compGC 88.8 83.8 84.9 88.0 83.8
E3
The experiment E2 is based on the direct evaluation F1-scoreGC 87.0 80.4 82.4 87.0 83.9
of the best group of features as derived from the purNotGC 91.9 88.0 89.0 91.5 88.7
importance results.
compNotGC 89.9 83.2 85.1 89.8 88.4
The classification efficiency of the full photometric
subset of features is evaluated through the experiment AE 89.5 86.0 88.1 88.7 83.8
E4
E3. purGC 85.3 82.5 84.1 83.8 78.3
385
compGC 90.0 83.8 87.6 90.0 83.8 If we compare the NG models with the two additional
ML methods (Random Forest and MLP neural network),
F1-scoreGC 87.6 83.1 85.8 86.8 81.0
their performances appears almost the same. This implies
purNotGC 92.7 88.6 91.1 92.6 88.1 that NG methods show classification capabilities fully
compNotGC 89.1 87.5 88.1 88.2 84.1 comparable to other ML methods.
Another interesting aspect is the analysis of the
The dataset has been randomly shuffled and split into
degree of coherence among the NG models in terms of
a training set of 1470 objects (70% of the whole KB) and
commonalities within classified objects. Table 3 reports
a blind test set of 630 objects (the residual 30% of the
the percentages of common predictions for the objects
KB).
correctly classified by considering, respectively both and
These datasets have been used to train and test the single classes. On average, the three NG models are in
selected five ML classifiers. The analysis of results, agreement among them for about 80% of the objects
reported in Table 2, has been performed on the blind test correctly classified.
set, in terms of the statistical estimators defined in
Table 3 Statistics for the three NG models related to the
Sec. 4.2.
common predictions of the correctly classified objects.
5 Discussion and Conclusions Second column is referred to both classes, while the
third and fourth columns report, respectively, the
As already underlined, main goal of this work is the statistics for single classes
validation of NG models as efficient classifiers in noising GC+notGC GC notGC
and multi-dimensional problems, with performances at EXP ID
% % %
least comparable to other ML methods, considered
“traditional” in terms of their use in such kind of E1 86.0 85.4 86.9
problems. E2 79.8 79.8 79.8
By looking at Table 2 and focusing on the statistics E3 81.1 82.5 79.2
for the three NG models, it is evident that their result is E4 77.8 77.4 78.4
able to identify GCs from other background objects,
This is also confirmed by looking at the Figure 4,
reaching a satisfying tradeoff between purity and
where the tabular results of Table 3 are showed through
completeness in all experiments and for both classes. The
the Venn diagrams, reporting also more details about
occurrence of statistical fluctuations is mostly due to the
their classification commonalities.
different parameter space used in the four experiments.
Nevertheless, none of the three NG models overcome the
others in terms of the measured statistics.
Figure 4 The Venn diagram related to the prediction of all (both GCs and not GCs) correctly classified objects
performed by the three Neural Gas based models (GNG, GNGRBF and SGNG) for the experiments, respectively, E1
(a), E2 (b), E3 (c) and E4 (d). The intersection areas (dark grey in the middle) show the objects classified in the same
way by different models. Internal numbers indicate the amount of objects correctly classified for each sub-region
Finally, from the computational efficiency point of internal structure, their complexity strongly depends on
view, the NG models have theoretically a higher the nature of the problem and its parameter space.
complexity than Random Forest and neural networks. Nevertheless, all the presented ML models have a
But, since they are based on a dynamic evolution of the variable architectural attitude to be compliant with the
386
parallel computing paradigms. Besides the [6] Brescia, M., Cavuoti, S., Longo, G., Nocella, A.,
embarrassingly parallel architecture of the Random Garofalo, M., et al.: DAMEWARE: A Web
Forest, the use of optimized libraries, like Theano[1], Cyberinfrastructure for Astrophysical Data Mining.
make also models like MLP highly efficient. From this PASP. 126, 942 (2014). doi: 10.1086/677725
point of view NG models have a high potentiality to be [7] Brescia, M., Longo, G.: Astroinformatics, Data
parallelized. By optimizing GNG, the GNGRBF would Mining and the Future of Astronomical Research.
automatically benefit, since both share the same search Nuclear Instruments and Methods in Physics
space, except for the RBF training additional cost. In Research A, 720, pp. 92-94, Elsevier (2013). doi:
practice, the hidden layer of the supervised network 10.1016/j.nima.2012.12.027
behaves just like a GNG network whose neurons act as [8] Brescia, M., Cavuoti, S., Paolillo, M., Longo, G.,
inputs for the RBF network. Consequently, with the same Puzia, T.: The Detection of Globular Clusters in
number of iterations, the GNGRBF network performs a Galaxies as a Data Mining Problem. MNRAS 421,
major number of operations. 2, pp. 1155-1165 (2012). doi: 10.1111/j.1365-
On the other hand, the SGNG network is similar to 2966.2011.20375.x
the GNG network, although characterized by a neural [9] Broomhead, D.S., Lowe, D.: Radial Basis
insertion mechanism over a long period, thus avoiding Functions, Multi-Variable Functional Interpolation
too rapid changes in the number of neurons and excessive and Adaptive Networks. Technical report. RSRE
oscillations of reference vectors. Therefore, on average, 4148 (1988)
the SGNG network computational costs are higher than [10] Carlson, M.N., Holtzman, J.A.: Measuring Sizes of
the models based on the standard Neural Gas mechanism. Marginally Resolved Young Globular Clusters with
In conclusion, although a more intensive test the Hubble Space Telescope. PASP 113, 790,
campaign on these models is still ongoing, we can assert pp. 1522-1540 (2001). doi: 10.1086/324417
that Neural Gas based models are very promising as [11] Cavuoti, S., Garofalo, M., Brescia, M., Paolillo, M.,
problem-solving methods, also in presence of complex Pescapè, A., Longo, G., Ventre, G.: Astrophysical
and multi-dimensional classification and clustering Data Mining with GPU. A Case Study: Genetic
problems, especially if preceded by an accurate analysis Classification of Globular Clusters. New
and optimization of the parameter space within the Astronomy, 26, pp. 12-22 (2014). doi:
problem domain. 10.1016/j.newast.2013.04.004
Acknowledgements [12] D'Isanto, A., Cavuoti, S., Brescia, M., Donalek, C.,
Longo, G., Riccio, G., Djorgovski, S.G.: An
MB acknowledges the PRIN-INAF 2014 Glittering Analysis of Feature Relevance in the Classification
kaleidoscopes in the sky: the multifaceted nature and of Astronomical Transients with Machine Learning
role of Galaxy Clusters, and the PRIN-MIUR 2015 Methods. MNRAS 457 (3), pp. 3119-3132 (2016).
Cosmology and Fundamental Physics: illuminating the doi: 10.1093/mnras/stw157
Dark Universe with Euclid.
[13] Dunn, L.P., Jerjen, H.: First Results from SAPAC:
MB, GL and MP acknowledge the H2020-MSCA- Toward a Three-dimensional Picture of the Fornax
ITN-2016 SUNDIAL (SUrvey Network for Deep Cluster Core. AJ 132 (3), pp. 1384-1395 (2006).
Imaging Analysis and Learning), financed within the doi: 10.1086/506562
Call H2020-EU.1.3.1. [14] Fritzke, B.: A Growing Neural Gas Network Learns
References Topologies. In: Advances in Neural Information
Processing System, 7, G. Tesauro, D.S. Touretzky
[1] Al-Rfou, R., Alain, G., Almahairi, A. et al.: and T.K. Leen (eds.), MIT Press, Cambridge MA
Theano: A {Python} Framework for Fast (1995)
Computation of Mathematical Expressions. arXiv [15] Fritzke, B.: Supervised Learning with Growing
e-printsabs/1605.02688 (2016) Cell Structures. In: Advances in Neural Information
[2] Annunziatella, M., Mercurio, A., Brescia, M., Processing System, 6, Cowan, J.D., Tesauro, G.,
Cavuoti, S., Longo, G.: Inside Catalogs: A and Alspector, J. (eds.), Morgan-Kaufmann, pp.
Comparison of Source Extraction Software. PASP 255-262 (1994)
125, 923 (2013). doi: 10.1086/669333 [16] Guyon, I., Elisseeff, A.: An Introduction to
[3] Astroinformatics. In: Brescia, M., Djorgovski, Variable and Feature Selection. JMLR 3, pp. 1157-
S.G., Feigelson, E.D., Longo, G., Cavuoti, S. (eds.) 1182 (2003)
International Astronomical Union Symposium, 325 [17] Harrell, F.E.: Regression Modeling Strategies.
(2017). ISBN: 9781107169951 Springer-Verlag (2001). ISBN 0-387-95232-2
[4] Bertin, E., Arnouts, S.: SExtractor: Software for [18] Hughes, G.F.: On the Mean Accuracy of Statistical
Source Txtraction. A&A Suppl. Series, 117, Pattern Recognizers. IEEE Transactions on
pp. 393-404 (1996). doi: 10.1051/aas:1996164 Information Theory, 14 (1), pp. 55-63 (1968).
[5] Breiman, L.: Machine Learning, 45. Springer Eds., doi:10.1109/TIT.1968.1054102
pp. 25-32 (2001)
387
[19] Koekemoer, A.M., Fruchter, A.S., Hook, R.N., [24] Montoro, J.C.G., Abascal, J.L.F.: The Voronoi
Hack, W.: MultiDrizzle: An Integrated Pyraf Script Polyhedra as Tools for Structure Determination in
for Registering, Cleaning and Combining Images. Simple Disordered Systems. J. Phys. Chem., 97
In: The 2002 HST Calibration Workshop. Santiago (16), pp. 4211-4215 (1993). doi:
Arribas, Anton Koekemoer, and Brad Whitmore 10.1021/j100118a044
(eds.). Baltimore, MD: Space Telescope Science [25] Paolillo, M., Puzia, T., Goudfrooij, P. et al.:
Institute (2002) Probing the GC-LMXB Connection in NGC 1399:
[20] Jirayusakul, A., Aryuwattanamongkol, S.: A A Wide-field Study with the Hubble Space
Supervised Growing Neural Gas Algorithm for Telescope and Chandra. ApJ, 736 (2), p. 90 (2011).
Cluster Analysis. Springer-Verlag (2006) doi: 10.1088/0004-637X/736/2/90
[21] Martinez, T., Schulten, K.: A Neural-Gas Network [26] Pedregosa, F., Varoquaux, G., Gramfort, A. et al.:
Learns Topologies. In: Artificial Neural Networks. Scikit-learn: Machine Learning in Python. JMLR,
T. Kohonen, K. Makisara, O. Simula, and J. Kangas 12, pp. 2825-2830 (2011)
(eds.), Amsterdam, The Netherlands, Elsevier, [27] Puzia, T., Paolillo, M., Goudfrooij, P., Maccarone,
pp. 397-402 (1991) T.J., Fabbiano, G., Angelini, L.: Wide-field Hubble
[22] Martinez, T., Berkovich, G., Schulten, K.J.: Neural Space Telescope Observations of the Globular
Gas Network for Vector Quantization and its Cluster System in NGC 1399. ApJ, 786 (2), p. 78
Application to Time-Series Prediction. In: IEEE (2014). doi: 10.1088/0004-637X/786/2/78
Transactions on Neural Networks, 4 (4), pp. 558- [28] Stehman, S.V.: Selecting and Interpreting
569 (1993) Measures of Thematic Classification Accuracy.
[23] McCulloch, W., Pitts, W., Bulletin of Mathematical Remote Sensing of Environment, 62 (1), pp. 77-89
Biophysics, 5 (4), pp. 115-133 (1943) (1997). doi:10.1016/S0034-4257(97)00083-7
388