<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Optimization of Marketing Decisions Based on Machine Learning: Case for Telecommunications</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Galyna Chornous</string-name>
          <email>Galyna.Chornous@knu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yana Fareniuk</string-name>
          <email>yfareniuk@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cost-Sensitive</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Taras Shevchenko National University of Kyiv</institution>
          ,
          <addr-line>90-A, Vasylkivska st., Kyiv, 03022</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <fpage>112</fpage>
      <lpage>124</lpage>
      <abstract>
        <p>Among the main marketing tasks are maintaining the clients and increasing their activity. Based on this, tasks are consumer segmentation and improving communication with them. Machine learning helps to analyze information about subscribers and their use of the company's services and find hidden insights to optimize marketing activities. The purposes of the research are to propose appropriate methods for solving the problem of clustering for customers segmentation and classification of clients who will respond positively to E-mail for the optimization of advertising mailings. The modeling was implemented based on data of the Ukrainian telecommunication constructing Self Organizing Map (SOM) with the g-means algorithm and k-means clustering to develop subscriber profiles by identifying their similar behavior in terms of frequency, duration of consumption, as well as expenses; determination of the most profitable customer segments. Such information will create a basis for the development of marketing activities aimed at certain groups of customers (personalized communications) and optimization of costs for targeted SMS/E-mail mailing. In order to minimize costs for clients who will not respond to advertising, such classification methods as JRip, DecisionTable, IBk, SMO, NaiveBayes, J48 (C4.5), RandomForest, Logistic regression and others were implemented for the Response to Mailing variable, considering the sampling imbalance, which was solved by an oversampling algorithm. A RandomForest, J48 and IBk models have the highest quality and are recommended for implementation in order to optimize advertising costs. Thus, based on the applied methods, the company can tailor the mailing to those customers who are more likely to respond. So, the research confirms the feasibility of using models in the clustering and classification of consumers to optimize marketing activities.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>telecommunication.</p>
      <p>machine learning, consumer segmentation, optimizing mailings, E-mail,</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Marketing specialists around the world are trying to find the best solution for their marketing
activities. Disruptive technologies such as data analytics and machine learning have changed the ways
businesses operate. Of all the revolutionary technologies, artificial intelligence is the technological
disruptor and has enormous potential for marketing transformation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Data analytics provides more
valuable insights to strengthen business success and make real-time business decisions by scrutinizing
and deeply analyzing these data to choose a customized decision with a high level of sophistication
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Effective marketing can be built on the basis of high-quality and comprehensive information
about the market, competition and consumers. Marketing research is necessary for an explanation of
the behavior of the company's customers, determination of possible prospects development and
increasing customer satisfaction, which will have a positive influence on the business results.
Advanced analysis, mathematical tools and machine learning algorithms allow companies to build
      </p>
      <p>
        2022 Copyright for this paper by its authors.
intelligent models and Decision Support Systems capable of learning from this data. Intelligent data
analysis allows to solve such tasks as customer segmentation, management of customer outflow and
determination of the best way to retain them; forecasting the response to the offer; effective attraction
of new customers, etc. For businesses, the consumer is the subject of increased attention, because his
behavior significantly affects the effectiveness of their marketing activities. Behavioral segmentation
of consumers helps to increase sales, which affects consumer loyalty [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. As a result, the marketing
tasks of customer segmentation and the development of an effective customer relationship strategy are
very important for all companies on the market. Segmentation of the client’s base is a basis for
forming effective communication with different target groups by the development of personal
advertising propositions.
      </p>
      <p>
        Advertising is an important competitive tool. Media activity attracts consumers, increases brand
awareness, creates loyalty to the company, distinguishes it from competitors or changes the taste of
consumers. The company’s share of the market drops in the case of the absence of advertising [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
Media is a means of communication that generates various marketing results among consumers. With
the occurrence of mobile communication and smartphones, consumer preferences can be
predetermined and therefore advertising can be delivered to consumers in a multimedia format at the
right time and place with the right message. As marketing communications spread, the capability to
target the right audience becomes increasingly important. Audience targeting practices in media tend
to highlight the demographics, behavior and other consumer characteristics as a basis for selecting the
right messages for each audience [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In the case of this new advertising opportunity, the development
of personalized mobile advertising to meet consumer needs is becoming an important challenge [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        The mobile advertising paradigm is shifting to personalized advertising services for each consumer
in this era of data. In the telecommunications market, the growing demand for smart devices and the
emergence of 4G mobile networks have increased the use of mobile services with increased
competition among category players. Lately, as the mobile ecosystem has become more complex,
marketing specialists are focusing on targeted marketing to clients to maximize ad impact [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and
increase revenue.
      </p>
      <p>The purposes of the research are to propose appropriate machine learning methods for solving the
problem of clustering for the segmentation of customers and classification of clients who will respond
positively to E-mail for the optimization of advertising mailings. The results of this research can be
used as a basis for customer relationship management.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Literature review</title>
      <p>
        Well-done segmentation leads to a better understanding of the market and customer needs. The
research [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] attempts to develop a new methodological approach combining Recency, Frequency and
Monetary with the K-means clustering and provides a useful tool and valid methodology for
marketing specialists and decision-makers to accurately identify the most profitable consumer
segments.
      </p>
      <p>
        Arunachalam and Kumar [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] evaluate the effectiveness of different clustering approaches for
finding profitable consumer segments. The data are analyzed by hierarchical clustering, K-Medoids,
fuzzy clustering and Self Organizing Maps (SOMs). The effectiveness of different clustering methods
differs considerably in practice. The obtained results indicate that clustering based on Fuzzy and SOM
are comparably more effective than traditional techniques to detect hidden structures in datasets.
Segments derived from SOM have more potential to provide interesting and useful insights for
databased decision-making in business practice. Pukala R. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] implements the artificial neural networks
(ANNs) for quantifying the risks of an innovative company by the Kohonen network. Approaches to
market segmentation and consumer diagnostics based on multivariate statistical analysis and ANN are
considered in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ][
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] (market segmentation using the SOM and further refinement of the results by
the k-means algorithm, and consumer diagnosis by discriminant analysis and multilayer perceptron).
      </p>
      <p>
        Zethmayr and Makhija [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] apply k-means clustering to customers according to demographics.
Data analysis in the paper [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] was carried out by clustering according to Ward's methodological
approach, which identified groups with different socio-demographic characteristics and hierarchical
preferences. Ortiz et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ][
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] explore the two-stage clustering for gaining a deeper understanding
of consumers’ consumption behavior by profiling and identification of consumer segments
considering their habits and lifestyles. The purpose of the paper [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] is the determination the
motivational profile of consumers with a factor-cluster analysis using exploratory factor analysis with
an unweighted least squares method. The discriminant analysis established factor importance and
demographics.
      </p>
      <p>
        One of the goals of training in predictive analytics is to create a model from a set of data. The goal
of various strategies and experiments is to create a more accurate forecasting model. The aims of [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]
are to take sequential steps to find an accurate model for data and save it for future implementation
with Python. An integrated intelligent system for monitoring, modeling and managing the life cycle of
the company's products has been developed in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. This system is presented in the form of a three
equations structure, which functions in conditions of instability. The system is based on the concepts,
principles, a set of nonlinear models and methods of decision-making and management.
      </p>
      <p>
        One of the biggest challenges to creating more successful marketing strategies in the telecom
market is understanding the diversity of consumer needs and the identification of consumer segments
[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. In consumer research, segmentation has been widely used to identify subsets of consumers based
on their preferences. Since the last decade, a more comprehensive assessment of product performance
has led to the consideration of a variety of information. Verain et al. [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ][
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] determine consumer
segments and explore differences between them by consumers’ perceptions. To determine consumer
segments a three-way cluster analysis around the latent variables approach is examined in [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ][
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
This method groups consumers into clusters and evaluate for each cluster an associated product latent
variable, attribute weights and a set of consumer indicators that can help to determine product
characteristics for the cluster. In studies where, in addition to preference indicators, external
information about products as well as about consumers is available, the clustering by latent variables
(CLV) methodology can be used for customer segmentation. A direct approach, L-CLV has
demonstrated its competence to detect consumer segmentation related to a large number of
sociological and behavioral parameters [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ][
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
      </p>
      <p>
        Delley and Brunner [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] investigate consumer segmentation using hierarchical cluster analysis.
Consumer descriptions varied greatly between groups, indicating heterogeneity. These results indicate
the need to study segmentation during data analysis. Techniques of hierarchical segmentation were
used in [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] to determine different groups of consumers based on their values and lifestyles. The
results contribute to the theoretical and practical aspects of customer segmentation. The
recommendations and findings emphasize the importance of implementing different strategies for
each segment. Cha and Park [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ][
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] show that the results of clustering can be used to find the
appropriate strategy for each cluster.
      </p>
      <p>
        The research by [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] presents an optimal predictive segmentation algorithm to identify subgroups
that are homogeneous with regard to certain patterns in customer attributes and predictive to the
desired result. The authors create an intuitive segmentation with high interpretability and an optimal
targeting process for the company’s clients. In this setting, the business develops a small number of
messages that will be sent to appropriately selected customers who are most likely to respond to
different message types. The proposed method uses consumption, demographics, and participation
data to extract underlying predictive rules from the dataset using machine learning algorithms.
      </p>
      <p>
        Marketers use marketing logic to target ads to specific consumer segments. However, there is not
always a clear alignment between consumer segmentation and targeting, which can lead to a potential
reduction in effectiveness [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. Predicting the probability that a user will respond to a particular ad
has been a common problem in advertising that has attracted much research attention. In recent years,
a growing number of new learning models have emerged to improve ad CTR prediction [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ].
      </p>
      <p>
        Over the past decades, the rapid development in the area of information and communication
technologies has led to the expansion of the Internet by broad segments of the population. Thanks to
various technological advances, the Internet has made it possible for advertisers to reach their target
audience [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ][
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. Artificial intelligence technologies have numerous applications for online
advertising, particularly to optimize the coverage of target audiences. Choi and Lim [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] investigate
and categorize different techniques of machine learning used to improve targeted online advertising.
A neural network classifier is proposed by Abrahams et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] to assign ads to groups that represent
different media channels. In its ability to classify unviewed ads, the model shows a higher
performance of classification than the result generated by a random model by 100-300%. Authors
suggest using ANN for automated media planning and advertising targeting. Mobile advertising has
evolved into a technology that allows an advertiser to effectively and efficiently promote products or
services to target consumers.
      </p>
      <p>
        Direct marketing is a crucial instrument for the company’s promotion, among which direct mailing
is quite important. One approach to improving direct mailing targeting is response modeling, which is
predictive modeling that assigns the probability of future responses to customers based on their
history with a company. Coussement et al. [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] present well-known statistical methods for data
classification and analysis (logistic regression, linear and quadratic discriminant analysis, naive
Bayes, neural networks, decision trees such as CHAID, CART and C4.5, and the kNN algorithm).
The results show that data mining algorithms (CHAID, CART and neural networks) have well
performance, followed by simplified statistical classifiers such as logistic regression and linear
discriminant analysis. The research by Reynaldo [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] explores a social network user gender prediction
model using AdaBoost, XGBoost, Support Vector Machine and Naive Bayes Classifier combined
with grid search and K-Fold validation. Kaefer et al. [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ] develop an alternative scoring approach for
classifying new clients as "good" or "bad" prospects for direct marketing. The research proves that the
approach of using only demographics to profile consumers can be enhanced by observing their
purchases. The authors establish multinomial logit and neural network models, which can help to
classify and target potential consumers.
      </p>
      <p>
        Perception of SMS advertising has a significant direct or indirect impact on consumer purchase
intention. However, there is a dearth of comprehensive research that suggests the predictors of SMS
advertising perception and the process by which it impacts purchase intention. The research [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ]
focuses on developing a model based on the stimulus-organism-response framework with a two-stage
hybrid model using PLS-SEM and ANN. Research benefits marketing specialists by facilitating better
decision-making for developing effective advertising campaigns using mobile SMS advertising.
      </p>
      <p>
        SMS helps companies to make direct interactions with their target consumers at any time and
location using their mobile phones. Using a modified technology acceptance model, the paper
[
        <xref ref-type="bibr" rid="ref34">34</xref>
        ][
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] explores the influencing factors of acceptance of SMS advertising by consumers. The
usefulness is important in establishing favorable consumer attitudes toward SMS advertising. Authors
show that consumers perceive SMS advertising differently. Email should be direct and personalized.
      </p>
      <p>
        Measuring the effectiveness of email marketing is difficult. To maintain competitiveness,
managers must maximize profits from mailings by deciding who should receive them. Before
achieving the main purpose of converting sales, the intermediate goal of email campaigns is to capture
interest and drive traffic to the website. The paper [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ] examines the relevance of variables that
impact recipient interest in promotional emails and provides companies with actionable and useful
insights on how to plan and deploy email marketing strategies with higher efficiency. Paper [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]
presents a two-step approach that allows companies to consider the dynamic consequences of mailing
and make effective mailing decisions by maximizing the customer’s long-term value. The authors
suggest a heterogeneous hidden Markov model to capture the interactive dynamics between customers
and mailings and use the resulting parameters to develop optimal mailing decisions using a Partial
Observable Markov Decision Process. Both immediate and remote consequences of mailings are
taken into account. Although email marketing is one of the most cost-effective tools, it remains
problematic due to low email open rates and a high percentage of unsubscribed campaigns. The
structure and content of the topic are investigated in [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ] together with various machine learning
techniques (Random Forest, Decision Trees, ANN, Naive Bayes, Support Vector Machines and
Gradient Boosting). The results show that combining the data leads to more accurate classification.
      </p>
      <p>
        Nowadays, mobile advertising focuses on powerful algorithms for personalized recommendations.
Chen and Hsieh [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] propose the fuzzy Delphi method to determine the main personalized attributes in
a personalized mobile advertising message for various products.
      </p>
      <p>But many questions about the peculiarities of optimization marketing decisions in aspects of
customer relationship management remain insufficiently studied, in particular, the customer base
segmentation in the telecommunication market using clustering methods, as well as optimization of
email marketing via classification of clients who will positively or negatively respond to mailings.
Solving such marketing tasks forms an information-analytical basis for marketing activity
optimization and making effective decisions for further business development and marketing strategy.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Methodology and dataset description</title>
      <p>In such a high-tech field as telecommunications, as well as in the field of marketing in general,
machine learning methods and approaches have been widely used. Among the main problems that
need to be solved are, first of all, those related to loyalty programs and maintenance of the existing
client base, as well as the attraction of new consumers of services.</p>
      <p>Internal systems of telecommunications companies accumulate large volumes of data every day.
First of all, this is information about subscribers and statistics on their use of the company's services.
The analysis of such information without using the capabilities of information technologies is
ineffective, which creates significant opportunities for the use of approaches and methods of machine
learning to optimize marketing activities and increase their effectiveness.</p>
      <p>The research goals are to propose the relevant methods to solve the task of clustering (client
segmentation) and classification for optimization of advertising mailing for the consumer base. The
modeling was implemented based on a database of one of the Ukrainian telecommunications
companies, which provides mobile communication and the Internet. The management decided on the
need to segment the subscriber base considering the purpose of optimizing the marketing activities, in
particular for building profiles of subscribers by identifying their similar behavior in terms of
frequency, duration of service use, as well as the level of expenses; assessment and determination of
the most profitable customer segments. This forms a scientific hypothesis about factors determining
consumer segments. In the future, such information will create a basis for the development of
marketing activities aimed at certain groups of customers (personalized promotional
communications); development of new tariff plans; optimization of costs for addressed
SMS/Viber/Email distribution in relation to new services and tariffs; predicting and avoiding the outflow of
customers to other competing companies.</p>
      <p>The data downloaded from the internal system, is a table with the following fields: age of the
client, average monthly expenses (average amount of expenses per subscriber for mobile
communication and mobile Internet), the average duration of calls (average number of minutes for
outgoing calls by a subscriber per month), daytime/evening/night activity per month (number of
activity (calls, messages, Internet connections) per month in the morning and daytime/evening/night
time, respectively), activity with other cities/countries per month, the share of calls to landline phones
(city numbers), the volume of Internet per month (number of Mb of Internet consumed). Only active
subscribers of the company who regularly use mobile communication and/or mobile Internet services
over the past few months were selected. The dataset for the experiment contains information about
4591 clients of this company and will be used to realize customer segmentation via Kohonen SOM
with a g-means algorithm in Deductor Studio and k-mean clustering in Weka. Such methods were
selected for solving the task of clustering, considering the specificity of the dataset, which contains
client-related data about their consumption of the company’s services, and the statement that it is
useful techniques for the goal of segmentation, as we mentioned in the literature review. In addition,
the task of optimizing mailings to customers with the goal of minimizing costs for those who do not
respond to advertising activity often arises in marketing. Using the example of a database that
contains information about customers and their activity, the application of classification methods for
advertising response will be demonstrated to increase the effectiveness of advertising activity.</p>
      <p>The dataset, which was used for the experiment contains information on 13,500 customers,
including known responses to the advertising E-mail and information such as gender, age, number of
years the customer has been a company’s client, the total value and total number of all purchases, the
facts of service calls support, etc. In total, 9 independent and 1 dependent variables are available for
analysis. The task is reduced to binary classification, where the variable "Response" was chosen as the
class indicator and in the case of a positive answer, "1" means feedback, and "0" means no feedback.</p>
      <p>The scientific hypothesis is the presented factors describe the probability of a positive or negative
response to ad message and machine learning methods can effectively predict this response. The task
is to classify consumers as clearly as possible according to the probability of responding to an
advertising message. The results were analyzed in order to formulate recommendations for
minimizing costs for new mailings to customers. In addition, the following information is also known
for this company: costs for one mailing CM = UAH 1, costs for retaining 1 client CR = UAH 9,
expected revenue from 1 client R = UAH 20, so it is assumed that the maximization of income from
communication with the consumer through the mailing. Consider the possible classification results
(Table 1).</p>
      <p>The total revenue will be TP*(R-CM-CR)-FP*CM. To evaluate the predictive power of the
classification model, it is necessary to compare the expected revenue with that which can be obtained
under the condition of mass mailing to all participants. In order to choose the best classifier, the
ZeroR, PART, OneR, JRip, Decision Table, IBk, SMO, Naive Bayes, J48(C4.5), Random Forest,
Logistic regression, and AdaBoostM1 methods will use. They were selected for solving the
classification task, considering the scientific achievements of researchers, which prove the high
performance of mentioned approaches, and the specific task of cost optimization with the relevant
dataset. The Cost Sensitive option will also be implemented to the mentioned methods. All of them
will be compared and the methods with the best performance will be selected for implementation
according to the goal of minimizing costs for ineffective mailings and improving revenue.</p>
      <p>The research was implemented through step-by-step analysis and modeling and the overall process
of optimization of marketing decisions for telecommunications companies through machine learning
technologies are look as shown in Figure 1.</p>
    </sec>
    <sec id="sec-5">
      <title>4. Results</title>
      <p>At the initial stage of work with the clustering problem, the SOM algorithm was applied with
automatic selection of the number of clusters. For implementation, the Deductor Studio software was
used, as a result of which 9 clusters (0-8) with different profiles were formed.</p>
      <p>On the basis of the obtained SOMs (Fig. 2), it is possible to analyze in detail the groups of
consumers based on various characteristics and the formed customer clusters. Thus, analyzing the
"Age" map, three age groups can be clearly distinguished: young people, middle-aged people, and
people over 45 years old. Focusing on youth in more detail, we can understand that it is quite
heterogeneous and several separate clusters can be distinguished among it. The first is located in the
upper right corner and is characterized by those customers who actively use the company's services in
the evening and at night, use Internet services to a large extent. As a result, they spend more on
mobile communication and mobile Internet than other representatives of this age group. This segment
includes most of those who prefer activity at night. It can be predicted that these are students and
young people who often spend the evening outside the house or communicate with friends or watch
video content. A small group of young people is concentrated below, which is not distinguished by
the activity of using services neither during the day, nor in the evening, nor even at night, therefore, as
a result, their monthly expenses for communication and the Internet in this cluster are small.</p>
      <p>The rest of the people of this age group are not distinguished by anything special: moderate
expenses for communication and the Internet and to a greater extent activity in the evening. It can be
predicted that most of the youth got here. Thus, we clearly identified three clusters in the youth age
group. Continuing the interpretation of SOM, we will focus on people of mature and retirement age.
Let's pay attention to the pronounced cluster in the lower left part, in which high values are observed
for almost all indicators, except for the Internet, including activity with other cities and countries.
These are so-called "VIP" clients: businessmen, executives, top managers. The vast majority of them
are of mature age, they carry out a lot of activities during the day and in the evening (most likely, due
to their work) and use the mobile Internet the least. Monthly expenses for communication and Internet
in this category of subscribers are the highest among all the company's clients.</p>
      <p>On the left, in one of the clusters, a completely opposite picture is observed: people practically do
not use mobile communication and Internet services. Most likely, these are pensioners who need
mobile communication and/or the Internet primarily to receive incoming calls, and their independent
activity is minimal. Costs for this group of clients are the lowest, which may be due to the fact that the
only source of income is a pension. The rest of the people in the mature and retirement age group are
united by the fact that they are mainly active in the evening and do not use the Internet very actively.
With a greater probability, it can be working pensioners, summer residents, and parents of adult
children. The last cluster of middle-aged people includes working subscribers, but among them, there
is a group of those who are not very active in the evening (perhaps these are employees with a
nonstandard work schedule – night/evening shifts, etc.).</p>
      <p>However, the automatic determination of the number of clusters using the G-means algorithm
produces 9 clusters in this case, which can create difficulties in practical application due to their large
number. It is recommended to reduce the number of clusters to 6, and to apply the k-means algorithm.
6 clusters (0…5) were formed in the Weka software, each of which contains from 4% to 31% of
customers, which indicates a sufficient number of cases for training and future application of the
model. Each cluster is characterized by a unique centroid for each of the 10 indicators, which
determines the differences between them and, as a result, determines the differences in the behavior of
each group.</p>
      <p>Based on the results, we will characterize the clusters shown in Figure 3. Cluster 5 includes mature
and older people who have the lowest costs for communication and the Internet and minimal activity
in using all services, that is, it can be assumed that this cluster is formed primarily by retirees and
people with low incomes. Cluster 4 includes young people up to mature age, who have moderate
expenses for Internet and communication, and their activity especially increases in the evening and at
night, have the highest share of mobile Internet use, that is, it can be assumed that this group includes
active young people who spend their free time outside the home or actively use the Internet for
communication or entertainment. Cluster 3 includes young and mature people with moderate
expenses and the highest levels of phone conversations, especially the activity increases in the
evening. We assumed that the group includes people who actively use the company's services for
interpersonal communication. Cluster 2 includes people under 35 years of age with moderate
expenses and average indicators of activity. The group actively uses the Internet and different levels
of activity in the evening. Cluster 1 includes mature and older people with moderate spending and
average activity indicators, with minimal activity at night and on the Internet. Cluster 0 includes
people of almost all age categories who have the highest costs for communication and the Internet, the
average level of phone conversations and the highest activity during the day and evening, the highest
activity with other cities and countries, that is, it can be assumed that this group is formed "VIP
clients", that is, business representatives who use communication and the Internet for business
purposes.</p>
      <p>Age</p>
      <p>Average expenses</p>
      <p>Average calls duration</p>
      <p>Daily activity</p>
      <p>Evening activity
Night activity</p>
      <p>Activity with other cities</p>
      <p>Activity with other
countries</p>
      <p>Share of calls to landline Activity on the Internet
phones</p>
      <p>Thus, the results of this clustering create a basis for improving the product offer and the company's
transition to personalized communication with their subscribers to increase revenue per 1 user, as well
as ensure loyalty to the company. These problems will be solved later.</p>
      <p>It is important to investigate the question of whether the division of clusters is preserved in
different periods of time and the time horizon at which it is appropriate to apply these results. The
analysis proves that in the short term (1-3 years) the behavior of the customers is relatively stable
because they consume mobile communication and mobile Internet services in almost unchanged
volumes, have a relatively unchanged standard of living and income, etc. However, if we talk about
the long-term perspective (5-10 years and more), the results of clustering may have significant
deviations from reality, as it is affected by a number of factors and general market trends. In this
regard, the results of clustering must be updated during the period of annual strategic and tactical
planning, so that the results correspond to the actual behavior of customers, and marketing activities
are effectively adapted to modern conditions. However, despite the fact that the results require
regular verification, the potential and effectiveness of the application of machine learning and Data
Science methods for business are constantly growing. The conclusions formed are part of complex
research for the formation of an effective marketing strategy and, as a result, the management of
marketing activities in general. Segmentation of customers based on machine learning increases the
quality of advertising planning as it makes it possible to launch personalized communication with
digital placement tools, as well as to ensure quality management of customer loyalty due to product
improvement and its relevant offer to interested segments.</p>
      <p>The next marketing task, that needs to be solved is the optimization of e-mail mailings to
customers in order to maximize response and minimize costs for those who do not respond to
advertising activity. The application of different classification methods will help to increase the
efficiency of advertising activity via the development of effective recommendations for future
advertising mailing. The results of various classification approaches were analyzed to assess the
quality of the classification.</p>
      <p>According to the classification results, it can be seen that OneR, JRip and Decision Table show the
best accuracy (compared to the baseline Zero R classifier, the accuracy increased from 85.5% to 92.0
92.2%). Among them, the Decision Table approach showed the highest results, but if we take into
account the business goal of minimizing costs, then the error brings greater losses when there is no
feedback, and the model predicted that there should have been feedback. This error is the smallest in
the OneR method (none of the clients were incorrectly clustered for this problem). In this regard, it is
recommended to use a combination of Decision Table and OneR.</p>
      <p>Among the methods based on decision trees, the J48(C4.5) algorithm shows slightly worse results
(accuracy - 92.0%, 129 observations are classified as positive, but there was no feedback, which
implies additional costs for the company). The decision tree with standard settings is the best option
for the algorithm (highest accuracy without overtraining). The decision tree is quite extensive, but it
provides a clear understanding of the factors that influence whether a customer will respond to an
advertising message. The tree turns into the so-called "golden rules" for setting up mailings to
customers. However, the classification results indicate that it is advisable to solve the problem of
sample imbalance since the results are close to random classification by most methods (response to
feedback is distributed almost 50%/50% between classes). At the moment, the share of customers
who respond to the E-mail is 14.5%, which creates a significant imbalance between those who will
respond and those who will not respond to the advertising message. Using the oversampling
algorithm, that is, increasing the share of a certain class, we will balance the sample: considering the
current share of the response = 1 class, it is advisable to increase it by 5 times to obtain balanced
results (46% for class 1 and 54% for class 0). Let's repeat the clustering according to the algorithms
mentioned above for balanced samples, and compare the results in Table 2.</p>
      <p>According to the classification results, Table 2 shows that IBk, J48 and Random Forest have the
best accuracy (92.9-97.2%). Among them, the Random Forest approach showed the highest results
both in terms of overall accuracy and in terms of minimizing the cost of inefficient mailings. The
Random Forest algorithm is recommended for implementation in marketing planning and marketing
activities from the point of view of optimizing costs for advertising activity.</p>
      <p>The next step in improving the classification results is the application of the Cost-Sensitive
Classifier for all methods, taking into account the cost matrix presented in Table 1. According to the
classification results from Table 2 and Table 3, there is a conclusion that JBk, J48 and Random Forest
maintain the best performance in terms of accuracy (83.2-95.8%) and potential revenue. Among them,
the Random Forest approach showed the highest results without taking into account the Cost
Sensitive classification approach. The Random Forest algorithm in combination with J48 and IBk is
recommended for implementation in telecommunication companies in order to optimize costs for
advertising activity, which is one of the key areas of marketing activity. Thus, based on the applied
data classification methods, the company can tailor mailings to those customers who are more likely
to respond to the message. As a result, costs will be minimized, and revenues will increase.</p>
      <p>To maximize marketing efficiency, it is advised to implement machine learning technologies into
the regular management of consumer behavior. The effective concept of such implementation of
modeling is a cyclic process, which accumulates next stages:
 Obtaining historical data on the influencing factors and variables that describe the factors;
 Updating the models, evaluating the effectiveness of previous decisions and current results;
 Formation of recommendations for marketing activities and work with the consumer base;
 Implementation of recommended solutions.</p>
      <p>In the case of regular support of machine learning models for the company, we may determine
business tasks depending on different time intervals (monthly and quarterly). The main tasks for
weekly time intervals are the realization of business monitoring, checking the efficiency of marketing
solutions, and evaluating the quality of constructed models and their accuracy. The main tasks on a
quarterly basis are model updates and formation of recommendations for future marketing decisions,
evaluation of the effectiveness of previous solutions.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusions</title>
      <p>Customer segmentation and marketing management are crucial tasks of the marketing system of a
telecommunications company, which is developing in conditions of oversaturation of the market and
the task of increasing the quality of communication with a client become an area of potential
optimization, which will minimize unnecessary costs and generate additional growth of revenue. To
ensure the effective functioning of companies on the market, it is necessary to implement approaches,
methods and models of machine learning for customer data.</p>
      <p>The implementation of machine learning technologies provided a qualitative result of the research.
The results of constructing the SOM and k-means clustering for customer segmentation for one of the
leading Ukrainian telecommunication companies become a basis for the development of client
profiles based on their demographics and data about their consumption of the company’s services in
terms of frequency, duration of service use, as well as the level of expenses. Such an approach will
help to optimize marketing activities, in particular via assessment and determination of the most
profitable customer segments and identification of the key differences between target groups. These
differences form a basis for future development of marketing activities aimed at certain groups of
customers (for example, personalized communications and promotional activities); development of
new tariff plans; optimization of costs for addressed SMS/E-mail mailing; minimizing the outflow of
customers to competitors.</p>
      <p>Considering the goal of choosing the best classification model, there was an investigation of the
following methods: ZeroR, PART, OneR, JRip, Decision Table, IBk, SMO, Naïve Bayes, J48(C4.5),
Random Forest, Logistic regression and AdaBoostM1 with additionally using the Cost-Sensitive
option. As a result, the Random Forest algorithm in combination with J48 and IBk showed the best
performance and the best economic effect and is recommended for implementation in
telecommunication companies in order to optimize costs for advertising activity, which is one of the
key areas of marketing activity. Thanks to the applied classification, the company can tailor mailings
to those customers who have the highest probability of positively responding to the advertising SMS /
E-mail. These decisions will help to minimize advertising costs and increase revenue by more than
137% vs random mailing for all consumer base. The estimation of accuracy reached over 80%, which
indicates the possibility and feasibility of using models in the further classification of customer
responses to determine the most effective consumer segments.</p>
      <p>To maximize marketing efficiency, it is advised to implement machine learning technologies into
the regular management of consumer behavior and customer relationship management. The effective
concept of such implementation of modeling is a cyclic process for maintaining the actuality of
constructed models for the current business environment and consumer behavior and preferences.</p>
      <p>The results of the research, constructed models and the proposed concept of the research can be
applicated in real business practice to optimize marketing activities for both Ukrainian and
international companies in the telecommunications market by making effective data-driven decisions
and to improve the mathematical methodology of consumer segmentation and optimization of
advertising mailings. Marketing strategy optimization based on data-based decisions and finding
hidden insights in data has a significant influence on business efficiency due to the high quality and
great validity of the decision-making process in very dynamically developing conditions on the
market. As an area of future research, it is relevant to focus on overcoming the limitations of current
research (in particular, the collection of different indicators about consumer characteristics and their
service preferences), and the periodic support of constructed models in different market conditions
due to possible changes in consumer behavior. It is necessary to identify new potential factors in a
timely manner, which will lead to enhancing marketing decisions. Therefore, it is advisable to
conduct research on a regular basis, which can be effectively implemented in future marketing
activities.</p>
    </sec>
    <sec id="sec-7">
      <title>6. References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Verma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Deb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Maitra</surname>
          </string-name>
          ,
          <article-title>Artificial intelligence in marketing: Systematic review and future research direction</article-title>
          ,
          <source>International Journal of Information Management Data Insights</source>
          (
          <year>2021</year>
          ), Vol.
          <volume>1</volume>
          , Issue 1. DOI: https://doi.org/10.1016/j.jjimei.
          <year>2020</year>
          .
          <volume>100002</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Saidali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Rahich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tabaa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Medouri</surname>
          </string-name>
          ,
          <article-title>The combination between Big Data and Marketing Strategies to gain valuable Business Insights for better Production Success</article-title>
          , Procedia
          <string-name>
            <surname>Manufacturing</surname>
          </string-name>
          (
          <year>2019</year>
          ), Vol.
          <volume>32</volume>
          , pp.
          <fpage>1017</fpage>
          -
          <lpage>1023</lpage>
          . DOI:
          <volume>10</volume>
          .1016/j.promfg.
          <year>2019</year>
          .
          <volume>02</volume>
          .316.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>W. H.</given-names>
            <surname>Susilo</surname>
          </string-name>
          ,
          <article-title>An Impact of Behavioral Segmentation to Increase Consumer Loyalty: Empirical Study in Higher Education of Postgraduate Institutions</article-title>
          at Jakarta, Procedia - Social and Behavioral
          <string-name>
            <surname>Sciences</surname>
          </string-name>
          (
          <year>2016</year>
          ), Vol.
          <volume>229</volume>
          , pp.
          <fpage>183</fpage>
          -
          <lpage>195</lpage>
          . DOI: https://doi.org/10.1016/j.sbspro.
          <year>2016</year>
          .
          <volume>07</volume>
          .128.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Amir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Machowska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Troege</surname>
          </string-name>
          ,
          <article-title>Advertising patterns in a dynamic oligopolistic growing market with decay</article-title>
          ,
          <source>Journal of Economic Dynamics and Control</source>
          (
          <year>2021</year>
          ), Vol.
          <volume>131</volume>
          . DOI: https://doi.org/10.1016/j.jedc.
          <year>2021</year>
          .
          <volume>104229</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Abrahams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Coupey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. X.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Barkhi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Manasantivongs</surname>
          </string-name>
          ,
          <article-title>Audience targeting by Bto-B advertisement classification: A neural network approach, Expert Systems with Applications (</article-title>
          <year>2013</year>
          ), Vol.
          <volume>40</volume>
          , Issue 8, pp.
          <fpage>2777</fpage>
          -
          <lpage>2791</lpage>
          . DOI:
          <volume>10</volume>
          .1016/j.eswa.
          <year>2012</year>
          .
          <volume>10</volume>
          .068.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.-T.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.-P.</given-names>
            <surname>Hsieh</surname>
          </string-name>
          ,
          <article-title>Personalized mobile advertising: Its key attributes, trends, and social impact, Technological Forecasting</article-title>
          and Social
          <string-name>
            <surname>Change</surname>
          </string-name>
          (
          <year>2012</year>
          ), Vol.
          <volume>79</volume>
          , Issue 3, pp.
          <fpage>543</fpage>
          -
          <lpage>557</lpage>
          . DOI: https://doi.org/10.1016/j.techfore.
          <year>2011</year>
          .
          <volume>08</volume>
          .011.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>K. Y.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. G.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Marketing insights for mobile advertising and consumer segmentation in the cloud era: A Q-R hybrid methodology and practices, Technological Forecasting</article-title>
          and Social
          <string-name>
            <surname>Change</surname>
          </string-name>
          (
          <year>2015</year>
          ), Vol.
          <volume>91</volume>
          , pp.
          <fpage>78</fpage>
          -
          <lpage>92</lpage>
          . DOI: https://doi.org/10.1016/j.techfore.
          <year>2014</year>
          .
          <volume>01</volume>
          .011.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Pantelous</surname>
          </string-name>
          ,
          <article-title>Market segmentation using high-dimensional sparse consumers data</article-title>
          ,
          <source>Expert Systems with Applications</source>
          (
          <year>2020</year>
          ), Vol.
          <volume>145</volume>
          . DOI:
          <volume>10</volume>
          .1016/j.eswa.
          <year>2019</year>
          .
          <volume>113136</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Arunachalam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <article-title>Benefit-based consumer segmentation and performance evaluation of clustering approaches: An evidence of data-driven decision-making, Expert Systems with Applications (</article-title>
          <year>2018</year>
          ), Vol.
          <volume>111</volume>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>34</lpage>
          . DOI: https://doi.org/10.1016/j.eswa.
          <year>2018</year>
          .
          <volume>03</volume>
          .007.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Pukała</surname>
          </string-name>
          ,
          <article-title>Use of Neural Networks in Risk Assessment and Optimization of Insurance Cover in Innovative Enterprises, Engineering Management in Production</article-title>
          and Services (
          <year>2016</year>
          ), Vol.
          <volume>8</volume>
          , No.
          <issue>3</issue>
          , pp.
          <fpage>43</fpage>
          -
          <lpage>56</lpage>
          . DOI: https://doi.org/10.1515/emj-2016-0023.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>E.</given-names>
            <surname>Pesikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Zaikin</surname>
          </string-name>
          , E. Kozlova,
          <article-title>Conducting market segmentation and diagnostics of the consumer printed products by using methods of multivariate statistical analysis and artificial intelligence</article-title>
          ,
          <source>IFAC Proceedings Volumes</source>
          (
          <year>2013</year>
          ), Vol.
          <volume>46</volume>
          , Issue 9, pp.
          <fpage>2116</fpage>
          -
          <lpage>2121</lpage>
          . DOI: https://doi.org/10.3182/20130619-3-RU-
          <volume>3018</volume>
          .
          <fpage>00642</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zethmayr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Makhija</surname>
          </string-name>
          ,
          <article-title>Six unique load shapes: A segmentation analysis of Illinois residential electricity consumers</article-title>
          ,
          <source>The Electricity Journal</source>
          (
          <year>2019</year>
          ), Vol.
          <volume>32</volume>
          ,
          <string-name>
            <surname>Issue</surname>
          </string-name>
          <article-title>9</article-title>
          . DOI: https://doi.org/10.1016/j.tej.
          <year>2019</year>
          .
          <volume>106643</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Di</surname>
          </string-name>
          <string-name>
            <surname>Vita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zanchini</surname>
          </string-name>
          , G. Falcone,
          <string-name>
            <surname>M. D'Amico</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Brun</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <article-title>Gulisano, Local, organic or protected? Detecting the role of different quality signals among Italian olive oil consumers through a hierarchical cluster analysis</article-title>
          ,
          <source>Journal of Cleaner Production</source>
          (
          <year>2021</year>
          ), Vol.
          <volume>290</volume>
          . DOI: https://doi.org/10.1016/j.jclepro.
          <year>2021</year>
          .
          <volume>125795</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Díaz-Caro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tejerina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Escribano</surname>
          </string-name>
          , E. Crespo,
          <string-name>
            <given-names>P.</given-names>
            <surname>Gaspar</surname>
          </string-name>
          ,
          <article-title>Consumption of fresh Iberian pork: Two-stage cluster for the identification of segments of consumers according to their habits and lifestyles</article-title>
          , Meat
          <string-name>
            <surname>Science</surname>
          </string-name>
          (
          <year>2021</year>
          ), Vol.
          <volume>173</volume>
          . DOI:
          <volume>10</volume>
          .1016/j.meatsci.
          <year>2020</year>
          .
          <volume>108373</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Higuchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Maehara</surname>
          </string-name>
          ,
          <article-title>A factor-cluster analysis profile of consumers</article-title>
          ,
          <source>Journal of Business Research</source>
          (
          <year>2021</year>
          ), Vol.
          <volume>123</volume>
          , pp.
          <fpage>70</fpage>
          -
          <lpage>78</lpage>
          . DOI: https://doi.org/10.1016/j.jbusres.
          <year>2020</year>
          .
          <volume>09</volume>
          .030.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>E.</given-names>
            <surname>Geldiev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nenkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Petrova</surname>
          </string-name>
          ,
          <article-title>Exercise of Machine Learning Using Some Python Tools and Techniques</article-title>
          .
          <source>CBU International conference proceedings 2018: Innovations in Science and Education</source>
          ,
          <volume>21</volume>
          .-
          <fpage>23</fpage>
          .
          <fpage>03</fpage>
          .
          <year>2018</year>
          (
          <year>2018</year>
          ), pp.
          <fpage>1062</fpage>
          -
          <lpage>1070</lpage>
          . DOI: https://doi.org/10.12955/cbup.v6.
          <fpage>1295</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ramazanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Babenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Honcharenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Moisieieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dykan</surname>
          </string-name>
          ,
          <article-title>Integrated Intelligent Information and Analytical System of Management of a Life Cycle of Products of Transport Companies</article-title>
          .
          <source>Journal of Information Technology Management</source>
          (
          <year>2020</year>
          ),
          <volume>12</volume>
          (
          <issue>3</issue>
          ), pp.
          <fpage>26</fpage>
          -
          <lpage>33</lpage>
          . DOI: https://doi.org/10.22059/JITM.
          <year>2020</year>
          .
          <volume>76291</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>M. C. Onwezen</surname>
            ,
            <given-names>M. J.</given-names>
          </string-name>
          <string-name>
            <surname>Reinders</surname>
            ,
            <given-names>I. A. van der</given-names>
          </string-name>
          <string-name>
            <surname>Lans</surname>
            ,
            <given-names>S. J.</given-names>
          </string-name>
          <string-name>
            <surname>Sijtsema</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Jasiulewicz</surname>
            ,
            <given-names>M. D.</given-names>
          </string-name>
          <string-name>
            <surname>Guardia</surname>
            ,
            <given-names>Luis</given-names>
          </string-name>
          <string-name>
            <surname>Guerrero</surname>
          </string-name>
          ,
          <article-title>A cross-national consumer segmentation based on food benefits: The link with consumption situations and food perceptions</article-title>
          ,
          <source>Food Quality and Preference</source>
          (
          <year>2012</year>
          ), Vol.
          <volume>24</volume>
          , Issue 2, pp.
          <fpage>276</fpage>
          -
          <lpage>286</lpage>
          . DOI: https://doi.org/10.1016/j.foodqual.
          <year>2011</year>
          .
          <volume>11</volume>
          .002.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>M. C.D. Verain</surname>
            ,
            <given-names>S. J.</given-names>
          </string-name>
          <string-name>
            <surname>Sijtsema</surname>
          </string-name>
          , G. Antonides,
          <article-title>Consumer segmentation based on food-category attribute importance: The relation with healthiness and sustainability perceptions</article-title>
          ,
          <source>Food Quality and Preference</source>
          (
          <year>2016</year>
          ), Vol.
          <volume>48</volume>
          ,
          <string-name>
            <surname>Part</surname>
            <given-names>A</given-names>
          </string-name>
          , pp.
          <fpage>99</fpage>
          -
          <lpage>106</lpage>
          . DOI:
          <volume>10</volume>
          .1016/j.foodqual.
          <year>2015</year>
          .
          <volume>08</volume>
          .012.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>V.</given-names>
            <surname>Cariou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. F.</given-names>
            <surname>Wilderjans</surname>
          </string-name>
          ,
          <article-title>Consumer segmentation in multi-attribute product evaluation by means of non-negatively constrained CLV3W, Food Quality and Preference (</article-title>
          <year>2018</year>
          ), Vol.
          <volume>67</volume>
          , pp.
          <fpage>18</fpage>
          -
          <lpage>26</lpage>
          . DOI: https://doi.org/10.1016/j.foodqual.
          <year>2017</year>
          .
          <volume>01</volume>
          .006.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>E.</given-names>
            <surname>Vigneau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Charles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>External preference segmentation with additional information on consumers: A case study on apples</article-title>
          ,
          <source>Food Quality and Preference</source>
          (
          <year>2014</year>
          ), Vol.
          <volume>32</volume>
          ,
          <string-name>
            <surname>Part</surname>
            <given-names>A</given-names>
          </string-name>
          , pp.
          <fpage>83</fpage>
          -
          <lpage>92</lpage>
          , https://doi.org/10.1016/j.foodqual.
          <year>2013</year>
          .
          <volume>05</volume>
          .007.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>M.</given-names>
            <surname>Delley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Brunner</surname>
          </string-name>
          ,
          <article-title>A segmentation of Swiss fluid milk consumers and suggestions for target product concepts</article-title>
          ,
          <source>Journal of Dairy Science</source>
          (
          <year>2020</year>
          ), Vol.
          <volume>103</volume>
          , Issue 4, pp.
          <fpage>3095</fpage>
          -
          <lpage>3106</lpage>
          . DOI: https://doi.org/10.3168/jds.2019-
          <volume>17325</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>A.</given-names>
            <surname>Díaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gómez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Molina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <article-title>A segmentation study of cinema consumers based on values and lifestyle</article-title>
          ,
          <source>Journal of Retailing and Consumer Services</source>
          (
          <year>2018</year>
          ), Vol.
          <volume>41</volume>
          , pp.
          <fpage>79</fpage>
          -
          <lpage>89</lpage>
          . DOI: https://doi.org/10.1016/j.jretconser.
          <year>2017</year>
          .
          <volume>12</volume>
          .001.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Y. U.</given-names>
            <surname>Cha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <article-title>Consumer preference and market segmentation strategy in the fast moving consumer goods industry: The case of women's disposable sanitary pads</article-title>
          ,
          <source>Sustainable Production and Consumption</source>
          (
          <year>2019</year>
          ), Vol.
          <volume>19</volume>
          , pp.
          <fpage>130</fpage>
          -
          <lpage>140</lpage>
          . DOI: https://doi.org/10.1016/j.spc.
          <year>2019</year>
          .
          <volume>04</volume>
          .002.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>A.</given-names>
            <surname>Albert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maasoumy</surname>
          </string-name>
          ,
          <article-title>Predictive segmentation of energy consumers</article-title>
          ,
          <source>Applied Energy</source>
          (
          <year>2016</year>
          ), Vol.
          <volume>177</volume>
          , pp.
          <fpage>435</fpage>
          -
          <lpage>448</lpage>
          . DOI: https://doi.org/10.1016/j.apenergy.
          <year>2016</year>
          .
          <volume>05</volume>
          .128.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          , G. Newlands,
          <article-title>Consumer segmentation within the sharing economy: The case of Airbnb</article-title>
          ,
          <source>Journal of Business Research</source>
          (
          <year>2018</year>
          ), Vol.
          <volume>88</volume>
          , pp.
          <fpage>187</fpage>
          -
          <lpage>196</lpage>
          . DOI:
          <volume>10</volume>
          .1016/j.jbusres.
          <year>2018</year>
          .
          <volume>03</volume>
          .019.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Zhai</surname>
          </string-name>
          ,
          <article-title>Click-through rate prediction in online advertising: A literature review</article-title>
          ,
          <source>Information Processing &amp; Management (2022)</source>
          , Vol.
          <volume>59</volume>
          ,
          <string-name>
            <surname>Issue</surname>
          </string-name>
          <article-title>2</article-title>
          . DOI: https://doi.org/10.1016/j.ipm.
          <year>2021</year>
          .
          <volume>102853</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>N.</given-names>
            <surname>Deshpande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Khode</surname>
          </string-name>
          ,
          <source>Web based Targeted Advertising: A Study based on Patent Information, Procedia Economics and Finance</source>
          (
          <year>2014</year>
          ), Vol.
          <volume>11</volume>
          , pp.
          <fpage>522</fpage>
          -
          <lpage>535</lpage>
          . DOI: https://doi.org/10.1016/S2212-
          <volume>5671</volume>
          (
          <issue>14</issue>
          )
          <fpage>00218</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>J.-A.</given-names>
            <surname>Choi</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. Lim,</surname>
          </string-name>
          <article-title>Identifying machine learning techniques for classification of target advertising</article-title>
          ,
          <source>ICT Express</source>
          (
          <year>2020</year>
          ), Vol.
          <volume>6</volume>
          , Issue 3, pp.
          <fpage>175</fpage>
          -
          <lpage>180</lpage>
          . DOI: https://doi.org/10.1016/j.icte.
          <year>2020</year>
          .
          <volume>04</volume>
          .012.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>K.</given-names>
            <surname>Coussement</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Harrigan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Benoit</surname>
          </string-name>
          ,
          <article-title>Improving direct mail targeting through customer response modeling</article-title>
          ,
          <source>Expert Systems with Applications</source>
          (
          <year>2015</year>
          ), Vol.
          <volume>42</volume>
          , Issue 22, pp.
          <fpage>8403</fpage>
          -
          <lpage>8412</lpage>
          . DOI: https://doi.org/10.1016/j.eswa.
          <year>2015</year>
          .
          <volume>06</volume>
          .054.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reynaldo</surname>
          </string-name>
          , Goenawan,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chanrico</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Suhartono</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Purnomo</surname>
          </string-name>
          ,
          <article-title>Gender Demography Classification on Instagram based on User's Comments Section</article-title>
          , Procedia Computer Science (
          <year>2019</year>
          ), Vol.
          <volume>157</volume>
          , pp.
          <fpage>64</fpage>
          -
          <lpage>71</lpage>
          . DOI: https://doi.org/10.1016/j.procs.
          <year>2019</year>
          .
          <volume>08</volume>
          .142.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>F.</given-names>
            <surname>Kaefer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Heilman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. D.</given-names>
            <surname>Ramenofsky</surname>
          </string-name>
          ,
          <article-title>A neural network application to consumer classification to improve the timing of direct marketing activities</article-title>
          ,
          <source>Computers &amp; Operations Research</source>
          (
          <year>2005</year>
          ), Vol.
          <volume>32</volume>
          , Issue 10, pp.
          <fpage>2595</fpage>
          -
          <lpage>2615</lpage>
          . DOI:
          <volume>10</volume>
          .1016/j.cor.
          <year>2004</year>
          .
          <volume>06</volume>
          .021.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>A.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. K.</given-names>
            <surname>Dwivedi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Arya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Q.</given-names>
            <surname>Siddiqui</surname>
          </string-name>
          ,
          <article-title>Does SMS advertising still have relevance to increase consumer purchase intention? A hybrid PLS-SEM-neural network modelling approach, Computers in Human Behavior (</article-title>
          <year>2021</year>
          ), Vol.
          <volume>124</volume>
          . DOI:
          <volume>10</volume>
          .1016/j.chb.
          <year>2021</year>
          .
          <volume>106919</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>A.</given-names>
            <surname>Muk</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Chung, Applying the technology acceptance model in a two-country study of SMS advertising</article-title>
          ,
          <source>Journal of Business Research</source>
          (
          <year>2015</year>
          ), Vol.
          <volume>68</volume>
          , Issue 1, pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . DOI: https://doi.org/10.1016/j.jbusres.
          <year>2014</year>
          .
          <volume>06</volume>
          .001.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Á</surname>
          </string-name>
          . J.
          <string-name>
            <surname>Lorente-Páramo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chaparro-Peláez</surname>
            ,
            <given-names>Á</given-names>
          </string-name>
          , Hernández-García,
          <article-title>How to improve e-mail clickthrough rates - A national culture approach, Technological Forecasting</article-title>
          and Social
          <string-name>
            <surname>Change</surname>
          </string-name>
          (
          <year>2020</year>
          ), Vol.
          <volume>161</volume>
          . DOI: https://doi.org/10.1016/j.techfore.
          <year>2020</year>
          .
          <volume>120283</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ma</surname>
          </string-name>
          , L. Hou,
          <string-name>
            <given-names>W.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>A nonhomogeneous hidden Markov model of response dynamics and mailing optimization in direct marketing</article-title>
          ,
          <source>European Journal of Operational Research</source>
          (
          <year>2016</year>
          ), Vol.
          <volume>253</volume>
          , Issue 2, pp.
          <fpage>514</fpage>
          -
          <lpage>523</lpage>
          . DOI: https://doi.org/10.1016/j.ejor.
          <year>2016</year>
          .
          <volume>02</volume>
          .055.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>M.</given-names>
            <surname>Paulo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. L.</given-names>
            <surname>Miguéis</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Pereira</surname>
          </string-name>
          ,
          <article-title>Leveraging email marketing: Using the subject line to anticipate the open rate</article-title>
          ,
          <source>Expert Systems with Applications</source>
          (
          <year>2022</year>
          ), Vol.
          <volume>207</volume>
          . DOI: https://doi.org/10.1016/j.eswa.
          <year>2022</year>
          .
          <volume>117974</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>