Optimization of Marketing Decisions Based on Machine
Learning: Case for Telecommunications
Galyna Chornous and Yana Fareniuk

Taras Shevchenko National University of Kyiv, 90-A, Vasylkivska st., Kyiv, 03022, Ukraine

                 Abstract
                 Among the main marketing tasks are maintaining the clients and increasing their activity.
                 Based on this, tasks are consumer segmentation and improving communication with them.
                 Machine learning helps to analyze information about subscribers and their use of the
                 company's services and find hidden insights to optimize marketing activities. The purposes of
                 the research are to propose appropriate methods for solving the problem of clustering for
                 customers segmentation and classification of clients who will respond positively to E-mail
                 for the optimization of advertising mailings. The modeling was implemented based on data of
                 the Ukrainian telecommunication company, and the article presents the results of
                 constructing Self Organizing Map (SOM) with the g-means algorithm and k-means clustering
                 to develop subscriber profiles by identifying their similar behavior in terms of frequency,
                 duration of consumption, as well as expenses; determination of the most profitable customer
                 segments. Such information will create a basis for the development of marketing activities
                 aimed at certain groups of customers (personalized communications) and optimization of
                 costs for targeted SMS/E-mail mailing. In order to minimize costs for clients who will not
                 respond to advertising, such classification methods as JRip, DecisionTable, IBk, SMO,
                 NaiveBayes, J48 (C4.5), RandomForest, Logistic regression and others were implemented for
                 the Response to Mailing variable, considering the sampling imbalance, which was solved by
                 an oversampling algorithm. A Cost-Sensitive Classifier has been demonstrated. The
                 RandomForest, J48 and IBk models have the highest quality and are recommended for
                 implementation in order to optimize advertising costs. Thus, based on the applied methods,
                 the company can tailor the mailing to those customers who are more likely to respond. So,
                 the research confirms the feasibility of using models in the clustering and classification of
                 consumers to optimize marketing activities.

                 Keywords 1
                 Advertising, machine learning, consumer segmentation, optimizing mailings, E-mail,
                 telecommunication.

1. Introduction
   Marketing specialists around the world are trying to find the best solution for their marketing
activities. Disruptive technologies such as data analytics and machine learning have changed the ways
businesses operate. Of all the revolutionary technologies, artificial intelligence is the technological
disruptor and has enormous potential for marketing transformation [1]. Data analytics provides more
valuable insights to strengthen business success and make real-time business decisions by scrutinizing
and deeply analyzing these data to choose a customized decision with a high level of sophistication
[2]. Effective marketing can be built on the basis of high-quality and comprehensive information
about the market, competition and consumers. Marketing research is necessary for an explanation of
the behavior of the company's customers, determination of possible prospects development and
increasing customer satisfaction, which will have a positive influence on the business results.
Advanced analysis, mathematical tools and machine learning algorithms allow companies to build

Information Technology and Implementation (IT&I-2022), November 30 – December 02, 2022, Kyiv, Ukraine
EMAIL: Galyna.Chornous@knu.ua (G.Chornous); yfareniuk@gmail.com (Y. Fareniuk)
ORCID: 0000-0003-4889-1247 (G. Chornous); 0000-0001-6837-5042 (Y. Fareniuk)
            ©️ 2022 Copyright for this paper by its authors.
            Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                                          112
intelligent models and Decision Support Systems capable of learning from this data. Intelligent data
analysis allows to solve such tasks as customer segmentation, management of customer outflow and
determination of the best way to retain them; forecasting the response to the offer; effective attraction
of new customers, etc. For businesses, the consumer is the subject of increased attention, because his
behavior significantly affects the effectiveness of their marketing activities. Behavioral segmentation
of consumers helps to increase sales, which affects consumer loyalty [3]. As a result, the marketing
tasks of customer segmentation and the development of an effective customer relationship strategy are
very important for all companies on the market. Segmentation of the client’s base is a basis for
forming effective communication with different target groups by the development of personal
advertising propositions.
    Advertising is an important competitive tool. Media activity attracts consumers, increases brand
awareness, creates loyalty to the company, distinguishes it from competitors or changes the taste of
consumers. The company’s share of the market drops in the case of the absence of advertising [4].
Media is a means of communication that generates various marketing results among consumers. With
the occurrence of mobile communication and smartphones, consumer preferences can be pre-
determined and therefore advertising can be delivered to consumers in a multimedia format at the
right time and place with the right message. As marketing communications spread, the capability to
target the right audience becomes increasingly important. Audience targeting practices in media tend
to highlight the demographics, behavior and other consumer characteristics as a basis for selecting the
right messages for each audience [5]. In the case of this new advertising opportunity, the development
of personalized mobile advertising to meet consumer needs is becoming an important challenge [6].
    The mobile advertising paradigm is shifting to personalized advertising services for each consumer
in this era of data. In the telecommunications market, the growing demand for smart devices and the
emergence of 4G mobile networks have increased the use of mobile services with increased
competition among category players. Lately, as the mobile ecosystem has become more complex,
marketing specialists are focusing on targeted marketing to clients to maximize ad impact [7] and
increase revenue.
    The purposes of the research are to propose appropriate machine learning methods for solving the
problem of clustering for the segmentation of customers and classification of clients who will respond
positively to E-mail for the optimization of advertising mailings. The results of this research can be
used as a basis for customer relationship management.

2. Literature review
   Well-done segmentation leads to a better understanding of the market and customer needs. The
research [8] attempts to develop a new methodological approach combining Recency, Frequency and
Monetary with the K-means clustering and provides a useful tool and valid methodology for
marketing specialists and decision-makers to accurately identify the most profitable consumer
segments.
   Arunachalam and Kumar [9] evaluate the effectiveness of different clustering approaches for
finding profitable consumer segments. The data are analyzed by hierarchical clustering, K-Medoids,
fuzzy clustering and Self Organizing Maps (SOMs). The effectiveness of different clustering methods
differs considerably in practice. The obtained results indicate that clustering based on Fuzzy and SOM
are comparably more effective than traditional techniques to detect hidden structures in datasets.
Segments derived from SOM have more potential to provide interesting and useful insights for data-
based decision-making in business practice. Pukala R. [10] implements the artificial neural networks
(ANNs) for quantifying the risks of an innovative company by the Kohonen network. Approaches to
market segmentation and consumer diagnostics based on multivariate statistical analysis and ANN are
considered in [11][11] (market segmentation using the SOM and further refinement of the results by
the k-means algorithm, and consumer diagnosis by discriminant analysis and multilayer perceptron).
   Zethmayr and Makhija [12] apply k-means clustering to customers according to demographics.
Data analysis in the paper [13] was carried out by clustering according to Ward's methodological
approach, which identified groups with different socio-demographic characteristics and hierarchical
preferences. Ortiz et al. [14][14] explore the two-stage clustering for gaining a deeper understanding
of consumers’ consumption behavior by profiling and identification of consumer segments

                                                                                                     113
considering their habits and lifestyles. The purpose of the paper [15] is the determination the
motivational profile of consumers with a factor-cluster analysis using exploratory factor analysis with
an unweighted least squares method. The discriminant analysis established factor importance and
demographics.
   One of the goals of training in predictive analytics is to create a model from a set of data. The goal
of various strategies and experiments is to create a more accurate forecasting model. The aims of [16]
are to take sequential steps to find an accurate model for data and save it for future implementation
with Python. An integrated intelligent system for monitoring, modeling and managing the life cycle of
the company's products has been developed in [17]. This system is presented in the form of a three
equations structure, which functions in conditions of instability. The system is based on the concepts,
principles, a set of nonlinear models and methods of decision-making and management.
   One of the biggest challenges to creating more successful marketing strategies in the telecom
market is understanding the diversity of consumer needs and the identification of consumer segments
[18]. In consumer research, segmentation has been widely used to identify subsets of consumers based
on their preferences. Since the last decade, a more comprehensive assessment of product performance
has led to the consideration of a variety of information. Verain et al. [19][19] determine consumer
segments and explore differences between them by consumers’ perceptions. To determine consumer
segments a three-way cluster analysis around the latent variables approach is examined in [20][20].
This method groups consumers into clusters and evaluate for each cluster an associated product latent
variable, attribute weights and a set of consumer indicators that can help to determine product
characteristics for the cluster. In studies where, in addition to preference indicators, external
information about products as well as about consumers is available, the clustering by latent variables
(CLV) methodology can be used for customer segmentation. A direct approach, L-CLV has
demonstrated its competence to detect consumer segmentation related to a large number of
sociological and behavioral parameters [21][21].
   Delley and Brunner [22] investigate consumer segmentation using hierarchical cluster analysis.
Consumer descriptions varied greatly between groups, indicating heterogeneity. These results indicate
the need to study segmentation during data analysis. Techniques of hierarchical segmentation were
used in [23] to determine different groups of consumers based on their values and lifestyles. The
results contribute to the theoretical and practical aspects of customer segmentation. The
recommendations and findings emphasize the importance of implementing different strategies for
each segment. Cha and Park [24][24] show that the results of clustering can be used to find the
appropriate strategy for each cluster.
   The research by [25] presents an optimal predictive segmentation algorithm to identify subgroups
that are homogeneous with regard to certain patterns in customer attributes and predictive to the
desired result. The authors create an intuitive segmentation with high interpretability and an optimal
targeting process for the company’s clients. In this setting, the business develops a small number of
messages that will be sent to appropriately selected customers who are most likely to respond to
different message types. The proposed method uses consumption, demographics, and participation
data to extract underlying predictive rules from the dataset using machine learning algorithms.
   Marketers use marketing logic to target ads to specific consumer segments. However, there is not
always a clear alignment between consumer segmentation and targeting, which can lead to a potential
reduction in effectiveness [26]. Predicting the probability that a user will respond to a particular ad
has been a common problem in advertising that has attracted much research attention. In recent years,
a growing number of new learning models have emerged to improve ad CTR prediction [27].
   Over the past decades, the rapid development in the area of information and communication
technologies has led to the expansion of the Internet by broad segments of the population. Thanks to
various technological advances, the Internet has made it possible for advertisers to reach their target
audience [28][28]. Artificial intelligence technologies have numerous applications for online
advertising, particularly to optimize the coverage of target audiences. Choi and Lim [29] investigate
and categorize different techniques of machine learning used to improve targeted online advertising.
A neural network classifier is proposed by Abrahams et al. [5] to assign ads to groups that represent
different media channels. In its ability to classify unviewed ads, the model shows a higher
performance of classification than the result generated by a random model by 100-300%. Authors
suggest using ANN for automated media planning and advertising targeting. Mobile advertising has

                                                                                                     114
evolved into a technology that allows an advertiser to effectively and efficiently promote products or
services to target consumers.
    Direct marketing is a crucial instrument for the company’s promotion, among which direct mailing
is quite important. One approach to improving direct mailing targeting is response modeling, which is
predictive modeling that assigns the probability of future responses to customers based on their
history with a company. Coussement et al. [30] present well-known statistical methods for data
classification and analysis (logistic regression, linear and quadratic discriminant analysis, naive
Bayes, neural networks, decision trees such as CHAID, CART and C4.5, and the kNN algorithm).
The results show that data mining algorithms (CHAID, CART and neural networks) have well
performance, followed by simplified statistical classifiers such as logistic regression and linear
discriminant analysis. The research by Reynaldo [31] explores a social network user gender prediction
model using AdaBoost, XGBoost, Support Vector Machine and Naive Bayes Classifier combined
with grid search and K-Fold validation. Kaefer et al. [32] develop an alternative scoring approach for
classifying new clients as "good" or "bad" prospects for direct marketing. The research proves that the
approach of using only demographics to profile consumers can be enhanced by observing their
purchases. The authors establish multinomial logit and neural network models, which can help to
classify and target potential consumers.
    Perception of SMS advertising has a significant direct or indirect impact on consumer purchase
intention. However, there is a dearth of comprehensive research that suggests the predictors of SMS
advertising perception and the process by which it impacts purchase intention. The research [33]
focuses on developing a model based on the stimulus-organism-response framework with a two-stage
hybrid model using PLS-SEM and ANN. Research benefits marketing specialists by facilitating better
decision-making for developing effective advertising campaigns using mobile SMS advertising.
    SMS helps companies to make direct interactions with their target consumers at any time and
location using their mobile phones. Using a modified technology acceptance model, the paper
[34][34] explores the influencing factors of acceptance of SMS advertising by consumers. The
usefulness is important in establishing favorable consumer attitudes toward SMS advertising. Authors
show that consumers perceive SMS advertising differently. Email should be direct and personalized.
    Measuring the effectiveness of email marketing is difficult. To maintain competitiveness,
managers must maximize profits from mailings by deciding who should receive them. Before
achieving the main purpose of converting sales, the intermediate goal of email campaigns is to capture
interest and drive traffic to the website. The paper [35] examines the relevance of variables that
impact recipient interest in promotional emails and provides companies with actionable and useful
insights on how to plan and deploy email marketing strategies with higher efficiency. Paper [36]
presents a two-step approach that allows companies to consider the dynamic consequences of mailing
and make effective mailing decisions by maximizing the customer’s long-term value. The authors
suggest a heterogeneous hidden Markov model to capture the interactive dynamics between customers
and mailings and use the resulting parameters to develop optimal mailing decisions using a Partial
Observable Markov Decision Process. Both immediate and remote consequences of mailings are
taken into account. Although email marketing is one of the most cost-effective tools, it remains
problematic due to low email open rates and a high percentage of unsubscribed campaigns. The
structure and content of the topic are investigated in [37] together with various machine learning
techniques (Random Forest, Decision Trees, ANN, Naive Bayes, Support Vector Machines and
Gradient Boosting). The results show that combining the data leads to more accurate classification.
    Nowadays, mobile advertising focuses on powerful algorithms for personalized recommendations.
Chen and Hsieh [6] propose the fuzzy Delphi method to determine the main personalized attributes in
a personalized mobile advertising message for various products.
    But many questions about the peculiarities of optimization marketing decisions in aspects of
customer relationship management remain insufficiently studied, in particular, the customer base
segmentation in the telecommunication market using clustering methods, as well as optimization of e-
mail marketing via classification of clients who will positively or negatively respond to mailings.
Solving such marketing tasks forms an information-analytical basis for marketing activity
optimization and making effective decisions for further business development and marketing strategy.


                                                                                                   115
3. Methodology and dataset description
    In such a high-tech field as telecommunications, as well as in the field of marketing in general,
machine learning methods and approaches have been widely used. Among the main problems that
need to be solved are, first of all, those related to loyalty programs and maintenance of the existing
client base, as well as the attraction of new consumers of services.
    Internal systems of telecommunications companies accumulate large volumes of data every day.
First of all, this is information about subscribers and statistics on their use of the company's services.
The analysis of such information without using the capabilities of information technologies is
ineffective, which creates significant opportunities for the use of approaches and methods of machine
learning to optimize marketing activities and increase their effectiveness.
    The research goals are to propose the relevant methods to solve the task of clustering (client
segmentation) and classification for optimization of advertising mailing for the consumer base. The
modeling was implemented based on a database of one of the Ukrainian telecommunications
companies, which provides mobile communication and the Internet. The management decided on the
need to segment the subscriber base considering the purpose of optimizing the marketing activities, in
particular for building profiles of subscribers by identifying their similar behavior in terms of
frequency, duration of service use, as well as the level of expenses; assessment and determination of
the most profitable customer segments. This forms a scientific hypothesis about factors determining
consumer segments. In the future, such information will create a basis for the development of
marketing activities aimed at certain groups of customers (personalized promotional
communications); development of new tariff plans; optimization of costs for addressed SMS/Viber/E-
mail distribution in relation to new services and tariffs; predicting and avoiding the outflow of
customers to other competing companies.
    The data downloaded from the internal system, is a table with the following fields: age of the
client, average monthly expenses (average amount of expenses per subscriber for mobile
communication and mobile Internet), the average duration of calls (average number of minutes for
outgoing calls by a subscriber per month), daytime/evening/night activity per month (number of
activity (calls, messages, Internet connections) per month in the morning and daytime/evening/night
time, respectively), activity with other cities/countries per month, the share of calls to landline phones
(city numbers), the volume of Internet per month (number of Mb of Internet consumed). Only active
subscribers of the company who regularly use mobile communication and/or mobile Internet services
over the past few months were selected. The dataset for the experiment contains information about
4591 clients of this company and will be used to realize customer segmentation via Kohonen SOM
with a g-means algorithm in Deductor Studio and k-mean clustering in Weka. Such methods were
selected for solving the task of clustering, considering the specificity of the dataset, which contains
client-related data about their consumption of the company’s services, and the statement that it is
useful techniques for the goal of segmentation, as we mentioned in the literature review. In addition,
the task of optimizing mailings to customers with the goal of minimizing costs for those who do not
respond to advertising activity often arises in marketing. Using the example of a database that
contains information about customers and their activity, the application of classification methods for
advertising response will be demonstrated to increase the effectiveness of advertising activity.
    The dataset, which was used for the experiment contains information on 13,500 customers,
including known responses to the advertising E-mail and information such as gender, age, number of
years the customer has been a company’s client, the total value and total number of all purchases, the
facts of service calls support, etc. In total, 9 independent and 1 dependent variables are available for
analysis. The task is reduced to binary classification, where the variable "Response" was chosen as the
class indicator and in the case of a positive answer, "1" means feedback, and "0" means no feedback.
    The scientific hypothesis is the presented factors describe the probability of a positive or negative
response to ad message and machine learning methods can effectively predict this response. The task
is to classify consumers as clearly as possible according to the probability of responding to an
advertising message. The results were analyzed in order to formulate recommendations for
minimizing costs for new mailings to customers. In addition, the following information is also known
for this company: costs for one mailing CM = UAH 1, costs for retaining 1 client CR = UAH 9,
expected revenue from 1 client R = UAH 20, so it is assumed that the maximization of income from

                                                                                                      116
communication with the consumer through the mailing. Consider the possible classification results
(Table 1).
   The total revenue will be TP*(R-CM-CR)-FP*CM. To evaluate the predictive power of the
classification model, it is necessary to compare the expected revenue with that which can be obtained
under the condition of mass mailing to all participants. In order to choose the best classifier, the
ZeroR, PART, OneR, JRip, Decision Table, IBk, SMO, Naive Bayes, J48(C4.5), Random Forest,
Logistic regression, and AdaBoostM1 methods will use. They were selected for solving the
classification task, considering the scientific achievements of researchers, which prove the high
performance of mentioned approaches, and the specific task of cost optimization with the relevant
dataset. The Cost Sensitive option will also be implemented to the mentioned methods. All of them
will be compared and the methods with the best performance will be selected for implementation
according to the goal of minimizing costs for ineffective mailings and improving revenue.
Table 1
The possible results of consumer classification according to the variable “Response”
Forecast Fact   Result                          Economic essence                               Income
   No     No      TN No Contact, No Cost                                                   0
  Yes     Yes     TP Revenue excluding mailing costs                                       (R-CM-CR)=10
   No     Yes     FN Missing contact and expenses, but unearned revenues                   0
                       Costs for a mailing that does not bring results, but a person can   CM*0.9=1*0.9
   Yes     No     FP
                       react later as the advertising has a delayed effect                 =0.9

   The research was implemented through step-by-step analysis and modeling and the overall process
of optimization of marketing decisions for telecommunications companies through machine learning
technologies are look as shown in Figure 1.


Figure 1: The proposed concept of the research

4. Results
   At the initial stage of work with the clustering problem, the SOM algorithm was applied with
automatic selection of the number of clusters. For implementation, the Deductor Studio software was
used, as a result of which 9 clusters (0-8) with different profiles were formed.

                                                                                                     117
    On the basis of the obtained SOMs (Fig. 2), it is possible to analyze in detail the groups of
consumers based on various characteristics and the formed customer clusters. Thus, analyzing the
"Age" map, three age groups can be clearly distinguished: young people, middle-aged people, and
people over 45 years old. Focusing on youth in more detail, we can understand that it is quite
heterogeneous and several separate clusters can be distinguished among it. The first is located in the
upper right corner and is characterized by those customers who actively use the company's services in
the evening and at night, use Internet services to a large extent. As a result, they spend more on
mobile communication and mobile Internet than other representatives of this age group. This segment
includes most of those who prefer activity at night. It can be predicted that these are students and
young people who often spend the evening outside the house or communicate with friends or watch
video content. A small group of young people is concentrated below, which is not distinguished by
the activity of using services neither during the day, nor in the evening, nor even at night, therefore, as
a result, their monthly expenses for communication and the Internet in this cluster are small.


Figure 2: The SOM for client’s segmentation constructed in Deductor Studio
    The rest of the people of this age group are not distinguished by anything special: moderate
expenses for communication and the Internet and to a greater extent activity in the evening. It can be
predicted that most of the youth got here. Thus, we clearly identified three clusters in the youth age
group. Continuing the interpretation of SOM, we will focus on people of mature and retirement age.
Let's pay attention to the pronounced cluster in the lower left part, in which high values are observed
for almost all indicators, except for the Internet, including activity with other cities and countries.
These are so-called "VIP" clients: businessmen, executives, top managers. The vast majority of them
are of mature age, they carry out a lot of activities during the day and in the evening (most likely, due
to their work) and use the mobile Internet the least. Monthly expenses for communication and Internet
in this category of subscribers are the highest among all the company's clients.
    On the left, in one of the clusters, a completely opposite picture is observed: people practically do
not use mobile communication and Internet services. Most likely, these are pensioners who need
mobile communication and/or the Internet primarily to receive incoming calls, and their independent
activity is minimal. Costs for this group of clients are the lowest, which may be due to the fact that the
only source of income is a pension. The rest of the people in the mature and retirement age group are
united by the fact that they are mainly active in the evening and do not use the Internet very actively.
With a greater probability, it can be working pensioners, summer residents, and parents of adult
children. The last cluster of middle-aged people includes working subscribers, but among them, there
is a group of those who are not very active in the evening (perhaps these are employees with a non-
standard work schedule – night/evening shifts, etc.).
    However, the automatic determination of the number of clusters using the G-means algorithm
produces 9 clusters in this case, which can create difficulties in practical application due to their large
number. It is recommended to reduce the number of clusters to 6, and to apply the k-means algorithm.
6 clusters (0…5) were formed in the Weka software, each of which contains from 4% to 31% of
customers, which indicates a sufficient number of cases for training and future application of the
model. Each cluster is characterized by a unique centroid for each of the 10 indicators, which
determines the differences between them and, as a result, determines the differences in the behavior of
each group.


                                                                                                       118
    Based on the results, we will characterize the clusters shown in Figure 3. Cluster 5 includes mature
and older people who have the lowest costs for communication and the Internet and minimal activity
in using all services, that is, it can be assumed that this cluster is formed primarily by retirees and
people with low incomes. Cluster 4 includes young people up to mature age, who have moderate
expenses for Internet and communication, and their activity especially increases in the evening and at
night, have the highest share of mobile Internet use, that is, it can be assumed that this group includes
active young people who spend their free time outside the home or actively use the Internet for
communication or entertainment. Cluster 3 includes young and mature people with moderate
expenses and the highest levels of phone conversations, especially the activity increases in the
evening. We assumed that the group includes people who actively use the company's services for
interpersonal communication. Cluster 2 includes people under 35 years of age with moderate
expenses and average indicators of activity. The group actively uses the Internet and different levels
of activity in the evening. Cluster 1 includes mature and older people with moderate spending and
average activity indicators, with minimal activity at night and on the Internet. Cluster 0 includes
people of almost all age categories who have the highest costs for communication and the Internet, the
average level of phone conversations and the highest activity during the day and evening, the highest
activity with other cities and countries, that is, it can be assumed that this group is formed "VIP-
clients", that is, business representatives who use communication and the Internet for business
purposes.
        Age             Average expenses          Average calls duration        Daily activity           Evening activity


    Night activity   Activity with other cities    Activity with other     Share of calls to landline Activity on the Internet
                                                        countries                   phones


Figure 3: Characteristics of customer clusters of a telecommunications company
   Thus, the results of this clustering create a basis for improving the product offer and the company's
transition to personalized communication with their subscribers to increase revenue per 1 user, as well
as ensure loyalty to the company. These problems will be solved later.
   It is important to investigate the question of whether the division of clusters is preserved in
different periods of time and the time horizon at which it is appropriate to apply these results. The
analysis proves that in the short term (1-3 years) the behavior of the customers is relatively stable
because they consume mobile communication and mobile Internet services in almost unchanged
volumes, have a relatively unchanged standard of living and income, etc. However, if we talk about
the long-term perspective (5-10 years and more), the results of clustering may have significant
deviations from reality, as it is affected by a number of factors and general market trends. In this
regard, the results of clustering must be updated during the period of annual strategic and tactical
planning, so that the results correspond to the actual behavior of customers, and marketing activities
are effectively adapted to modern conditions. However, despite the fact that the results require
regular verification, the potential and effectiveness of the application of machine learning and Data
Science methods for business are constantly growing. The conclusions formed are part of complex
research for the formation of an effective marketing strategy and, as a result, the management of
marketing activities in general. Segmentation of customers based on machine learning increases the
quality of advertising planning as it makes it possible to launch personalized communication with
digital placement tools, as well as to ensure quality management of customer loyalty due to product
improvement and its relevant offer to interested segments.


                                                                                                                          119
    The next marketing task, that needs to be solved is the optimization of e-mail mailings to
customers in order to maximize response and minimize costs for those who do not respond to
advertising activity. The application of different classification methods will help to increase the
efficiency of advertising activity via the development of effective recommendations for future
advertising mailing. The results of various classification approaches were analyzed to assess the
quality of the classification.
    According to the classification results, it can be seen that OneR, JRip and Decision Table show the
best accuracy (compared to the baseline Zero R classifier, the accuracy increased from 85.5% to 92.0-
92.2%). Among them, the Decision Table approach showed the highest results, but if we take into
account the business goal of minimizing costs, then the error brings greater losses when there is no
feedback, and the model predicted that there should have been feedback. This error is the smallest in
the OneR method (none of the clients were incorrectly clustered for this problem). In this regard, it is
recommended to use a combination of Decision Table and OneR.
    Among the methods based on decision trees, the J48(C4.5) algorithm shows slightly worse results
(accuracy - 92.0%, 129 observations are classified as positive, but there was no feedback, which
implies additional costs for the company). The decision tree with standard settings is the best option
for the algorithm (highest accuracy without overtraining). The decision tree is quite extensive, but it
provides a clear understanding of the factors that influence whether a customer will respond to an
advertising message. The tree turns into the so-called "golden rules" for setting up mailings to
customers. However, the classification results indicate that it is advisable to solve the problem of
sample imbalance since the results are close to random classification by most methods (response to
feedback is distributed almost 50%/50% between classes). At the moment, the share of customers
who respond to the E-mail is 14.5%, which creates a significant imbalance between those who will
respond and those who will not respond to the advertising message. Using the oversampling
algorithm, that is, increasing the share of a certain class, we will balance the sample: considering the
current share of the response = 1 class, it is advisable to increase it by 5 times to obtain balanced
results (46% for class 1 and 54% for class 0). Let's repeat the clustering according to the algorithms
mentioned above for balanced samples, and compare the results in Table 2.
    According to the classification results, Table 2 shows that IBk, J48 and Random Forest have the
best accuracy (92.9-97.2%). Among them, the Random Forest approach showed the highest results
both in terms of overall accuracy and in terms of minimizing the cost of inefficient mailings. The
Random Forest algorithm is recommended for implementation in marketing planning and marketing
activities from the point of view of optimizing costs for advertising activity.
    The next step in improving the classification results is the application of the Cost-Sensitive
Classifier for all methods, taking into account the cost matrix presented in Table 1. According to the
classification results from Table 2 and Table 3, there is a conclusion that JBk, J48 and Random Forest
maintain the best performance in terms of accuracy (83.2-95.8%) and potential revenue. Among them,
the Random Forest approach showed the highest results without taking into account the Cost-
Sensitive classification approach. The Random Forest algorithm in combination with J48 and IBk is
recommended for implementation in telecommunication companies in order to optimize costs for
advertising activity, which is one of the key areas of marketing activity. Thus, based on the applied
data classification methods, the company can tailor mailings to those customers who are more likely
to respond to the message. As a result, costs will be minimized, and revenues will increase.
    To maximize marketing efficiency, it is advised to implement machine learning technologies into
the regular management of consumer behavior. The effective concept of such implementation of
modeling is a cyclic process, which accumulates next stages:
     Obtaining historical data on the influencing factors and variables that describe the factors;
     Updating the models, evaluating the effectiveness of previous decisions and current results;
     Formation of recommendations for marketing activities and work with the consumer base;
     Implementation of recommended solutions.
    In the case of regular support of machine learning models for the company, we may determine
business tasks depending on different time intervals (monthly and quarterly). The main tasks for
weekly time intervals are the realization of business monitoring, checking the efficiency of marketing
solutions, and evaluating the quality of constructed models and their accuracy. The main tasks on a


                                                                                                    120
quarterly basis are model updates and formation of recommendations for future marketing decisions,
evaluation of the effectiveness of previous solutions.
Table 2
The by different methods for a balanced sample
   Method         Accuracy       AUC                  Confusion Matrix               Return, UAH
PART             84.6%        0.94          a       b        classified as            15 430
                                            8991    769      a=1
                                            2552    9030     b=0
OneR             76.2%        0.77          a       b        classified as             12 887
                                            8200    1560     a=1
                                            3513    8039     b=0
JRip             86.6%        0.91          c       b        classified as             15 612
                                            8710    1050     a=1
                                            1808    9744     b=0
Decision Table   82.3%        0.92          a       b        classified as             15 077
                                            9097    663      a=1
                                            3117    8435     b=0
IBk              95.8%        0.94          a       b        classified as             18 616
                                            9760    0        a=1
                                            904     10648    b=0
SMO              79.3%        0.79          a       b        classified as             12 583
                                            7229    2531     a=1
                                            1875    9677     b=0
Naïve Bayes      77.9%        0.88          a       b        classified as             10 931
                                            5880    3880     a=1
                                            829     10723    b=0
J48              92.9%        0.96          a       b        classified as             17 847
                                            9592    168      a=1
                                            1337    10215    b=0
Random Forest    97.2%        0.999         a       b        classified as             18 918
                                            9760    0        a=1
                                            602     10950    b=0
Logistic         79.1%        0.87          a       b        classified as             12 373
                                            7063    2697     a=1
                                            1753    9799     b=0
AdaBoostM1       79.5%        0.90          a       b        classified as             13 961
                                            8579    1181     a=1
                                            3197    8355     b=0


5. Conclusions
    Customer segmentation and marketing management are crucial tasks of the marketing system of a
telecommunications company, which is developing in conditions of oversaturation of the market and
the task of increasing the quality of communication with a client become an area of potential
optimization, which will minimize unnecessary costs and generate additional growth of revenue. To
ensure the effective functioning of companies on the market, it is necessary to implement approaches,
methods and models of machine learning for customer data.
    The implementation of machine learning technologies provided a qualitative result of the research.
The results of constructing the SOM and k-means clustering for customer segmentation for one of the
leading Ukrainian telecommunication companies become a basis for the development of client
profiles based on their demographics and data about their consumption of the company’s services in
terms of frequency, duration of service use, as well as the level of expenses. Such an approach will
help to optimize marketing activities, in particular via assessment and determination of the most
profitable customer segments and identification of the key differences between target groups. These
differences form a basis for future development of marketing activities aimed at certain groups of
customers (for example, personalized communications and promotional activities); development of
new tariff plans; optimization of costs for addressed SMS/E-mail mailing; minimizing the outflow of
customers to competitors.

                                                                                                   121
Table 3
The results of Cost-Sensitive classification by different methods for a balanced sample
        Method     Accuracy       AUC                  Confusion Matrix               Return, UAH
                                              a         b     classified as
 ZeroR             45.8%       0.50           9760      0     a=1                         7 968
                                              11552     0     b=0
                                              a       b        classified as
 PART              81.7%       0.83           9759    1        a=1                        15 627
                                              3891    7661     b=0
                                              c       b        classified as
 JRip              82.7%       0.83           8848    912      a=1                        14 925
                                              2771    8781     b=0
                                              a       b        classified as
 Decision Table    80.8%       0.82           9455    305      a=1                        15 122
                                              3788    7764     b=0
                                              a       b        classified as
 IBk               95.8%       0.96           9760    0        a=1                        18 616
                                              904     10648 b=0
                                              a       b        classified as
 Naïve Bayes       75.5%       0.77           8709    1051     a=1                        13 520
                                              4168    7384     b=0
                                              a       b        classified as
 J48               92.9%       0.93           9632    128      a=1                        17 883
                                              1381    10171 b=0
                                              a       b        classified as
 Random Forest     83.2%       0.85           9760    0        a=1                        15 949
                                              3571    7981     b=0
                                              a       b        classified as
 AdaBoostM1        81.2%       0.82           9363    397      a=1                        15 109
                                              3617    7935     b=0
    Considering the goal of choosing the best classification model, there was an investigation of the
following methods: ZeroR, PART, OneR, JRip, Decision Table, IBk, SMO, Naïve Bayes, J48(C4.5),
Random Forest, Logistic regression and AdaBoostM1 with additionally using the Cost-Sensitive
option. As a result, the Random Forest algorithm in combination with J48 and IBk showed the best
performance and the best economic effect and is recommended for implementation in
telecommunication companies in order to optimize costs for advertising activity, which is one of the
key areas of marketing activity. Thanks to the applied classification, the company can tailor mailings
to those customers who have the highest probability of positively responding to the advertising SMS /
E-mail. These decisions will help to minimize advertising costs and increase revenue by more than
137% vs random mailing for all consumer base. The estimation of accuracy reached over 80%, which
indicates the possibility and feasibility of using models in the further classification of customer
responses to determine the most effective consumer segments.
    To maximize marketing efficiency, it is advised to implement machine learning technologies into
the regular management of consumer behavior and customer relationship management. The effective
concept of such implementation of modeling is a cyclic process for maintaining the actuality of
constructed models for the current business environment and consumer behavior and preferences.
    The results of the research, constructed models and the proposed concept of the research can be
applicated in real business practice to optimize marketing activities for both Ukrainian and
international companies in the telecommunications market by making effective data-driven decisions
and to improve the mathematical methodology of consumer segmentation and optimization of
advertising mailings. Marketing strategy optimization based on data-based decisions and finding
hidden insights in data has a significant influence on business efficiency due to the high quality and
great validity of the decision-making process in very dynamically developing conditions on the
market. As an area of future research, it is relevant to focus on overcoming the limitations of current
research (in particular, the collection of different indicators about consumer characteristics and their
service preferences), and the periodic support of constructed models in different market conditions
due to possible changes in consumer behavior. It is necessary to identify new potential factors in a
timely manner, which will lead to enhancing marketing decisions. Therefore, it is advisable to


                                                                                                    122
conduct research on a regular basis, which can be effectively implemented in future marketing
activities.

6. References
[1] S. Verma, R. Sharma, S. Deb, D. Maitra, Artificial intelligence in marketing: Systematic review and
     future research direction, International Journal of Information Management Data Insights (2021), Vol.
     1, Issue 1. DOI: https://doi.org/10.1016/j.jjimei.2020.100002.
[2] J. Saidali, H. Rahich, Y. Tabaa, A. Medouri, The combination between Big Data and Marketing
     Strategies to gain valuable Business Insights for better Production Success, Procedia Manufacturing
     (2019), Vol. 32, pp. 1017-1023. DOI: 10.1016/j.promfg.2019.02.316.
[3] W. H. Susilo, An Impact of Behavioral Segmentation to Increase Consumer Loyalty: Empirical Study
     in Higher Education of Postgraduate Institutions at Jakarta, Procedia - Social and Behavioral Sciences
     (2016), Vol. 229, pp. 183-195. DOI: https://doi.org/10.1016/j.sbspro.2016.07.128.
[4] R. Amir, D. Machowska, M. Troege, Advertising patterns in a dynamic oligopolistic growing market
     with decay, Journal of Economic Dynamics and Control (2021), Vol. 131. DOI:
     https://doi.org/10.1016/j.jedc.2021.104229.
[5] A. S. Abrahams, E. Coupey, E. X. Zhong, R. Barkhi, P. S. Manasantivongs, Audience targeting by B-
     to-B advertisement classification: A neural network approach, Expert Systems with Applications
     (2013), Vol. 40, Issue 8, pp. 2777-2791. DOI: 10.1016/j.eswa.2012.10.068.
[6] P.-T. Chen, H.-P. Hsieh, Personalized mobile advertising: Its key attributes, trends, and social impact,
     Technological Forecasting and Social Change (2012), Vol. 79, Issue 3, pp. 543-557. DOI:
     https://doi.org/10.1016/j.techfore.2011.08.011.
[7] K. Y. Kim, B. G. Lee, Marketing insights for mobile advertising and consumer segmentation in the
     cloud era: A Q–R hybrid methodology and practices, Technological Forecasting and Social Change
     (2015), Vol. 91, pp. 78-92. DOI: https://doi.org/10.1016/j.techfore.2014.01.011.
[8] J. Zhou, L. Zhai, A. A. Pantelous, Market segmentation using high-dimensional sparse consumers
     data, Expert Systems with Applications (2020), Vol. 145. DOI: 10.1016/j.eswa.2019.113136.
[9] D. Arunachalam, N. Kumar, Benefit-based consumer segmentation and performance evaluation of
     clustering approaches: An evidence of data-driven decision-making, Expert Systems with
     Applications (2018), Vol. 111, pp. 11-34. DOI: https://doi.org/10.1016/j.eswa.2018.03.007.
[10] R. Pukała, Use of Neural Networks in Risk Assessment and Optimization of Insurance Cover in
     Innovative Enterprises, Engineering Management in Production and Services (2016), Vol. 8, No. 3,
     pp. 43-56. DOI: https://doi.org/10.1515/emj-2016-0023.
[11] E. Pesikov, O. Zaikin, E. Kozlova, Conducting market segmentation and diagnostics of the consumer
     printed products by using methods of multivariate statistical analysis and artificial intelligence, IFAC
     Proceedings       Volumes       (2013),     Vol.     46,    Issue     9,    pp.    2116-2121.        DOI:
     https://doi.org/10.3182/20130619-3-RU-3018.00642.
[12] J. Zethmayr, R. S. Makhija, Six unique load shapes: A segmentation analysis of Illinois residential
     electricity consumers, The Electricity Journal (2019), Vol. 32, Issue 9. DOI:
     https://doi.org/10.1016/j.tej.2019.106643.
[13] G. Di Vita, R. Zanchini, G. Falcone, M. D’Amico, F. Brun, G. Gulisano, Local, organic or protected?
     Detecting the role of different quality signals among Italian olive oil consumers through a hierarchical
     cluster     analysis,    Journal     of    Cleaner      Production     (2021),    Vol.      290.     DOI:
     https://doi.org/10.1016/j.jclepro.2021.125795.
[14] A. Ortiz, C. Díaz-Caro, D. Tejerina, M. Escribano, E. Crespo, P. Gaspar, Consumption of fresh
     Iberian pork: Two-stage cluster for the identification of segments of consumers according to their
     habits and lifestyles, Meat Science (2021), Vol. 173. DOI: 10.1016/j.meatsci.2020.108373.
[15] A. Higuchi, R. Maehara, A factor-cluster analysis profile of consumers, Journal of Business Research
     (2021), Vol. 123, pp. 70-78. DOI: https://doi.org/10.1016/j.jbusres.2020.09.030.
[16] E. Geldiev, N. Nenkov, M. Petrova, Exercise of Machine Learning Using Some Python Tools and
     Techniques. CBU International conference proceedings 2018: Innovations in Science and Education,
     21.-23.03.2018 (2018), pp.1062-1070. DOI: https://doi.org/10.12955/cbup.v6.1295.
[17] S. Ramazanov, V. Babenko, O. Honcharenko, N. Moisieieva, V. Dykan, Integrated Intelligent
     Information and Analytical System of Management of a Life Cycle of Products of Transport
     Companies. Journal of Information Technology Management (2020), 12(3), pp. 26-33. DOI:
     https://doi.org/10.22059/JITM.2020.76291.


                                                                                                         123
[18] M. C. Onwezen, M. J. Reinders, I. A. van der Lans, S. J. Sijtsema, A. Jasiulewicz, M. D. Guardia,
     Luis Guerrero, A cross-national consumer segmentation based on food benefits: The link with
     consumption situations and food perceptions, Food Quality and Preference (2012), Vol. 24, Issue 2,
     pp. 276-286. DOI: https://doi.org/10.1016/j.foodqual.2011.11.002.
[19] M. C.D. Verain, S. J. Sijtsema, G. Antonides, Consumer segmentation based on food-category
     attribute importance: The relation with healthiness and sustainability perceptions, Food Quality and
     Preference (2016), Vol. 48, Part A, pp. 99-106. DOI: 10.1016/j.foodqual.2015.08.012.
[20] V. Cariou, T. F. Wilderjans, Consumer segmentation in multi-attribute product evaluation by means
     of non-negatively constrained CLV3W, Food Quality and Preference (2018), Vol. 67, pp. 18-26.
     DOI: https://doi.org/10.1016/j.foodqual.2017.01.006.
[21] E. Vigneau, M. Charles, M. Chen, External preference segmentation with additional information on
     consumers: A case study on apples, Food Quality and Preference (2014), Vol. 32, Part A, pp. 83-92,
     https://doi.org/10.1016/j.foodqual.2013.05.007.
[22] M. Delley, T. A. Brunner, A segmentation of Swiss fluid milk consumers and suggestions for target
     product concepts, Journal of Dairy Science (2020), Vol. 103, Issue 4, pp. 3095-3106. DOI:
     https://doi.org/10.3168/jds.2019-17325.
[23] A. Díaz, M. Gómez, A. Molina, J. Santos, A segmentation study of cinema consumers based on
     values and lifestyle, Journal of Retailing and Consumer Services (2018), Vol. 41, pp. 79-89. DOI:
     https://doi.org/10.1016/j.jretconser.2017.12.001.
[24] Y. U. Cha, M. J. Park, Consumer preference and market segmentation strategy in the fast moving
     consumer goods industry: The case of women’s disposable sanitary pads, Sustainable Production and
     Consumption (2019), Vol. 19, pp. 130-140. DOI: https://doi.org/10.1016/j.spc.2019.04.002.
[25] A. Albert, M. Maasoumy, Predictive segmentation of energy consumers, Applied Energy (2016),
     Vol. 177, pp. 435-448. DOI: https://doi.org/10.1016/j.apenergy.2016.05.128.
[26] C. Lutz, G. Newlands, Consumer segmentation within the sharing economy: The case of Airbnb,
     Journal of Business Research (2018), Vol. 88, pp. 187-196. DOI: 10.1016/j.jbusres.2018.03.019.
[27] Y. Yang, P. Zhai, Click-through rate prediction in online advertising: A literature review, Information
     Processing & Management (2022), Vol. 59, Issue 2. DOI: https://doi.org/10.1016/j.ipm.2021.102853.
[28] N. Deshpande, S. Ahmed, A. Khode, Web based Targeted Advertising: A Study based on Patent
     Information, Procedia Economics and Finance (2014), Vol. 11, pp. 522-535. DOI:
     https://doi.org/10.1016/S2212-5671(14)00218-4.
[29] J.-A. Choi, K. Lim, Identifying machine learning techniques for classification of target advertising,
     ICT Express (2020), Vol. 6, Issue 3, pp. 175-180. DOI: https://doi.org/10.1016/j.icte.2020.04.012.
[30] K. Coussement, P. Harrigan, D. F. Benoit, Improving direct mail targeting through customer response
     modeling, Expert Systems with Applications (2015), Vol. 42, Issue 22, pp. 8403-8412. DOI:
     https://doi.org/10.1016/j.eswa.2015.06.054.
[31] N. Reynaldo, Goenawan, W. Chanrico, D. Suhartono, F. Purnomo, Gender Demography
     Classification on Instagram based on User’s Comments Section, Procedia Computer Science (2019),
     Vol. 157, pp. 64-71. DOI: https://doi.org/10.1016/j.procs.2019.08.142.
[32] F. Kaefer, C. M. Heilman, S. D. Ramenofsky, A neural network application to consumer
     classification to improve the timing of direct marketing activities, Computers & Operations Research
     (2005), Vol. 32, Issue 10, pp. 2595-2615. DOI: 10.1016/j.cor.2004.06.021.
[33] A. Sharma, Y. K. Dwivedi, V. Arya, M. Q. Siddiqui, Does SMS advertising still have relevance to
     increase consumer purchase intention? A hybrid PLS-SEM-neural network modelling approach,
     Computers in Human Behavior (2021), Vol. 124. DOI: 10.1016/j.chb.2021.106919.
[34] A. Muk, C. Chung, Applying the technology acceptance model in a two-country study of SMS
     advertising, Journal of Business Research (2015), Vol. 68, Issue 1, pp. 1-6. DOI:
     https://doi.org/10.1016/j.jbusres.2014.06.001.
[35] Á. J. Lorente-Páramo, J. Chaparro-Peláez, Á, Hernández-García, How to improve e-mail click-
     through rates – A national culture approach, Technological Forecasting and Social Change (2020),
     Vol. 161. DOI: https://doi.org/10.1016/j.techfore.2020.120283.
[36] S. Ma, L. Hou, W. Yao, B. Lee, A nonhomogeneous hidden Markov model of response dynamics
     and mailing optimization in direct marketing, European Journal of Operational Research (2016), Vol.
     253, Issue 2, pp. 514-523. DOI: https://doi.org/10.1016/j.ejor.2016.02.055.
[37] M. Paulo, V. L. Miguéis, I. Pereira, Leveraging email marketing: Using the subject line to anticipate
     the open rate, Expert Systems with Applications (2022), Vol. 207. DOI:
     https://doi.org/10.1016/j.eswa.2022.117974.


                                                                                                       124