<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The  Hierarchical  Information  System  for  Management  of  the  Targeted Advertising </article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Karina Melnyk</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Natalia Borysova</string-name>
          <email>borysova.n.v@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Viktoriia Melnyk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kharkiv general education school of I-III degrees No 145</institution>
          ,
          <addr-line>Amosova street, 24a, Kharkiv, 61171</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Technical University “Kharkiv Polytechnic Institute”</institution>
          ,
          <addr-line>Kirpichova street, 2, Kharkiv, 61002</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>  The problems of target audience identification and user segmentation for managing a process of the targeted advertising to customers are considered. It has proposed to combine the solution of these two tasks within the framework of one information system. An overview of the existing methods for identifying and segmenting of the target audience are presented. In addition, the existing applications and tools for solving the given problem are considered. The formalization of these tasks is presented. The functional models of business processes corresponding to the identifying and segmenting of the target audience are developed. The architecture of the information system is proposed. It is presented in the form of a deployment scheme, a database model is developed. The results of numerical studies and evaluation of the effectiveness of the developed information system are presented.</p>
      </abstract>
      <kwd-group>
        <kwd>1  Targeted advertising</kwd>
        <kwd>target audience</kwd>
        <kwd>buyer persona</kwd>
        <kwd>customer segmentation</kwd>
        <kwd>classification methods</kwd>
        <kwd>K-Means Clustering</kwd>
        <kwd>similarity measure</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction </title>
      <p>process can be used for promotion of a certain product to a group of buyers from the target audience.
Other advantages of using the segmentation models are following: drawing up personalized price
proposals, improving customer experience, searching for ideal customer, reducing the customer
churn, prices optimization, searching for new market opportunities, etc.</p>
      <p>Therefore, the purpose of this research is to develop an integrated hierarchical information system
(HIS) that would combine the functions of systems for determining the target audience and a system
of user segmentation for the distribution of targeted advertising.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Formal problem statement </title>
      <p>The Hierarchical Information System is a system that solves tasks stage-by-stage. The first task is
to identify the target audience. It is a classification task since it divides customers into several
predefined groups. Input information is a set of characteristics of potential buyers. To solve the
problem, it is necessary to do:
 analyze the domain area for creating a portrait of buyer persona;
 undertake review of approaches and methods for determining the targeted audience;
 formalize a method for the resolving the task.</p>
      <p>The second level of the hierarchical management system is designed to solve the problem of
segmentation of the target audience. This task is a clustering task and belongs to the class of
unsupervised learning tasks. Potential customers should be grouped into clusters in such a way that
objects from one cluster are closer to one another than objects from other clusters by any criterion. To
solve the clustering problem, it is necessary to do:
 analyze methods and applications to solve the clustering task;
 formalize proposed method;
 perform numerical study;
 evaluate the proposed approach.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Overview of existing models, methods and applications to solve the given  issue </title>
      <p>
        This study proposes to solve the given issue in two stages. Therefore, it is necessary to conduct
review the existing methods and applications separately for each stage: for identifying the target
audience and for customer segmentation. There are many generally accepted models in direct
marketing, which can be used for both target audience identifying and customer segmentation. The
usage of these models depends on user’s data (Table 1) [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1-4</xref>
        ].
      </p>
      <sec id="sec-3-1">
        <title>Table 1 </title>
        <sec id="sec-3-1-1">
          <title>Models for target audience identifying and customer segmentation </title>
        </sec>
        <sec id="sec-3-1-2">
          <title>Model name </title>
        </sec>
        <sec id="sec-3-1-3">
          <title>Demographic based </title>
        </sec>
        <sec id="sec-3-1-4">
          <title>Geographic based </title>
        </sec>
        <sec id="sec-3-1-5">
          <title>Psychographic based </title>
        </sec>
        <sec id="sec-3-1-6">
          <title>Technographic based </title>
        </sec>
        <sec id="sec-3-1-7">
          <title>Behavioral </title>
        </sec>
        <sec id="sec-3-1-8">
          <title>Characteristics </title>
        </sec>
        <sec id="sec-3-1-9">
          <title>Age, gender, occupation, income, education, marital status, ethnicity,  race, religion, profession or role in the company </title>
        </sec>
        <sec id="sec-3-1-10">
          <title>Continent, country, region or state, city, district, postal code, timezone </title>
        </sec>
        <sec id="sec-3-1-11">
          <title>Social class, lifestyle, personality, values, presence in digital and/or </title>
          <p>social media space, personal convictions, beliefs, attitude, interests </p>
        </sec>
        <sec id="sec-3-1-12">
          <title>Usage of devises, applications and software </title>
        </sec>
        <sec id="sec-3-1-13">
          <title>Habits, spending, consumption, usage and desired benefits, usage,  loyalty, awareness, types of payment, demands, quality fanatics, price  and/or brand sensitiveness </title>
          <p>In addition, it is possible to use multiple models or mixed models.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Review of target audience identifying methods and applications </title>
      <p>
        Any classification method can be used for resolving of the identifying task of the target audience.
For example, Naive Bayes, logistic regression, Support Vector Machine, k-nearest neighbor, decision
trees etc. allow dividing objects into different groups [
        <xref ref-type="bibr" rid="ref5 ref6 ref7 ref8 ref9">5-9</xref>
        ]. The set of buyers is divided into two
classes: included in the target audience and not included in the target audience. The classification
signs are different characteristics of the portrait of an ideal buyer. The portrait has built before the
launch of an advertising campaign for a product or service and can be adjusted. The characteristics
correspond to one or more used models or a mixed model (Table 1). The best case when a marketing
specialist who understands the intricacies of promoting a company’s product or service draws up the
portrait. She/he can use various approaches, for example, Mark Sherrington’s “5W” approach. 5W
means answering such questions: What? (type of product or service we sale); Who? (our customer);
Why? (motivation for buying); When? (conditions for buying); Where? (place of buying). There are
alternative marketing approaches to the 5W method: the method “from the opposite”, the method
“from the product”, the method “from the market”, and the method “from the target” [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The result of
using any of the marketing methods is a set of values for the characteristics of an ideal buyer. This is
the most important and crucial stage for the further success of the advertising campaign. It influences
on the size of the company’s profit. An interesting tool from HubSpot named Make My Persona [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
can come in handy when drawing up a portrait of an ideal buyer. This tool allows building some
virtual image of buyer personas in seven simple steps:
1. Creating Personas’ avatar and choosing Personas’ name.
2. Identifying Personas’ demographic characteristics, such as age and level of education.
3. Identifying Personas’ business, such as working industry and size of company.
4. Identifying Personas’ careers, such as their job title, their job measuring, their boss.
5. Identifying Personas’ job characteristics, such as their goals or objectives, their biggest
challenges, their job responsibilities
6. Identifying Personas’ lovely tools to do their job and to communicate with vendors and other
businesses.
7. Identifying Personas’ consumption habits, such as their lovely social networks and their
training in their job.
      </p>
      <p>After answering these questions, the user is taken to a page where she/he can see her/his Buyer
Personas ’Overview, and can also expand it by adding own fields to this Overview. It can be saved,
downloaded or shared on social networks after filling out a special pop-up form.</p>
      <p>Existing applications, services, tools and software, which are intended for target audience
identifying, for example, such as Google Analytics, Facebook Insights, Twitter Followers Dashboard
and others can only be used to analyze the company’s existing customer data, creating various
analytical reports, visualizing analysis results, making forecasts, tracking the activity of regular
customers, inflow and outflow of customers, etc. However, all of these feature-rich applications,
services, tools and software have “a cold start” problem. They cannot be used for analysis in the
absence of data. For example, when some startup is launched and the portrait of the ideal buyer has
already been built, manager have to create advertising, but target audience is unknown. Obviously, the
target audience identifying task has not been completely solved, and in some cases it has not been
solved at all.</p>
      <p>
        Thus, in order to identify potential buyers, it is necessary to form a portrait of the buyer persona
and compare it with the considered objects. There are many ways of comparing objects. For instance,
manager of a company can perform calculation of similarities between objects. Input data of buyer
persona and potential customers are information of mixed type. Therefore, to calculate a similarity
between them, are encouraged to use corresponding metrics. It can be Voronin similarity measure,
Zhuravlev metric, Gower coefficient etc. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
3.2.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Review of customer segmentation methods and applications </title>
      <p>
        After identifying of the target audience, HIS can start customer segmentation. Manager can use
rule-based methods or cluster analysis methods for customer segmentation. The use of rule-based
methods implies the creation of rules or the selection of a priori thresholds for strict customer
segmentation. Such way of segmentation can lead to situation, when customers from one group have
significant differences. It is also quite difficult to perform segmentation in more than two dimensions.
In addition, the segmentation results are more consistent with the initial assumptions of the marketer
and do not always reveal significant differences between customers. In this sense, cluster analysis
methods show higher efficiency in comparison with rule-based methods. Since they are unsupervised
machine learning methods, they do not need a training sample; they can work directly with the input
data without prior training. These methods are more practical, they divide the training set of
customers into more homogeneous groups, within the differences between customers are very small,
in addition, cluster analysis methods allow to conduct the dynamic clustering, which is fully reflect
the state of the available data at a given time [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        There are special applications and software for customer segmentation on the market. For
example, Segmentor the customer segmentation tool from Optimove [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], CleverTap [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], HubSpot
[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], Experian [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], SproutSocial [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], Qualtrics [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], MailChimp [19] etc. In addition, there are
applications and software for customer segmentation that only use the behavioral model. For example,
Yieldify [20], Amplitude [21], Indicative [22], Mixpanel [23]. All of them certainly have their own
advantages and perform the function of customer segmentation using one or several models.
However, not all of them are free, but it is important for novice businesspersons or startups. In
addition, there is no description of the used algorithms and methods, which is not will allow the user
to double-check the obtained results and may lead to erroneous conclusions.
3.3.
      </p>
    </sec>
    <sec id="sec-6">
      <title>Review of targeted advertising management applications and services </title>
      <p>After segmentation of the target audience, it is necessary to prepare special advertisements for
each group of customers and send them out. For this, it is advisable to use special targeted advertising
management applications and services. The article [24] provides a brief description of some of these
services. Of course, there is a huge number of such applications and services, there are free and paid
ones, they have different functionality, and some even solve the problem of finding the target
audience in addition. Nevertheless, none of them solves three problems: target audience search, target
audience segmentation and targeted advertising delivery.</p>
    </sec>
    <sec id="sec-7">
      <title>4. Development of the hierarchical information system </title>
    </sec>
    <sec id="sec-8">
      <title>4.1. Formalization  of  the  management  process  of  the  targeted  advertising </title>
      <p>Let’s consider designing and using the hierarchical information system to resolve the management
task of the Targeted Advertising in a more detailed way. To solve the problem of determining the
targeted audience and the segmentation of potential users, it is necessary to develop a functional
model of the business process for managing targeted advertising delivery. For this, it is proposed to
use the IDEF0 methodology (Figure 1).</p>
      <p>The first step is marketing research. It allow to find, to collect and to analyze the received
information for reducing the uncertainty in making managerial decisions. For example, it is important
for a new startup to find a product or service that can be successfully implemented. For existing
companies, it is possible to assess the prospects for demand for a specific product or service based on
the study of consumer behavior. There are many methods to perform market research: observation,
surveys, focus groups, personal interviews etc. Each method has its own advantages, disadvantages,
limitations and is capable of providing information of varying completeness and accuracy.</p>
      <p>The goal of the next stage is to highlight informative features that will fully reflect the portrait of a
potential buyer. Next, a profile of ideal buyer persona is created. Defining a buyer or audience
persona helps to create product or service to better target ideal customer. Depending on the final
product is designed for, the appropriate templates are used: B2B or B2C Buyer Persona templates, as
well as the results of marketing research.</p>
      <p>Potential 
customers </p>
      <p>Conduct 
customer 
segmentation</p>
      <p>A5</p>
      <p>Non 
targeted 
audience</p>
      <p>   ()
Groups of 
customers</p>
      <p>Information 
from social nets</p>
      <p>Clustering 
methods
Classification
methods</p>
      <p>Undertake 
identifying of 
target audience </p>
      <p>A4</p>
      <p>The next step is to split buyers into two groups: targeted audience and non-targeted audience.
Various classification methods can be used for this. According to the above analysis, the paper
proposes to use an approach based on calculating a measure of similarity between an ideal customer
and a potential customer. To improve the effectiveness of an advertising campaign, the target
audience can be segmented depending on the needs of potential customers. It is necessary to solve the
clustering task according to the selected indicators. For example, if you advertise certain sport clothes,
then the younger generation will choose bright colors, and the middle age will prefer convenience to
appearance. In general, the task of segmenting potential customers can be reduced to the task of
determining the target audience. The expert for each group sets its own boundary values of the
similarity measure. Thus, she/he can create several target groups: core audience, several groups with
different values of similarity measures, non-targeted people. This approach can be applied in the case
of limited finances. However, the expert can miss significant differences between clients, therefore,
for a more efficient segmenting process, it is recommended to use automatic segmentation, namely
clustering methods, since these are methods of unsupervised learning.</p>
      <p>4.2.</p>
    </sec>
    <sec id="sec-9">
      <title>The  model  of  resolving  the  identifying  task  of  the  target  audience </title>
      <p>Consider the task of identifying potential customers. The model for solving the task is described
by the following activity diagram in Figure 2.</p>
      <p>Let be the set of clients selected by the manager to determine the value for the advertising
company. The input data for the task of identifying the target group is an ideal customer profile. Let’s
specify as a set of indicators of a potential client, and is a set of values of the -th
indicator. Then and are the value of the -th indicator of profile of buyer
persona and of -th client accordingly.</p>
      <p>
        Let’s designate as a measure of similarity between the profile of an ideal client and
-th client, then is a similarity in the по -th indicator. Data about a potential customer can be
qualitative data, in the form of categories, and in the form of numbers. For the simultaneous
processing of such data, there are special proximity measures. Let us consider the calculation of the
similarity on the example of using the Gower coefficient [
        <xref ref-type="bibr" rid="ref11">11, 25</xref>
        ]. To calculate the measure, it is
necessary to turn qualitative data into categorical data, and then use formula (1) for them:
. 
(1) 
Create portrait of buyer persona
Calculate the similarity measure
for mixed data between buyer
      </p>
      <p>persona and all customers
Set up the threshold value of the</p>
      <p>similarity measure
yes</p>
      <p>Is the similarity measure between
buyer persona and a customer
bigger than threshold value?</p>
      <p>no
Form targeted audience</p>
      <p>Form non targeted audience</p>
      <sec id="sec-9-1">
        <title>Figure 2: The model of the identifying process of the targeted audience </title>
        <p>If the indicators of the portrait of audience persona are quantitative, then formula (2) is applied:
 </p>
        <p>To calculate the Gower coefficient according to formula (3), it is necessary to determine is
the coefficient of the presence of the -th indicator for the client: , if information about the
indicator is absent in the profile of the ideal client or potential client, it equals to 1 if all the
information is available .</p>
        <p> </p>
        <p>Next, the manager determines the similarity values that are acceptable for the target group of
buyers and forms the corresponding sets. The obtained information allows formulating solutions for
the implementation of an advertising campaign.</p>
        <p>4.3.</p>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>The model of resolving the segmenting task </title>
      <p>Review of related works according to resolving the segmenting task has showed the feasibility of
using the clustering methods. Clustering process is a machine learning technique that groups of
objects according to chosen indicators. Modern science knows a lot of clustering methods: Affinity
Propagation, Balanced Iterative Reducing, K-Means Clustering, Clustering using Hierarchies, Mean
shift clustering, Agglomerative Hierarchical Clustering, Expectation–Maximization Clustering using
Gaussian Mixture Models etc. [26, 27]. One of the simplest and most commonly used method is the
K-Means Clustering method. Let’s consider a model for solving the task of segmenting a targeted
audience using the proposed method (Figure 3).</p>
      <p>Let the manager has a hypothesis about the number of clusters or segments. It can be based on
theoretical considerations or the results of marketing research. If the assumptions are not obvious,
then a series of experiments with different numbers of clusters can be performed to find the optimal
partition.</p>
      <p>The K-Means algorithm starts with randomly selected clusters and then reassigns objects to them
to minimize intra-cluster variability and maximize inter-cluster variability. The main drawback of the
K-Means algorithm is that it only works with numeric values. The input customer information is
mixed information. Therefore, in this paper, it is proposed to use the Gower coefficient to calculate
the distances between the centers of the clusters and the objects, since it works with mixed data.
(2) 
(3) 
Define the number n of initial clusters or segments</p>
      <p>Define cluster centers</p>
      <p>Mixed data
Quantitative</p>
      <p>data</p>
      <p>Calculate the
Euclidian measure</p>
      <p>Calculate the Gower</p>
      <p>coefficient
Assign every client to the</p>
      <p>appropriate cluster</p>
      <p>Calculate cluster centers</p>
      <p>Are the clusters’
centers equal to the
centers from the
previous step?
yes</p>
      <p>no</p>
      <p>Form segments of potential customers</p>
      <sec id="sec-10-1">
        <title>Figure 3: The model of the segmenting process of the targeted audience </title>
        <p>The algorithm of K-Means is following. The center of each cluster is randomly determined.
Denote as the center of -th cluster with the set of values of all indicators .
Then is the cardinality of the set of objects of the -th cluster. Then it is necessary to calculate
the distance between the centers of the clusters and each object according to formulas
(1)(3), where is the set of potential buyers or target audience obtained in the previous step, and is a
specific buyer from this set. The object or client is assigned to the closest cluster
where is designation of belonging of -th customer to -th cluster.</p>
        <p>After calculating the distances and assigning objects to the new cluster, it is necessary to find the
coordinates of the new center for each cluster. The new values of each indicator of the center of the
cluster are found based on the use of the formula for the arithmetic mean of all values of the -th
indicator of objects that belong to the current cluster:
 , 
 . 
(4) 
(5) </p>
        <p>Next, the algorithm again calculates the distance from each object to the cluster centers using
formulas (1)-(3) and assigns the objects to the nearest cluster using formula (4). The centers of gravity
of the clusters are calculated again according to (5). This process is repeated until the centers of
gravity stop “migrating” in space.</p>
        <p>Thus, the model for target audience segmentation has been proposed.
4.4.</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>Architectural solution of the Information System </title>
      <p>The HIS design process is revised from the development of the software requirement specification
(SRS). All requirements from SRS are divided into functional and non-functional ones. Functional
requirements are subdivided into business requirements, user requirements, and system requirements.
Non-functional requirements describe how the HIS should work and which properties and quality
attributes it should have. The main requirements described in the SRS are presented in the form of
Requirement Diagram (Figure 4).</p>
      <p>The list of functional requirements for the system is following:
 The HIS has to verify the correctness of the entered data.
 The HIS has to generate a list of advertising messages and advertising objects or generate
an error in the process of creating advertising messages.
 The HIS has to create the advertising messages and customer groups after the customer
segmentation process is completed.
 The HIS has to provide access to all necessary data and documents (list of product
categories, information on discounts, rules of message formation, advertising budget,
segmentation rules, list of customer wishes, description of the segmentation method, list of
customer preferences) to the advertiser at the stage of formation of advertising messages.
 The HIS has to provide access to the database (Products_DB, Customers_DB etc.) at any
time.
 The HIS has to provide opportunities to work with files that have been created in other
systems.
 The HIS has to allow the advertiser to suspend his/her work and save to a file, then to
resume his/her work from the saved file.
 The HIS has to send advertising messages at the scheduled time or allow the system user
to do so manually.</p>
      <p>The result of turning out the requirements into a structured solution that meets both technical and
business requirements is an architectural solution for HIS. Deployment Diagram is used to visualize
the topology of the physical components of the system. The proposed architectural solution for The
Hierarchical Information System for the Management of Targeted Advertising in the form of a
Deployment Diagram is shown in Figure 5.</p>
      <p>This research proposes to use the “client-server” architectural pattern for the development of the
HIS architecture. Such architecture allows sharing the data processing function to several separate
servers. It separates the functions of storing, processing and presenting data for more efficient use.
The presentation component or a “client” is responsible for the user interface. Application logic is
executed at the middle level of the architecture, namely the application server layer. It provides data
exchange between users and databases. The middle layer is split into two separate components to
improve the performance of the HIS:
 IIS Application Server is responsible for the application’s logic;
 IIS Mail Server is devoted to process advertising messages.</p>
      <p>The data layer is designed to store and manage the information processed by the HIS. Here is a
database that allows to implement the interpretation of information about the domain area in the form
of formalized data in accordance with certain requirements (Figure 6).</p>
    </sec>
    <sec id="sec-12">
      <title>5. Experiments </title>
      <p>To check the performance of the developed HIS, three tasks have been solved to promote a new
sports club service: target audience identifying task, customer segmentation task and management
task of targeted advertising. The target audience has been determined based on the database of the
club’s clients and among users of the social networks Instagram and Facebook.</p>
      <p>The marketing research of the fitness services market, the communication with clients and analysis
of their requests made it possible to see the increased interest of the club’s clients in losing weight,
quick recovery after heavy physical exertion, developing muscle endurance and strength, increasing
their elasticity, strengthening the cardiovascular system, etc. In this regard, management of the fitness
club has been decided to create a new type of service in this fitness center as aqua aerobics. This type
of fitness contributes to the normalization of weight, hardening of the body and strengthening its
immunity, smoothing out the manifestations of cellulite, increasing skin tone, relieving muscle and
emotional stress, neutralizing the negative effects of stress, strengthening the nervous system,
normalizing sleep. The aforementioned marketing research would determine the set of indicators for
identifying the target audience:
 – income of clients: – low, – average, – high;
 – field of activity: – office worker, – student, – business manager, –
housewife, – creative activity;
 – social media interests in Instagram and/or Facebook: – presence of likes,
followers and subscriptions of pages with information about sport clubs, diet, weight loss,
child goods, mam’s publics, – absence of such records, – private account or
absence of account in social networks;
 – geographic location of a potential client: – residents of the area where the fitness
club is located, – residents of other areas;
 – age: – less, than 21 years, – 21-35 years, – 35-45 years, – more, than
45 years.</p>
      <p>The management of the fitness club has been created the profile of the target audience: or ;
or or ; or ; or ; or or . Let’s consider several potential clients of
aqua aerobics service. Information from club’s databases, social networks or questionnaires is
presented as a set of indicators in Table 2. Analysis of input data allow seeing that some indicators for
the portrait of a potential consumer have several values. It, in turn, adds uncertainty during the
deciding whether to include a particular potential client in the target group. The usage of the proposed
technology could clearly define the future client based on the varying process of the threshold value,
which helps to increase or filter the customer base.</p>
      <p>Values of the input indicators  </p>
      <p>The customers 1-3 and 5 fell into the target group with the current threshold as can be seen from
Table 2. If the threshold value increases to 0.61 to refine the set of potential customers, then we will
consider only customers 1-3. Despite the fact that two of them live in a different area, the scope of
their employment and interests allows to offer them the considered service.</p>
      <p>Second stage of proposed technology is resolving the customer segmentation task for the potential
clients of aqua aerobics service. Analysis of customer preferences and the impact of aqua aerobics on
the human body shows that potential customers could be divided into 3 segments. The first segment is
responsible for improving the muscles and visible changes of the body of clients, that is, this service
can be used as an additional training. The aim of the second group of clients is psychological and
physical rehabilitation, that is, to relieve muscle and emotional tension and problems. The purpose of
the third segment of clients is the organization of leisure, that is, more entertaining than strengthening
the body. The result in this case is the acquisition of new acquaintances, an interesting pastime and an
increase in self-esteem. To solve the segmentation task, the following indicators have been proposed:
 – health status: – poor health, – average state, – good health;
 – social identification: – business people, – student, – athlete, – former
athlete, – housewife;
 – social media interests in Instagram and/or Facebook: – sport clubs, – diet,
– weight loss, – child goods, – pools, – private account or absence of
account in social networks;
 – free time: – all day, – weekend, – weekday evenings;
 – age: – less, than 21 years, – 21-35 years, – 35-45 years, – more, than
45 years;
 – marital status: – single, – married, – have kids, – childless;
 – regularity of training: – 0-1 per week, – 2-3 per week, – more than 3
times per week;
 – possible problems of the target audience: – a weight problem, – problems
with communication, – problems with appearance, – bad mood, – narrow
social circle, – low immunity, – absence of complaints.</p>
      <p>The snippet of input data for ten potential clients and result of resolving the segmentation task is
presented in Table 3. The clients 1-3 were chosen as initial centers for the three clusters. Result of
using the K-Means Clustering method at the first stage is the set of values of similarity measure
between clients 1-3 and 4-10. It is a base for defining the belonging of each client to one of the
clusters. Next step is to calculate new cluster centers. After the sixth iteration, the cluster centers have
stabilized.</p>
      <p>An analysis of the content of each segment shows that the first cluster includes middle-aged
housewives with overweight problems and a former athlete. Based on the values of other indicators, it
can be understood that this segment of clients is aimed at rehabilitation, support and restoration of
health. The second cluster of potential customers includes those people who are seriously into sports.
For this group, it is necessary to create special training programs. The third segment is a group of
young people, mostly unmarried, who are focused not only on sports, but also on a pleasant and
interesting pastime.</p>
      <sec id="sec-12-1">
        <title>Table 3 </title>
        <sec id="sec-12-1-1">
          <title>The result of the segmenting process of the targeted audience </title>
        </sec>
        <sec id="sec-12-1-2">
          <title>Values of the input indicators  </title>
          <p>Thus, the practical obtained result shows the feasibility of using the proposed technology in the
article in real conditions to create a target audience.
6. Discussion 
 
 
 
 
 
 
 
 
 
 
 
 
 </p>
          <p>Efficiency estimation of the HIS was carried out separately for each task: target audience
identifying task, customer segmentation task and management task of targeted advertising. The
classification method based on the calculation of the similarity measure has been used for target
audience identifying task. Therefore the standard metrics for assessing the effectiveness of the
Precision and Recall classification were chosen to assess the effectiveness of HIS. The Confusion
matrix with the corresponding indicators for calculating these metrics is presented in Table 4.</p>
        </sec>
      </sec>
      <sec id="sec-12-2">
        <title>Table 4 </title>
        <sec id="sec-12-2-1">
          <title>Confusion matrix for Precision and Recall metrics evaluation </title>
        </sec>
        <sec id="sec-12-2-2">
          <title>Actual </title>
        </sec>
        <sec id="sec-12-2-3">
          <title>Positive </title>
        </sec>
        <sec id="sec-12-2-4">
          <title>Negative </title>
        </sec>
        <sec id="sec-12-2-5">
          <title>Predicted </title>
        </sec>
        <sec id="sec-12-2-6">
          <title>Positive  TP = 1123  FP = 41 </title>
        </sec>
        <sec id="sec-12-2-7">
          <title>Negative  FN = 77  TN = 429 </title>
          <p>Data for the matrix is the result of comparison of the results of the work of HIS and the opinion of
an expert. The classifier assigned 1164 clients from 1670 to the target audience. It incorrectly
attributed 41 clients to the target audience and did not attribute 77 clients from the target audience to
it. The formulas for calculating and Precision and Recall metrics values are presented below:</p>
          <p>The values of the Precision and Recall metrics are quite high and very close to one, which
indicates a high classification efficiency.</p>
          <p>K-means clustering algorithm was used for resolving the customer segmentation task. The Rand
index (RI) has been chosen for evaluating the quality of clustering algorithm. The RI calculates a
proximity measure between two clusters based on the comparing results of pairs that are assigned in
the same or different clusters in the predicted and true clustering process. The input data for clustering
were 1200 clients of the target audience. The contingency table with the indicators values for
calculating the RI is presented in Table 5.</p>
        </sec>
      </sec>
      <sec id="sec-12-3">
        <title>Table 5 </title>
        <sec id="sec-12-3-1">
          <title>Contingency table for Rand index evaluation </title>
          <p> </p>
        </sec>
        <sec id="sec-12-3-2">
          <title>Same class </title>
        </sec>
        <sec id="sec-12-3-3">
          <title>Different classes </title>
          <p>An expert compared the clustering results of the HIS with the own results of the distribution of
clients into groups. The formula for calculating and the value of the Rand index are presented below:
 </p>
          <p>The obtained value of the RI is rather close to one. It indicates an almost complete coincidence of
clusters and classes, which showed a high efficiency of clustering.</p>
          <p>Conversion rate (CR) was used to evaluate the effectiveness of targeted advertising mailing. The
target audience of 1200 clients was split into two clusters. The first cluster included 418 clients, the
second one 782. Different advertisements about the new service of the sports club were prepared for
the clients of each cluster. The set of data was prepared according to the customers’ actions within a
month from the date of sending advertisements. The formula for CR calculating and CR values for the
two clusters are presented below:
 </p>
          <p>In SMM, an advertising company is considered as successful if the CR values have reached 3-5%.
In our case, the rather high CR values can be explained by the fact that the target audience included
both real customers from the sports club’s database and new customers found in social networks.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-13">
      <title>7. Conclusion </title>
      <p>In this study, the approach was proposed for solving the task of dividing potential customers into
non-targeted audience and targeted audience with additional segmentation for a more effective
advertising campaign. The methods and existing applications for solving the given task were
considered. The model of resolving the identifying task of the targeted audience and the model of the
segmenting process of the targeted audience were developed. The architectural solution for the HIS
has been developed on the base of the chosen architectural pattern “client-server”. The conducted
experiment and the assessment of the performance of the HIS have showed the feasibility of usage of
the developed HIS in real conditions for managers that conduct an advertising campaign in order to
attract new customers and improve the financial condition of the enterprise.
8. References 
[19] MailChimp. Put your audience at the heart of your marketing, 2022. URL:
https://mailchimp.com/audience/.
[20] Yieldify. Behavioral segmentation, 2022. URL:
https://www.yieldify.com/platform/behavioralsegmentation/.
[21] Amplitude Analytics, 2022. URL:
https://help.amplitude.com/hc/enus/categories/360006505092-Amplitude-Analytics.
[22] Indicative. Segmentation, 2022. URL: https://www.indicative.com/feature/segmentation/.
[23] Mixpanel. Limitless segmentation. Analyze why metrics change, 2022. URL:
https://mixpanel.com/segmentation/.
[24] D.O. Maidebura, K.V. Melnyk, N.V. Borysova, Analiz isnuiuchykh servisiv nalashtuvannia
tarhetovanoi reklamy [Analysis of existing services of setting up targeted advertising], in:
E. I. Sokol (Eds.), Proceedings of XXIX International scientific-practical conference in
Information technologies: science, engineering, technology, education, health, part 1 of
MicroCAD-2021, NTU “KhPI”, Kharkiv Ukraine, 2021. p. 32.
[25] B. S. Everitt, S. Landau, D. Stahl, Cluster Analysis, Wiley, New York, NY, 2011.
[26] G. Seif. The 5 Clustering Algorithms Data Scientists Need to Know, 2018. URL:
https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-knowa36d136ef68.
[27] M. McGregor. 8 Clustering Algorithms in Machine Learning that All Data Scientists Should
Know, 2020. URL:
https://www.freecodecamp.org/news/8-clustering-algorithms-in-machinelearning-that-all-data-scientists-should-know/.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Matan</given-names>
            <surname>Naveh</surname>
          </string-name>
          ,
          <article-title>How To Identify a Target Audience for Your Business</article-title>
          ,
          <year>2022</year>
          . URL: https://elementor.com/blog/how-to
          <article-title>-identify-target-audience/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Customer</given-names>
            <surname>Segmentation Models</surname>
          </string-name>
          ,
          <year>2017</year>
          . URL: https://medium.com/think-withstartupflux/
          <article-title>customer-segmentation-models-52ef7738823a.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>What is Customer Segmentation</surname>
          </string-name>
          ? - Types, Techniques, Models. URL: https://survicate.com/customer-segmentation/
          <article-title>what-is-customer-segmentation/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Baker</surname>
          </string-name>
          , The Ultimate Guide to Customer Segmentation: How to Organize Your Customers to Grow Better,
          <year>2020</year>
          . URL: https://blog.hubspot.com/service/customer-segmentation.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E. F.</given-names>
            <surname>Ayetiran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Adeyemo</surname>
          </string-name>
          ,
          <article-title>A Data Mining-Based Response Model for Target Selection in Direct Marketing</article-title>
          ,
          <source>International Journal of Information Technology and Computer Science</source>
          <volume>4</volume>
          (
          <issue>1</issue>
          ) (
          <year>2012</year>
          ). doi:
          <volume>10</volume>
          .5815/ijitcs.
          <year>2012</year>
          .
          <volume>01</volume>
          .02.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E. W.</given-names>
            <surname>Maibach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Leiserowitz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Roser-Renouf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. K.</given-names>
            <surname>Mertz</surname>
          </string-name>
          ,
          <article-title>Identifying like-minded audiences for global warming public engagement campaigns: an audience segmentation analysis and tool development</article-title>
          ,
          <source>PLoS ONE 6</source>
          (
          <issue>3</issue>
          ) (
          <year>2011</year>
          ). URL: https://journals.plos.org/plosone/article?id=
          <volume>10</volume>
          .1371/journal.pone.0017571 doi:10.1371/journal.pone.0017571
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Tirenni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Herrmann</surname>
          </string-name>
          ,
          <article-title>Applying decision trees for value-based customer relations management: Predicting airline customers' future values</article-title>
          ,
          <source>J Database Mark Cust Strategy Manag</source>
          <volume>14</volume>
          (
          <year>2007</year>
          )
          <fpage>130</fpage>
          -
          <lpage>142</lpage>
          . doi:
          <volume>10</volume>
          .1057/palgrave.dbm.
          <volume>3250044</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>K.</given-names>
            <surname>Melnyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kirkin</surname>
          </string-name>
          ,
          <source>Intelligent Data Processing in Creating Targeted Advertising, in: Proceedings of the 1st International Conference Computational Linguistics And Intelligent Systems</source>
          , volume
          <volume>1</volume>
          <source>of COLINS</source>
          <year>2017</year>
          , NTU «KhPI», Kharkiv Ukraine,
          <year>2017</year>
          , pp.
          <fpage>131</fpage>
          -
          <lpage>132</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Karim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Rahman</surname>
          </string-name>
          ,
          <article-title>Decision Tree and Naïve Bayes Algorithm for Classification and Generation of Actionable Knowledge for Direct Marketing</article-title>
          ,
          <source>Journal of Software Engineering and Applications</source>
          <volume>6</volume>
          (
          <issue>4</issue>
          ) (
          <year>2013</year>
          ). URL: https://www.scirp.org/html/6-9301587_
          <fpage>30463</fpage>
          .htm. doi:
          <volume>10</volume>
          .4236/jsea.
          <year>2013</year>
          .
          <volume>64025</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <article-title>HubSpot tools. Make My Persona. A Buyer Persona Generator from HubSpot</article-title>
          . URL: https://www.hubspot.com/make-my-persona
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>K.</given-names>
            <surname>Melnyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Borysova</surname>
          </string-name>
          ,
          <article-title>Integrated Technology for Personnel Assessment Based on the Competencies Model</article-title>
          , in: T.
          <string-name>
            <surname>Hovorushchenko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Pakštas</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Vychuzhanin</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Yin</surname>
            and
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Rudnichenko</surname>
          </string-name>
          (Eds.),
          <source>Proceedings of the 9th International Conference “Information Control Systems &amp; Technologies”, ICST-2020</source>
          , Odessa,
          <year>2020</year>
          , pp.
          <fpage>343</fpage>
          -
          <lpage>357</lpage>
          . URL: http://ceur-ws.
          <source>org/Vol2711/paper27.pdf. doi:10.13140/RG.2.2.26024.60169.</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Customer</given-names>
            <surname>Segmentation via Cluster Analysis</surname>
          </string-name>
          ,
          <year>2020</year>
          . URL: https://www.optimove.com/resources/learning-center/
          <article-title>customer-segmentation-via-clusteranalysis.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Segmentor</surname>
          </string-name>
          . Customer segmentation tool,
          <year>2022</year>
          . URL: https://segmentor.optimove.com/#/.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>CleverTap</surname>
          </string-name>
          .
          <article-title>Audience segmentation. Build actionable user segments with ease</article-title>
          ,
          <year>2022</year>
          . URL: https://clevertap.com/segmentation/.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>HubSpot's Product</surname>
            and
            <given-names>Services</given-names>
          </string-name>
          <string-name>
            <surname>Catalog</surname>
          </string-name>
          ,
          <year>2022</year>
          . URL: https://legal.hubspot.com/hubspotproduct-and
          <string-name>
            <surname>-</surname>
          </string-name>
          services-catalog?_
          <source>ga=2.129780293.818963249</source>
          .
          <fpage>1623571090</fpage>
          -
          <lpage>2102544527</lpage>
          .
          <fpage>1617356624</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Experian</surname>
          </string-name>
          . Marketing solutions,
          <year>2022</year>
          . URL: https://www.experian.com/business/solutions/marketing-solutions.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>SproutSocial. Listening</given-names>
            <surname>Tools</surname>
          </string-name>
          .
          <article-title>Inform your business strategy with social listening, 2022</article-title>
          . URL: https://sproutsocial.com/features/social-media-listening/.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Qualtrics</surname>
          </string-name>
          .
          <article-title>Platforma dlya segmentacii rynka. Izuchajte svoyu celevuyu auditoriyu s pomoshch'yu resheniya dlya segmentacii rynka [Qualtrics. Market segmentation platform. Research your target audience with a market segmentation solution]</article-title>
          ,
          <year>2022</year>
          . URL: https://www.qualtrics.com/ru/product-experience/
          <article-title>po-dlya-segmentirovaniyarynka/?rid=langMatch&amp;prevsite=en&amp;newsite=ru&amp;geo=UA&amp;geomatch=.</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>