=Paper= {{Paper |id=Vol-2833/Paper_7 |storemode=property |title=Modeling and Forecasting Dynamic Factors of Pricing in E-commerce |pdfUrl=https://ceur-ws.org/Vol-2833/Paper_7.pdf |volume=Vol-2833 |authors=Galyna Chornous,Yaroslava Horbunova |dblpUrl=https://dblp.org/rec/conf/iti2/ChornousH20 }} ==Modeling and Forecasting Dynamic Factors of Pricing in E-commerce== https://ceur-ws.org/Vol-2833/Paper_7.pdf
Modeling and Forecasting Dynamic Factors of Pricing in E-
commerce
Galyna Chornous, Yaroslava Horbunova
Taras Shevchenko National University of Kyiv, Faculty of Economics, Department of Economic Cybernetics,
 Vasylkivska str. 90a, Kyiv, 03022, Ukraine


                 Abstract
                 The rapid development of information technologies contribute to the search for various
                 methods to increase the e-commerce company's profits. Big Data technologies are conducive
                 to developing varieties of personalized pricing strategies, in particular dynamic pricing
                 meaning a process of setting and developing prices for goods and services, where prices
                 change synchronously for all consumers depending on demand under current market
                 conditions. The implementation of dynamic pricing approaches is directly related to the use
                 of appropriate software capable of performing data mining. The important tasks are to study
                 the successful experience of powerful e-commerce platforms in relation to the tools that they
                 used for analysis, and to expand the toolkit of methods to support personalized pricing
                 strategies with methods working with Big Data. The purpose of this research is to study the
                 experience of modeling dynamic pricing and the proposal to involve the PLS regression both
                 to confirm the importance of dynamic factors and to predict prices in the implementation of
                 this strategy of personalized pricing in e-commerce. The approaches to pricing in e-
                 commerce are studied in the research. The features of different types of personalized pricing
                 (dynamic, customized, transactional) are identified too. The approaches to modeling of
                 dynamic pricing are explored. Applying PLS regression for predicting the impact of dynamic
                 factors on price is proposed. The study places special emphasis on identifying the dynamic
                 factors that have become increasingly important in recent years. Results of modeling confirm
                 that linear regression is not able to identify hidden predictors among a large number of
                 collinear factors (dynamic factors are often hidden and collinear) whereas PLS regression
                 does. The proposed approach is implemented by using data of the vacation rental e-platform
                 Airbnb.

                 Keywords 1
                 dynamic pricing, e-commerce, dynamic pricing modeling, dynamic factors of pricing, PLS
                 regression, Big Data, information economy

1. Introduction

   The rapid development of technology, as well as the growth of the information sector of the
economy, contributes to the search for various methods to increase the company's profits or reduce its
costs. A promising area of the information economy is e-commerce. Interest in the e-commerce
market is increasing every year. There are objective reasons for this fact: the growing number of
Internet audiences, the growing share of online sales in total sales, the development of social and
mobile networks. The COVID-19 pandemic has also been a factor in the significant growth of this
sector of the economy.
   One of the most widely used modern types of pricing in e-commerce is personalized or customized
pricing meaning dynamic price adjustment for consumers depending on the value that these customers

IT&I-2020 Information Technology and Interactions, December 02–03, 2020, KNU Taras Shevchenko, Kyiv, Ukraine
EMAIL: gach.2012@gmail.com (G.Chornous); yaroslava.gorbunova @gmail.com (Y.Horbunova)
ORCID: 0000-0003-4889-1247 (G. Chornous)
            ©️ 2020 Copyright for this paper by its authors.
            Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                                               71
attribute to the commodities. Personalized pricing is a technology-based pricing system in which
different prices are set for the same goods for different consumers. The development of Big Data
technologies has contributed to the spread of personalized pricing strategy, as it provided companies
with the opportunity to analyze in real-time many factors on the Internet environment, such as
customer loyalty, purchase history on the site, consumer preferences, and more.
    One of the types of personalized pricing is dynamic pricing. This type of pricing bases on a
process of setting and developing prices for goods and services, where prices change synchronously
for all consumers depending on demand under current market conditions.
    The number of e-commerce retailers has grown over the past few years. With increasing
competition, they face the difficult task of maximizing profits while maintaining price
competitiveness. Dynamic pricing is a rational and effective solution to this problem because it takes
into account changes in supply and demand and recommends an effective pricing structure. If this
pricing strategy is implemented over a long period, it can significantly increase the overall revenue
and profitability of the business.
    The implementation of dynamic pricing approaches is directly related to the use of appropriate
software capable of performing data mining. In recent years, arrays of information on goods,
consumers, and market conditions have become increasingly large. Therefore, on the one hand, the
task is to study the successful experience of powerful e-commerce platforms in relation to the tools
that they used for analysis, and on the other hand - to expand the toolkit of methods to support
personalized pricing strategies with methods working with Big Data.
    The purpose of this research is to study the experience of modeling dynamic pricing and the
proposal to involve the PLS model both to confirm the importance of dynamic factors and to predict
prices in the implementation of this strategy of personalized pricing in e-commerce.
    The outlined goal requires the solution of the following research tasks:
        Investigate approaches to pricing in e-commerce and show the prospects of personalized
    pricing.
        Identify the features of dynamic pricing and explore approaches to its modeling.
        Propose prospective models for predicting the impact of dynamic factors on price.
        Develop a dynamic pricing model to identify the significant factors that determine price
    dependency.
    The results of this study can be used in companies that apply or plan to use a dynamic pricing
strategy. They also can be helpful for the researchers who are interested in understanding the dynamic
factors that can significantly affect price changes.

2. Research review on pricing in e-commerce
    There are plenty of different approaches to pricing that are used in e-commerce. In the article [1]
Tanir identifies such types of pricing as cost-based pricing (the company determines the unit costs for
each of the products and sets a target margin for each of them); market pricing (the company follows
the prices of competitors and offers the prices which are a little cheaper than their competitors’);
dynamic pricing (the company sets the optimal price in a specific period of time for a specific request
in real-time); consumer or behavioral pricing (the company segments the audience through the use of
real-time data, purchase history to accurately identify segments for customers); complete pricing (the
company combines products of the same nature as complementary goods, which increases the average
cost of the order, but the amount per order is less than buying all units separately) and market capture
(the company sets the price for a new product on the market below average and thus capture this
market segment).
    Amongt the first studies devoted to dynamic pricing was the work of Weatherford and Bodily [2].
In that paper, dynamic pricing was shown as a tool to determine the optimal price for a product.
Further theoretical principles, methods, and approaches used for implementation are mentioned in the
works of Den Boer [3], Dolgui and Proth [4], Faruqui [5], Kannan and Kopalle [6], Sahay [7], Tong,
Wang and Zhou [8].


                                                                                                     72
    A broader term that is used for dynamic pricing is ‘customized’ or ‘personalized’. The theoretical
foundations of personalized pricing are described in Obermiller, Arnesen, and Cohen [9]. This type of
pricing are defined as a strategy when “identical products are delivered, regardless of time or
situation, to different consumers at different prices” [9, p. 14]. Elmachtoub, Gupta, and Hamilton [10]
in their work define customized pricing as pricing that is based on personal information about the
customer. In this case, it is appropriate to use machine learning methods to determine personal
qualities.
    The Office of Fair Trading [11] and the Organization for Economic Co-operation and
Development [12] consider that customized pricing is related to price discrimination.
    According to Iarmolenko and Chornous [13], transactional pricing, as a type of customized pricing
which focuses on direct and indirect transaction features, is used only in the e-commerce market and
can minimize price discrimination.
    The relationship between different types of pricing in the information economy is analyzed in [14].
This work presents the peculiar properties of the category ‘dynamic pricing’ in the papers of various
researchers. According to the authors, the concept of customized pricing is broader than dynamic or
transactional ones. It is not limited to a particular market and is defined by the widest variety of
factors that affect pricing. Dynamic pricing does not take into account the subjective characteristics of
customers. Transactional pricing narrows the category of customized pricing to the e-commerce
market.
    Bichler uses the term ‘flexible pricing’ in the research [15]. He distinguishes between
differentiated pricing (when different buyers are assigned different prices based on expected price
values) and dynamic pricing (prices are formed during trading between market participants). In the
author’s opinion, flexible pricing covers both of these categories.
    According to the research review, there is no consensus on the boundaries of the definition
‘dynamic pricing’ among researchers. There is no clearly defined list of parameters the variation of
which will maximize profit [14]. So, Srivastava in the paper [16] showed that dynamic pricing
includes two aspects: price variance and price discrimination. Price variance can be spatial or
temporal. In the spatial variance of prices, several sellers offer this item at different prices. With a
temporary dispersion of prices, the store changes its price for a given product over time, based on the
time of sale and the situation of supply and demand. An aspect of dynamic pricing is differentiated
pricing or price discrimination when the same products are charged different prices to different
consumers.
    Belyh in the work [17] identifies the following types of dynamic pricing to solve a specific
problem in the market.
    1. Segmented pricing: divide customers into different segments by certain factors (price, quality,
        etc.).
    2. Time-based pricing: inherent in companies that earn on fast service and work around the
    clock.
    3. Changing market conditions: frequent changes in market conditions which cause the need of
    the companies to adapt quickly.
    4. Peak pricing: most effective for peak periods in all industries.
    5. Penetration pricing.
    In this study, while defining ‘dynamic pricing’, the authors emphasize the synchronous change of
prices in order to clearly separate the concept from the category of ‘customized pricing’ and avoid the
association of the category with price discrimination. Therefore, segmented pricing is not considered.
The authors also decided not to limit the goals of dynamic pricing only to the financial goals of the
company but also include reputational goals. Maximizing profits, especially for a limited period of
time, is not a fairly correct goal: a wrong pricing strategy can have negative consequences in future
periods. One of the most difficult aspects of using a dynamic pricing strategy is the difficulty in
determining consumers' willingness to pay. In addition, the cost of setting such prices can be very
high. In large measure, this problem has been overcome through the use of the global network and
Big Data technologies.

                                                                                                      73
3. Related work in modeling dynamic pricing
    The idea of dynamic prices is very transpicuous. The company should adjust prices to demand.
However, its implementation is much more difficult. To achieve this idea, the company uses different
pricing approaches. Dynamic pricing algorithms provide flexibility because e-commerce companies
can set prices based on different customer groups. The main idea is to offer the best price based on
market trends, fluctuations in demand, customer behavior, purchasing power, and many other factors.
    Among the classical pricing models can be distinguished parameterization of the demand function.
The essence of this model is that the form of demand function is known. It is only necessary to find
the parameter for this function bases on the historical data that will satisfy the condition of the
problem. The problem is that there are about a dozen different possible types of the demand function.
These functions of demand are constantly influenced by macroeconomic, psychological factors,
consumer preferences, changes in fashion, and so on. These non-price factors affect the form of the
function. Bertsimas and Perakis [18], Boer [19], Carvalho and Puterman [20], Lobo and Boyd [21]
studied the parametric model.
    The nonparametric approach is inverted to the parametric. In this type of model, price optimization
occurs without constructing the demand function. There are a lot of methods that are used for price
optimization. Stochastic approximation, Bayesian inference, Poisson process as a basis for the model,
etc. are among them. The application of nonparametric approaches is described in Parkhimenko's paper
[22].
    Now non-parametric approaches are very popular. Among non-parametric models are economic
and mathematical models, data mining. Scientists have substantiated the effectiveness of statistical
and econometric approaches in their researches. They have also described different models for
modeling and forecasting dynamic and personalized pricing in their papers. Carta, Medda, Recupero,
and Saia considered ARIMA models [23], Liu, Liu, and Chaninvestigated used stochastic differential
equations [24], Tseng, Lin, Zhou, Kurniajaya, and Li suggest signal processing methods [25], and also
Tyralis, Papacharalampous presented the random forest method [26], etc.
    In dynamic pricing, in addition to complex approaches based on mathematical modeling, there are
other algorithms: the price changes depending on the offer of competitors based on price parsing
(automatic monitoring of prices for competitors' products).
    The model based on inventories was studied by Chan, Shen, Simchi-Levi, Swann [27],
Elmaghraby, Keskinocak [28]. According to that approach, pricing decisions are based on the number
of inventories of the firm. The problem of dynamic pricing can be defined as the problem of
maximizing the company's total revenue except for production costs and inventory storage costs. The
auction model assumes that the price is set by the auction participants based on the value of the good
for each agent. This model was studied by Bichler [15], Elmaghraby [29].
    Game models can also be used for dynamic pricing. Such models should be used in a highly
competitive economy when sellers compete for the same group of buyers. Various ways of computer
modeling of prices in such cases are considered. Models based on cooperation between e-market
players are the most realistic and bring the most benefit to market agents. Federgruen, Bernstein [30],
Сao, Shen, Milito, Wirth [31] studied the models of game theory. An e-commerce company can
combine several models and use them for modeling and forecasting dynamic pricing.

4. Forecasting dynamic factors of pricing based on models of Big Data
   analysis
   The last decade has shown an unprecedented increase in data in all areas of economic activity,
including e-commerce. The amount of data on the activities of customers collecting electronic
platforms is constantly growing. It is important that the set of factors influencing the formation of the
price is considerably expanded. That is why it is crucial to know what factors can be included in the
models of dynamic pricing. Involving more factors helps to increase the accuracy of the forecasts.
   The model based on the use of Big Data technology on the Internet environment is considered by
Yang, Zhao, Xing [32]. The model involves the use of methods that analyze statistics or other

                                                                                                      74
parameters to determine consumer preferences and optimal dynamic prices. Among the technologies
used are data mining, necessary for decision making (search for associative rules, classification
problems, cluster analysis, regression analysis, detection and analysis of deviations, etc.), machine
learning, simulation, statistical analysis, time series analysis, etc. To implement these technologies,
various software products are used, including R, Python, MatLab, NoSQL, Hadoop, etc.
    A special place among the models of dynamic pricing is occupied by machine learning models,
which provide for the analysis of large arrays of information. Usually, the main parameters of the
market environment are subject to constant dynamic changes, so it is impossible to predict all options
for the development of the market system. Computer technology and Big Data technology have
provided an opportunity to predict the main trends in market development and design pricing models
for them. The researches in this area were done by Gupta, Ravikumar, Kumar [33], Zagaynova [34].
    The main purpose of dynamic pricing is to respond in real-time to changes in factors that affect the
price. Based on the approach described in this paper, these factors can be divided into two groups:
dynamic (factors values of which change over time and depend on market conditions) and others
(characteristics of the product or service and the characteristics of customers). Importantly, the
number of dynamic factors available for observation in the information economy, specifically in e-
commerce, is growing as rapidly as the set of behavioral characteristics of customers.
    The feature of dynamic factors is that often they are collinear and quite difficult to determine
which of them should involve the model or be excluded. Dynamic factors contain latent variables, the
impact of which on the price is difficult to investigate using classical econometric or statistical
models.
    Promising models that can effectively predict changes in model factors are, for example, ridge
regression and neural networks but none of them include hidden factors that are most useful in
modeling. This disadvantage is absent in the model of PLS regression (Partial Least Squares, or
Projection into Latent Structure). The PLS regression uses the decomposition of the original
predictors along the axes of the principal components. This model also allocates a subset of latent
variables in which the relationship between the dependent variable and predictors reaches the
maximum value. The research [35] provides a comprehensive overview of Partial Least Squares
(PLS) methods with a discussion of the directions of current research and perspectives.
    The general underlying model of multivariate PLS is
                                         𝑋 = 𝑇𝑃 𝑇 + 𝐸,                                             (1)
                                         𝑌 = 𝑈𝑄𝑇 + 𝐹,                                              (2)
    where X – n×m matrix of predictors, Y– n×p matrix of responses, T and U - n×l matrices projections
of X and projections of Y, P – m×l orthogonal loading matrix; Q – p×l orthogonal loading matrix,
matrices E and F are the error terms, assumed to be independent and identically distributed random
normal variables. The decompositions of X and Y are made to maximize the covariance between T and
U.
    PLS regression is a useful method when the factors are many and highly collinear. The increase in
the number of dynamic characteristics of price formation in e-commerce means the increase in the
number of model factors that correlate with each other. There are many obvious factors. However,
there may be only a few keys or latent factors that explain much of the variation in response. The
main idea of PLS regression is to try to extract these latent factors. Thus, the PLS model should be
used to determine the most influential dynamic factors in the modeling of dynamic pricing.
    Until the advent of Big Data, this approach was not used to establish a set of important factors that
affect dynamic pricing. The current volumes of information resources cause the possibility and
prospects of using the PLS model for forecasting in the field of e-commerce.
    In the next section, we will demonstrate the implementation of this approach in comparison with the
classical linear regression. We use the booking service Airbnb to study the models of dynamic pricing.
In a number of studies [36-37], this service has already been considered as an object of development of
a flexible pricing system. The purpose of dynamic pricing was to obtain the maximum profit for each
future booking date. Thus, the article [36] presents three pricing models. The reservation probability
model, based on the binary classification model, shows the probabilistic forecast of the reservation of
flat for each night. Based on this model and two indicators (price decrease recall and booking regret) a
model of pricing strategy is built. The model is based on historical data on booking prices on a particular
night.

                                                                                                        75
    Based on information from the booking probability model, pricing strategy model, and
quantification, customers were offered personalized prices that satisfied them as consumers and
allowed the company to maximize profits.
    In this study, we are going to develop applied approaches through the use of the PLS regression to
analyze the impact of dynamic factors on price change and to show the prospects for the widespread
introduction of this method for pricing and price forecasting in e-commerce.

5. Modeling of dynamic pricing in booking accommodation

   We examined the popular vacation rental e-platform Airbnb’s data [38] to model the dynamic
pricing in order to determine the factors that most influence the price formation.
   Airbnb is an American vacation rental online platform designed to view and book accommodation
for rent. It is a marketplace where tenants and renters interact. The online platform connects owners
and travelers and facilitates the rental process. In addition, this service develops the economy of
shared use, allowing property owners to rent private apartments.
   The main task of modeling is to find out the factors that determine the rental price dependency and
their significance; identify the dynamic factors that are most influential in dynamic pricing when
booking accommodation. Data analysis and modeling were performed in the software environment
RStudio.

    5.1.        Data preparation

   To build the model, we considered data on future bookings of vacation rent in Madrid
(17.03.2020-17.03.2021), which contained the price for accommodation and information about
available housing (table with data ‘Calendar’) and data with a description of housing (a type of
housing, host rate, reviews, accommodation, and other characteristics) (table with data ‘Listing’). The
dataset was formed by combining two tables (Figure 1).



Figure 1: Data connection scheme

    The data needed to study pricing on the Airbnb platform contains a significant amount of
categorical data. Some of the data can be transformed into Logical or Numeric views (for example,
the presence/absence of certain rooms, facilities, and devices; proximity to the sights of the city),
some important data (for example, such factor as feedback) may be involved in the study only after a
sentimental analysis. In addition, the data also contained quite a few missing values. Based on the fact
that the simulated building of missing values can distort the results, we got rid of rows and columns
that contained a large number of spaces. The appropriate values of the categorical factors were
converted to Logical (0/1) and Numeric types.In the next step, we formed several new factors. For
example, among the non-dynamic factors based on the coordinates of the house location was
calculated the distance to the city center. Because the important task of modeling is to determine the
influence of dynamic factors, the set of such factors has been expanded. Variables were introduced to
determine the number of days before settling: x1 - less than 290 days, x2 - more than 290 and less
than 360 days. The threshold value of 290 days is justified by the fact that at this time there was a
significant jump in price. This was the beginning of a new year, relative to the date of the study
(March 17, 2020). We also singled out the day of the week when housing was ordered (0 - weekday, 1
- day off or holiday) and, on a similar principle, when housing was booked. A column was added with
the popularity of housing, showing the percentage of booked housing per month of all housing offered
by a particular owner. 23 columns (of 71) with 639,999 rows of data were formed after data
processing. Table 1 describes the main factors of the processed dataset.

                                                                                                     76
    5.2.        Data analysis
    In order to initially understand the data, we have performed a visual analysis of the data on
booking accommodation in Madrid. For the analysis, we used the RStudio software environment. For
this purpose, we used such tools as bar and line charts, histograms, boxplots, scatter plots, correlation
matrix and map, and others. The results of the analysis allowed us to obtain the set of conclusions.
    The highest demand for housing was observed in the center of Madrid, slightly less in the north
and north-east. The housing in the west and south areas are the less booked. In all areas of Madrid, a
significant advantage is the booking of apartments (60% - 89%). The closer the area is to the center,
the higher the percentage of apartment booking is. Private houses take the second place in terms of
booking, both in the central and the outlying districts. The highest prices are in the central regions.

Table 1
Data types
 Data                      Type         Description
 listing_id                Numeric      Booking code
 Date                      Date         The date on which the accommodation is booked or
                                        available for booking
 Available                 Logical      Available or booked accommodation on a specific date
 host_acceptance_rate      Logical      The host acceptance rate
 is_location_exact         Logical      The correctness of the specified location
 accommodates              Numeric      The capacity of housing
 guests_included           Numeric      Number of guests
 number_of_reviews         Numeric      Number of reviews
 review_scores_rating      Numeric      Total rating by reviews
 instant_bookable          Logical      The time rate of booking
 TV                        Logical      Availability of TV
 Internet                  Logical      Availability of the Internet
 AirCondition              Logical      Availability of air conditioning
 Pets                      Logical      Opportunity to live with animals
 Kitchen                   Logical      Availability of kitchen
 Breakfast                 Logical      Breakfast included
 Weekday1                  Logical      The day of the week when the order is placed
 Weekday2                  Logical      The day of the week on which the booking is made
 distance                  Numeric      Distance to the center
 x1                        Logical      Less than 290 days before booking
 x2                        Logical      More than 290 and less than 360 days before booking
 Price                     Numeric      Price
 i.available               Numeric      Percentage of booked housing per month

    When analyzing prices considering popular days of the week for the rent, the highest demand is
observed on Saturdays and Sundays. This tendency can be explained by the fact that people have days
off. The cheapest prices are on Thursdays. Surprisingly, low prices are also observed on Fridays. The
data also show a high spike in prices at the end of June. This can be explained by the fact that the
famous national holiday of San Juan, which attracts tourists, occurs during this period in Spain. The
rise in prices at the end of the year is due to the fact that tourists come to celebrate the New Year in
Madrid.
    The trend is also similar considering the dependency of the price on the day of placing the order.
You will have to pay more ordering on weekends. However, low prices are observed when looking
for housing on Mondays.
    The increase in the price of accommodation is directly proportional to the number of bathrooms up
to 5 rooms, while after that amount the price decreases. There is a similar dependence on the number

                                                                                                      77
of rooms in an apartment but prices increase slowly. Considering a housing capacity, the cheapest
prices are for the housing which can accommodate 1 person or 9 people, and the most expensive
prices are for the housing which can accommodate 6 or 12 people.
   The correlation analysis of the dependence of the variables demonstrates a high correlation (value
0.7 and higher) for almost half of the factors. Among collinear factors, almost 70% are dynamic
factors.

    5.3.         Modeling of dynamic pricing and interpretation of results

    Among the main dynamic factors for developing models, we included the day of the week when
the order was made, the booked day of the week, the number of days before booking, the number of
reviews, the total rating for reviews, the time rate of accommodation booking, the percentage of
accommodation booked with the same owner during the month. And non-dynamic factors include the
distance to the center, the capacity of housing, the availability of a kitchen, the Internet, air
conditioning, TV, the inclusion of breakfasts, and so on.
    We determined the significance of the factors using linear regression and the PLS model. Their use
provides an answer to the question of whether dynamic factors significantly affect price formation.
    Linear regression is a standard and reliable tool for finding linear relationships between factors but
it has many limitations and disadvantages, especially in Big Data environment. The PLS model
should help in finding latent pricing factors, especially this model is suitable for forecasting based on
large information arrays in conditions of high collinearity of factors.
    To construct a linear regression, we selected factors that showed a low level of correlation in the
previous analysis. This limitation significantly narrowed down many factors, especially dynamic,
because, for example, the days of the week or the number of days before booking had a high level of
collinearity. Several linear regression models were built based on the selected factors. The best model is:
          Price  3.67  0.18accommodates  0.06 x1  0.06distance  0.15i.available                   (3)
where accommodates – capacity of housing, x1 – less than 290 days before booking, distance –
distance to the center, i.available – the percentage of accommodation booked with the same owner
during the month.
    To determine the quality of the model, we need to check several conditions: the absence of
multicollinearity, absence of heteroscedasticity, and absence of autocorrelation.
    So, in this сase we explore panel data as the basis for modeling, and not time series, therefore it is
sufficient that the first two conditions are met.
    To check for multicollinearity, we realized two tests: VIF and CI. Both tests confirmed the absence
of multicollinearity (CI = 11.14 < 30; VIF = 1.05 < 10).
    To verify the absence of heteroscedasticity, we used criteria Goldfeld-Quandt and White. Both
tests verified heteroskedasticity. We got rid of heteroskedasticity using a covariance matrix without
heteroskedasticity.
    All the factors of the regression are significant but R2 has a value of 0.5059, which indicates that
the value is not enough to justify the forecasts. At the same time, we can confirm the presence of
dynamic factors in the model. According to the simulation results, it turned out that the price is
affected by the housing capacity, distance to the center, accommodation booking for 290 days, and the
percentage of accommodation booked with the same owner during the month. As you can see, among
the dynamic factors, these are the number of days before booking and the percentage of booked
accommodation for the exact date. The presence of dynamic factors suggests that they affect the price
and confirms that this online platform is used the dynamic pricing strategy.
    The next step was the implementation of the PLS regression. Among the advantages of using this
model in comparison with the linear regression is that there is no need to narrow down factors set
because this model effectively works with multicollinear data and with a large number of predictors.
    PLS model finds the useful information that is contained in independent variables and in the
relationship between dependent and independent variables. The goal of PLS is to maximize the
covariance between variables X and Y in order to uncover latent factors. The main assessment of the
quality of the model is the minimization of standard errors.


                                                                                                        78
    Based on the results of PLS modeling, we see that one component can explain 17.7% of the
information, and 3 components can explain 51.2%.
    The resulting model was the model with the lowest standard deviation and the highest R2. The R2
estimate for this model is a non-standard quality assessment, so it was calculated using an
econometric formula. The VIF projection approach was used to randomize the significance of the
factors. The results of comparing the importance of predictors are presented in Figure 2.




Figure 2: Influence of importance of factors on pricing

    By estimating the magnitude of the impact of each of the factors on the price, we see that the most
influential are such dynamic factors as booking accommodation 290 days before the trip, the time rate
of booking, and booking more than 290 days but less than 360. Slightly less impact on price has
accommodation capacity and the distance to the center.
    The PLS regression demonstrates a high dependence of dynamic factors on the price. As the
importance of the dynamic factors is 51.2% for determining the price.
    Estimates of the quality of the main resulting models in the research, as well as the names of
significant dynamic factors, are summarized in Table 2.
    Thus, both models, which were developed based on the booking database of the Airbnb platform
in the city of Madrid, demonstrated the dependence of the rental price on dynamic factors. In addition,
we can make sure about the high level of significance of dynamic factors in both models.
    At the same time, we see that the results of applying the linear regression model give a rather
rough estimate on a large data set (low value of R2). The PLS regression delves more subtly into the
hidden relationships between factors in the context of Big Data and demonstrates a range of
significant dynamic factors. This fact once again confirms the feasibility of using such an approach to
modeling dynamic pricing in the presence of huge arrays of observations.
    A significant amount of effort to prepare the data for modeling is one of the main problems that
occurred during modeling. This is due to both the need for data integration and the low quality of the
original data. A large amount of categorical data was needed to be transformed. Some problems
concerned the expansion of the range of dynamic factors.
    Some problems were caused by the fact that the initial data were uploaded on a specific date, so it
is difficult to see the dynamics of change. The dataset is based on the offered prices by the service for
home owners and this affects the accuracy of forecasting (for example, prices were presented in the

                                                                                                      79
format: 70, 75, 80, 85, etc, rounded to integer multiples of 5). Adjusted host prices would have shown
a greater dynamic relationship.

Table 2
Estimation of model quality
 Indicators                         Linear regression                  PLS regression
 R2                                 0.5059                             0.6842
 RMSE                               -                                  27.4188
 Dynamic factors                    x1, i.available                    x1, instant_bookable, x2

   It should also be noted that the analysis of large amounts of data requires significant computing
capacity and requires more time for modeling.

6. Conclusion
    In today's competitive market environment, e-commerce has a number of advantages over
traditional trading. Among these advantages are the absence of maintenance costs, low market entry
costs, low ‘menu’ costs, flexibility, as well as the ability to use significant amounts of information on
the specifics of the sale or purchase transactions to form favorable prices. The development of the
information economy, especially the e-commerce market, facilitates faster adaptation to changes in
market conditions and consumer preferences. The new opportunities allow companies to set the
optimal price for a specific period of time for a specific consumer.
    More and more e-commerce platforms are using dynamic pricing as a type of personalized pricing.
Among the positive aspects of dynamic pricing usage can be identified the growth of business
profitability, the absence of direct price discrimination, because it does not take into account
information about the behavioral characteristics of a particular customer, their solvency, and so on.
These special features alleviate the ethical issues that typically arise from customized pricing. Prices
are the same for all consumers but depend on time factors.
    The features of dynamic pricing include the speed of response to changes in the market, flexibility,
control over pricing strategy, cost savings in the long term, implementation of specific software into
management processes that promotes informatization and digitalization of business.
    This study with the mathematical modeling implementation confirmed the use of dynamic pricing
strategies on the popular online service Airbnb. Significantly, this online platform is recognized as
one of the most successful businesses in e-commerce. It has many competitive advantages, which
confirms the growth of its popularity compared to the popularity of Booking and other similar e-
services.
    This fact once again confirms the relevance of studying the company's successful experience to
increase competitiveness in e-commerce and prospects of using modern data mining technologies and
powerful methods to form strategies for pricing and determining factors affecting prices. This study
places special emphasis on identifying the dynamic factors of pricing that have become increasingly
important.
    When implementing the new software to support personalized pricing, e-commerce companies can
use the approach proposed in this study, namely, the PLS regression. The advantages of this model
are the ability to identify hidden predictors among a large number of collinear factors (dynamic
factors are often hidden and collinear) and good performance in the processing of large information
arrays, even in cases where the number of observations is small, and number of factors is large. It is
also important to note that the inclusion of this algorithm in the toolkit of other methods and
approaches does not require significant funds, as it is present in most popular open-source libraries.
    It should be noted that we can recommend the use of PLS regression in the implementation of all
other types of personalized pricing because in modern conditions the constant increasing the number
of factors is also due to the expansion of the range of behavioral characteristics of consumers.
    Thus, the use of the PLS model answers the today’s challenges and this approach should take its
rightful place among the tools to support personalized pricing.

                                                                                                      80
7. References
[1] B. Tanir. “E-Commerce pricing.” Prisync.com, 2018. URL: https://prisync.com/blog/ultimate-
    ecommerce-pricing-strategies/.
[2] L. R. Weatherford, S. E. Bodily. “A Taxonomy and Research Overview of Perishable-Asset
     Revenue Management: Yield Management.” Overbooking, and Pricing, Operations Research, Vol.
     40, Issue 5, September-October (1992): 831-844. DOI: 10.1287/opre.40.5.831
[3] A. V. Den Boer. “Dynamic Pricing and Learning: Historical Origins, Current Research, and New
     Directions.” Surveys in operations research and management science, Vol. 20, Issue 1, June (2015):
     1-18. DOI: 10.1016/j.sorms.2015.03.001.
[4] A. Dolgui, J.-M. Proth. “Supply Chain Engineering: Useful Methods and Techniques.” Springer,
     New York, NY, 2010.
[5] A. Faruqui. “The Ethics of Dynamic Pricing.” The Electricity Journal, Vol. 23, Issue 6, July
     (2010): 13-27. DOI: 10.1016/j.tej.2010.05.013.
[6] P. K. Kannan, P. K. Kopalle. “Marketing in the E-Channel.” International Journal of Electronic
     Commerce, Vol. 5, Issue 35, Dec. (2001): 63-83. DOI: 10.1080/10864415.2001.11044211.
[7] A. Sahay. “How to Reap Higher Profits with Dynamic Pricing.” MIT Sloan Management Review
     48, (2007): 53-60.
[8] Y. Tong, L. Wang, Z. Zhou, L. Chen, B. Du, J. Ye. “Dynamic Pricing in Spatial Crowdsourcing: A
     Matching-Based Approach.” International Conference on Management of Data, May (2018): 773-
     788. DOI: 10.1145/3183713.3196929
[9] C. Obermiller, D. Arnesen, M. Cohen. “Customized Pricing: Win-Win or End Run?” Drake
     Management Review 1 (2012): 12-28.
[10] A. N. Elmachtoub. V. Gupta, M. Hamilton. “The Value of Personalized Pricing.” ssrn.com, 2018.
     URL: https://ssrn.com/abstract=3127719.
[11] Office of Fair Trading, Personalized Pricing: Increasing Transparency to Improve Trust, 2013.
     URL: https://one.oecd.org/document/DAF/COMP(2018)13/en/pdf.
[12] Organization for Economic Co-operation and Development, Personalized Pricing in the Digital
     Era, 2018. URL: http://www.oft.gov.uk/shared_oft/markets-work/personalised-pricing/oft1489.pdf.
[13] I. Iarmolenko, G. Chornous. “The Model of a Second-Hand Goods Resale Exchange under
     Transactional Pricing Strategy.” Ekonomika, Vol. 99(1), May (2020): 69-78. DOI:
     10.15388/Ekon.2020.1.4.
[14] G. Chornous, I.Iarmolenko. “Spivvidnoshennia mizh riznovydamy tsinoutvorennia v
     informatsiinii ekonomitsi.” [Relationship between types of pricing in the information economy],
     in: V. S. Ponomarenko, T. S. Klebanova (Eds.), Tools for Modeling Systems in the Information
     Economy, VSHEM-KHNEU, Kharkiv, 2019, pp. 120-135. (in Ukrainian)
[15] M. Bichler, R. D. Lawrence, J. Kalagnanam, H. S. Lee, K. Katircioglu, G. Y. Lin, A. J. King, Y.
     Lu. “Applications of flexible pricing in business-to-business electronic commerce.” IBM Systems
     Journal, Vol. 41, Issue 2, (2002): 287–302. DOI: 10.1147/sj.412.0287.
[16] A. Srivastava, Dynamic pricing models: Opportunity for action, Cap Gemini Ernst & Young
     Center for Business Innovation, 2001.
[17] A. Belyh. Different Types of Dynamic Pricing, 2018. URL: https://www.cleverism.com/complete-
     guide-dynamic-pricing/.
[18] D. Bertsimas, G. Perakis, Dynamic Pricing: A Learning Approach, in: S. Lawphongpanich, D.W.
     Hearn, M.J. Smith (Eds.), Mathematical and Computational Models for Congestion Charging.
     Applied Optimization, volume 101, Springer, Boston, MA, 2006. DOI: 10.1007/0-387-29645-X_3.
     doi:10.1007/0-387-29645-X_3.
[19] A. V. Den Boer. “Dynamic Pricing with Multiple Products and Partially Specified Demand
     Distribution.” Mathematics of Operations Research, Vol. 39, Issue 3, August (2014): 597-948.
     DOI:10.1287/moor.2013.0636.
[20] A. X. Carvalho, M. L. Puterman. “Dynamic optimization and learning: how should a manager set
     prices when the demand functions is unknown?” Technical Report 1117, Instituto de Pesquisa
     Economica Aplicada, 2005.



                                                                                                    81
[21] M. S. Lobo, S. Boyd. “Pricing and learning with uncertain demand.” Technical report, Stanford
     University, 2003.
[22] V.A. Parkhimenko, “Zadacha dinamicheskogo opredeleniya optimal'noj ceny s ispol'zovaniem
     neparametricheskogo podhoda.” [The problem of dynamically determining the optimal price using
     a nonparametric approach]. Mezhdunardnaya nauchnaya konferenciya BGUIR, (2014), Minsk. (in
     Russian)
[23] S.     Carta,     A.    Medda,      A.     Pili,  R.    Recupero,      R.    Saia.    “Forecasting
     E-Commerce Products Prices by Combining an Autoregressive Integrated Moving Average
     (ARIMA) Model and Google Trends Data.” Future Internet, Vol. 11, Issue 1, December (2018).
     DOI: 10.3390/fi11010005.
[24] W. W. Liu, Y. Liu, N. H. Chan. “Modeling EBay Price Using Stochastic Differential
     Equations.” Journal of Forecasting, Vol. 38, Issue 1, January (2019): 63-72. DOI:
     10.1002/for.2551.
[25] K.-K. Tseng, R. F.-Y Lin, H. Zhou, K.J. Kurniajaya, Q. Li. “Price Prediction of E-Commerce
     Products Through Internet Sentiment Analysis.” Electronic Commerce Research, Vol. 18, Issue 1,
     March (2018): 65-88. DOI: 10.1007/s10660-017-9272-9.
[26] H. Tyralis, G. Papacharalampous. “Variable Selection in Time Series Forecasting Using Random
     Forests.” Algorithms, Vol. 10, Issue 4, October (2017): 114. DOI: 10.3390/a10040114.
[27] L. M. A. Chan, Z. J. M. Shen, D. Simchi-Levi, J. Swann, Coordination of pricing and inventory
     decisions: A survey and classification, in: Handbook on Supply Chain Analysis: Modeling in the E-
     Business Era, Series in Operations Research and Management Science, Kluwer Academic
     Publishers, 2005.
[28] W. Elmaghraby, P. Keskinocak. “Dynamic pricing. Research overview, current practices and future
     directions.” Management Science, Vol. 49, Issue 10, October (2003): 1287–1309. DOI:
     10.1287/mnsc.49.10.1287.17315
[29] W. Elmaghraby, Auctions and pricing in e-marketplaces, in: Handbook of Quantitative Supply
     Chain Analysis: Modelling in the E-Business Era, International Series in Operations Research and
     Management Science, Kluwer Academic Publishers, 2005.
[30] F. Bernstein, A. Federgruen. “Pricing and replenishment strategies in a distribution system with
     competing retailers.” European Journal of Operational Research, Vol. 51, Issue 3, June (2003):
     409–426. DOI: 10.1287/opre.51.3.409.14957
[31] X, Cao, H. Shen, R. Milito, P. Wirth. “Internet pricing with a game theoretic approach: concepts
     and examples.” ACM Transactions on Networking, Vol. 10, Issue 2, April (2002): 208–216. DOI:
     10.1109/90.993302
[32] J. Yang, C. Zhao, C. Xing, On this page Abstract Introduction Literature Review Conclusions Data
     Availability Conflicts of Interest Acknowledgments References Copyright Special Issue
     Applications of Machine Learning Methods in Complex Economics and Financial Networks View,
     2019.
[33] M. Gupta, K. Ravikumar, M. Kumar. “Adaptive strategies for price markdown in a multiunit
     descending price auction: A comparative study, in: Proceedings of the IEEE Conference on
     Systems, Man, and Cybernetics, 2002, pp. 373–378.
[34] E. V. Zagainova, “Model' dinamicheskogo cenoobrazovaniya na rynke pasazhirskih
     aviaperevozok.” [A Model of Dynamic Pricing in the Air Passenger Market]. ZHurnal
     ekonomicheskoj teorii, (2017): 177-182 (in Russian).
[35] V. Esposito, J. Henseler, H. Wang, W. W. Chin, Handbook of Partial Least Squares: Concepts,
     Methods and Applications, 1st. ed., Springer-Verlag Berlin Heidelberg, 2010.
[36] P.Ye, J. Qian, J. Chen, C. Wu, Y. Zhou, S. D. Mars, F. Yang, L. Zhang. Customized Regression
     Model for Airbnb Dynamic Pricing, in: Proceedings of the 24th ACM SIGKDD International
     Conference on Knowledge Discovery & Data Mining, Association for Computing Machinery, New
     York, NY, USA, 2018, pp. 932–940. DOI: 10.1145/3219819.3219830.
[37] J. L. Wang, D. Nicolau. “Price determinants of sharing economy based accommodation rental: a
     study of listings from 33 cities on Airbnb.com.” International Journal of Hospitality Management,
     Vol. 62, April (2017): 120-131. DOI: 10.1016/j.ijhm.2016.12.007.
[38] Inside Airbnb, InsideAirbnb.Com. URL: http://insideairbnb.com/get-the-data.html.


                                                                                                    82