Data Exchange Platform for Digital Economy Applications Oleg Surnin Pavel Sitnikov Anastasia Khorina Data analysis department Faculty of infocommunication Faculty of infocommunication IPSI SEC “Open Code” ITMO University ITMO University Samara, Russia Saint-Petersburg, Russia Saint-Petersburg, Russia surnin@o-code.ru sitnikov@o-code.ru anastasiakhorina@mail.ru Anton Ivaschenko Anastasia Stolbova Nataly Ilyasova Computer science department Information systems and technologies Image Processing Systems Institute of RAS Samara State Technical University department - Branch of the FSRC "Crystallography Samara, Russia Samara National Research University and Photonics" RAS anton.ivashenko@gmail.com Samara, Russia Samara, Russia anastasiya.stolbova@bk.ru ilyasova.nata@gmail.com Abstract—This paper describes an experience of the Data  street retail – organization of retail trade in the most exchange software platform practical use supporting the visited places: on pedestrian streets, ground floors of modern trends of Digital Economy. The platform was initially buildings; designed for the suppliers and customers of data sources providing the up-to-date technologies of big data processing as  non-food retail – organization of trade in non-food an online service. The platform is also open for software products, which in grocery stores, as a rule, are called developers to upload new algorithms and technologies in order related. Such products include clothing, cosmetics, to help them to find new areas of application. There is household chemicals, stationery and other categories presented architecture and its software implementation for an of goods; intermediary online platform capable of collecting, processing and analysis of various datasets. Modern companies being the  food retail – organization of food trade. This category members of digital economy can use this platform to process of goods is the most demanded and relates to their data and produce business analytics. They can become everyday goods; both suppliers and providers of data, as well as develop and upload new customized algorithms. First results were achieved  network retail – organization with several stores of in the area of retail and social media analysis. the same chain united by one concept;  electronic retail – organization of trade through the Keywords—data exchange, digital economy, big data Internet; I. INTRODUCTION  cellular retail – organization of trade between mobile Data exchange platform is a new concept to organize an operators. efficient cooperation between the providers and customers of Retail industry gains significant benefits of being various data sets and algorithms based on implementing the adaptive to customer changing demands. Understanding the software solution of open service provider [1 – 2]. trends and prediction of changes helps reducing costs and Considering the results of open service software providers’ increasing competitiveness. Application of modern practical use [3], there was developed and probated new technologies for data collecting and processing can solve this software architecture capable of solving the actual problems problem. of business analytics in the different spheres of digital economy. Retailers use the following mechanisms to successfully organize business processes that form the main requirements Such a solution is presented below with an illustration of for a data processing toolset: Data exchange platform application in network retail. Data exchange platform implements the modern technologies of  calculation of the most optimal locations for big data analysis and capable of providing business analytics placement of outlets; and decision-making support in real time. The proposed solution architecture and case study intended for processing  use of modern commercial equipment; network retail data, predicting the processes, managing the  work with categories of clients; placement of big data, and planning the computing load balancing.  work with methods of attraction; II. PROBLEM DOMAIN OVERVIEW  development of self-service and reduction of personnel due to this; Retail is a popular way to organize distributive trades. Modern retail industry becomes a promising area of  optimization of logistics, work with wholesale development of network organizations that provide suppliers; distributed services for a large amount of consumers and therefore processing the concerned flows of big data.  automation of all stages of trade. The logic of this information processing is mainly Depending on the specifics of retail problem domain and influenced by the processes of retail service. The following other factors, a retail development strategy is determined. areas of retail are distinguished: The strategy operates with critical factors, by solving the Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) Data Science problems of determining the selling price of goods sold, low cost of its application. For example, to sell an managing the assortment of outlets, and determining their expensive product, you need to put it next to an even location. more expensive product. Another fact that has been studied is that the customer has a high probability that A comparison of the most frequently used business he or she will not have a single price, so a small intelligence systems in the retail sector for a number of the difference in price should be left for similar products. most important criteria is shown in Table 1. Psychological techniques include managing delivery TABLE I. PLATFORMS COMPARATIVE ANALYSIS prices. It is very often necessary to arrange delivery. Modern research shows that most shoppers leave the Data store basket. The offer of free delivery or promotional Qlik Klip Tab Power Feature ex goods when making delivery positively affects the View folio leau BI change SQL and customer; software + - - - +  predictive pricing is a fairly effective approach, the develoment skills main task of which is to determine how price affects Own demand. The approach allows you to model changes programming + - - - + in demand depending on the price, taking into language account various factors, including other methods Connectivity to involved in pricing: psychological pricing, the heterogeneous + + + + + influence of competitors, behavioral models and data sources Data storage + + + + - categories of current retailer customers, current trends Data (in the fashion industry or global trends in lifestyle + - + changes). aggregation A variety of dashboards for + + + + + III. STATE OF THE ART visualization Graphical The results of [3] show that retailers are moving to more programming - - - - + innovative strategies to offer modern consumer solutions language based on technological advances. The high level of property Intelligent rights, due to the enormous number of patents, forces algorithms and - - - - + retailers to invest more in the acquisition of patented data processing technologies to achieve advantages over competitors or to Specialization in retail - - - - + introduce new management methods. analytics The need to analyze retail data in a highly competitive Possibility of monetization of environment is shown in [4]. Advances in machine learning - - - - + and big data lead to the use of data analytics management data, analytics and algorithms systems in many organizations and industries. As a rule, large organizations can devote more resources Currently, electronic retail is popular and its popularity is to this, and the software used is more suitable for large growing according to the main trends of digital economy enterprises. However, the growing business pressure is development. Electronic retail includes trade in both food forcing small and medium-sized enterprises to implement and non-food products, which leads to serious competition data analytics, which is new to them and leads to a number of among retailers. Price is the main factor that allows problems considered in [5]. companies to stand out among their competitors and attract buyers, so pricing strategies should be the most flexible. Social networks have become a part of life for most people around the world. Retailers (and not only) are actively There are a variety of approaches to pricing: using them to share information about their products with  personalized approach to customers with the customers. As a result of the growth of social networks, the formation of individual offers: this approach is based need for monitoring, data mining and analysis is increasing. on the analysis of the consumer behavior of each So, in determining the tasks and opportunities of network specific buyer, the determination of his needs, retail, the issues of integration with social networks, financial capabilities and, ultimately, the formation of determining a development strategy and studying the life individual price offers for him; cycle of clients are relevant [6, 7]. The study of customer behavior models requires taking into account completely  promotion management: the approach is aimed at the diverse data and improving their analysis. formation of special price offers for goods or groups of goods according to certain criteria in order to In [8], the questions of creating a body of knowledge are attract customers. In applying this approach, it is considered, the concept and key methods of data analysis are important not to create a reputation as a discounter, studied in the field of retail network, the creation of data which can negatively affect the perception of the exchange methods [9], such as shopping basket analysis [10] retailer by customers, so promotions should be and sales analysis in general using business intelligence temporary; tools.  psychological techniques that affect the perception of Analysis of factors of customer satisfaction and loyalty, price. This approach is common for the retail industry the image of retailers and the relationships between them as a whole and is quite popular, due to the ease and lead to the emergence of models for creating a satisfactory experience for consumers, which is a priority for retailers VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 244 Data Science [11]. The influence of fluctuations in demand and purchasing According to the commission of the Code of good power on macroeconomic regulation and state control are practices, it is known that retailers do not want to share sales considered in [12]. data of individual outlets with suppliers, which leads to an inaccurate assessment of their activities, loss of productivity, Research on the effects of loyalty programs on the and, consequently, loss of income. decision to purchase goods at different periods of time is relevant, since an important factor for multichannel grocery IV. SOLUTION ARCHITECTURE retailers is which promotion strategy to choose across all channels. The tendency of product buyers to continue to visit Having considered the tasks and problems of network an offline store after they start buying in the online store of retail, strategies for solving them, we can conclude that the the network means that promotions in one channel can have creation of a data exchange platform is a necessary step for a significant negative impact on the behavior of customers in the business development, and its intellectualization is an another channel, especially if promotions differ by channel integral part of the process. [13]. Intellectualization includes the introduction of the To solve the above mentioned problems of network retail following technologies into the data exchange: there can be used a variety of data sources. For example, it is 1. Face recognition. Biometric assessment systems can possible to analyze data obtained in the following areas: optimize salary costs, allow you to calculate the costs of non- 1. Acquiring data which includes information about the staff employees and minimize the risks of dishonest actions commission of the acquiring bank, categories of outlets, and counteractions. The introduction of face recognition can payment card options, which allows you to determine the also significantly increase understanding of the dependence of the availability of sales outlets on the characteristics of the average buyer. Indeed, many algorithms availability of POS terminals; allow you to evaluate gender, age, race, and thereby form a portrait of the buyer. To solve such problems, a camera is 2. Retailer data, which include: used, aimed close-up at incoming visitors, with a face detection function. Additionally, such algorithms in  information on loyalty programs and promotions to combination with the display of advertising can give an determine the dependence of demand on the understanding of the effectiveness of advertising on visitors. availability of various special offers for goods; 2. Creating smart baskets. Built-in cameras that can  information about regular customers and users of recognize and scan products, sensors that detect the detection loyalty programs to analyze the dependence of the of an object in the basket, as well as scales that allow you to purchased goods on the categories of customers, to get rid of additional weighing of fruits and vegetables. On form the most advantageous offers for both the buyer the trolley screen, the buyer will be able to see all the and the retailer. In addition, it is possible to evaluate products taken, as well as their total amount. The data that the customers in terms of how much money they carts collect (on routes through stores, on the frequency of spend and how often they make purchases in retailers’ purchases of goods from certain shelves, where customers stores, which will determine the significance of each are, etc.) would help company partners optimize their stores. customer for the network and more accurately formulate personal offers; 3. Analysis of social networks. Analysis of data from social networks will allow you to analyze customer  date of purchase and time of purchase to analyze the complaints in real time and track user requests and wishes. dynamics of demand for certain products depending Location and shopping data will allow you to compile the on seasonality and time of day. In addition, it is most complete portrait of a specific retailer buyer. necessary to take into account various holidays and ongoing events in the analyzed region; 4. Internet of things. When it comes to retail, the IoT infrastructure includes RFID tags, infrared traffic meters in  the list of goods on the check allows you to analyze stores, satellite and Wi-Fi tracking systems, digital the most frequently purchased goods together; signatures, kiosks or even mobile devices of the customers  the price of the items listed in the check and the total themselves. The Internet of things makes it possible to purchase amount allow the analysis of the turnover of implement predictive maintenance of equipment, the outlet; transportation, keeping a warehouse on the basis of demand, tracking the activity of buyers, creating a smart store,  the trajectory of buyers allows you to determine the behavioral analytics, personalized marketing and real-time order of purchases at retail outlets and to identify the advertising based on location and purchase history. optimal location of goods in the store; 5. Adaptive Acquiring. The sources of such a data 3. Social media that help identification of fashion trends exchange are intelligent systems. Data processing can give and identify the dependence of purchases as well as the most conflicting results in different contexts and in the absence of popular brands and products; data, but the results obtained should provide decision-making support. Technologies without data will not give the desired 4. Geo data that can be used to determine the dependence result, as well as data without applying the necessary of consumer activity, demand for goods, payment methods technologies to them. So, the data exchange can be presented depending on the territorial location of the outlet and the as a platform for interfacing digital intelligent services. nearby infrastructure, such as the area (sleeping, tourist, business centers and others), categories of users visiting or The proposed software solution for a data exchange living in the area. platform is presented in Fig. 1. It was implemented using Java (IntelliJ IDEA 15.0.3 Integrated Software Development VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 245 Data Science Environment (Community Edition)) and supports JavaScript, In the considered example, users were searched for the CoffeeScript, HTML / XHTML / XAML, CSS / SASS / following product groups: 3 (energy granola bars), 4 (instant LESS, XML / XSL / XPath, YAML, ActionScript / MXML, foods), 5 (marinades meat preparation). As can be seen from Python, Ruby, Haxe, Groovy, Scala, SQL, PHP, Kotlin, Figure 2, for the buyer with the number 141848112, you can Clojure, C, C ++. create an offer for goods from group 3 (energy granola bars), and for the buyer with the number 78310286 – 4 (instant foods). Fig. 1. Software architecture. Additional libraries and frameworks include Apache Hadoop, Apache Spark, and Django. Apache Hadoop (HDFS) is a file system designed to store large files, block- by-block distributed between nodes of a computing cluster. Apache Spark is an open source framework for implementing distributed processing of unstructured and weakly structured data, which is part of the Hadoop project ecosystem. The developed data exchange platform provides the functionality for pre-processing and analysis of network retail data. As a part of data pre-processing the system implements the methods of data structuring, omission processing, and data dimension reduction, trend highlighting, and correlation analysis. To solve the problems of network retail analysis, there is a possibility to use the following:  Apriori, FPG (Frequent-Pattern Tree) methods help Fig. 2. Project Results. solving the problem of analysing a shopping cart; Another popular task in the field of network retail  Linear regression methods are used to solve the analysis is the sales forecast. The following fields were forecasting problem; selected as the initial data of the problem to be solved: store  The Mean method is used to calculate average values; identifier, purchase date, purchase amount. To solve the problem, there were combined the basic  K_means method is used for various kinds of methods powered by the developed data exchange platform: clustering. Besides, a wide range of methods such as ARIMA, FB  python_basic_filter – the method allows you to Prophet and others are used to analyse time series. select sales data for one store from many;  python_basic_resample – the method allows you to V. IMPLEMENTATION AND TESTS aggregate data by day, summing them up; The following example illustrates the results of Data exchange platform development. We create a project for  python_timeseries_holt_winters – the method allows personal offers on selected categories of goods formed by a forecasting using the Holt-Winters additive model retailer for users of a social network with loyalty cards. (triple exponential smoothing). Forming this request is useful if you need to increase sales  python_plotly_forecast – a method for plotting. for certain categories of goods available to the retailer. The ability to combine methods makes it possible to The result of this query is a list of users who are choose the most suitable analysis algorithm, taking into interested in purchasing goods for selected groups of goods account the specifics of the source data. The project drawn with an indication of the group. By analyzing the results up in the query designer is shown in Fig. 3. obtained, the retailer can formulate personal offers for the groups of goods chosen by him for the most suitable customers. VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 246 Data Science Sitnikov), analysis of its perspective for digital economy (A. Ivaschenko) based on implementation of data exchange model (A. Stolbova) and algorithms of semantic (A. Khorina) and statistics (N. Ilyasova) analysis. ACKNOWLEDGMENT This work was financially supported by the Russian Foundation for Basic Research under grant # 19-29-01135 and by the Ministry of Science and Higher Education within the State assignment to the FSRC “Crystallography and Photonics” RAS. Fig. 3. Predictive analysis project. REFERENCES The result of this project is the forecasted revenue values [1] O.L. Surnin, P.V. Sitnikov, A.V. Ivaschenko, N.Yu. Ilyasova and of the selected store, taking into account the seasonality S.B. Popov, “Big Data incorporation based on open services provider for distributed enterprises,” CEUR Workshop Proceedings, vol. 190, period with a sales horizon. pp. 42-47, 2017. The proposed approach opens new perspectives for [2] O.L. Surnin, P.V. Sitnikov, A.A. Khorina, A.V. Ivaschenko, A.A. online services providers and buyers cooperation in common Stolbova and N.Yu. Ilyasova, “Industrial application of big data services in digital economy,” CEUR Workshop Proceedings, vol. information space. According to the current trends of 2416, pp. 409-416, 2019. economy digital transformation most services migrate to web [3] E. Pantano, “Does innovation-orientation lead to retail industry platforms transporting all negotiations between the business growth? Empirical evidence from patent analysis,” Journal of parties to virtual reality. Retailing and Consumer Services, vol. 34, pp. 88-94, 2017. [4] C. Kahraman, “Efficiency analysis in retail sector: implementation of Therefore social media become a substantial part of data envelopment analysis in a local supermarket chain,” Cham: The business relations. The resulting effect forms the economy of International Symposium for Production Research, Springer, pp. 884- ultra-low expenditures. On the one hand, the members on 897, 2018. such relations can swiftly change their mind by getting better [5] M. Mirzaei, “Investigating challenges to SME deployment of options, producing the contract agreements that are easy to operational business intelligence: a case study in the New Zealand enter and easy to leave. On the other hand, all the games are retail sector,” Proceedings of the 12th IEEE/ACM International Conference on Utility and Cloud Computing Companion, pp. 139- being fixed, and the members agree to play open. 142, 2019. Under these conditions a new class of intermediary open [6] O.G. Ayodeji and V. Kumar, “Social media analytics: a tool for the service platforms starts playing the general role. Using the success of online retail industry,” International Journal of Services Operations and Informatics, vol. 10, no. 1, pp. 79-95, 2019. existing IT infrastructure they allow building a new virtual [7] I.A. Rycarev, D.V. Kirsh and A.V. Kupriyanov, “Clustering of media world powered by business analytics. In this sphere decision- content from social networks using BigData technology,” Computer making can no longer be done by humans themselves. All Optics, vol. 42, no. 5, pp. 921-927, 2018. DOI: 10.18287/2412-6179- the decisions need a support from data analysis and machine 2018-42-5-921-927. learning, which makes the access to them even more [8] F. Castelo-Branco, “Business intelligence and data mining to support important than economy resources available in real life. sales in retail,” Singapore: Marketing and Smart Technologies, Springerpp, pp. 406-419, 2020. VI. CONCLUSION [9] Y. Chen, “FaDe: a blockchain-based fair data exchange scheme for Big Data sharing,” Future Internet, vol. 11, no. 11, pp. 225, 2019. The developed solution based on the data exchange [10] M. Kaur and S. Kang, “Market basket analysis: identify the changing platform allows predicting the processes of network retail trends of market data using association rule mining,” Procedia manage the placement of big data and plan the computing computer science, vol. 85, pp. 78-85, 2016. load based on the info logical model, which will increase the [11] C.M. Veloso, “The loyalty and satisfaction determinants: a factor sales efficiency. analysis applied to the south and insular Portuguese traditional retail,” 11th Annual Conference of the EuroMed Academy of Business: Next steps are related to extending the area of Research Advancements in National and Global Business Theory and intermediary open service application including the problem Practice, EuroMed Press, pp. 1380-1394, 2018. domains of social analysis and services automation. Positive [12] S. Shen, “An empirical analysis of the total retail sales of consumer results were also achieved in banking acquiring, which can goods by using time series model,” Journal of Mathematical Finance, vol. 9, no. 2, pp. 175, 2019. be extended in business sphere making the data processing an effective tool of economy digital transformation. [13] E. Breugelmans and K. Campo, “Cross-channel effects of price promotions: an empirical analysis of the multi-channel grocery retail Main research results include the following deliverables: sector,” Journal of Retailing,vol. 92, no. 3, pp. 333-351, 2016. big data management methodology (O. Surnin), and its implementation by Open code platform architecture (P. VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 247