Demand Forecasting for Inventory Management using Limited Data Sets: A Case Study from the Oil Industry Jorge Ivan Romero-Gelvez, Esteban Felipe Villamizar, Olmer Garcia-Bedoya and Jorge Aurelio Herrera-Cuartas Universidad de Bogotá Jorge Tadeo Lozano, Bogotá, Colombia Abstract This document’s main focus is to present a way to solve forecasting issues using open source tools for time series analysis. First, we present an introduction to the hydrocarbon sector and time series analysis, later we focus on the solution methods based on supervised learning trained (support vector regression) with bio-inspired algorithms (Particle swarm optimization). We expose some benefits of use support vector machines and open source tools that focuses on variables like trend and seasonality. In this work, we chose the fb-prophet package and support vector regressor with scikit-learn as the primary tools because they have representative results dealing with limited data sets and Particle swarm optimization as training algorithm because of their speed and adaptability. Finally, we show the results and compare them with their RMSE obtained. Keywords Hydrocarbon, Forecasting, small time-series, support vector regressor, particle swarm optimization 1. Introduction The hydrocarbon sector is the protagonist of the world economy’s growth, seeking to expand as a critical piece of economic development through energy consumption, exploration, and production of oil where its primary producers are the USA, Russia, Saudi Arabia Iraq, Canada, and Iran. In 2019, 83 million barrels per day were produced, and Colombia reached participation of 863 thousand barrels, being the country number 22 within the scale of producers worldwide. In Colombia, the hydrocarbon sector has contributed significant growth, standing out as the engine of the country’s development. The ACP estimates that the investment in production in 2020 will be around USD 4,050 million, that is, 25% higher than that of 2019 (USD 3,250 million). By region, 90% of the investment will be carried out in the Llanos Orientales, Valle Magdalena, and Caguán - Putumayo basins. By department, the following stand out: Goal USD 1,924 million, Santander USD 560 million, Casanare USD 496 million, Putumayo USD 239 ICAIW 2020: Workshops at the Third International Conference on Applied Informatics 2020, October 29–31, 2020, Ota, Nigeria " jorgei.romerog@utadeo.edu.co (J.I. Romero-Gelvez); estebanf.villamizar@utadeo.edu.co (E.F. Villamizar); olmerg@gmail.com (O. Garcia-Bedoya); jorgea.herrerac@utadeo.edu.co (J.A. Herrera-Cuartas)  0000-0002-5335-0819 (J.I. Romero-Gelvez); 0000-0002-6964-3034 (O. Garcia-Bedoya); 0000-0003-0273-4043 (J.A. Herrera-Cuartas) © 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) million, and Arauca USD 101 million. Also, the ACP projects that in 2020 there will be an investment in exploration and production of oil and gas of USD 4,970 million, 23% higher than in 2019. Indeed, the largest company in the country and the leading oil company in Colombia belong to the group of the 39 largest oil companies in the world and is one of the top five in Latin America. They have hydrocarbon extraction fields in the center, south, east and north of Colombia, two refineries, ports for the export and import of fuel and crude oil on both coasts, and a transportation network of 8,500 kilometers of pipelines and pipelines to throughout the entire national geography, which interconnect production systems with large consumption centers and maritime terminals. Given this panorama, the industries must plan strategies to manage and control their inven- tories since their importance lies in obtaining profits. Inventory management plays a vital role within the business chain, which is the buffer between two processes, supply, and demand. This can be known or unknown, variable, or constant. The sourcing process contributes goods to inventory while demand consumes the same inventory. This is necessary due to differences in rates and times between supply and demand, and this difference must be attributed to internal or external factors. Endogenous factors are policy issues, but exogenous factors are uncon- trollable. Internal factors include economies of scale, smoothing of operation, and customer service, the most important exogenous factor being uncertainty. Given that inventory control is a critical aspect for effective management and administration, in order to guarantee the availability of equipment, spare parts, and materials to meet the needs of expenses and projects with the expected quality, cost, and opportunity. Likewise, the materials management process is to ensure that the materials that register stock in the warehouse correspond to optimal levels of inventories, such quantity must fully meet the needs of the company with the minimum investment. 2. Literature review There are applications in forecasts using SVR and it is more common to find them in recent years, since there is a special interest in machine learning applications to make predictions on time series. An example of this can be seen in [1] with financial forecasts, [2, 3] over rainfall predictions, electric load forecasting [4, 5, 6], forecasting carbon price [7] among many oth- ers. Next, we present a brief introduction to the time series and later we will deal with the comparison of the applied models. 2.1. Time series analysis The analysis of time series and demand forecasts becomes the primary input for the MRP model. For this reason, it is proposed to contrast different methods that allow considering seasonal periods, and that may also include new demand observations to adjust the model in real-time. According to [8] Time-series methods refers to a set of observations of real phenomena (like mathematical, biological, social sciences, physical, economic among others) given as part of a discrete set in time. The main thought is that past data can be utilized to generate future estimations. In time series analysis is common to try to get patters of data as trend, seasonality, cycles, and randomness as inputs to modeling the phenomena. 112 • Trend: The data set exhibits a stable pattern of growth or decrease. • Seasonality: A seasonality pattern are those that are repeated at fixed intervals. • Cycles: The variation of cycles is similar to seasonality, except that the duration and the magnitude of the cycle varies. • Randomness: A random series is where you do not have a recognized pattern of data. One can generate random series of data that have a specific structure. The data that seems to have apparently a randomness , actually have a specific structure. Actually the really random data fluctuate around a fixed average. 2.2. Support Vector machine and support vector regressor According with [9] In machine learning, support vector machine proposed by Vapnik is one of the most popular approaches for supervised learning [10, 11]. This model resembles logistic regression in that a linear function 𝒘 ⊤ 𝒙 + 𝑏 drives both. Its main difference is that the logistic regression produces probabilities, the support vector machine produces a class identity. The SVM predicts that the positive class is present when 𝒘 ⊤ 𝒙 + 𝑏 is positive. Likewise, it predicts that the negative class is present when 𝒘 ⊤ 𝒙 + 𝑏 negative. A notable feature of support vector machines is the kernel, considering that many algorithms can be written in the form of a dot product. As an example, the rewritten SVM linear function is shown as: 𝑚 𝒘 ⊤ 𝒙 + 𝑏 = 𝑏 + ∑ 𝛼𝑖 𝒙 ⊤ 𝒙 (𝑖) (1) 𝑖=1 where 𝑥(𝑖) is a training example and 𝛼 is a vector of coefficients. Rewriting the learning algorithm this way allows us to replace 𝑥 by the output of a given feature function 𝜙(𝒙) and the dot product with a function 𝑘 (𝒙, 𝒙 (𝑖) ) = 𝜙(𝒙) ⋅ 𝜙 (𝒙 (𝑖) ) called a kernel. The (·) operator represents an inner product analogous to 𝜙(𝒙)⊤ 𝜙 (𝒙 (𝑖) ). For some feature spaces, we may not use literally the vector inner product. In some infinite dimensional spaces, we need to use other kinds of inner products, for example, inner products based on integration rather than summation. After replacing dot products with kernel evaluations, we can make predictions using the function 𝑓 (𝒙) = 𝑏 + ∑ 𝛼𝑖 𝑘 (𝒙, 𝒙 (𝑖) ) (2) 𝑖 This function is nonlinear with respect to 𝑥, but the relationship between 𝜙(𝒙) and 𝑓 (𝑥) is linear. Also, the relationship between 𝛼 and 𝑓 (𝑥) is linear. The kernel-based function is exactly equivalent to preprocessing the data by applying 𝜙(𝒙) to all inputs, then learning a linear model in the new transformed space. The kernel trick is powerful for two reasons. First, it allows us to learn models that are nonlinear as a function of 𝑥 using convex optimization techniques that are guaranteed to converge efficiently. This is possible because we consider 𝜙 fixed and optimize only 𝜶 , i.e., the optimization algorithm can view the decision function as being linear in a different space. Second, the kernel function 𝑘 often admits an implementa- tion that is significantly more computational efficient than naively constructing two vectors 113 and explicitly taking 𝜙(𝑥) their dot product. In some cases, 𝜙(𝑥) can even be infinite dimen- sional, which would result in an infinite computational cost for the naive, explicit approach. In many cases, 𝑘 (𝒙, 𝒙 ′ ) is a nonlinear, tractable function of x even when 𝜙(𝑥) is intractable. As an example of an infinite-dimensional feature space with a tractable kernel, we construct a feature mapping 𝜙(𝑥) over the non-negative integers 𝑥. Suppose that this mapping returns a vector containing 𝑥 ones followed by infinitely many zeros. We can write a kernel function 𝑘 (𝑥, 𝑥 (𝑖) ) = min (𝑥, 𝑥 (𝑖) ) that is exactly equivalent to the corresponding infinite-dimensional dot product. 2.3. Support vector regressor According to [12] the basic idea of SVR is to map the data 𝑥 into a high dimensional feature space  by nolinear mapping 𝜙 and to do linear regression in this space. Also, according to [13] we can consider a set of training data 𝑓 (𝑥) = (𝑤 ⋅ 𝜙(𝑥)) + 𝑏, where each 𝑥𝑖 ⊂ 𝑅 n denotes the input space of the sample and has a corresponding target value 𝑦𝑖 ⊂ 𝑅 for 𝑖 = 1, … , 𝑙, where corresponds to the size of the training data. The idea of the regression problem is to determine a function that can approximate future values accurately. The generic form of SVR can be seen as follows 𝑓 (𝑥) = (𝑤 ⋅ 𝜙(𝑥)) + 𝑏 (3) where, 𝑤 ⊂ 𝑅 n , 𝑏 ⊂ 𝑅 and 𝜙 denotes a nonlinear transfor to high-dimensional space. Our goal is to find the value of and such that values of can be determined by minimizing the regression risk 𝓁 𝑅reg (𝑓 ) = ∑ 𝐶 (𝑓 (𝑥𝑖 ) − 𝑦𝑖 ) + 𝜆‖𝑤‖2 (4) 𝑖=1 where 𝐶(⋅) is a cost function, 𝜆 is a constant, and vector 𝑤 can be written in terms of data points as 𝓁 𝑤 = ∑ (𝛼𝑖 − 𝛼𝑖∗ ) 𝜙 (𝑥𝑖 ) (5) 𝑖=1 By substituting eq.5 into eq.3, the generic equation can be rewritten as 𝓁 𝓁 𝑓 (𝑥) = ∑ (𝛼𝑖 − 𝛼𝑖∗ ) (𝜙 (𝑥𝑖 ) ⋅ 𝜙(𝑥)) + 𝑏 = ∑ (𝛼𝑖 − 𝛼𝑖∗ ) 𝑘 (𝑥𝑖 , 𝑥) + 𝑏 (6) 𝑖=1 𝑖=1 In eq.6, the dot product is replaced with a kernel function 𝑘 (𝑥𝑖 , 𝑥) = (𝜙 (𝐱𝑖 ) ⋅ 𝜙 (𝐱𝑗 )). Kernel functions enable the dot product to be performed in high-dimensional feature space using low- dimensional space data input without knowing the transformation 𝜙. All kernel functions must satisfy Mercer’s condition that corresponds to the inner product of some feature space. The radial basis function RBF is commonly used as the kernel for regression { } 𝑘 (𝑥𝑖 , 𝑥) = exp −𝛾 |𝑥 − 𝑥𝑖 |2 (7) 114 2.4. Particle swarm optimization Particle swarm optimization is a computational technique that optimizes a problem by iter- atively attempting to promote a candidate solution concerning a given degree of quality. It solves a problem by producing a population of candidate solutions called particles and moving them around the search-space with particular position and velocity. Each particle’s movement is influenced by its local best-known place but is also guided toward the best-known positions in the search-space, which are renewed as another particle found better solutions. This is ex- pected to move the swarm toward the best solutions. According to [14] the main inputs for the formulation of this algorithm can be seen as follows: • 𝐷 is the dimension of the search space. • 𝐸 is the search space, a hyperparallelepid defined as the euclidean product of 𝐷 real intervals. 𝐷 𝐸 = ⨂ [min, max] (8) 𝑑 𝑑 𝑑=1 The standard form of the algorithm composes by a set of particles (called swarm), made of a position in the search space, the fitness value at each position, a velocity for displacement, a memory that contains the best position and last the fitness value of the previous best. The search is performed in two phases, initialization of the swarm and a cycle of iterations. The main steps can be seen as follows: • Initialisation of the swarm: pick a random position in search space and pick a random velocity. • Iteration: compute the new velocity, move and compute the new fitness. • Stop . 2.5. The prophet forecasting model Prophet is an open source tool for forecasting time series observations based on an additive model in which nonlinear trends adjust to seasonality. Their results improve with data that includes compromises with strong seasonal effects and a considerable amount of historical data. We use Prophet open-source software in Python [15] based in a decomposable time series model [16] components: trend, seasonality, and vacations. They are combined as follows: 𝑦(𝑡) = 𝑔(𝑡) + 𝑠(𝑡) + ℎ(𝑡) + 𝜖𝑡 (9) Were 𝑔(𝑡) is the trend function which models non periodic changes in the value of the time series, 𝑠(𝑡) represents periodic changes, and ℎ(𝑡) represents the effects of vacations which occur on potentially irregular schedules over one or more days. The error term 𝜖𝑡 represents any idiosyncratic changes which are not accommodated by the model; We will make a parametric assumption with normally distributed 𝑡. 115 2.5.1. Prediction model The development of an algorithm is necessary to obtain a data history in which the breakdown of the monthly inventory value is represented by the BSE material and transit corresponding to two years. In this analysis of generated data, factors such as inventory value and material dependence are taken into account. These have a significant influence on the model’s behavior since they provide realism and specific variations of the trend line, which are of interest to optimize its management. All these data have been obtained from the materials management system. Subsequently, these data will be processed and analyzed in order to see the interaction between them, carrying out a parameterization that allows characterizing the logic of genera- tion of the historical. 3. Data description The warehouse corresponds to the code assigned in the information system that represents an organizational unit or warehouse, corresponds to the physical place where the materials are stored, which allows differentiation of material stocks. In this case, the logistics center 2000 is taken, as a Detail describes the types of warehouse: Imported: corresponds only to mate- rial in transit of expenses or projects. Expenses: the types of warehouses are associated with new material in good condition and repaired material in good condition. These materials are characteristic of the operation and maintenance, such as spare parts, consumables, and sup- plies for the operation, equipment, lubricants, consumption tools, parts from manufacturers, spare parts. Projects: this material is part of the business investment with a physical location in patios and covered warehouse, this material is acquired according to the project’s require- ment, given that the material is no longer required by the project, it is assigned as not required material or in the process of sale which is offered to other projects of the business group. 4. Solution method and results In order to solve the problem, we contrast two methods, the fb-prophet and PSO-SVM. • Forecasting Model selection: In order to use the method that generates less error 𝜖𝑡 . First, we apply support vector machine with particle swarm optimizations as global optimization algorithm. In addition, Fb-prophet (black box method) is also used. Next, we select the method with the least error of them • IDE: IPython/Jupyter notebooks and Google-Colab. 4.1. Results 5. Conclusions The main forecast topic with a variety of backgrounds should make more forecasts than they can do manually—the first component of our forecast. The system is the new model that we 116 Figure 1: Fb-prophet black box model implementation Figure 2: Initialization of the PSO-SVM model Table 1 RMSE error RMSE fb-prophet 3471658053 pso-svm 0.022833 have developed in many prediction iterations of a variety of data in FbProphet. We use a simple modular regression model that often works well with predetermined parameters, and that allows you to select the components that are relevant to your forecast problem and quickly make adjustments as needed. The success of model lies in its ability to adjust the positions of all particles in an area of search space with satisfactory solutions. According to a determined objective function to minimize, in this case, the root mean square error. It measures the dispersion of error in forecast, this value is the difference between real demand and the forecast squaring, disabling those periods where the difference was higher compared to others. From this calculation, decisions against forecast models and their results 117 Figure 3: Results of PSO-SVR are guided to the best choice. Likewise, it is established that this type of problems are solved by evolutionary algorithms, the importance of using these algorithms as the swarm of particles lies in the high efficiency in generating predictions with better performance, the formulation is reduced to characterize the movement of the particles based on a speed operator that must associate exploration and convergence by decomposing the speed into three components in order to decipher the study behavior. References [1] C.-J. Lu, T.-S. Lee, C.-C. Chiu, Financial time series forecasting using independent compo- nent analysis and support vector regression, Decision support systems 47 (2009) 115–125. [2] A. D. Mehr, V. Nourani, V. K. Khosrowshahi, M. A. Ghorbani, A hybrid support vector regression–firefly model for monthly rainfall forecasting, International Journal of Envi- ronmental Science and Technology 16 (2019) 335–346. [3] C. Balsa, C. V. Rodrigues, I. Lopes, J. Rufino, Using analog ensembles with alternative metrics for hindcasting with multistations, ParadigmPlus 1 (2020) 1–17. [4] Z. Zhang, W.-C. Hong, Electric load forecasting by complete ensemble empirical mode de- composition adaptive noise and support vector regression with quantum-based dragonfly algorithm, Nonlinear Dynamics 98 (2019) 1107–1136. [5] Y. Yang, J. Che, C. Deng, L. Li, Sequential grid approach based support vector regression for short-term electric load forecasting, Applied Energy 238 (2019) 1010–1021. [6] Z. Zhang, W.-C. Hong, J. Li, Electric load forecasting by hybrid self-recurrent support vec- tor regression model with variational mode decomposition and improved cuckoo search algorithm, IEEE Access 8 (2020) 14642–14658. [7] B. Zhu, D. Han, P. Wang, Z. Wu, T. Zhang, Y.-M. Wei, Forecasting carbon price using empirical mode decomposition and evolutionary least squares support vector regression, Applied energy 191 (2017) 521–530. [8] F. R. Jacobs, Manufacturing planning and control for supply chain management, McGraw- Hill, 2011. 118 [9] I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press, 2016. http://www. deeplearningbook.org. [10] B. E. Boser, I. M. Guyon, V. N. Vapnik, A training algorithm for optimal margin classifiers, in: Proceedings of the fifth annual workshop on Computational learning theory, 1992, pp. 144–152. [11] C. Cortes, V. Vapnik, Support-vector networks, Machine learning 20 (1995) 273–297. [12] K.-R. Müller, A. J. Smola, G. Rätsch, B. Schölkopf, J. Kohlmorgen, V. Vapnik, Predict- ing time series with support vector machines, in: International Conference on Artificial Neural Networks, Springer, 1997, pp. 999–1004. [13] C.-H. Wu, J.-M. Ho, D.-T. Lee, Travel-time prediction with support vector regression, IEEE transactions on intelligent transportation systems 5 (2004) 276–281. [14] M. Clerc, Beyond standard particle swarm optimisation, in: Innovations and Develop- ments of Swarm Intelligence Applications, IGI Global, 2012, pp. 1–19. [15] S. J. Taylor, B. Letham, Forecasting at scale, The American Statistician 72 (2018) 37–45. [16] A. C. Harvey, S. Peters, Estimation procedures for structural time series models, Journal of Forecasting 9 (1990) 89–108. 119