Data-driven Policy on Feasibility Determination for the Train Shunting Problem (Extended Abstract) ? Paulo Roberto de Oliveira da Costa1 , Jason Rhuggenaath1 ( ), Yingqian Zhang1 , Alp Akcay1 , Wan-Jui Lee2 , and Uzay Kaymak1 1 Eindhoven University of Technology 2 Dutch Railways Introduction. Parking, matching, scheduling, and routing are common problems in train maintenance. In particular, train units are commonly maintained and cleaned at dedicated shunting yards. The planning problem that results from such situations is referred to as the Train Unit Shunting Problem (TUSP). This problem involves matching arriving train units to service tasks and determining the schedule for departing trains. The TUSP is an important problem as it is used to determine the capacity of shunting yards and arises as a sub-problem of more general scheduling and planning problems. In this paper, we consider the case of the Dutch Railways (NS) TUSP. As the TUSP is complex, NS currently uses a local search (LS) heuristic to determine if an instance of the TUSP has a feasible solution. Given the number of shunting yards and the size of the planning problems, improving the evaluation speed of the LS brings significant computational gain. In this work, we use a machine learning approach that complements the LS and accelerates the search process. We use a Deep Graph Convolutional Neural Network (DGCNN) model to predict the feasibility of solutions obtained during the run of the LS heuristic. We use this model to decide whether to continue or abort the search process. In this way, the computation time is used more efficiently as it is spent on instances that are more likely to be feasible. Using simulations based on real-life instances of TUSP, we show how our approach improves upon the previous method on prediction accuracy and leads to computational gains for the decision-making process. Methodology. Our methodology consists of three main steps: (i) generation of starting solutions; (ii) a prediction model for feasibility of solutions and (iii) a policy for interation with the LS. A schematic view of our approach can be found in [1]. Each intermediate solution is represented as an activity graph of maintenance activities. We then use a Deep Graph Convolutional Neural Network (DGCNN) [2] as a feature extractor to train a model that predicts the future feasibility of each solution. Afterwards, we combine the predictions with the LS to derive a policy that decides to terminate the LS run based on the sequence of intermediate solutions seen so far. Main results and insights. We evaluate the classification performance of the graph classification models (Table 1). We consider three types of prediction models: (i) DGCNN-IS, (ii) DGCNN-IS-T and (iii) DGCNN-MS-T. DGCNN-IS ? Copyright c 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). The full paper can be found in [1]. uses only node labels and initial solutions during training. DGCNN-IS-T is the same as DGCNN-IS but with additional temporal features. DGCNN-MS-T uses additional node temporal features and the first 10% of the graphs in each instance of a LS run for training. All models consider as class labels the final feasibility status at the end of an LS run. We perform inference on the initial solutions on the test dataset. Method DGCNN-IS DGCNN-IS-T DGCNN-MS-T ACC(%) 54.10 ± 2.37 60.01 ± 1.67 59.96 ± 1.68 TPR(%) 52.93 ± 11.47 58.25 ± 9.10 82.08 ± 2.53 TNR(%) 55.28 ± 10.27 60.38 ± 7.37 37.86 ± 4.99 Table 1. Different models. Adding time features improves performance by 10%. The main insight is that temporal features add important information for the classification of feasible instances. These results motivate us to consider a variant DGCNN-MS-T-W of DGCNN-MS-T that can be used alongside the local search procedure. DGCNN-MS-T-W is similar to DGCNN-MS-T except that it predicts feasibility of solutions that are W iterations ahead. Given an instance of TUSP, we use the following threshold policy (see [1] for more details): 1. Start the LS with the initial solution and run it for K iterations. 2. For each iteration i where 0 ≤ i ≤ K, we pass the current solution Gi to DGCNN-MS-T-W to get a feasibility score φ(Gi ). 3. If the average of the feasibility scores φ(Gi ) seen up to iteration K falls below some threshold 0 < αIF < 1, we stop the LS and classify the instance as infeasible. Prediction Outcome 0.506 Infeasible 0: infeasible 1: feasible Correct Feasible Incorrect 0.504 0.502 68.4% Feasibility Score 543 250 0 0.500 33.9% 15.6% 31.6% Actual Value 0.498 0.496 284 523 64.8% 0.494 1 17.7% 32.7% 35.2% 0.492 Correct 65.6% 67.6% 66.6% 0 50 100 150 200 250 300 350 400 Number of Iterations Incorrect 34.4% 32.4% 33.4% Fig. 1. Moving average over the last 10 Table 2. Confusion matrix of iterations of the feasibility scores φ(Gi ) one of the folds for the final averaged over cross-validation runs of the classification of DGCNN-MS-T- LS procedure W, with W = 150, K = 200, αIF = 0.5. Our results (see also Fig. 1 and Table 2) indicate that we can achieve a 8% reduction in running time for a single instance, which can account for roughly 20 hours in total real time. References 1. de O. da Costa, P.R., Rhuggenaath, J., Zhang, Y., Akcay, A., Lee, W., Kaymak, U.: Data-driven policy on feasibility determination for the train shunting problem. In: Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD (2019), to appear 2. Zhang, M., Cui, Z., Neumann, M., Chen, Y.: An end-to-end deep learning architecture for graph classification. In: AAAI-18. pp. 4438–4445 (2018)