TRAINING NEURAL NETWORKS USING REINFORCEMENT LEARNING TO REACTIVE PATH PLANNING
Abstract
The mobile robots are devices with great boom given the possibilities that their utilities offer, and to a greater extent, those freelancers who do not require an operator to perform their functions. In order to consolidate the autonomy it is necessary to generate a system of planning of ways that allows a viable route and as far as possible optimal. This study develops a reactive two-dimensional path planning method with neural networks trained under the reinforcement learning method. The complexity of the scenario between the initial and final point is due to warning and forbidden obstacle zones, and the experimentation is carried out on different neural network architectures, each one as an agent of the learning-by-reinforcement algorithm, being these DQN and DDQN types. The best results are obtained with the DDQN training, reaching the objective in 89% in the validation episodes, although the DQN method shows to be 15.63% faster in its success cases. This work was carried out within the research group DIGITI of the Universidad Distrital Francisco José de Caldas.
References
Praveen Kalla et al. “Coordinate Reference Frame Technique for Robotic Planar Path Planning”. In: Materials Today: Proceedings 5.9, Part 3 (2018). Materials Processing and characterization,16th – 18th March 2018, pp. 19073–19079. issn: 2214- 7853. doi: 10.1016/j.matpr.2018.06. 260. url: http://www.sciencedirect.com/science/article/pii/ S221478531831383X.
Chaymaa Lamini, Said Benhlima, and Ali Elbekri. “Genetic Algorithm Based Approach for Auton¬omous Mobile Robot Path Planning”. In: Proce¬dia Computer Science 127 (2018). PRO-CEED¬INGS OF THE FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING IN DATA SCIENCES, ICDS2017, pp. 180–189. issn: 1877-0509. doi:10.1016/j.procs.2018.01.113. url: http://www.sciencedirect.com/science/article/pii/ S187705091830125X.
Thi Thoa Mac et al. “Heuristic approaches in ro¬bot path planning: A survey”. In: Robotics and Au¬tonomous Systems 86 (2016), pp. 13–28. issn: 0921-8890. doi: 10.1016/j.robot.2016.08. 001. url: http://www.sciencedirect.com/science/article/pii/ S0921889015300671.
Azzeddine Bakdi et al. “Optimal path planning and execution for mobile robots using genetic algorithm and adaptive fuzzy-logic control”. In: Robotics and Autonomous Systems 89 (2017), pp. 95–109. issn: 0921-8890. doi: 10.1016/j.robot.2016.12.008. url: http://www.sciencedirect.com/science/article/pii/ S0921889016302512.
Farhad Bayat, Sepideh Najafinia, and Morteza Aliyari. “Mobile robots path planning: Electrostat¬ic potential field approach”. In: Expert Systems with Applications 100 (2018), pp. 68–78. issn: 0957-4174. doi: 10.1016/j.eswa.2018.01.050. url: http://www.sciencedirect.com/science/article/pii/ S0957417418300630.
Julien Moreau et al. “Reactive path planning in inter¬section for autonomous vehicle”. In: IFAC-PapersOn¬Line 52.5 (2019). 9th IFAC Symposium on Advanc¬es in Automotive Control AAC 2019, pp. 109–114. issn: 2405-8963. doi: 10.1016/j.ifacol.2019.09.018. url: http: //www.sciencedirect.com/science/article/pii/ S2405896319306378.
Yi-Wen Chen and Wei-Yu Chiu. “Optimal Robot Path Planning System by Using a Neural Network-Based Approach”. In: Proceedings of 2015 International Automatic Control Conference (CACS) (2015), pp. 85–90.
Valeri Kroumov and Jianli Yu. “Neural Networks Based Path Planning and Navigation of Mobile Robots”. In: Recent Advances in Mobile Robotics (2011), pp. 173–190.
Keyu Wu et al. “TDPP-Net: Achieving three-dimen¬sional path planning via a deep neural network ar¬chitecture”. In: Neurocomputing 357 (2019), pp. 151–162. issn: 0925-2312. doi: 10.1016/j.neu¬com.2019.05.001. url: http://www.sciencedirect. com/science/article/ pii/S092523121930606X.
Md. Arafat Hossain and Israt Ferdous. “Autono¬mous robot path planning in dynamic environment using a new optimization technique inspired by bacterial foraging technique”. In: Robotics and Au¬tonomous Systems 64 (2015), pp. 137–141. issn: 0921-8890. doi: 10.1016/ j.robot.2014.07.002. url: http://www.sciencedirect.com/science/article/pii/ S0921889014001274.
Ee Soong Low, Pauline Ong, and Kah Chun Cheah. “Solving the optimal path planning of a mobile ro¬bot using improved Q-learning”. In: Robotics and Autonomous Systems 115 (2019), pp. 143–161. issn: 0921-8890. doi: 10.1016/j.robot.2019.02.013. url: http://www.sciencedirect.com/science/article/pii/ S0921889018308285.
Adam C Parry and Raul Ordonez. “Intelligent Path Planning with Evolutionary Computation”. In: AIAA Guidance, Navigation, and Control Conference Au¬gust (2010).
Alessandro Gasparetto et al. “Path Planning and Trajectory Planning Algorithms: a General Over¬view”. In: Motion and Operation Planning of Robotic Systems. September 2017. 2015. Chap. 1, pp. 3–27. isbn: 9783319147055.
Yandun Aracely and Sotomayor. Nelson. “Planea¬cion y seguimiento de trayectorias para un robot movil”. In: (2012).
Chen Chen et al. “A knowledge-free path plan¬ning approach for smart ships based on rein¬forcement learning”. In: Ocean Engineering 189 (2019), p. 106299. issn: 0029-8018. doi: 10.1016/ j.oceaneng.2019.106299. url: http://www.sciencedi¬rect.com/science/article/pii/S0029801819304706.
Richard S. Sutton and Andrew G Barto. Reinforce¬ment Learning: An Introduction. 2nd ed. 2018, p. 552. isbn: 978-0262039246.
Ding Ding et al. “Q-learning based dynamic task scheduling for energy-efficient cloud computing”. In: Future Generation Computer Systems 108 (2020), pp. 361–371. issn: 0167-739X. doi: 10.1016/j.fu¬ture.2020.02.018. url: http://www.sciencedirect.com/ science/article/ pii/S0167739X19313858.
Gyeeun Jeong and Ha Young Kim. “Improving fi¬nancial trading decisions using deep Q-learning: Predicting the number of shares, action strate¬gies, and transfer learning”. In: Expert Systems with Applications 117 (2019), pp. 125–138. issn: 0957-4174. doi: 10.1016/j.eswa.2018.09.036. url: http://www.sciencedirect.com/science/article/pii/ S0957417418306134.
Martin Riedmiller. “Neural fitted Q iteration - First ex¬periences with a data efficient neural Reinforcement Learning method”. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artifi¬cial Intelligence and Lecture Notes in Bioinformatics) 3720 LNAI (2005), pp. 317–328. issn: 03029743.
Sebastian Thrun and Anton Schwartz. “Issues in Using Function Approximation for Reinforcement Learning”. In: Proceedings of the 4th Connectionist Models Summer School Hillsdale, NJ. Lawrence Er¬lbaum (1993), pp. 1–9.
Hado Van Hasselt. “Double Q-learning”. In: Advanc¬es in Neural Information Processing Systems. Janu¬ary 2010. 2010.
Hado Van Hasselt. Insights in Reinforcement Learn¬ing: Formal Analysis and Empirical Evaluation of Temporal-Difference Learning Algorithms. 2011, pp. 1–282. isbn: 9789039354964.