Підходи до створення мультиагентних систем і глибокого посиленого навчання

O.A. Oursatyev; O.Ye. Volkov

doi:10.15407/intechsys.2025.03.030

Authors

O.A. Oursatyev Institute of Information Technologies and Systems of NAS of Ukraine https://orcid.org/0009-0009-8323-0525
O.Ye. Volkov https://orcid.org/0000-0002-5418-6723

DOI:

https://doi.org/10.15407/intechsys.2025.03.030

Keywords:

unmanned moving objects, UAVs, UAV swarm control, swarm of UAVs, deep reinforcement learning, DRL, world models, world models introduces a model-based approach to RL, training paradigms execution sheme

Abstract

Introduction. Unmanned aerial vehicles (UAVs) are increasingly used in many complex and diverse tasks related to civil and military spheres. UAVs are a class of aircraft, commonly referred to as drones. They can fly without the presence of a human pilot on board. However, there are a number of unsolved problems with UAVs development: flight path planning, navigation and control. In complex systems, which certainly include UAVs, artificial intelligence (AI) is usually used to solve these problems and ensure the required of its functioning, implemented by the method of deep learning with reinforcement. Modern foreign experience in the use of analytical platforms for controlling mobile objects, in particular UAVs, allows for the use of deep neural networks for the above tasks.

The purpose of the paper is to introduce domain experts whose primary job function is outside of machine learning to the challenges of applying AI to these problems, robust and complex deep neural networks and their training, which remains challenging and requires large amounts of data and practical experience. This can be a form of citizen science and will contribute to the replication of research and the democratization of AI.

Results. An analysis of solutions to these problems using deep reinforcement learning is performed, in particular, control of a swarm of UAVs etc. and a taxonomy of Model-Free deep reinforcement learning algorithms applied in UAV tasks is given. The first experience of solutions using the environment model is considered. Unfortunately, almost all works are of a nature, they lack verification in real or close to them environmental conditions. This paper presents a brief overview of approaches to solving problems of reinforcement learning - interactions between agents and the environment in the process of step-by-step decision making. This approach is applied to solving problems of moving objects and complex and partially observable environments; model-free and model-based learning; mathematical formalization of solving UAV problems under reinforcement learning, including paradigms for learning agents in a multi-agent environment Multi-Agent reinforcement learning. Problems arising in the multi-agent field, such as non-stationarity of the environment from the point of view of a single agent, relative overgeneralization and the problem of assigning credits are discussed. Formal concepts underlying these Multi-Agent reinforcement learning are presented.

Conclusions. An overview of methods for solving the problem of reinforced learning is presented, as a result of which the authors conclude that further research should focus on solutions using a model (Model-Based) and pay attention to the design of typical environments that meet certain conditions for performing the task. Such models may be easier and faster to adapt to the real environment.

References

Azar AT, Koubaa A., Ali Mohamed N. et al. Drone Deep Reinforcement Learning: A Review. Electronics 2021, Vol. 10(9), 1–30. https://doi.org/10.3390/electronics10090999

Oursatyev O.A, Volkov, O.Ye, Tkalya V.H. Automated Machine Learning. State and Prospects Development. Information Technologies and Systems, Vol. 2(2), 3–33. [In Ukrainian: Автоматизоване машинне навчання. Стан та перспективи розвитку. Information Technologies and Systems (Інформаційні технології та системи)] https://doi.org/10.15407//intechsys.2025.02.003

Schmidhuber J. Deep Learning in Neural Networks: An Overview. Neural Networks, 2015, Vol. 61, 85-117. https://doi.org/10.1016/j.neunet.2014.09.003

Sutton, R.S., Barto A.G. Reinforcement Learning: An Introduction. MIT Press,Cambridge, Massachusetts, London, England, 2018, 526 p. URL: https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf

Schmidhuber J. Making the World Differentiable: On Using Self-Supervised Fully Recurrent Neural Networks for Dynamic Reinforcement Learning and Planning in Non-Stationary Environments. IDSIA, 1990. URL: https://people.idsia.ch/~juergen/FKI-126-90_(revised)bw_ocr.pdf

Schmidhuber J. Reinforcement Learning in Markovian and Non-Markovian Environments. IDSIA, 1991. URL: https://sferics.idsia.ch/pub/juergen/nipsnonmarkov.pdf

Arulkumaran K. et al. Deep reinforcement learning: A brief survey, 2017. https://doi.org/10.1109/MSP.2017.2743240

Zhang H., Yu T. Taxonomy of Reinforcement Learning Algorithms. Deep Reinforcement Learning, Springer Singapore, 2020. https://doi.org/10.1007/978-981-15-4095-0_3

Schmidhuber J. An on-line algorithm for dynamic reinforcement learning and planning in reactive environments. IJCNN International Joint Conference on Neural Networks, 1990, Vol. 2, 253-258. https://doi.org/10.1109/IJCNN.1990.137723

Nagabandi A. et al. PDDM: Planing with Deep Dynamics Models for Learning Dexterous Manipulation. Conference on Robot Learning (CoRL), 2019. URL: https://sites.google.com/view/pddm

Poole D.L., Mackworth A.K. AI Foundations of Computational Agents. University of British Columbia, TJ Books Limited, Padstow, Cornwall, 2023, 870 p. https://doi.org/10.1017/9781009258227

Francois-Lavet V., Henderson P. et al. An Introduction to Deep Reinforcement Learning. 2018. https://doi.org/10.1561/9781680835397

Ha D., Schmidhuber J. World Models. Can agents learn inside of their own dreams? NIPS 2018, March 27 2018, Oral Presentation. https://doi.org/10.5281/zenodo.1207631

Anastasis Germanidis. Introducing General World Models. URL: https://research.runwayml.com/introducing-general-world-models [Accessed Dec. 2023]

Gronauer S., Diepold K. Multi-agent deep reinforcement learning: a survey. Artificial Intelligence Review, 2021, 1–49. URL: https://link.springer.com/article/10.1007/s10462-021-09996-w

Sutton R. S., Barto A. G. Reinforcement Learning: An Introduction. A Bradford Book, The MIT Press Cambridge, Massachusetts, London, England,2015, 1–337. URL: https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf

Littman M.L. Markov games as a framework for multi-agent reinforcement learning. Machine Learning Proceedings, 11-th International Conference, Rutgers University, New Brunswick, NJ, 10–13 Jul. 1994, 157-163. https://doi.org/10.1016/B978-1-55860-335-6.50027-1

Fully Observable vs. Partially Observable Environment in AI. URL: https://www.geeksforgeeks.org/fully-observable-vs-partially-observable-environment-in-ai/[Accessed May. 2024]

Tožicka J., Szulyovszky B., de Chambrier G. et all. Application of deep ˇreinforcement learning to UAV fleet control. SAI Intelligent Systems Conference, London, UK, 5–6 Sept. 2018, 1169–1177. https://doi.org/10.1007/978-3-030-01057-7_85

Liu C.H. et al. Energy-efficient UAV control for effective and fair communication coverage: A deep reinforcement learning approach. IEEE J. Sel. Areas Commun., Aug. 2018, Vol. 36 (9), 2059–2070. https://doi.org/10.1109/JSAC.2018.2864373

Yang J. et al. Application of reinforcement learning in UAV cluster task scheduling. Future Gener. Comput. Syst. 2019, Vol. 95, 140–148. https://doi.org/10.1016/j.future.2018.11.014

Wang C. et all. Autonomous navigation of UAV in large-scale unknown complex environment with deep reinforcement learning. IEEE Global Conference on Signal and Information Processing (GlobalSIP), Montreal, QC, Canada, 14–16 Nov. 2017, 858–862. https://doi.org/10.1109/GlobalSIP.2017.8309082

Huang Y., Wei G., Wang, Y. V-D D3QN: the Variant of Double Deep Q-Learning Network with Dueling Architecture. 37th Chinese Control Conference (CCC), Wuhan, China, 2018, 9130–9135. https://doi.org/10.23919/ChiCC.2018.8483478

Imanberdiyev N., Fu C., Kayacan E., Chen I.M. Autonomous navigation of UAV by using real-time model-based reinforcement learning. 14th International Conference on Control, Automation, Robotics and Vision (ICARCV), Phuket, Thailand, 13–15 Nov. 2016, 1–6. https://doi.org/10.1109/ICARCV.2016.7838739

Hester T., Stone P. TEXPLORE: Real-Time Sample-Efficient Reinforcement Learning for Robots. AAAI Technical Report SS-12-02. Designing Intelligent Robots: Reintegrating AI. Department of Computer Science, The University of Texas at Austin. URL: https://cdn.aaai.org/ocs/4271/4271-19461-1-PB.pdf

Moerland T.M. Model-based Reinforcement Learning: A Survey. 2020, 67 p. URL: http://arxiv.org/pdf/2006.16712

Bou-Ammar H., Voos H., Ertel W. Controller Design for Quadrotor UAVs using Reinforcement Learning. IEEE International Conference on Control Applications, Yokohama, Japan, 8–10 Sept. 2010, 2130–2135. https://doi.org/10.1109/CCA.2010.5611206

Henderson P. et al. Deep Reinforcement Learning that Matters. Thirthy-Second AAAI Conference On Artificial Intelligence (AAAI), 2018. https://doi.org/10.1609/aaai.v32i1.11694

Approaches to Creating Multiagent Systems and Deep Reinforcement Learning of Drones

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Language

Information

Make a Submission

Current Issue

Browse

© Institute of Information Technologies and Systems of the NAS of Ukraine, 2025
© Publisher PH «Akademperiodyka» of the NAS of Ukraine, 2025