Посилене навчання нейромережі в уяві в системах керування безпілотними рухомими об’єктами

O.A. Oursatyev; O.Ye. Volkov

doi:10.15407/intechsys.2025.04.003

Authors

O.A. Oursatyev Institute of Information Technologies and Systems of NAS of Ukraine https://orcid.org/0009-0009-8323-0525
O.Ye. Volkov Institute of Information Technologies and Systems of NAS of Ukraine https://orcid.org/0000-0002-5418-6723

DOI:

https://doi.org/10.15407/intechsys.2025.04.003

Keywords:

unmanned moving objects, deep reinforcement learning, World Models, mental model of the world, neural networks, agent learning, multi-agent environment, recurrent state space model

Abstract

Introduction. Artificial intelligence based on recurrent (cyclic) neural networks (RNNAI) performs the function of learning to think like the human brain. Large recurrent neural networks can learn spatial and temporal representations of data. The model-based reinforcement deep learning method requires high-dimensional input data to achieve an optimal agent action strategy and an accurate representation of the environment.

However, reinforcement learning faces the traditional problem of large data: algorithms rarely work with high-dimensional data. First, traditional RL algorithms have difficulty learning the weights of a large model in the task of assigning grades or credits, especially at the end of a sequence of algorithmic steps. The weight distribution problem solves the problem of determining which steps should be considered rewarding or punishing for the final result. Second, the performance of physical simulators, whose task is to anticipate changes in the environment, is low. The selection of optimal actions is carried out through separate task scheduling. This requires simulating many random actions and selecting the best one. This is classic Model-Based RL. However, with large dimensions and long chains, the number of possible actions becomes too large to enumerate. Therefore, the author explored an approach that uses a mental model of the world as the environment model. Neural networks in this approach are trained similarly to how they are trained in the human brain. This approach achieves the control goal by constructing a model of the world instead of conducting costly real-world testing.

Purpose of this article is to analyze current international experience in developing and implementing analytical platforms for controlling moving objects in single- and multi-agent systems. This control is achieved using artificial intelligence generated by deep neural networks, training the model through reinforcement learning in unknown, partially observable environments.

Methods. An approach to controlling moving objects in single- and multi-agent systems using neural networks and a mental model of the world is considered.

Results. An approach to controlling moving objects using neural networks and a mental model of the world is investigated. This article analyzes international experience in the development and application of artificial intelligence tools, specifically deep reinforcement learning, to solve problems of moving object behavior in unknown, partially observable environments.

Conclusions. Based on this analysis, the author proposes the application of a well-known approach based on deep reinforcement learning to the problem of controlling moving objects. This approach achieves the control goal by constructing a model representation of the world instead of conducting costly real-world testing.

References

Oursatyev, O., & Volkov, O. Approaches to Creating Multiagent Systems and Deep Reinforcement Learning of Drones. Information Technologies and Systems, 3(3), 30–55. https://doi.org/10.15407/intechsys.2025.03.030

Ha D., Schmidhuber J. World Models. Can agents learn inside of their own dreams? NIPS 2018, March 27 2018, Oral Presentation. https://doi.org/10.5281/zenodo.1207631

Schmidhuber J. On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models, 2015, 36 p. https://doi.org/10.48550/arXiv.1511.09249

Gronauer S., Diepold K. Multi-agent deep reinforcement learning: a survey. Artificial Intelligence Review, 2021, 1–49. URL: https://link.springer.com/article/10.1007/s10462-021-09996-w

Schmidhuber J. Deep Learning in Neural Networks: An Overview. Neural Networks, 2015, Vol. 61, 85–117. https://doi.org/10.1016/j.neunet.2014.09.003

Sutton R. S., Barto A. G. Reinforcement Learning: An Introduction. A Bradford Book, The MIT Press Cambridge, Massachusetts, London, England, 2015, 1–337. URL: https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf

Schmidhuber J. Making the World Differentiable: On Using Self-Supervised Fully Recurrent Neural Networks for Dynamic Reinforcement Learning and Planning in Non-Stationary Environments. IDSIA, 1990. URL: https://people.idsia.ch/~juergen/FKI-126-90_(revised)bw_ocr.pdf

Schmidhuber J. An on-line algorithm for dynamic reinforcement learning and planning in reactive environments. IJCNN International Joint Conference on Neural Networks, 1990, Vol. 2, 253–258. https://doi.org/10.1109/IJCNN.1990.137723

Schmidhuber J. Reinforcement Learning in Markovian and Non-Markovian Environments. IDSIA, 1991. URL: https://sferics.idsia.ch/pub/juergen/nipsnonmarkov.pdf

Schmidhuber, J., A Possibility for Implementing Curiosity and Boredom in Model-building Neural Controllers. The First International Conference on Simulation of Adaptive Behavior on From Animals to Animats, 1990. 222–227. MIT Press/Bradford Books, 1991. https://doi.org/10.7551/mitpress/3115.003.0030

Arulkumaran K. et al. Deep reinforcement learning: A brief survey, 2017. https://doi.org/10.1109/MSP.2017.2743240

Kingma D. P. and Welling M. Auto-Encoding Variational Bayes. Cornell University, 2013. URL: https://pure.uva.nl/ws/files/2511146/162970_1312.6114v10.pd.pdf

Hansen (TAO). The CMA Evolution Strategy: A Tutorial. 2016, 1–39. URL: https://arxiv.org/abs/1604.00772v2

Kaiser L. et al., Model-Based Reinforcement Learning for Atari. ICLR 2020, 1-28. URL: https://arxiv.org/abs/1903.00374

Hessel M. et al. Rainbow: Combining Improvements in Deep Reinforcement Learning. AAAI 2018. https://doi.org/10.1609/aaai.v32i1.11796

Hafner D. et al. Mastering Atari with Discrete World Models. ICLR 2021, 1–26. URL: https://arxiv.org/abs/2010.02193

Mastering Atari with Discrete World Models. February 18, 2021, Posted by Hafner D., Google Research. URL: https://research.google/blog/mastering-atari-with-discrete-world-models/

Oursatyev O. Data Research in Industrial Data Mining Projects in the Big Data Generation Era. Control Systems and Computers, Issue 3, 33–54. [In Ukrainian: Урсатьєв О.А., Дослідження даних у промислових data-mining-проєктах в епоху генерації великих даних] https://doi.org/10.15407/csc.2023.03.033

Hafner D. et al., Learning Latent Dynamics for Planning from Pixels, 2018. URL: https://arxiv.org/abs/1811.04551

Introducing PlaNet: A Deep Planning Network for Reinforcement Learning, Febr. 2019, Posted by Danijar Hafner. URL: https://research.google/blog/introducing-planet-a-deep-planning-network-for-reinforcement-learning/

Introducing Dreamer: Scalable Reinforcement Learning Using World Models. March, 2020. Posted by Danijar Hafner. URL: https://research.google/blog/introducing-dreamer-scalable-reinforcement-learning-using-world-models/

Sutton R.S., Barto A.G. Reinforcement Learning: An Introduction. MIT Press,Cambridge, Massachusetts, London, England, 2018, 526 p. URL: https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf

Egorov V., Shpilman A. Scalable Multi-Agent Model-Based Reinforcement Learning. The 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 381–390. https://dl.acm.org/doi/abs/10.5555/3535850.3535894

Egorov V., Shpilman A. Scalable Multi-Agent Model-Based Reinforcement Learning. ArXiv, 2022. URL: https://arxiv.org/abs/2205.15023v1

Sunehag P. et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning. ArXiv, 2017. URL: https://arxiv.org/abs/1706.0529625.

Vaswani A. et al.Attention Is All You Need. ArXiv, 2017. URL: https://arxiv.org/abs/1706.03762

Bahdanau D., Cho K., Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate. ArXiv. URL: https://arxiv.org/abs/1409.0473

Cho K. et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. https://doi.org/10.3115/v1/D14-1179

Enhanced Neural Network Learning In Imagination of Systems for Controlling Unmanned Movements of Objects

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Language

Information

Make a Submission

Current Issue

Browse

© Institute of Information Technologies and Systems of the NAS of Ukraine, 2025
© Publisher PH «Akademperiodyka» of the NAS of Ukraine, 2025