Enhanced Neural Network Learning In Imagination of Systems for Controlling Unmanned Movements of Objects
DOI:
https://doi.org/10.15407/intechsys.2025.04.003Keywords:
unmanned moving objects, deep reinforcement learning, World Models, mental model of the world, neural networks, agent learning, multi-agent environment, recurrent state space modelAbstract
Introduction. Artificial intelligence based on recurrent (cyclic) neural networks (RNNAI) performs the function of learning to think like the human brain. Large recurrent neural networks can learn spatial and temporal representations of data. The model-based reinforcement deep learning method requires high-dimensional input data to achieve an optimal agent action strategy and an accurate representation of the environment.
However, reinforcement learning faces the traditional problem of large data: algorithms rarely work with high-dimensional data. First, traditional RL algorithms have difficulty learning the weights of a large model in the task of assigning grades or credits, especially at the end of a sequence of algorithmic steps. The weight distribution problem solves the problem of determining which steps should be considered rewarding or punishing for the final result. Second, the performance of physical simulators, whose task is to anticipate changes in the environment, is low. The selection of optimal actions is carried out through separate task scheduling. This requires simulating many random actions and selecting the best one. This is classic Model-Based RL. However, with large dimensions and long chains, the number of possible actions becomes too large to enumerate. Therefore, the author explored an approach that uses a mental model of the world as the environment model. Neural networks in this approach are trained similarly to how they are trained in the human brain. This approach achieves the control goal by constructing a model of the world instead of conducting costly real-world testing.
Purpose of this article is to analyze current international experience in developing and implementing analytical platforms for controlling moving objects in single- and multi-agent systems. This control is achieved using artificial intelligence generated by deep neural networks, training the model through reinforcement learning in unknown, partially observable environments.
Methods. An approach to controlling moving objects in single- and multi-agent systems using neural networks and a mental model of the world is considered.
Results. An approach to controlling moving objects using neural networks and a mental model of the world is investigated. This article analyzes international experience in the development and application of artificial intelligence tools, specifically deep reinforcement learning, to solve problems of moving object behavior in unknown, partially observable environments.
Conclusions. Based on this analysis, the author proposes the application of a well-known approach based on deep reinforcement learning to the problem of controlling moving objects. This approach achieves the control goal by constructing a model representation of the world instead of conducting costly real-world testing.
References
Oursatyev, O., & Volkov, O. Approaches to Creating Multiagent Systems and Deep Reinforcement Learning of Drones. Information Technologies and Systems, 3(3), 30–55. https://doi.org/10.15407/intechsys.2025.03.030
Ha D., Schmidhuber J. World Models. Can agents learn inside of their own dreams? NIPS 2018, March 27 2018, Oral Presentation. https://doi.org/10.5281/zenodo.1207631
Schmidhuber J. On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models, 2015, 36 p. https://doi.org/10.48550/arXiv.1511.09249
Gronauer S., Diepold K. Multi-agent deep reinforcement learning: a survey. Artificial Intelligence Review, 2021, 1–49. URL: https://link.springer.com/article/10.1007/s10462-021-09996-w
Schmidhuber J. Deep Learning in Neural Networks: An Overview. Neural Networks, 2015, Vol. 61, 85–117. https://doi.org/10.1016/j.neunet.2014.09.003
Sutton R. S., Barto A. G. Reinforcement Learning: An Introduction. A Bradford Book, The MIT Press Cambridge, Massachusetts, London, England, 2015, 1–337. URL: https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf
Schmidhuber J. Making the World Differentiable: On Using Self-Supervised Fully Recurrent Neural Networks for Dynamic Reinforcement Learning and Planning in Non-Stationary Environments. IDSIA, 1990. URL: https://people.idsia.ch/~juergen/FKI-126-90_(revised)bw_ocr.pdf
Schmidhuber J. An on-line algorithm for dynamic reinforcement learning and planning in reactive environments. IJCNN International Joint Conference on Neural Networks, 1990, Vol. 2, 253–258. https://doi.org/10.1109/IJCNN.1990.137723
Schmidhuber J. Reinforcement Learning in Markovian and Non-Markovian Environments. IDSIA, 1991. URL: https://sferics.idsia.ch/pub/juergen/nipsnonmarkov.pdf
Schmidhuber, J., A Possibility for Implementing Curiosity and Boredom in Model-building Neural Controllers. The First International Conference on Simulation of Adaptive Behavior on From Animals to Animats, 1990. 222–227. MIT Press/Bradford Books, 1991. https://doi.org/10.7551/mitpress/3115.003.0030
Arulkumaran K. et al. Deep reinforcement learning: A brief survey, 2017. https://doi.org/10.1109/MSP.2017.2743240
Kingma D. P. and Welling M. Auto-Encoding Variational Bayes. Cornell University, 2013. URL: https://pure.uva.nl/ws/files/2511146/162970_1312.6114v10.pd.pdf
Hansen (TAO). The CMA Evolution Strategy: A Tutorial. 2016, 1–39. URL: https://arxiv.org/abs/1604.00772v2
Kaiser L. et al., Model-Based Reinforcement Learning for Atari. ICLR 2020, 1-28. URL: https://arxiv.org/abs/1903.00374
Hessel M. et al. Rainbow: Combining Improvements in Deep Reinforcement Learning. AAAI 2018. https://doi.org/10.1609/aaai.v32i1.11796
Hafner D. et al. Mastering Atari with Discrete World Models. ICLR 2021, 1–26. URL: https://arxiv.org/abs/2010.02193
Mastering Atari with Discrete World Models. February 18, 2021, Posted by Hafner D., Google Research. URL: https://research.google/blog/mastering-atari-with-discrete-world-models/
Oursatyev O. Data Research in Industrial Data Mining Projects in the Big Data Generation Era. Control Systems and Computers, Issue 3, 33–54. [In Ukrainian: Урсатьєв О.А., Дослідження даних у промислових data-mining-проєктах в епоху генерації великих даних] https://doi.org/10.15407/csc.2023.03.033
Hafner D. et al., Learning Latent Dynamics for Planning from Pixels, 2018. URL: https://arxiv.org/abs/1811.04551
Introducing PlaNet: A Deep Planning Network for Reinforcement Learning, Febr. 2019, Posted by Danijar Hafner. URL: https://research.google/blog/introducing-planet-a-deep-planning-network-for-reinforcement-learning/
Introducing Dreamer: Scalable Reinforcement Learning Using World Models. March, 2020. Posted by Danijar Hafner. URL: https://research.google/blog/introducing-dreamer-scalable-reinforcement-learning-using-world-models/
Sutton R.S., Barto A.G. Reinforcement Learning: An Introduction. MIT Press,Cambridge, Massachusetts, London, England, 2018, 526 p. URL: https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf
Egorov V., Shpilman A. Scalable Multi-Agent Model-Based Reinforcement Learning. The 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 381–390. https://dl.acm.org/doi/abs/10.5555/3535850.3535894
Egorov V., Shpilman A. Scalable Multi-Agent Model-Based Reinforcement Learning. ArXiv, 2022. URL: https://arxiv.org/abs/2205.15023v1
Sunehag P. et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning. ArXiv, 2017. URL: https://arxiv.org/abs/1706.0529625.
Vaswani A. et al.Attention Is All You Need. ArXiv, 2017. URL: https://arxiv.org/abs/1706.03762
Bahdanau D., Cho K., Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate. ArXiv. URL: https://arxiv.org/abs/1409.0473
Cho K. et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. https://doi.org/10.3115/v1/D14-1179
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Information Technologies and Systems

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The paper is an Open Access under the CC BY-NC-ND 4.0 license - Attribution-NonCommercial-NoDerivatives 4.0 International.