Determining the Scale and Rotation Angle for Long-Term Object Tracking in Video
DOI:
https://doi.org/10.15407/intechsys.2025.05.022Keywords:
відстеження об’єктів, BRISK ключові точки, KCF алгоритм відстеження, HOG ознаки, масштаб та кут повороту об’єкта на зображенні, детектування та розпізнавання об’єктівAbstract
Introduction. The task of tracking is to determine the position of an arbitrary object (target) in a video after detection in the initial frame. Algorithms for performing tracking are divided into those that provide short-term or more complex long-term tracking in a video. A key problem in long-term tracking is the recovery of a target after a period of absence or tracking failures, because during this time not only the coordinates but also the scale and angle of rotation of the target can change significantly, knowledge of which increases the accuracy and reliability of detection. As a result, after a long disappearance, the search for the target must be performed not locally, but within the entire image and significant intervals of possible changes in scale and rotation angle. The reliability of tracking in video largely depends on the efficiency (accuracy and low computational complexity) of the algorithms used to determine the scale and rotation angle of the tracked target in the images. There are known algorithms that determine the scale and rotation angle based on the correspondence of key points (KPs) of the target without sufficient consideration of the background KP, and can provide tracking in conditions of only a short-term absence of the target, during which the scale and rotation angle change little.
Purpose of the research is to develop an algorithm for determining the scale and angle of rotation of an object, which overcomes these shortcomings to obtain more reliable object tracking results in difficult conditions.
Methods for searching key points and determining their correspondence in images were used.
Results. An algorithm is proposed for estimation the scale and rotation angle of the tracked object in the images based on finding corresponding KPs in each frame to the KPs in the object model M. The algorithm can be mainly used in conditions where changes in scale and angle of rotation are mainly a consequence of changes in camera movement or operator actions. These changes are largely correlated with changes in the background, which usually corresponds to video surveillance from an aircraft, in particular a UAV. The advantages of the algorithm are that it is relatively more resistant to errors in determining the corresponding KP pairs, and can also be used during the prolonged absence of an object in the video to estimate the scale and angle of rotation.
Conclusions. The paper uses a tracking object model consisting of KPs in the object and the background to search for KPs correspondence. The algorithm is proposed for determining the scale and rotation angle of the object both when present and absent in images to update M and detect this object after it appears in images with significantly changed parameters. Examples of using the algorithm for long-term tracking with the proposed criterion for the presence of the object, as well as two methods of updating M, when it is present or absent in images, are given.
References
Bolme D.S., Beveridge J.R., Draper B.A., Lui Y.M.. Visual object tracking using adaptive correlation filters. The IEEE conference on Computer Vision and Pattern, 2010, 1–10. https://doi.org/10.1109/CVPR.2010.5539960
Henriques J.F., Caseiro R., Martins P., Batista J. High-speed tracking with kernelized correlation filters. IEEE transactions on pattern analysis and machine intelligence, 2015, Vol. 37 (3), 583–596. https://doi.org/10.1109/TPAMI.2014.2345390
Danelljan M., Häger G., Khan F.S., Felsberg M. Discriminative scale space tracking. IEEE transactions on pattern analysis and machine intelligence, 2016, Vol. 39 (8), 1561–1575. https://doi.org/10.1109/TPAMI.2016.2609928
Tao R., Gavves E., Smeulders A.W. Siamese Instance Search for Tracking. The IEEE conference on Computer Vision and Pattern Recognition (CVPR), 2016, 1420–1429. https://doi.org/10.1109/CVPR.2016.158
Bertinetto L., Valmadre J., Henriques J.F., Vedaldi A., Torr P.H. Fully-convolutional siamese networks for object tracking. Computer vision–ECCV 2016 workshops, Amsterdam, the Netherlands, 2016, 850–865. https://doi.org/10.1109/CVPR.2016.158
Nebehay G., Pflugfelder R.P. Clustering of static-adaptive correspondences for deformable object tracking. The IEEE conference on Computer Vision and Pattern Recognition (CVPR), 2015, 2784–2791. https://doi.org/10.1109/CVPR.2015.7298895
Wu B., Xie Y., Luo W. Robust and adaptive object tracking via correspondence clustering. IEICE Trans. Information & Systems, 2016, Vol. E99-D (10), 2664–2667. https://doi.org/10.1587/transinf. 2016EDL8065
Hong Z., Chen Z., Wang C., Mei X., Prokhorov D., Tao D. Multi-store tracker (muster): A cognitive psychology inspired approach to object tracking. The IEEE conference on computer vision and pattern recognition, 2015, 749–758. https://doi.org/10.1109/CVPR.2015.7298675
Kalal Z., Mikolajczyk K., Matas J. Tracking-learning detection. TPAMI, 2012, Vol. 34 (7), 1409–1422. https://doi.org/10.1109/TPAMI.2011.239
Ma C., Yang X., Zhang C., Yang M.H. Long-term correlation tracking. The IEEE conference on computer vision and pattern recognition, 2015, 5388–5396. https://doi.org/10.1109/CVPR.2015.7299177
Ma C., Huang J.B., Yang X., Yang M.H. Adaptive correlation filters with long-term and short-term memory for object tracking. International Journal of Computer Vision, 2018, Vol. 126, 771–796. https://doi.org/10.1007/s11263-018-1076-4
Lukežič A., Zajc L.Č., Vojíř T., Matas J., Kristan M. FuCoLot–a fully-correlational long-term tracker. Computer Vision–ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, 2018, Revised Selected Papers, Part II 14, 2019, 595–611. https://doi.org/10.1007/978-3-030-20890-5_38
Lukeˇziˇc A., Voj´ıˇr, T., Cˇehovin Zajc, L., Matas, J., Kristan, M. Discriminative correlation filter with channel and spatial reliability. Comp. Vis. Patt. Recognition, 2017, 6309–6318. https://doi.org/10.1109/CVPR.2017.515
Yan B., Zhao H., Wang D., Lu H., Yang X. ‘Skimming-Perusal’ Tracking: a framework for Real-Time and robust Long-Term tracking. IEEE/CVF International Conference on Computer Vision (ICCV), 2019. https://doi.org/10.1109/iccv.2019.00247
Huang L., Zhao X., Huang K. GlobalTrack: a simple and strong baseline for Long-Term tracking. AAAI Conference on Artificial Intelligence, 2020, Vol. 34 (07), 11037–11044. https://doi.org/10.1609/aaai.v34i07.6758
Dai K., Zhang Y., Wang D., Li J., Lu H., Yang X. High-Performance Long-Term tracking with Meta-Updater. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, 6297–6306. https://doi.org/10.1109/cvpr42600.2020.00633
Dunnhofer M., Micheloni C. CoCoLoT: Combining Complementary Trackers in Long-Term Visual Tracking. 26th International Conference on Pattern Recognition (ICPR), 2022, 5132–5139. https://doi.org/10.1109/ICPR56361.2022.9956082
Chen X., Yan B., Zhu J., Wang D., Yang X., Lu H. Transformer Tracking. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, 8122–8131. https://doi.org/10.1109/CVPR46437.2021.00803
Swati, Kumar V.N., Dinesh Kawa S., Engineer P.J. An Efficient Object Tracking on Edge Devices with Quantized Siamese Networks. Devices for Integrated Circuit (DevIC), 2025, 604–609. https://doi.org/10.1109/DevIC63749.2025.11012629
Kristan M., Matas J., Leonardis A., Vojír T., Pflugfelder R.P., Fernandez G.J., Nebeha, G., Porikli F.M., Cehovin L. A Novel Performance Evaluation Methodology for Single-Target Trackers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, Vol. 38, 2137–2155. https://doi.org/10.1109/TPAMI.2016.2516982
Dalal N., Triggs B. Histograms of oriented gradients for human detection. Comp. Vis. Patt. Recognition, 2005, Vol. 1, 886–893. https://doi.org/10.1109/CVPR.2005.177
Lowe D.G. Distinctive image features from scale-invariant keypoints. IJCV, 2004, Vol. 60 (2), 91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
Danelljan M., Hager G., Khan F.S., Felsberg M. Accurate Scale Estimation for Robust Visual Tracking. BMVC, 2014, 1–11. http://doi.org/10.5244/C.28.65
Li Y., Zhu J. A scale adaptive kernel correlation filter tracker with feature integration. Proc. European Conf. Computer Vision. 2014, 254–265. https://doi.org/10.1007/978-3-319-16181-5_18
Dolla´r P., Appel R., Belongie S., Perona P. Fast feature pyramids for object detection. TPAMI, 2014. https://doi.org/10.1109/TPAMI.2014.2300479
Krizhevsky A., Sutskever I., Hinton G.. ImageNet classification with deep convolutional neural networks. NIPS, 2012, 84–90. https://doi.org/10.1145/3065386
Shrivastava A., Gupta A., Girshick R. Training region- based object detectors with online hard example mining. IEEE conference on Computer Vision and Pattern Recognition, 2016, 761–769. https://doi.org/10.1109/CVPR.2016.89
Ren S., He K., Girshick R., Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS, 2015, 91–99.
Lin T.-Y. et al. Feature pyramid networks for object detection. IEEE conference on Computer Vision and Pattern Recognition, 2017, 2117–2125. https://doi.org/10.1109/CVPR.2017.106
Zhou S., Zhou H., Qian L. A multi-scale small object detection algorithm SMA-YOLO for UAV remote sensing images. Sci Rep, 2025, Vol. 15, Article 9255. https://doi.org/10.1038/s41598-025-92344-7
Leutenegger S., Chli M., Siegwart R.Y. BRISK: Binary robust invariant scalable keypoints. In ICCV, 2011. https://doi.org/10.1109/ICCV.2011.6126542
Mikolajczyk K., Schmid C. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2005, Vol. 27, 1115–1125 1615–1630. https://doi.org/10.1109/TPAMI.2005.188
Lucas B.D., Kanade T. An iterative image registration technique with an application to stereo vision. In IJCAI, 1981, 674–679. URL: https://www.ijcai.org/Proceedings/81-2/Papers/017.pdf [Accessed 03 Oct. 2025]
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Copyright Holder is the publisher of the Paper (The Institute of Information Technologies and Systems of the NAS of Ukraine), and/or the publisher of the Paper (PH "Akademperiodika" of the NAS of Ukraine), to that the The Institute of Information Technologies and Systems of the NAS of Ukraine on the basis of a sublicense publishing agreement granted the right to publish the work and the right to indicate the publisher after the copyright sign.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The paper is an Open Access under the CC BY-NC-ND 4.0 license - Attribution-NonCommercial-NoDerivatives 4.0 International.