References

vestifm

Известия Национальной академии наук Беларуси. Серия физико-математических наук

Proceedings of the National Academy of Sciences of Belarus. Physics and Mathematics Series

1561-24302524-2415

The Republican Unitary Enterprise Publishing House "Belaruskaya Navuka"

10.29235/1561-2430-2025-61-3-253-264

vestifm-853

Research Article

ИНФОРМАТИКА

INFORMATICS

Метод обнаружения объектов на изображениях дистанционного зондирования Земли ABS-YOLO на основе улучшенного YOLOv11

A remote sensing image object detection method ABS-YOLO based on improved YOLOv11

https://orcid.org/0009-0003-6976-5386

Сяньи

Ву

Xianyi

Ву Сяньи – аспирант

пр. Независимости, 4, 220030, Минск

Wu Xianyi – Postgraduate Student

4, Nezavisimosti Ave., 220030, Minsk

tigerv5872@gmail.com

https://orcid.org/0000-0001-9404-1206

Абламейко

С. В.

Ablameyko

S. V.

Абламейко Сергей Владимирович – академик Национальной академии наук Беларуси, доктор технических наук, профессор

ул. Сурганова, 6, 220012, Минск

Sergey V. Ablameyko – Academician of the National Academy of Sciences of Belarus, Dr. Sc. (Engineering), Professor

6, Surganov Str., 220012, Minsk

ablameyko@yandex.by

Белорусский государственный университетBelarusian State University

Объединенный институт проблем информатики Национальной академии наук БеларусиUnited Institute of Informatics Problems of the National Academy of Sciences of Belarus

2025

14102025

613253264

2025

Сяньи В., Абламейко С.В.

Xianyi W., Ablameyko S.V.

Данная работа распространяется под лицензией Creative Commons Attribution 4.0.

This work is licensed under a Creative Commons Attribution 4.0 License.

https://vestifm.belnauka.by/jour/article/view/853

Исследуется задача обнаружения объектов на изображениях дистанционного зондирования Земли, что важно для сельскохозяйственного мониторинга, городского планирования, раннего предупреждения о стихийных бедствиях и др. Из-за различных размеров объектов, сложного фона и плотного распределения мелких объектов на изображениях дистанционного зондирования часто возникают проблемы, связанные с высоким процентом пропущенных объектов и недостаточной точностью определения их координат. В связи с этим предлагается усовершенствованный метод для YOLOv11 – ABS-YOLO, который значительно повышает производительность обнаружения объектов за счет интеграции усредненной свертки (AConv), двунаправленной пирамиды взвешенных признаков (BiFPN) и механизма внимания Swin Transformer. Экспериментальные результаты показывают, что по сравнению с YOLOv11 предложенный метод обнаружения объектов ABS-YOLO с AConv, BiFPN и Swin Transformer достигает увеличения оценок mAP50 на 3,9 % и mAP50-95 на 2,6 % на наборе данных NWPU VHR-10 со значительным улучшением в точности и показателях полноты. Данный метод позволяет достичь баланса между эффективностью и точностью обнаружения объектов дистанционного зондирования благодаря предложенным усовершенствованиям.

The problem of object detection in Earth remote sensing images is studied, which is important for agricultural monitoring, urban planning, early warning of natural disasters, etc. Due to the different sizes of objects, complex background, and dense distribution of small objects in remote sensing images, problems such as high percentage of missed objects and insufficient accuracy of their coordinates often arise. In this regard, an improved method for YOLOv11, ABS-YOLO, is proposed, which significantly improves the performance of object detection by integrating Averaged Convolution (AConv), Bidirectional Weighted Feature Pyramid (BiFPN), and Swin Transformer attention mechanism. Experimental results show that, compared with YOLOv11, the proposed object detection method ABS-YOLO with AConv, BiFPN, and Swin Transformer achieves 3.9 % increase in mAP50 estimations and 2.6 % increase in mAP50-95 on the NWPU VHR-10 dataset with significant improvement in precision and recall rates. This method allows achieving a balance between efficiency and accuracy of remote sensing object detection due to the proposed improvements.

YOLOv11Swin Transformerизображение дистанционного зондированияобнаружение объектов

YOLOv11Swin Transformerremote sensing imageobject detection

References1

Velastegui-Montoya A., Montalván-Burbano N., Carrión-Mero P., Rivera-Torres H., Sadeck L., Adami M. Google Earth Engine: A Global Analysis and Future Trends. Remote Sensing, 2023, vol. 15, no. 14, art. ID 3675. https://doi.org/10.3390/rs15143675

Liu H., Gong P., Wang J., Wang X., Ning G., Xu B. Production of global daily seamless data cubes and quantification of global land cover change from 1985 to 2020 – iMap World 1.0. Remote Sensing of Environment, 2021, vol. 258, art. ID 112364. https://doi.org/10.1016/j.rse.2021.112364

Zhang L. P., Shen H. F. Progress and future of remote sensing data fusion. Journal of Remote Sensing, 2016, vol. 20, no. 5, pp. 1050–1061. https://doi.org/10.11834/jrs.20166243

Dalal N., Triggs B. Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 2005, vol. 1, pp. 886–893. https://doi.org/10.1109/CVPR.2005.177

Lowe D. G. Object recognition from local scale-invariant features. Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999, vol. 2, pp. 1150–1157. https://doi.org/10.1109/ICCV.1999.790410

Maturana D., Mery D., Soto Á. Face recognition with local binary patterns, spatial pyramid histograms and naive Bayes nearest neighbor classification. Proceedings of the 2009 International Conference of the Chilean Computer Science Society, 2009, pp. 125–132. https://doi.org/10.1109/SCCC.2009.21

Girshick R., Donahue J., Darrell T., Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. Arxiv [Preprint], 2014. Available at: https://arxiv.org/abs/1311.2524. https://doi.org/10.48550/arXiv.1311.2524

Girshick R. Fast R-CNN. Arxiv [Preprint], 2015. Available at: https://arxiv.org/abs/1504.08083. https://doi.org/10.48550/arXiv.1504.08083

Ren S., He K., Girshick R., Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. Arxiv [Preprint], 2016. Available at: https://arxiv.org/abs/1506.01497. https://doi.org/10.48550/arXiv.1506.01497

Redmon J., Divvala S., Girshick R., Farhadi A. You only look once: Unified, real-time object detection. Arxiv [Preprint], 2016. Available at: https://arxiv.org/abs/1506.02640. https://doi.org/10.48550/arXiv.1506.02640

Liu W., Anguelov D., Erhan D., Szegedy C., Reed S., Fu C.-Y., Berg A. C. SSD: Single Shot MultiBox Detector. Computer Vision – ECCV 2016. ECCV 2016. Lecture Notes in Computer Science, vol 9905. Springer, 2016, pp. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2

Zhang Y., Ye M., Zhu G., Liu Y., Guo P., Yan J. FFCA-YOLO for small object detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 2024, vol. 62, pp. 1–15. https://doi.org/10.1109/TGRS.2024.3363057

Yi H., Liu B., Zhao B., Liu E. Small object detection algorithm based on improved YOLOv8 for remote sensing. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, vol. 17, pp. 1734–1747. https://doi.org/10.1109/JSTARS.2023.3339235

Wu T., Dong Y. YOLO-SE: Improved YOLOv8 for Remote Sensing Object Detection and Recognition. Applied Sciences, 2023, vol. 13, no. 24, art. ID 12977. https://doi.org/10.3390/app132412977

Wang X., Gao H., Jia Z., Li Z. BL-YOLOv8: An Improved Road Defect Detection Model Based on YOLOv8. Sensors, 2023, vol. 23, no. 20, art. ID 8361. https://doi.org/10.3390/s23208361

Soudeep S., Jahin M. A., Mridha M. F. Interpretable dynamic graph neural networks for small occluded object detection and tracking. Arxiv [Preprint], 2025. Available at: https://arxiv.org/abs/2411.17251. https://doi.org/10.48550/arXiv.2411.17251

Khanam R., Hussain M. YOLOv11: An overview of the key architectural enhancements. Arxiv [Preprint], 2024. Available at: https://arxiv.org/abs/2410.17725. https://doi.org/10.48550/arXiv.2410.17725

Tan M., Pang R., Le Q. V. EfficientDet: Scalable and efficient object detection. Arxiv [Preprint], 2020. Available at: https://arxiv.org/abs/1911.09070. https://doi.org/10.48550/arXiv.1911.09070

Lin T.-Y., Dollár P., Girshick R., He K., Hariharan B., Belongie S. Feature pyramid networks for object detection. Arxiv [Preprint], 2017. Available at: https://arxiv.org/abs/1612.03144. https://doi.org/10.48550/arXiv.1612.03144

Liu S., Qi L., Qin H., Shi J., Jia J. Path aggregation network for instance segmentation. Arxiv [Preprint], 2018. Available at: https://arxiv.org/abs/1803.01534. https://doi.org/10.48550/arXiv.1803.01534

Ghiasi G., Lin T.-Y., Pang R., Le Q. V. NAS-FPN: Learning scalable feature pyramid architecture for object detection. Arxiv [Preprint], 2019. Available at: https://arxiv.org/abs/1904.07392. https://doi.org/10.48550/arXiv.1904.07392

Liu Z., Lin Y., Cao Y., Hu H., Wei Y., Zhang Z., Lin S., Guo B. Swin Transformer: Hierarchical vision transformer using shifted windows. Arxiv [Preprint], 2021. Available at: https://arxiv.org/abs/2103.14030. https://doi.org/10.48550/arXiv.2103.14030

Su H., Wei S., Yan M., Wang C., Shi J., Zhang X. Object detection and instance segmentation in remote sensing imagery based on precise Mask R-CNN. IGARSS 2019 – 2019 IEEE International Geoscience and Remote Sensing Symposium, 2019, pp. 1454–1457. https://doi.org/10.1109/IGARSS.2019.8898573

Su H., Wei S., Liu S., Liang J., Wang C., Shi J., Zhang X. HQ-ISNet: High-Quality Instance Segmentation for Remote Sensing Imagery. Remote Sensing, 2020, vol. 12, no. 6, pp. 989. https://doi.org/10.3390/rs12060989

Khanam R., Hussain M. What is YOLOv5: A deep look into the internal features of the popular object detector. Arxiv [Preprint], 2024. Available at: https://arxiv.org/abs/2407.20892. https://doi.org/10.48550/arXiv.2407.20892

Li C., Li L., Jiang H., Weng K., Geng Y., Li L., Ke Z. [et al.]. YOLOv6: A single-stage object detection framework for industrial applications. Arxiv [Preprint], 2022. Available at: https://arxiv.org/abs/2209.02976. https://doi.org/10.48550/arXiv.2209.02976

Jocher G., Qiu J., Chaurasia A. Ultralytics YOLO (Version 8.0.0) [Computer software]. 2023. Available at: https:// github.com/ultralytics/ultralytics

Wang A., Chen H., Liu L., Chen K., Lin Z., Han J., Ding G. YOLOv10: Real-time end-to-end object detection. Arxiv [Preprint], 2024. Available at: https://arxiv.org/abs/2405.14458. https://doi.org/10.48550/arXiv.2405.14458

Ouyang D., He S., Zhang G., Luo M., Guo H., Zhan J., Huang Z. Efficient multi-scale attention module with cross-spatial learning. ICASSP 2023 – 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5. https://doi.org/10.1109/ICASSP49357.2023.10096516

Yang L., Zhang R.-Y., Li L., Xie X. SimAM: A simple, parameter-free attention module for convolutional neural networks. Proceedings of the 38th International Conference on Machine Learning, PMLR, 2021, vol. 139, pp. 11863–11874. Available at: https://proceedings.mlr.press/v139/yang21o.html

Chen Z., Lu S. CAF-YOLO: A Robust framework for multi-scale lesion detection in biomedical imagery. Arxiv [Preprint], 2024. Available at: https://arxiv.org/abs/2408.01897. https://doi.org/10.48550/arXiv.2408.01897

Xu S., Zheng S., Xu W., Xu R., Wang C., Zhang J., Teng X., Li A., Guo L. HCF-Net: Hierarchical context fusion net work for infrared small object detection. Arxiv [Preprint], 2024. Available at: https://arxiv.org/abs/2403.10778. https://doi.org/10.48550/arXiv.2403.10778

He K., Gkioxari G., Dollár P., Girshick R. Mask R-CNN. Arxiv [Preprint], 2018. Available at: https://arxiv.org/abs/1703.06870. https://doi.org/10.48550/arXiv.1703.06870

The authors declare that there are no conflicts of interest present.