Preview

Proceedings of the National Academy of Sciences of Belarus. Physics and Mathematics Series

Advanced search

A remote sensing image object detection method ABS-YOLO based on improved YOLOv11

https://doi.org/10.29235/1561-2430-2025-61-3-253-264

Abstract

The problem of object detection in Earth remote sensing images is studied, which is important for agricultural monitoring, urban planning, early warning of natural disasters, etc. Due to the different sizes of objects, complex background, and dense distribution of small objects in remote sensing images, problems such as high percentage of missed objects and insufficient accuracy of their coordinates often arise. In this regard, an improved method for YOLOv11, ABS-YOLO, is proposed, which significantly improves the performance of object detection by integrating Averaged Convolution (AConv), Bidirectional Weighted Feature Pyramid (BiFPN), and Swin Transformer attention mechanism. Experimental results show that, compared with YOLOv11, the proposed object detection method ABS-YOLO with AConv, BiFPN, and Swin Transformer achieves 3.9 % increase in mAP50 estimations and 2.6 % increase in mAP50-95 on the NWPU VHR-10 dataset with significant improvement in precision and recall rates. This method allows achieving a balance between efficiency and accuracy of remote sensing object detection due to the proposed improvements.

About the Authors

Wu Xianyi
Belarusian State University
Belarus

Wu Xianyi – Postgraduate Student

4, Nezavisimosti Ave., 220030, Minsk



S. V. Ablameyko
United Institute of Informatics Problems of the National Academy of Sciences of Belarus
Belarus

Sergey V. Ablameyko – Academician of the National Academy of Sciences of Belarus, Dr. Sc. (Engineering), Professor

6, Surganov Str., 220012, Minsk

 



References

1. Velastegui-Montoya A., Montalván-Burbano N., Carrión-Mero P., Rivera-Torres H., Sadeck L., Adami M. Google Earth Engine: A Global Analysis and Future Trends. Remote Sensing, 2023, vol. 15, no. 14, art. ID 3675. https://doi.org/10.3390/rs15143675

2. Liu H., Gong P., Wang J., Wang X., Ning G., Xu B. Production of global daily seamless data cubes and quantification of global land cover change from 1985 to 2020 – iMap World 1.0. Remote Sensing of Environment, 2021, vol. 258, art. ID 112364. https://doi.org/10.1016/j.rse.2021.112364

3. Zhang L. P., Shen H. F. Progress and future of remote sensing data fusion. Journal of Remote Sensing, 2016, vol. 20, no. 5, pp. 1050–1061. https://doi.org/10.11834/jrs.20166243

4. Dalal N., Triggs B. Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 2005, vol. 1, pp. 886–893. https://doi.org/10.1109/CVPR.2005.177

5. Lowe D. G. Object recognition from local scale-invariant features. Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999, vol. 2, pp. 1150–1157. https://doi.org/10.1109/ICCV.1999.790410

6. Maturana D., Mery D., Soto Á. Face recognition with local binary patterns, spatial pyramid histograms and naive Bayes nearest neighbor classification. Proceedings of the 2009 International Conference of the Chilean Computer Science Society, 2009, pp. 125–132. https://doi.org/10.1109/SCCC.2009.21

7. Girshick R., Donahue J., Darrell T., Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. Arxiv [Preprint], 2014. Available at: https://arxiv.org/abs/1311.2524. https://doi.org/10.48550/arXiv.1311.2524

8. Girshick R. Fast R-CNN. Arxiv [Preprint], 2015. Available at: https://arxiv.org/abs/1504.08083. https://doi.org/10.48550/arXiv.1504.08083

9. Ren S., He K., Girshick R., Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. Arxiv [Preprint], 2016. Available at: https://arxiv.org/abs/1506.01497. https://doi.org/10.48550/arXiv.1506.01497

10. Redmon J., Divvala S., Girshick R., Farhadi A. You only look once: Unified, real-time object detection. Arxiv [Preprint], 2016. Available at: https://arxiv.org/abs/1506.02640. https://doi.org/10.48550/arXiv.1506.02640

11. Liu W., Anguelov D., Erhan D., Szegedy C., Reed S., Fu C.-Y., Berg A. C. SSD: Single Shot MultiBox Detector. Computer Vision – ECCV 2016. ECCV 2016. Lecture Notes in Computer Science, vol 9905. Springer, 2016, pp. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2

12. Zhang Y., Ye M., Zhu G., Liu Y., Guo P., Yan J. FFCA-YOLO for small object detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 2024, vol. 62, pp. 1–15. https://doi.org/10.1109/TGRS.2024.3363057

13. Yi H., Liu B., Zhao B., Liu E. Small object detection algorithm based on improved YOLOv8 for remote sensing. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, vol. 17, pp. 1734–1747. https://doi.org/10.1109/JSTARS.2023.3339235

14. Wu T., Dong Y. YOLO-SE: Improved YOLOv8 for Remote Sensing Object Detection and Recognition. Applied Sciences, 2023, vol. 13, no. 24, art. ID 12977. https://doi.org/10.3390/app132412977

15. Wang X., Gao H., Jia Z., Li Z. BL-YOLOv8: An Improved Road Defect Detection Model Based on YOLOv8. Sensors, 2023, vol. 23, no. 20, art. ID 8361. https://doi.org/10.3390/s23208361

16. Soudeep S., Jahin M. A., Mridha M. F. Interpretable dynamic graph neural networks for small occluded object detection and tracking. Arxiv [Preprint], 2025. Available at: https://arxiv.org/abs/2411.17251. https://doi.org/10.48550/arXiv.2411.17251

17. Khanam R., Hussain M. YOLOv11: An overview of the key architectural enhancements. Arxiv [Preprint], 2024. Available at: https://arxiv.org/abs/2410.17725. https://doi.org/10.48550/arXiv.2410.17725

18. Tan M., Pang R., Le Q. V. EfficientDet: Scalable and efficient object detection. Arxiv [Preprint], 2020. Available at: https://arxiv.org/abs/1911.09070. https://doi.org/10.48550/arXiv.1911.09070

19. Lin T.-Y., Dollár P., Girshick R., He K., Hariharan B., Belongie S. Feature pyramid networks for object detection. Arxiv [Preprint], 2017. Available at: https://arxiv.org/abs/1612.03144. https://doi.org/10.48550/arXiv.1612.03144

20. Liu S., Qi L., Qin H., Shi J., Jia J. Path aggregation network for instance segmentation. Arxiv [Preprint], 2018. Available at: https://arxiv.org/abs/1803.01534. https://doi.org/10.48550/arXiv.1803.01534

21. Ghiasi G., Lin T.-Y., Pang R., Le Q. V. NAS-FPN: Learning scalable feature pyramid architecture for object detection. Arxiv [Preprint], 2019. Available at: https://arxiv.org/abs/1904.07392. https://doi.org/10.48550/arXiv.1904.07392

22. Liu Z., Lin Y., Cao Y., Hu H., Wei Y., Zhang Z., Lin S., Guo B. Swin Transformer: Hierarchical vision transformer using shifted windows. Arxiv [Preprint], 2021. Available at: https://arxiv.org/abs/2103.14030. https://doi.org/10.48550/arXiv.2103.14030

23. Su H., Wei S., Yan M., Wang C., Shi J., Zhang X. Object detection and instance segmentation in remote sensing imagery based on precise Mask R-CNN. IGARSS 2019 – 2019 IEEE International Geoscience and Remote Sensing Symposium, 2019, pp. 1454–1457. https://doi.org/10.1109/IGARSS.2019.8898573

24. Su H., Wei S., Liu S., Liang J., Wang C., Shi J., Zhang X. HQ-ISNet: High-Quality Instance Segmentation for Remote Sensing Imagery. Remote Sensing, 2020, vol. 12, no. 6, pp. 989. https://doi.org/10.3390/rs12060989

25. Khanam R., Hussain M. What is YOLOv5: A deep look into the internal features of the popular object detector. Arxiv [Preprint], 2024. Available at: https://arxiv.org/abs/2407.20892. https://doi.org/10.48550/arXiv.2407.20892

26. Li C., Li L., Jiang H., Weng K., Geng Y., Li L., Ke Z. [et al.]. YOLOv6: A single-stage object detection framework for industrial applications. Arxiv [Preprint], 2022. Available at: https://arxiv.org/abs/2209.02976. https://doi.org/10.48550/arXiv.2209.02976

27. Jocher G., Qiu J., Chaurasia A. Ultralytics YOLO (Version 8.0.0) [Computer software]. 2023. Available at: https:// github.com/ultralytics/ultralytics

28. Wang A., Chen H., Liu L., Chen K., Lin Z., Han J., Ding G. YOLOv10: Real-time end-to-end object detection. Arxiv [Preprint], 2024. Available at: https://arxiv.org/abs/2405.14458. https://doi.org/10.48550/arXiv.2405.14458

29. Ouyang D., He S., Zhang G., Luo M., Guo H., Zhan J., Huang Z. Efficient multi-scale attention module with cross-spatial learning. ICASSP 2023 – 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5. https://doi.org/10.1109/ICASSP49357.2023.10096516

30. Yang L., Zhang R.-Y., Li L., Xie X. SimAM: A simple, parameter-free attention module for convolutional neural networks. Proceedings of the 38th International Conference on Machine Learning, PMLR, 2021, vol. 139, pp. 11863–11874. Available at: https://proceedings.mlr.press/v139/yang21o.html

31. Chen Z., Lu S. CAF-YOLO: A Robust framework for multi-scale lesion detection in biomedical imagery. Arxiv [Preprint], 2024. Available at: https://arxiv.org/abs/2408.01897. https://doi.org/10.48550/arXiv.2408.01897

32. Xu S., Zheng S., Xu W., Xu R., Wang C., Zhang J., Teng X., Li A., Guo L. HCF-Net: Hierarchical context fusion net work for infrared small object detection. Arxiv [Preprint], 2024. Available at: https://arxiv.org/abs/2403.10778. https://doi.org/10.48550/arXiv.2403.10778

33. He K., Gkioxari G., Dollár P., Girshick R. Mask R-CNN. Arxiv [Preprint], 2018. Available at: https://arxiv.org/abs/1703.06870. https://doi.org/10.48550/arXiv.1703.06870


Review

Views: 9


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1561-2430 (Print)
ISSN 2524-2415 (Online)