Chen YFeng ZPaes DNilsson DLovreglio R2025-11-052025-11-052025-11-28Chen Y, Feng Z, Paes D, Nilsson D, Lovreglio R. (2025). Real-time human pose estimation and tracking on monocular videos: A systematic literature review. Neurocomputing. 655.0925-2312https://mro.massey.ac.nz/handle/10179/73759Real-time human pose estimation and tracking on monocular videos is a fundamental task in computer vision with a wide range of applications. Recently, benefiting from deep learning-based methods, it has received impressive progress in performance. Although some works have reviewed and summarised the advancements in this field, few have specifically focused on real-time performance and monocular video-based solutions. The goal of this review is to bridge this gap by providing a comprehensive understanding of real-time monocular video-based human pose estimation and tracking, encompassing both 2D and 3D domains, as well as single-person and multi-person scenarios. To achieve this objective, this paper systematically reviews 68 papers published between 2014 and 2024 to answer six research questions. This review brings new insights into computational efficiency measures and hardware configurations of existing methods. Additionally, this review provides a deep discussion on trade-off strategies for accuracy and efficiency in real-time systems. Finally, this review highlights promising directions for future research and provides practical solutions for real-world applications.CC BY-NC 4.0(c) 2025 The Author/shttps://creativecommons.org/licenses/by-nc/4.0/Real-timeHuman pose estimationHuman pose trackingDeep learning2D and 3D poseMonocular optical videosReal-time human pose estimation and tracking on monocular videos: A systematic literature reviewJournal article10.1016/j.neucom.2025.1313091872-8286journal-article131309S0925231225019812