Transformer-based multiple instance learning network with 2D positional encoding for histopathology image classification

Yang B; Ding L; Li J; Li Y; Qu G; Wang J; Wang Q; Liu B

doi:10.1007/s40747-025-01779-y

Transformer-based multiple instance learning network with 2D positional encoding for histopathology image classification

dc.citation.issue	5
dc.citation.volume	11
dc.contributor.author	Yang B
dc.contributor.author	Ding L
dc.contributor.author	Li J
dc.contributor.author	Li Y
dc.contributor.author	Qu G
dc.contributor.author	Wang J
dc.contributor.author	Wang Q
dc.contributor.author	Liu B
dc.date.accessioned	2025-05-27T23:48:59Z
dc.date.available	2025-05-27T23:48:59Z
dc.date.issued	2025-05
dc.description.abstract	Digital medical imaging, particularly pathology images, is essential for cancer diagnosis but faces challenges in direct model training due to its super-resolution nature. Although weakly supervised learning has reduced the need for manual annotations, many multiple instance learning (MIL) methods struggle to effectively capture crucial spatial relationships in histopathological images. Existing methods incorporating positional information often overlook nuanced spatial correlations or use positional encoding strategies that do not fully capture the unique spatial dynamics of pathology images. To address this issue, we propose a new framework named TMIL (Transformer-based Multiple Instance Learning Network with 2D positional encoding), which leverages multiple instance learning for weakly supervised classification of histopathological images. TMIL incorporates a 2D positional encoding module, based on the Transformer, to model positional information and explore correlations between instances. Furthermore, TMIL divides histopathological images into pseudo-bags and trains patch-level feature vectors with deep metric learning to enhance classification performance. Finally, the proposed approach is evaluated on a public colorectal adenoma dataset. The experimental results show that TMIL outperforms existing MIL methods, achieving an AUC of 97.28% and an ACC of 95.19%. These findings suggest that TMIL’s integration of deep metric learning and positional encoding offers a promising approach for improving the efficiency and accuracy of pathology image analysis in cancer diagnosis.
dc.description.confidential	false
dc.edition.edition	May 2025
dc.identifier.citation	Yang B, Ding L, Li J, Li Y, Qu G, Wang J, Wang Q, Liu B. (2025). Transformer-based multiple instance learning network with 2D positional encoding for histopathology image classification. Complex and Intelligent Systems. 11. 5.
dc.identifier.doi	10.1007/s40747-025-01779-y
dc.identifier.eissn	2198-6053
dc.identifier.elements-type	journal-article
dc.identifier.issn	2199-4536
dc.identifier.number	218
dc.identifier.pii	s40747-025-01779-y
dc.identifier.uri	https://mro.massey.ac.nz/handle/10179/72953
dc.language	English
dc.publisher	Springer Nature Switzerland AG
dc.publisher.uri	https://link.springer.com/article/10.1007/s40747-025-01779-y
dc.relation.isPartOf	Complex and Intelligent Systems
dc.rights	(c) 2025 The Author/s
dc.rights	CC BY-NC-ND 4.0
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Weakly supervised training
dc.subject	Image classification
dc.subject	Multiple instance learning
dc.title	Transformer-based multiple instance learning network with 2D positional encoding for histopathology image classification
dc.type	Journal article
pubs.elements-id	500303
pubs.organisational-group	Other

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 500303 PDF.pdf
Size:: 2.92 MB
Format:: Adobe Portable Document Format
Description:: Preprint version.pdf

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 9.22 KB
Format:: Plain Text
Description:

Download

Collections

Journal Articles