Local-enhanced representation for text-based person search

dc.citation.issueMay 2025
dc.citation.volume161
dc.contributor.authorZhang G
dc.contributor.authorChen Y
dc.contributor.authorZheng Y
dc.contributor.authorMartin G
dc.contributor.authorWang R
dc.date.accessioned2025-01-30T01:56:10Z
dc.date.available2025-01-30T01:56:10Z
dc.date.issued2024-12-12
dc.description.abstractText-based person search is a critical task in intelligent security, designed to locate a person of interest by text descriptions. The primary challenge in this task is to effectively bridge the significant gap between the text and image domains while simultaneously extracting the discriminative features that are crucial for the accurate identification of individuals. Existing methods have made some effective attempts by conducting cross-modal matching at the fine-grained representation level. However, these approaches frequently overlook two crucial factors: (i) the presence of noise in the local features during information fusion, and (ii) the lack of intra-modal matching when measuring feature similarity. To address the above issues, we propose a novel local-enhanced representation framework in this paper. Specifically, to restrain noises in local features, we design a Relation-based cross-modal local-enhanced fusion module, which can filter out weak related information by relation assessment. In addition, we explore an intra-cross modal projection strategy to overcome the limitations of existing cross-modal projection methods. This strategy jointly applies the intra-modal and cross-modal matching constrains in feature distribution. Finally, experiments on three mainstream datasets verify the performance superiority of our proposed method compared to existing state-of-the-art methods.
dc.description.confidentialfalse
dc.identifier.citationZhang G, Chen Y, Zheng Y, Martin G, Wang R. (2025). Local-enhanced representation for text-based person search. Pattern Recognition. 161. May 2025.
dc.identifier.doi10.1016/j.patcog.2024.111247
dc.identifier.eissn1873-5142
dc.identifier.elements-typejournal-article
dc.identifier.issn0031-3203
dc.identifier.number111247
dc.identifier.piiS0031320324009981
dc.identifier.urihttps://mro.massey.ac.nz/handle/10179/72445
dc.languageEnglish
dc.publisherElsevier B.V.
dc.publisher.urihttps://www.sciencedirect.com/science/article/pii/S0031320324009981
dc.relation.isPartOfPattern Recognition
dc.rights(c) 2024 The Author/s
dc.rightsCC BY 4.0
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectPerson re-identification
dc.subjectCross-modal retrieval
dc.subjectLocal representation
dc.titleLocal-enhanced representation for text-based person search
dc.typeJournal article
pubs.elements-id492837
pubs.organisational-groupOther
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zhang_G_et_al_2024_Published.pdf
Size:
4.55 MB
Format:
Adobe Portable Document Format
Description:
492837 PDF.pdf
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
9.22 KB
Format:
Plain Text
Description:
Collections