Large Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencing

dc.citation.volume346
dc.contributor.authorWijegunarathna K
dc.contributor.authorStock K
dc.contributor.authorJones CB
dc.contributor.editorSila-Nowicka K
dc.contributor.editorMoore A
dc.contributor.editorO’Sullivan D
dc.contributor.editorAdams B
dc.contributor.editorGahegan M
dc.coverage.spatialChristchurch, New Zealand
dc.date.accessioned2025-09-21T23:09:58Z
dc.date.available2025-09-21T23:09:58Z
dc.date.finish-date2025-08-29
dc.date.issued2025-08-15
dc.date.start-date2025-08-26
dc.description.abstractMillions of biological sample records collected in the last few centuries archived in natural history collections are un-georeferenced. Georeferencing complex locality descriptions associated with these collection samples is a highly labour-intensive task collection agencies struggle with. None of the existing automated methods exploit maps that are an essential tool for georeferencing complex relations. We present preliminary experiments and results of a novel method that exploits multimodal capabilities of recent Large Multi-Modal Models (LMM). This method enables the model to visually contextualize spatial relations it reads in the locality description. We use a grid-based approach to adapt these auto-regressive models for this task in a zero-shot setting. Our experiments conducted on a small manually annotated dataset show impressive results for our approach (∼1 km Average distance error) compared to uni-modal georeferencing with Large Language Models and existing georeferencing tools. The paper also discusses the findings of the experiments in light of an LMM's ability to comprehend fine-grained maps. Motivated by these results, a practical framework is proposed to integrate this method into a georeferencing workflow.
dc.description.confidentialfalse
dc.identifier.citationWijegunarathna K, Stock K, Jones CB. (2025). Large Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencing. Sila-Nowicka K, Moore A, O’Sullivan D, Adams B, Gahegan M. Leibniz International Proceedings in Informatics Lipics. Schloss Dagstuhl – Leibniz-Zentrum für Informatik.
dc.identifier.doi10.4230/LIPIcs.GIScience.2025.12
dc.identifier.elements-typec-conference-paper-in-proceedings
dc.identifier.isbn978-3-95977-378-2
dc.identifier.issn1868-8969
dc.identifier.urihttps://mro.massey.ac.nz/handle/10179/73585
dc.publisherSchloss Dagstuhl – Leibniz-Zentrum für Informatik
dc.publisher.urihttp://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.12
dc.rights(c) The author/sen
dc.rights.licenseCC BYen
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en
dc.source.journalLeibniz International Proceedings in Informatics Lipics
dc.source.name-of-conference13th International Conference on Geographic Information Science (GIScience)
dc.subjectGeoreferencing
dc.subjectLarge Language Models
dc.subjectLarge Multi-Modal Models
dc.subjectLLM
dc.subjectNatural History collections
dc.titleLarge Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencing
dc.typeconference
pubs.elements-id502960
pubs.organisational-groupOther
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
502960 PDF.pdf
Size:
3.2 MB
Format:
Adobe Portable Document Format
Description:
Published version.pdf
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
9.22 KB
Format:
Plain Text
Description:
Collections