Large Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencing

Wijegunarathna K; Stock K; Jones CB

doi:10.4230/LIPIcs.GIScience.2025.12

Large Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencing

dc.citation.volume	346
dc.contributor.author	Wijegunarathna K
dc.contributor.author	Stock K
dc.contributor.author	Jones CB
dc.contributor.editor	Sila-Nowicka K
dc.contributor.editor	Moore A
dc.contributor.editor	O’Sullivan D
dc.contributor.editor	Adams B
dc.contributor.editor	Gahegan M
dc.coverage.spatial	Christchurch, New Zealand
dc.date.accessioned	2025-09-21T23:09:58Z
dc.date.available	2025-09-21T23:09:58Z
dc.date.finish-date	2025-08-29
dc.date.issued	2025-08-15
dc.date.start-date	2025-08-26
dc.description.abstract	Millions of biological sample records collected in the last few centuries archived in natural history collections are un-georeferenced. Georeferencing complex locality descriptions associated with these collection samples is a highly labour-intensive task collection agencies struggle with. None of the existing automated methods exploit maps that are an essential tool for georeferencing complex relations. We present preliminary experiments and results of a novel method that exploits multimodal capabilities of recent Large Multi-Modal Models (LMM). This method enables the model to visually contextualize spatial relations it reads in the locality description. We use a grid-based approach to adapt these auto-regressive models for this task in a zero-shot setting. Our experiments conducted on a small manually annotated dataset show impressive results for our approach (∼1 km Average distance error) compared to uni-modal georeferencing with Large Language Models and existing georeferencing tools. The paper also discusses the findings of the experiments in light of an LMM's ability to comprehend fine-grained maps. Motivated by these results, a practical framework is proposed to integrate this method into a georeferencing workflow.
dc.description.confidential	false
dc.identifier.citation	Wijegunarathna K, Stock K, Jones CB. (2025). Large Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencing. Sila-Nowicka K, Moore A, O’Sullivan D, Adams B, Gahegan M. Leibniz International Proceedings in Informatics Lipics. Schloss Dagstuhl – Leibniz-Zentrum für Informatik.
dc.identifier.doi	10.4230/LIPIcs.GIScience.2025.12
dc.identifier.elements-type	c-conference-paper-in-proceedings
dc.identifier.isbn	978-3-95977-378-2
dc.identifier.issn	1868-8969
dc.identifier.uri	https://mro.massey.ac.nz/handle/10179/73585
dc.publisher	Schloss Dagstuhl – Leibniz-Zentrum für Informatik
dc.publisher.uri	http://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.12
dc.rights	(c) The author/s	en
dc.rights.license	CC BY	en
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en
dc.source.journal	Leibniz International Proceedings in Informatics Lipics
dc.source.name-of-conference	13th International Conference on Geographic Information Science (GIScience)
dc.subject	Georeferencing
dc.subject	Large Language Models
dc.subject	Large Multi-Modal Models
dc.subject	LLM
dc.subject	Natural History collections
dc.title	Large Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencing
dc.type	conference
pubs.elements-id	502960
pubs.organisational-group	Other

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 502960 PDF.pdf
Size:: 3.2 MB
Format:: Adobe Portable Document Format
Description:: Published version.pdf

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 9.22 KB
Format:: Plain Text
Description:

Download

Collections

Journal Articles