Wijegunarathna KStock KJones CBSila-Nowicka KMoore AO’Sullivan DAdams BGahegan M2025-09-212025-09-212025-08-15Wijegunarathna K, Stock K, Jones CB. (2025). Large Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencing. Sila-Nowicka K, Moore A, O’Sullivan D, Adams B, Gahegan M. Leibniz International Proceedings in Informatics Lipics. Schloss Dagstuhl – Leibniz-Zentrum für Informatik.978-3-95977-378-21868-8969https://mro.massey.ac.nz/handle/10179/73585Millions of biological sample records collected in the last few centuries archived in natural history collections are un-georeferenced. Georeferencing complex locality descriptions associated with these collection samples is a highly labour-intensive task collection agencies struggle with. None of the existing automated methods exploit maps that are an essential tool for georeferencing complex relations. We present preliminary experiments and results of a novel method that exploits multimodal capabilities of recent Large Multi-Modal Models (LMM). This method enables the model to visually contextualize spatial relations it reads in the locality description. We use a grid-based approach to adapt these auto-regressive models for this task in a zero-shot setting. Our experiments conducted on a small manually annotated dataset show impressive results for our approach (∼1 km Average distance error) compared to uni-modal georeferencing with Large Language Models and existing georeferencing tools. The paper also discusses the findings of the experiments in light of an LMM's ability to comprehend fine-grained maps. Motivated by these results, a practical framework is proposed to integrate this method into a georeferencing workflow.(c) The author/shttps://creativecommons.org/licenses/by/4.0/GeoreferencingLarge Language ModelsLarge Multi-Modal ModelsLLMNatural History collectionsLarge Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencingconference10.4230/LIPIcs.GIScience.2025.12CC BYc-conference-paper-in-proceedings