Journal Articles

Permanent URI for this collectionhttps://mro.massey.ac.nz/handle/10179/7915

Browse

Search Results

Now showing 1 - 4 of 4
  • Item
    Importance of timely metadata curation to the global surveillance of genetic diversity
    (Wiley Periodicals LLC on behalf of Society for Conservation Biology, 2023-08) Crandall ED; Toczydlowski RH; Liggins L; Holmes AE; Ghoojaei M; Gaither MR; Wham BE; Pritt AL; Noble C; Anderson TJ; Barton RL; Berg JT; Beskid SG; Delgado A; Farrell E; Himmelsbach N; Queeno SR; Trinh T; Weyand C; Bentley A; Deck J; Riginos C; Bradburd GS; Toonen RJ
    Genetic diversity within species represents a fundamental yet underappreciated level of biodiversity. Because genetic diversity can indicate species resilience to changing climate, its measurement is relevant to many national and global conservation policy targets. Many studies produce large amounts of genome-scale genetic diversity data for wild populations, but most (87%) do not include the associated spatial and temporal metadata necessary for them to be reused in monitoring programs or for acknowledging the sovereignty of nations or Indigenous peoples. We undertook a distributed datathon to quantify the availability of these missing metadata and to test the hypothesis that their availability decays with time. We also worked to remediate missing metadata by extracting them from associated published papers, online repositories, and direct communication with authors. Starting with 848 candidate genomic data sets (reduced representation and whole genome) from the International Nucleotide Sequence Database Collaboration, we determined that 561 contained mostly samples from wild populations. We successfully restored spatiotemporal metadata for 78% of these 561 data sets (n = 440 data sets with data on 45,105 individuals from 762 species in 17 phyla). Examining papers and online repositories was much more fruitful than contacting 351 authors, who replied to our email requests 45% of the time. Overall, 23% of our email queries to authors unearthed useful metadata. The probability of retrieving spatiotemporal metadata declined significantly as age of the data set increased. There was a 13.5% yearly decrease in metadata associated with published papers or online repositories and up to a 22% yearly decrease in metadata that were only available from authors. This rapid decay in metadata availability, mirrored in studies of other types of biological data, should motivate swift updates to data-sharing policies and researcher practices to ensure that the valuable context provided by metadata is not lost to conservation science forever. Importancia de la curación oportuna de metadatos para la vigilancia mundial de ladiversidad genéticaResumen:La diversidad genética intraespecífica representa un nivel fundamental, pero ala vez subvalorado de la biodiversidad. La diversidad genética puede indicar la resilienciade una especie ante el clima cambiante, por lo que su medición es relevante para muchosobjetivos de la política de conservación mundial y nacional. Muchos estudios producenuna gran cantidad de datos sobre la diversidad a nivel genético de las poblaciones silvestres,aunque la mayoría (87%) no incluye los metadatos espaciales y temporales asociados paraque sean reutilizados en los programas de monitoreo o para reconocer la soberanía de lasnaciones o los pueblos indígenas. Realizamos un “datatón” distribuido para cuantificar ladisponibilidad de estos metadatos faltantes y para probar la hipótesis que supone que estadisponibilidad se deteriora con el tiempo. También trabajamos para reparar los metadatosfaltantes al extraerlos de los artículos asociados publicados, los repositorios en línea yla comunicación directa con los autores. Iniciamos con 838 candidatos de conjuntos dedatos genómicos (representación reducida y genoma completo) tomados de la colabo-ración internacional para la base de datos de secuencias de nucleótidos y determinamosque 561 incluían en su mayoría muestras tomadas de poblaciones silvestres. Restauramoscon éxito los metadatos espaciotemporales en el 78% de estos 561 conjuntos de datos (n=440 conjuntos de datos con información sobre 45,105 individuos de 762 especies en 17filos). El análisis de los artículos y los repositorios virtuales fue mucho más productivo quecontactar a los 351 autores, quienes tuvieron un 45% de respuesta a nuestros correos. Engeneral, el 23% de nuestras consultas descubrieron metadatos útiles. La probabilidad derecuperar metadatos espaciotemporales declinó de manera significativa conforme incre-mentó la antigüedad del conjunto de datos. Hubo una disminución anual del 13.5% enlos metadatos asociados con los artículos publicados y los repositorios virtuales y hastauna disminución anual del 22% en los metadatos que sólo estaban disponibles mediante lacomunicación con los autores. Este rápido deterioro en la disponibilidad de los metadatos,duplicado en estudios de otros tipos de datos biológicos, debería motivar la pronta actual-ización de las políticas del intercambio de datos y las prácticas de los investigadores paraasegurar que en las ciencias de la conservación no se pierda para siempre el contexto valiosoproporcionado por los metadatos.
  • Item
    Poor data stewardship will hinder global genetic diversity surveillance.
    (24/08/2021) Toczydlowski RH; Liggins L; Gaither MR; Anderson TJ; Barton RL; Berg JT; Beskid SG; Davis B; Delgado A; Farrell E; Ghoojaei M; Himmelsbach N; Holmes AE; Queeno SR; Trinh T; Weyand CA; Bradburd GS; Riginos C; Toonen RJ; Crandall ED
    Genomic data are being produced and archived at a prodigious rate, and current studies could become historical baselines for future global genetic diversity analyses and monitoring programs. However, when we evaluated the potential utility of genomic data from wild and domesticated eukaryote species in the world's largest genomic data repository, we found that most archived genomic datasets (86%) lacked the spatiotemporal metadata necessary for genetic biodiversity surveillance. Labor-intensive scouring of a subset of published papers yielded geospatial coordinates and collection years for only 33% (39% if place names were considered) of these genomic datasets. Streamlined data input processes, updated metadata deposition policies, and enhanced scientific community awareness are urgently needed to preserve these irreplaceable records of today's genetic biodiversity and to plug the growing metadata gap.
  • Item
    Not the time or the place: the missing spatio-temporal link in publicly available genetic data.
    (Blackwell Publishing Ltd, 2015-08) Pope LC; Liggins L; Keyse J; Carvalho SB; Riginos C
    Genetic data are being generated at unprecedented rates. Policies of many journals, institutions and funding bodies aim to ensure that these data are publicly archived so that published results are reproducible. Additionally, publicly archived data can be 'repurposed' to address new questions in the future. In 2011, along with other leading journals in ecology and evolution, Molecular Ecology implemented mandatory public data archiving (the Joint Data Archiving Policy). To evaluate the effect of this policy, we assessed the genetic, spatial and temporal data archived for 419 data sets from 289 articles in Molecular Ecology from 2009 to 2013. We then determined whether archived data could be used to reproduce analyses as presented in the manuscript. We found that the journal's mandatory archiving policy has had a substantial positive impact, increasing genetic data archiving from 49 (pre-2011) to 98% (2011-present). However, 31% of publicly archived genetic data sets could not be recreated based on information supplied in either the manuscript or public archives, with incomplete data or inconsistent codes linking genetic data and metadata as the primary reasons. While the majority of articles did provide some geographic information, 40% did not provide this information as geographic coordinates. Furthermore, a large proportion of articles did not contain any information regarding date of sampling (40%). Although the inclusion of spatio-temporal data does require an increase in effort, we argue that the enduring value of publicly accessible genetic data to the molecular ecology field is greatly compromised when such metadata are not archived alongside genetic data.
  • Item
    Global genetic diversity status and trends: towards a suite of Essential Biodiversity Variables (EBVs) for genetic composition.
    (2022-08) Hoban S; Archer FI; Bertola LD; Bragg JG; Breed MF; Bruford MW; Coleman MA; Ekblom R; Funk WC; Grueber CE; Hand BK; Jaffé R; Jensen E; Johnson JS; Kershaw F; Liggins L; MacDonald AJ; Mergeay J; Miller JM; Muller-Karger F; O'Brien D; Paz-Vinas I; Potter KM; Razgour O; Vernesi C; Hunter ME
    Biodiversity underlies ecosystem resilience, ecosystem function, sustainable economies, and human well-being. Understanding how biodiversity sustains ecosystems under anthropogenic stressors and global environmental change will require new ways of deriving and applying biodiversity data. A major challenge is that biodiversity data and knowledge are scattered, biased, collected with numerous methods, and stored in inconsistent ways. The Group on Earth Observations Biodiversity Observation Network (GEO BON) has developed the Essential Biodiversity Variables (EBVs) as fundamental metrics to help aggregate, harmonize, and interpret biodiversity observation data from diverse sources. Mapping and analyzing EBVs can help to evaluate how aspects of biodiversity are distributed geographically and how they change over time. EBVs are also intended to serve as inputs and validation to forecast the status and trends of biodiversity, and to support policy and decision making. Here, we assess the feasibility of implementing Genetic Composition EBVs (Genetic EBVs), which are metrics of within-species genetic variation. We review and bring together numerous areas of the field of genetics and evaluate how each contributes to global and regional genetic biodiversity monitoring with respect to theory, sampling logistics, metadata, archiving, data aggregation, modeling, and technological advances. We propose four Genetic EBVs: (i) Genetic Diversity; (ii) Genetic Differentiation; (iii) Inbreeding; and (iv) Effective Population Size (Ne ). We rank Genetic EBVs according to their relevance, sensitivity to change, generalizability, scalability, feasibility and data availability. We outline the workflow for generating genetic data underlying the Genetic EBVs, and review advances and needs in archiving genetic composition data and metadata. We discuss how Genetic EBVs can be operationalized by visualizing EBVs in space and time across species and by forecasting Genetic EBVs beyond current observations using various modeling approaches. Our review then explores challenges of aggregation, standardization, and costs of operationalizing the Genetic EBVs, as well as future directions and opportunities to maximize their uptake globally in research and policy. The collection, annotation, and availability of genetic data has made major advances in the past decade, each of which contributes to the practical and standardized framework for large-scale genetic observation reporting. Rapid advances in DNA sequencing technology present new opportunities, but also challenges for operationalizing Genetic EBVs for biodiversity monitoring regionally and globally. With these advances, genetic composition monitoring is starting to be integrated into global conservation policy, which can help support the foundation of all biodiversity and species' long-term persistence in the face of environmental change. We conclude with a summary of concrete steps for researchers and policy makers for advancing operationalization of Genetic EBVs. The technical and analytical foundations of Genetic EBVs are well developed, and conservation practitioners should anticipate their increasing application as efforts emerge to scale up genetic biodiversity monitoring regionally and globally.