Metabarcoding hyperdiverse kelp holdfast communities on temperate reefs: An experimental approach to inform future studies

Classical taxonomic approaches to quantifying biodiversity can be notoriously laborious and restrictive. Instead, molecular metabarcoding is emerging as a rapid, high-throughput, and cost-effective tool to catalog biodiversity. Despite the appeal of metabarcoding, methodological and procedural biases must be understood before robust biodiversity inferences can be made. Here, we use CO1 metabarcoding to characterize marine eukaryote communities associated with Ecklonia radiata , the dominant eco-engineering kelp of temperate Australasia. To establish a standardized and reproducible community metabarcoding protocol, we examined the influence of different sample preparation, laboratory, and bioinformatic steps on inferences of species richness and composition of communities associated with E . radiata holdfasts (the root-like structure anchoring the kelp to the substratum) sampled from northeastern New Zealand. Specifically, we examined the effect of sieving the community into different size fractions and the replicability of results across DNA extractions, polymerase chain reactions and sequencing. Overall, we found that sieving the community into two size fractions before DNA extraction enabled detection of a greater diversity of taxa than not sieving samples. When compared with traditional morphology-based inventories of kelp holdfast biodiversity, we found that although the taxonomic precision of our metabarcoding approach at the species and genus level was limited by the availability of reference sequences in public repositories, we recovered ~ 40% more taxa and a greater taxonomic breadth of organisms than morphological surveys (e.g., 18 phyla as compared with 14 phyla). On the basis of our findings, we provide methodological guidelines for the use of metabarcoding as a tool for surveying and monitoring the hyperdiverse species assemblages associated with kelp holdfasts.


| INTRODUC TI ON
DNA metabarcoding (Taberlet, Coissac, Hajibabaei, et al., 2012) has revolutionized the way we characterize biodiversity (Bush et al., 2019;Stat et al., 2017) as well as the assessment of ecosystem and environmental health (Aylagas et al., 2016).DNA metabarcoding methods are now used in empirical ecology (Harms-Tuohy et al., 2016), invasion biology (Thomas et al., 2020), biomonitoring and conservation management (Barnes & Turner, 2016).Where the study focus is a specific community, or when a bulk specimen mixture is taken from a focal environment, we refer to these approaches as community DNA metabarcoding (Creer et al., 2016).Community DNA studies use the same methods of high-throughput DNA extraction, polymerase chain reaction (PCR), and sequencing common to all metabarcoding approaches but aim to directly identify the taxa within the sampled community based on their DNA barcode.These community DNA studies are akin to traditional visual morphologybased surveys in aiming to characterize the taxonomic richness of a community or species assemblage and to infer differences in the taxonomic composition among sampled communities (Deiner et al., 2017;Taberlet, Coissac, Pompanon, et al., 2012).
DNA-based monitoring methodologies (Baird & Hajibabaei, 2012), such as community DNA, have provided comparable results to traditional biodiversity surveys in a range of ecosystems (Deiner et al., 2017).In contrast to morphology-based surveys, community DNA metabarcoding does not depend on expert taxonomic training (Bush et al., 2019), allows higher comparability across studies (Aylagas et al., 2016;Ji et al., 2013), and produces data (i.e., sequence reads) that can be easily reanalyzed and reinterpreted by a secondary user.As a result, community DNA approaches often discover greater numbers of taxa within a community than has been previously described (Bush et al., 2019;Siegenthaler et al., 2019;Valentini et al., 2016).In the last decade, community DNA approaches have been successfully used to describe past and present biodiversity in terrestrial (Brehm et al., 2016;Dopheide et al., 2019), freshwater (Andújar et al., 2018;Blackman et al., 2019;Elbrecht & Leese, 2017;Hajibabaei et al., 2019), estuarine (Lobo et al., 2017), and marine (Aylagas et al., 2016;Knowlton & Leray, 2015;Zhang et al., 2018) environments.There is growing recognition that community DNA can help characterize and monitor the biodiversity of a wide variety of important ecosystems and assist in making informed management decisions.
As an emerging tool however, before community DNA metabarcoding can be confidently applied, rigorous examination of the potential biases and artifacts of the approach must be conducted.For instance, several studies have addressed the influence of community sampling protocols (Dickie et al., 2018) and laboratory methods, such as DNA extraction procedures (Deiner et al., 2017;Lear et al., 2018), primer choice (van der Loos & Nijland, 2021), and amplification bias (Kelly et al., 2019), as well as the level of replication at each methodological step, in producing robust, consistent, and reproducible results (Ficetola et al., 2015;Nichols et al., 2020;Porter et al., 2019).Bioinformatic pipelines, which transform the sequence reads into community data, can strongly influence study results and are constantly revised and improved (Pauvert et al., 2019).Decisions within the bioinformatic pipelines regarding the filtering of reads, processing PCR replicates, and sequencing depth have also been demonstrated to influence biodiversity estimates (Alberdi et al., 2018;Bokulich et al., 2013;Flynn et al., 2015;Kunin et al., 2010).As a consequence, prior to using a community DNA approach in a new ecosystem or focal community, there is a recognized need for experimental examination of the potential drawbacks and biases that different steps in the overall approach might introduce (Aylagas et al., 2016;Bush et al., 2019;McGee et al., 2019).
One of the most important considerations when first applying a community DNA approach in a new system is establishing how to obtain a community DNA sample that is representative of the biodiversity present in a target community (Koziol et al., 2019).Although it is common for procedural replicates and controls to be considered in the laboratory steps of community DNA studies, there have been few systematic examinations of the community sampling procedures on the overall estimates of biodiversity (Alberdi et al., 2018;Porter et al., 2019).Replicate samples of a target community, whether it be in traditional morphology-based biodiversity assessments or community DNA approaches, only recover a subset of the community; that is, they are not a census, and as such, the magnitude of variation among replicates must be quantified (Vlek et al., 2006).Community DNA approaches generate millions of sequences and are therefore potentially able to reach the asymptote of the species discovery curve with fewer replicates than traditional surveys (Bush et al., 2019).However, this benefit of a community DNA approach will depend on how representative each replicate DNA sample is of the community, and results vary among studies (Ficetola et al., 2015).It has been established that in communities where species have varying biomass, sieving the community into different size fractions before DNA extraction can reduce misidentification or the omission of smaller organisms (Aylagas et al., 2016;Wangensteen, Cebrian, et al., 2018).However, there has been little examination of subsequent steps in a community DNA approach, for instance, at what laboratory or bioinformatic stage it is best to combine the different size fractions to recover representative estimates of biodiversity.Addressing the implications of these procedural decisions is important (Cowart et al., 2015), particularly when the communities being characterized are known to support diverse taxa of varying sizes.Kelp (Laminariales) are ecosystem engineers (Jones, 2014) responsible for supporting incredibly diverse, structurally complex, and highly productive ecosystems along temperate and polar coastlines worldwide (Steneck et al., 2002).On the Great Southern Reef of Australia, it has been estimated that between 700 and 4000 different species of algae, invertebrates, and fishes occupy these ecosystems, with high levels of endemism (between 20 and 60%; Bennett et al., 2016).Accordingly, when these kelps are lost, we observe dramatic declines in biodiversity and ecosystem productivity (Bennett et al., 2016;Filbee-Dexter & Scheibling, 2014;Krumhansl et al., 2016;Ling et al., 2009).
Importantly, kelps are sensitive to environmental changes, and the demographic responses of kelp populations to stressors ripple throughout the ecosystem (Smale et al., 2013;Teagle et al., 2017;Vergés et al., 2014;Wernberg et al., 2016).Monitoring biodiversity changes within kelp forests using traditional survey methods is time-intensive and is highly dependent on scientists having diverse taxonomic expertise.For these reasons, the monitoring of kelp forest-associated biodiversity has benefited from the use of metabarcoding approaches.Specifically, the analysis of environmental DNA in seawater samples to detect vertebrate taxa within kelp forest ecosystems has gained results comparable with visual surveys (Port et al., 2016), and community DNA has also been used to characterize the sessile invertebrate communities attached to cobbles beneath the kelp canopy (Shum et al., 2019).
The kelp holdfast is the structure that anchors the kelp to the substratum, and the complex web of haptera (root-like projections) which form the holdfast, provides a biogenically complex structure for a diversity of taxa and functional groups to colonize (Figure 1).Kelp holdfasts provide a logistically convenient, biologically defined sampling unit, which captures a broad diversity of marine eukaryote phyla (Anderson et al., 2005;Teagle et al., 2017).Despite the notable appeal of using kelp holdfast assemblages as a barometer for change in this ecosystem, the large number, taxonomic diversity, and predominance of soft-bodied organisms have precluded their use in morphological assessments of the biodiversity at the broad spatial scales necessary to monitor these ecosystems.Nonetheless, kelp holdfasts could provide the basis for an effective community DNA approach to characterize and monitor biodiversity in kelp forest ecosystems.
Here, we evaluate metabarcoding as a tool for assessing biodiversity of Ecklonia radiata holdfasts sampled from an established kelp forest in Matheson's Bay on the northeast coast of New Zealand.Our comprehensive experimental design examines the influence of methodological steps and decisions on biodiversity estimates in these communities.Specifically, given the disparate sizes of organisms found within holdfast communities, we assessed the effect of sieving the community into different size fractions.To assess the representativeness of each DNA extraction, we examined the similarity between replicate samples of the same community and then assessed the replicability between PCRs.Additionally, we analyzed the effect of bioinformatic decisions on final biodiversity estimates.We then compare our taxonomically assigned community DNA reads with a morphology-based, kelp holdfast inventory from the same location to evaluate the biases and opportunities of the DNA metabarcoding approach.Finally, we provide a series of guidelines for community sampling, sample preparation, laboratory procedures, and bioinformatic decisions for metabarcoding marine eukaryotes from kelp holdfast assemblages for biodiversity surveys and monitoring.

| Field sampling and processing
Eight mature E. radiata (mean height =35.1 cm; range =15.4-61.6 cm) were collected from Matheson's Bay, New Zealand in February 2019 by carefully sliding a knife between the base of the holdfast and the rocky reef.The individual kelp was quickly placed in polyethylene bags and sealed underwater to prevent organisms from escaping and the accidental transfer of organisms among samples.Upon returning to the surface, samples were placed in an insulated container and transported to the laboratory.Once in the laboratory, each E. radiata individual was placed in a separate tray, and the holdfast community was separated from the kelp holdfast using forceps.The community sample was then passed through a stacked filter unit containing a fine, 63µm Sefar Nytal ® filter on the bottom and a coarse 1000µm Sefar Nytal ® filter on the top.The community sample retained on the 63µm filter was considered the small size fraction containing meiobenthic organisms, and the material retained on the 1000µm filter was considered the large size fraction containing megabenthic and macrobenthic organisms (Rex & Etter, 2010;Wangensteen, Cebrian, et al., 2018).The two size fractions were transferred along with the mesh into individual labelled 50-ml falcon tubes.All sample processing equipment was sterilized prior to use on each holdfast by flaming the forceps (doused in ethanol); washing the filters, funnels, and trays with soap and bleach for at least 15 min in each solution; and further autoclaving the funnels.All samples were processed within 5-6 h of collection in the field and stored at −80°C for at least 24 h before lyophilization, more commonly known as freezedrying.In the freeze-drying procedure, the falcon tubes containing the community samples were quickly transferred into a Labconco FreeZone 6 bulk tray freeze-dryer with the lid off, and they were subjected to a 24-h cycle at a condensation temperature of −50°C and a vacuum pressure of 0.12 mbar.Once the freeze-drying was completed, a sterilized 15-ml falcon tube was used like a pestle to grind the sample inside the 50-ml falcon tube into a fine powder.
The powdered community was then weighed and stored at −80°C until DNA extraction.

| Experimental design
Our overall goal was to assess the potential shortcomings and biases introduced by different laboratory and bioinformatic procedures in the DNA metabarcoding of E. radiata holdfast-associated communities.
The impact of alternative decisions and procedures on measures of taxonomic richness and community composition (hereafter "biodiversity estimates") was examined at different stages of the community DNA workflow including sample preparation, DNA extraction, PCR, and bioinformatic manipulation (Figure 1).To do this, the eight kelp holdfast-associated communities were subsampled for use across several experimental treatments.Due to the limited volume of some community samples and poor sequencing of others, not all holdfast communities were represented in all treatments, resulting in an unbalanced experimental design (Table S1).
For sample preparation, we were interested in examining differences in biodiversity estimates between the large and small size fractions of the community sample.Our experimental design aimed F I G U R E 1 (a) Ecklonia radiata forest and associated community.The white square highlights the kelp holdfast, which was used as the focal sampling unit in our study.(b) Overview of the experimental design to investigate the impact of alternative decisions and procedures on biodiversity estimates for kelp holdfast assemblages using community DNA metabarcoding.Experimental treatments were considered during the sample preparation, DNA extraction, and polymerase chain reaction (PCR) steps of the community DNA approach.In bold are the treatments applied to the kelp holdfast communities, which we compared in a series of planned contrasts.For each DNA extraction and PCR, two replicates were performed.See Section 2.2 for additional details and explanation of acronyms used for the treatments and Table S1  to test the influence of manually pooling (i.e., physically mixing) the size fractions before DNA extraction (MPF COM ), pooling the DNA extractions of the large and small fractions before PCR (MPF EXT ), and pooling the PCR products of the large and small fraction extractions before sequencing (MPF PCR ), as well as bioinformatically averaging (BAF) and bioinformatically combining (BCF) the sequence data of the large and small fractions on our biodiversity estimates (Figure 1).Briefly, bioinformatically combining fractions involved concatenating the sequence data of each fraction before analysis, whereas bioinformatically averaging fractions involved calculating the average richness or community based on the large and the small size fractions as part of the analysis (for more details on the bioinformatic procedures, see Section ).The specific contrasts we were interested in were.
(2) Bioinformatically averaged fractions (BAF; n = 11) versus pooling PCR products before sequencing (MPF PCR ; n = 3): Do biodiversity estimates differ if DNA extractions of the large and small size fractions are amplified and sequenced separately and then bioinformatically averaged, or if the PCR products of the large and small size fractions are manually pooled before sequencing?Biodiversity estimates can be strongly affected by the methodology used to cluster sequences; thus, we also examined the influence of using amplicon sequence variants (ASVs) or operational taxonomic units (OTUs) on our results.Furthermore, to investigate the effects of different bioinformatic sample standardization decisions, we examined the influence of different filtering methods on our biodiversity estimates for the 10 contrasts described above (see Section 2.4 for more detailed information).

| Community DNA extraction and PCR
Genomic DNA was extracted from a 0.20 g of the powdered kelp holdfast community using the DNeasy PowerSoil Extraction Kit (QIAGEN) following the specified protocol, except using UltraPure DNase/RNase-Free Distilled Water (Thermo Fisher) in the final step rather than elution buffer.For each sample, we performed two replicate extractions (Extractions A and B).For each set of extractions, a negative extraction control was also included using UltraPure water.
To assess the quality and quantity of the extractions, the extracted DNA was run on a 1% agarose gel and visualized using GelRed ® Nucleic Acid Gel Stain (Biotium, Inc.), and gel-based estimates were confirmed using a Qubit™ fluorometer with the Qubit™ dsDNA BR Assay Kit.The DNA concentration of all extractions was normalized to ~15 ng/µl by diluting some DNA extractions with UltraPure DNase/RNase-Free Distilled Water (Thermo Fisher; the maximum required dilution was 1:10).
We followed the thermocycling regime used by van der Reis et al.
(2018) conducted in a SureCycler 8800 Thermal Cycler (Agilent Technologies).Specifically, thermocycling involved an initial denaturation at 95°C for 60 s; 20 "touch-up" cycles including a denaturation step at 95°C for 30 s, an annealing step starting at 45°C for 30 s (increasing 1°C with each cycle), and a final extension step at 72°C for 30 s; 20 "touch-down" cycles including a denaturation step at 95°C for 30 s, an annealing step starting at 65°C for 30 s (decreasing 1°C with each cycle), and a final extension step at 72°C for 30 s; 10 cycles including a denaturation step at 95°C for 15 s, an annealing step at 45°C for 20 s, and a final extension step at 72°C for 30 s; and a final extension cycle at 72°C for 60 s and then holding at 14°C.For each PCR, a negative control was also included using UltraPure DNase/RNase-Free Distilled Water (Thermo Fisher) instead of DNA template.Amplicons were purified by magnetic separation following the Mag-Bind ® Total Pure NGS protocol (Omega Bio-Tek).PCR products were pooled as required according to our experimental design, quantified (Qubit ® 2.0 Fluorometer, Invitrogen), and diluted to an equal concentration of between 5 and 15 ng/µl.Sequencing was performed at Massey Genome Services, Massey University (Palmerston North, New Zealand) where indexing occurred using the Nextera™ DNA Library Preparation Kit (Illumina) before sequencing on an Illumina MiSeq™ System (2 × 250 paired-end protocol).

| Bioinformatic analysis
Sequence reads were analyzed and filtered using a series of quality control steps available in the bioinformatics toolkit of QIIME 2 (Bolyen et al., 2019).First, the primers were removed without mismatch tolerated using cutadapt (Martin, 2011).We used DADA2 (Callahan et al., 2016) to perform the paired-end merging (trimleft-r 13, trim-left-f 13, and trunc-len 200), dereplication, chimera filtering (using the consensus method), and clustering of ASV.Using the decontam R package v1.4 (Davis et al., 2018) with the combined method, we filtered the ASV table for possible contaminants using the extraction and PCR negative controls.To account for the expected degree of tag switching in multiplexed metabarcoding analysis (Costello et al., 2018), we used the abundance renormalization approach in Wangensteen and Turon (2015).This approach removes the reads of the subsamples corresponding to a cumulative frequency of <3% for each particular ASV.ASVs passing the initial quality control steps were taxonomically assigned using the MARES_COI_NOBAR reference sequence database (Arranz et al., 2020).MARES is the most comprehensive COI reference database for marine eukaryotes available and provides users the ability to retain taxa that cannot be assigned at the species level but can be assigned at higher taxonomic levels-a desirable feature when working in communities of taxonomically diverse and potentially poorly characterized biodiversity (Arranz et al., 2020).
For taxonomic assignment, we first performed a BLASTn (Altschul et al., 1990) with an e-value of 1 −60 for high-quality matches using the default max_target_seqn, which recovers the first 500 possible matches (Shah et al., 2019).Then, we used MEGAN 6.18.3 (Huson et al., 2016) for taxonomic assignment within the NCBI taxonomy framework using the default Lowest Common Ancestor algorithm parameters.Lastly, the ASVs were further clustered into OTUs using VSEARCH v2.13.6 (Rognes et al., 2016) and a 97% similarity threshold.The resulting ASV and OTU tables were further filtered to remove sequencing artifacts or erroneous ASVs (or OTUs) based on different approaches.Five filtering methods were investigated: keeping all ASVs (or OTUs); removing ASVs (or OTUs) with a relative abundance lower than 0.003% (Elbrecht & Leese, 2017), 0.01% (Alberdi et al., 2018), and 0.05% of the total number of sequences; and using the LULU curation algorithm (Frøslev et al., 2017) to collapse ASVs (or OTUs) into their parent ASVs (or OTUs) depending on their similarity and co-occurrence patterns.
Three of the eight E. radiata holdfast communities had adequate biomass to be used in all the treatments used to construct the contrasts above (Figure 1), whereas the other five community holdfasts spanned as many treatments as was possible, given the biomass of the sample (Table S1).This experimental design resulted in a total of 72 subsamples for sequencing including three extraction negative controls and five PCR negative controls.We bioinformatically combined subsamples by combining the appropriate sample size fractions (3 BCF subsamples), extraction replicates (22 BCE subsamples), or PCR replicates (6 BCP subsamples) according to our experimental design, generating a total of 31 synthetic subsamples for comparison.We bioinformatically averaged subsamples by adjusting the coefficients of the contrast matrix to ±0.5 or ±0.25 (see Section 2.5 for details) for the appropriate fractions, extractions, and PCRs to define the contribution of each subsample to the average.A total of 103 subsamples were used for statistical analysis, examined at both the ASV and OTU levels and for all filtering thresholds as well as sample rarefaction levels.
We considered the trade-off of increasing the rarefaction threshold to retain a greater proportion of the sampled diversity at the expense of removing greater numbers of subsamples with low ASV and OTU richness or decreasing the rarefaction threshold and retaining more subsamples at the expense of removing a greater proportion of the sampled diversity.Given that our experimental design focuses on treatment (subsample) contrasts, we deemed it was most important to select a rarefaction threshold for each ASV and OTU table that retained the greatest number of subsamples (Figure S1).We assessed the effect of variability introduced by the rarefaction procedure by repeating each analysis on three different rarefied datasets each started from a different random seed as well as an analysis of the unrarefied data.Two subsamples (MTB11_Sml_Ext A _MPP PCR and MTB20_Lrg_Ext B _MPP PCR ) had low sequencing depth for unknown reasons and were excluded when the data were rarefied to even depth.Rarefaction was performed using the Phyloseq R package v1.28 (McMurdie & Holmes, 2013).

| Statistical analysis
We used linear models implemented in R v4.0.1 (R Core Team, 2020) to assess how our laboratory and bioinformatic decisions influenced biodiversity estimates for the holdfast communities.
First, we calculated the presence or absence of each ASV or OTU in each rarefaction, filtering and clustering category for each subsample.From these presence-absence matrices, we calculated two response variables: the observed taxonomic richness for each subsample and differences in community composition based on Jaccard's dissimilarity between all pairwise combinations of subsamples.Second, due to the unbalanced and partially nested nature of our experimental design to test the 10 planned comparisons described above, we set up a dummy variable, which assigned each subsample to its corresponding combination of levels for the sample preparation, DNA extraction, PCR, and bioinformatic steps.For example, a subsample where the large size fraction was extracted, then one of the two extraction replicates was amplified, and the single PCR sequenced would have been coded as Lrg _ EXT A_ PCR X .
This categorical dummy variable had a total of 25 levels for the 95 subsamples considered, excluding negative controls and including the bioinformatically combined synthetic subsamples.We then constructed a contrast matrix for the dummy variable, which contained our planned independent comparisons.Each column of the contrast matrix corresponded to a particular planned comparison (see Section 2.2), allowing us to test contrasts directly, without unnecessary subsetting and thereby multiple testing of the same data.
Subsamples involved in contrasts were assigned weights of either 1 or −1, depending on the levels being contrasted, except for bioinformatically averaged subsamples that were assigned a weight of ±0.5 or ±0.25 (Crawley, 2012).Not all 10 contrasts were orthogonal.Non-orthogonal contrasts are analogous to collinear predictors and can produce similar statistical issues (Quinn & Keough, 2002); hence, to mitigate these issues, we used the inverse of the transposed contrast matrix to calculate the fixed effects design matrix.
We also used the inverse of the transposed contrast matrix for the calculation of the estimated marginal means and standard errors using the emmeans R package v1.4.8 (Lenth et al., 2018).Lastly, because individual kelp holdfast communities contributed to multiple levels of the dummy variable (i.e., repeated measures), for our analysis of taxonomic richness, we specified the individual kelp identifier as a random effect in a linear mixed model fitted using the lmer function in the lme4 R package v1.1 (Bates et al., 2014).To test the statistical significance of the contrasts from the linear mixed models, we used the summary function from the lmerTest R package v3.1 (Kuznetsova et al., 2017).For the analysis of community composition, we conducted a permutational analysis of variance using the adonis2 function in the vegan R package v2.6 (Oksanen et al., 2007) and determined the significance of our contrasts using 999 permutations under a reduced model.While the adonis2 function does not allow fitting mixed models, to account for repeated measures we constrained permutations to only occur among subsamples with the same individual kelp identifier.We tested the homogeneity of dispersions assumption of this analysis and examined patterns of multivariate beta diversity between subsamples by calculating the mean distance to the group centroid using the betadisper function in the vegan R package v2.6.Differences in the group dispersions among levels of our experimental treatments were tested using the anova.betadisperfunction.To account for repeated measures in this analysis, we also constrained permutations (999 permutations) of the residuals to occur among subsamples with the same individual kelp identifier.
Using planned independent comparisons implemented by the manipulation of the contrast matrix allowed us to efficiently use the degrees of freedom within our experimental design to only test the comparisons of interest.Importantly, because all planned comparisons are considered in the model without the need for post hoc tests, corrections for multiple testing and false positives are not required.Nevertheless, because we wanted to explore the effects of different filtering thresholds (site occupancy vs. proportional read abundance), clustering methods (ASV vs. OTU), and levels of taxonomic precision (taxonomically assigned vs. taxonomy free), there were a large number of models considered.However, because we were interested in determining possible biases in estimates of biodiversity as a result of our experimental manipulations, we chose to retain a higher probability of false-positive results, making it more likely that we detect a bias rather than ignoring potential biases.Accordingly, by not correcting for multiple testing, we present the most conservative view of procedural biases in community DNA metabarcoding of kelp holdfast communities.
For data visualization, we calculated the mean taxonomic richness and standard error of each treatment for unassigned ASVs filtered by 0.003% minimum read abundance using the emmeans R package.To visualize the differences in community composition between the large and small fractions of the holdfast community samples, we projected the sample coordinates for the first two axes of the principal coordinate analysis (PCoA) of the Jaccard dissimilarity between subsamples for the presence/absence matrix of unassigned ASVs filtered by 0.003% minimum read abundance.

| Comparison with morphology-based surveys
To assess the performance of the community DNA approach as compared with a traditional morphological survey, we used the community DNA subsamples that were sieved, extracted, and amplified separately but pooled before sequencing (MPP PCR ).According to our results, this would be the preferred treatment of future samples as it retrieved similar taxonomic richness and community composition as bioinformatically combined PCR products of each sample while minimizing the sequencing costs (see Section 3).Here, we compared the taxonomically assigned OTUs in these community DNA subsamples from the eight E. radiata holdfasts against a morphology-based survey conducted for nine E. radiata holdfasts collected <750 m away in 2002 (Anderson et al., 2005).Prior to comparison, we assessed whether the assigned taxa identified by metabarcoding were of exclusively marine or brackish origin using the wormsbyname function in worms R package v0.2 (Holstein, 2018).Assigned OTUs, which were not exclusively marine, were identified using a custom R script and removed.
If an OTU or morphologically identified taxon could only be confidently assigned at a high taxonomic level (i.e., identifiable only to Class or Order), the OTU or morphologically identified taxon was labelled as undefined in lower taxonomic levels.We then compared the absolute and relative number of OTUs and morphologically identified taxa for these two studies from the same geographical area synonymizing taxonomies according to the World Register of Marine Species (Horton et al., 2017).

| RE SULTS
Across our experimental design, Illumina sequencing produced 4,310,106 paired-end reads of 72 sequenced subsamples (including negative controls).After quality filtering (primer removal, denoise, paired-end assembly, dereplication, and chimera removal), a total of 1,303,462 reads were retained with a modal sequence length of 313 bp and a mean sequence length of 319 bp.The two subsamples (excluding negative controls) that retrieved <5000 reads were removed for downstream analysis (Table S1).The final dataset after removing possible contaminants and after tag switching abundance renormalization (Wangensteen & Turon, 2015) consisted of 7230 ASVs, with an average of 14,862 reads per subsample (range: 5156-24,624).ASVs were clustered into OTUs at 97% similarity, producing 2623 OTUs.Further filtering of ASVs and OTUs was performed based on the LULU algorithm (Frøslev et al., 2017) and three thresholds of minimum read abundance (0.003%, 0.01%, and 0.05%; Table S2).Rarefaction curves indicated that most of the subsamples approached an asymptote in ASV and OTU richness, indicating that sampling effort was sufficient to produce a representative estimate of the biodiversity in the sampled community (Figure S1).

| Effect of sample preparation
For the eight holdfast communities collected from the field, the larger size fraction had overall lower taxonomic richness than the smaller size fraction for all ASVs and OTUs (Figures 2, 3, and S2).
However, the large size fraction had a similar taxonomic richness to the small size fraction when comparing only the taxonomically assigned ASVs and OTUs (Figures 2 and S2).Conversely, though not unexpectedly, the community composition of the two size fractions differed strongly for every rarefaction, filtering, clustering, and taxonomic assignment procedure used (Figures 2, 4, and S2).
Differences in the variability in assemblage composition (multivariate dispersions) between the large and small size fractions differed when considering all reads at the OTU level and considering all reads at the ASV level when stringent filtering was applied (0.01% or 0.05% minimum read abundance; Figure S2).Interestingly, the PCoA showed that the greatest variation in community composition was associated with the holdfast identities (PCoA1 in Figure 4) rather than differences between the large and small size fractions of each holdfast (PCoA2 in Figure 4).
The bioinformatically combined sequence reads from the separately sequenced size fractions (BCF) retrieved more taxonomic richness than pooling PCR products before sequencing (MPF PCR ) based on ASVs (Figures 2, 3, and S2).Interestingly, the taxonomic richness of the BCF and the pooled fractions before sequencing (MPF PCR ) was similar when we clustered the ASVs into OTUs at 97% similarity (Figures 2 and S2).The lowest taxonomic richness was found when the sequence reads of the two size fractions were bioinformatically averaged (BAF; Figures 2, 3, and S2).Nevertheless, although we detected differences in taxonomic richness between bioinformatically combining and bioinformatically averaging the two size fractions, we detected no significant differences in community composition among these two bioinformatic approaches (Figures 2 and S2).
Pooling the large and small size fractions before DNA extraction (MPF COM ) or before PCR (MPF EXT ) had little effect on estimates of taxonomic richness (Figures 2 and S2).Similarly, we found no significant differences in community composition between the pooled size fractions before DNA extraction (MPF COM ) and the pooled size fractions before PCR (MPF EXT ; Figures 2 and S2).The taxonomic richness of the pooled size fractions before PCR (MPF EXT ) was similar to the taxonomic richness retrieved by bioinformatically averaging the sequence data of the two size fractions (BAF; Figures 3 and S2).
Interestingly, all methods for pooling the size fractions after separately extracting the DNA of each fraction (i.e., pooling extractions of the size fractions before PCR, pooling PCR products of the size fractions before sequencing, and bioinformatically averaging and bioinformatically combining sequence data from each fraction) showed similar community composition (Figures 2 and S2).

| Effect of DNA extraction
The taxonomic richness of replicate extractions from the same subsample (Extractions A and B) did not differ significantly for most of the clustering and filtering options we examined, except for ASVs when only taxonomic assigned reads were considered and for OTUs considering all reads (Figures 2 and S2).Additionally, we found similar community composition among replicate extractions for all combinations of clustering and the level of filtering (Figures 2 and   S2).
Across the subsampled communities, the mean taxonomic richness of the bioinformatically averaged extraction replicates (BAE was lower than bioinformatically combined sequence reads of the extraction replicates (BCE; Figures 2, 3, and S2).This result suggests that ASVs and OTUs may differ among subsamples of the same holdfast community (i.e., extraction replicates).On the basis of this contrast alone, we cannot determine whether biases have been introduced by PCR and/or sequencing, rather than the subsampling of the community.However, the community composition of the bioinformatically averaged (BAE) and combined (BCE) replicate extractions was similar (Figures 2 and S2).

| Effect of PCR
Subsamples from the same extraction, amplified and sequenced separately (PCR X and PCR Y ), did not differ significantly for most of the clustering and filtering options we examined, except when considering all reads at the ASV level when stringent filtering was applied (0.01% or 0.05% minimum read abundance; Figures 2 and S2).However, the community composition of the PCR replicates did not differ significantly in any of the contrasts we examined (Figures 2 and S2).
We found no significant differences in taxonomic richness among the different strategies for pooling PCR replicates for taxonomically assigned ASVs and OTUs (Figure S2).However, the highest taxonomic richness was found when bioinformatically combining the sequence data of both size fractions (BCF), followed by pooling PCR replicates (MPP PCR ), and the lowest taxonomic richness was found for bioinformatically averaging PCR replicates (BAP; Figure 3).
The community composition of the PCR replicates did not differ significantly at either the ASV level or OTU level when only the taxonomically assigned or all reads were examined (Figures 2 and S2).

| Comparison with morphology-based surveys
The morphology-based survey of holdfast-associated biodiversity recorded 181 taxa belonging to 14 phyla, of which 109 taxa were identified to the species level (Anderson et al., 2005; Figure 5).Our F I G U R E 2 Summary results for the planned contrasts used to examine differences in taxonomic richness (upper table) and community composition (based on Jaccard's dissimilarity index; lower table) as a result of different decisions taken during the community DNA metabarcoding approach, including sample preparation, DNA extraction, and polymerase chain reaction (PCR) amplification.Differences were examined at the amplicon sequence variant (left table) and operational taxonomic unit level (right table), considering different filtering thresholds (NO: no filtering; LULU algorithm; and 0.003%, 0.01%, and 0.05% minimum read abundance across all samples) and including all reads and only reads taxonomically assigned to Eukaryota.Blue color intensity increases with increasing level of statistical significance (key, top left), and white denotes no statistically significant difference between levels of the experimental treatments.The results presented here are a conservative estimate based on the consensus across three random seeds of each rarefaction performed (see Figure S2

F I G U R E 4
Principal coordinate analysis (PCoA)of Jaccard's dissimilarity between samples for amplicon sequence variants filtered at 0.003% minimum read abundance and considering all reads.Each kelp holdfast sample is represented by a different colored symbol.The small size fractions for each sample are indicated by triangles and pink polygon, whereas the large fractions are indicated by circles and the blue polygon.The labels (Large and Small) indicate the group centroids.Although among sample differences were large, indicated by the clustering of symbols of the same color along the PCoA1, differences between size fractions were consistent and associated with PCoA2 metabarcoding-based approach identified a total of 305 OTUs, representing 18 phyla; however, only 70 OTUs were assigned to the species level with 64 unique species identified (Figure 5 and Table S3).The number of assigned OTUs (i.e., taxa) was higher at lower taxonomic ranks for morphology-based surveys and the number of undefined taxa lower (Table S3).However, at higher taxonomic ranks (class, order, and phylum), the total number of assigned OTUs with the metabarcoding approach exceeded that of the morphology-based surveys (Table S3).

F I G U R E 5
Venn diagrams showing the number of taxa detected by each methodological approach at different taxonomic levels.Results of community DNA metabarcoding are displayed on the left (blue) and traditional morphology-based surveys on the right (yellow) of each Venn diagram, with the number of taxa in common in the intersection (green) F I G U R E 6 Proportion of taxa identified for each phylum by each methodological approach.DNA metabarcoding on the left (blue) and traditional morphological-based surveys on the right (yellow).OTUs, operational taxonomic units The taxonomic overlap between the morphology-and metabarcoding-based surveys was minimal at low taxonomic ranks; only two species and six genera were found in both survey methodologies, barely 1%-3% of the total OTUs assigned to those levels (Figure 5 and Table S3).The taxonomic overlap increased at higher taxonomic ranks, reaching 34% and 33% of taxa identified at the Class and Phylum level, respectively (Figure 5).There were three Phyla found only in the visual surveys (Brachiopoda, Rhizopoda, and Sipuncula), though these represented a small percentage (a combined 2.2%) of the total taxa found by the morphology-based approach (Figure 6).Seven Phyla were exclusively identified using metabarcoding, including microeukaryotes and fungi (Myzozoa, Ascomycota, and Oomycota), Archaeplastida, and Stramenopiles (Rhodophyta, Chlorophyta, Bacillariophyta, and Ochrophyta), though some of these Phyla were excluded from the morphologybased surveys a priori (Figure 6).Importantly, these microeukaryotic and fungal phyla represented 43% of the taxa found using the metabarcoding approach (Figure 6).Arthropods, Annelids, Porifera, and Echinoderms were common in the morphology-and metabarcoding-based surveys (Figure 6).Interestingly, Molluscs and Bryozoans, which were common in the morphology-based survey, were scarce in the metabarcoding-based survey (Figure 6).There were 17 species present in the morphological survey that had representative reference species barcodes in the MARES reference database; however, only two of these species were recovered in our metabarcoding survey of holdfast-associated communities.

| DISCUSS ION
To routinely use DNA metabarcoding of kelp holdfast biodiversity as a kelp forest ecosystem monitoring tool, sampling and laboratory protocols must be optimized, validated, and standardized (Cowart et al., 2015;Elbrecht & Leese, 2017;Pawlowski et al., 2018).Here, we present the analysis of a robust experimental design that quantifies the impacts of various practical, laboratory, and bioinformatic decisions made during a community DNA approach to estimating biodiversity.Our overall aim was to highlight the opportunity for using community DNA to assess the taxonomic richness and community composition of assemblages living on and in the holdfasts of a dominant ecosystem engineering kelp and to identify any shortcomings and biases in such an approach.Our results highlight that sieving the community into similarly sized organisms enables detection of a wider range of taxa, and replicate DNA extractions of the community, as well as replicate PCRs help to capture the maximum taxonomic richness within a sample.When compared with traditional morphology-based approaches to quantifying biodiversity in kelp holdfast communities, a community DNA approach recovers higher levels of taxonomic richness and a greater breadth of phyla.
Nonetheless, as described in several other systems (Gauthier et al., 2020), incomplete reference sequence databases remain a key factor limiting the potential of community DNA approaches to biodiversity assessment in this ecosystem.Below, we outline our learnings and discuss their implications for quantifying biodiversity, providing methodological and procedural recommendations for community DNA studies of kelp holdfast biodiversity.
Partitioning the kelp holdfast community into two size fractions allowed the detection of a wider diversity than what can be achieved without size fractioning.Within kelp holdfasts, resident organisms vary considerably in their biomass (Anderson et al., 2005), from nematodes only micrometers in length to sponges or colonial ascidians that can dominate much of the available space within a holdfast.The taxonomic richness obtained when sieving the kelp holdfast community into large and small size fractions, regardless of the pooling strategy used (manually in the laboratory through physical mixing of the fractions or bioinformatically), was higher to that of the community characterized when both fractions were extracted together, simulating the unsieved sample of the same communities (Figure 2).
Higher numbers of DNA copies from larger organisms with greater biomass can hinder the detection of smaller organisms, and thus, in the absence of size fractioning, higher sequencing effort is required to detect small, low biomass organisms (Cowart et al., 2015).In our study, the smaller size fraction had greater taxonomic richness than the larger size fraction at the ASV and OTU levels, considering all reads.Similar to previous studies, our results suggest that without size fractioning, it may be difficult to recover the presence of small biomass organisms in taxonomically diverse communities (Elbrecht et al., 2021;Rex & Etter, 2010;Wangensteen, Cebrian, et al., 2018).
It is worth noting that each individual holdfast contained a very different assemblage; thus, although we were able to differentiate among subsamples of the large and small fraction confidently, differences in community composition among holdfasts were substantial (Figure 4).
To avoid missing taxa, previous studies have suggested that multiple extractions and amplifications of the same sample may be required (Ficetola et al., 2015).Across our study design, the mass of the community subsample was kept consistent, optimizing the ratio of subsample mass to reagent volume for DNA extraction (as determined in pilot studies) and enabling the use of each community sample across several experimental treatments.Despite our efforts to homogenize the holdfast community samples before subsampling, we found that combining extraction replicates detects the highest taxonomic richness (Figures 2 and 3).This result is consistent with previous studies on animal taxa that have also found high variability among extraction replicates (Hermans et al., 2018).In the case of kelp holdfast community samples, the size of any subsamples used for extraction may often be too small relative to the bulk community sample to recover the full taxonomic breadth of organisms that are present (Deiner et al., 2017).Future studies may wish to trial increasing the overall mass of subsamples used in DNA extraction, to potentially gain more representative subsamples of the entire community.Nonetheless, our results also highlight the value of having extraction replicates (Zhou et al., 2011).
In contrast, studies focused on single phyla have shown that replicate extractions are less important than PCR replicates in minimizing variability among samples (Ficetola et al., 2015;Porter et al., 2019).PCR replicates are recommended as a procedure to reduce the PCR stochasticity and maximize the detection of taxa (Leray & Knowlton, 2015).The downside of increasing the number of PCR replicates, however, is the increased cost and the risk of false positives by accumulating artifactual sequences (Alberdi et al., 2018;Ficetola et al., 2015).In our study, we found that although PCR replicates are presumed to have similar biases due to primer choice and the laboratory protocols we used, inherent stochasticity in each PCR replicate slightly influenced the taxonomic richness observed but not the community composition of replicate PCRs.
Despite our efforts to rarefy subsamples that were bioinformatically combined following sequencing so that they were comparable to subsamples that were pooled before sequencing, in most cases, this was not sufficient to make up for the impact of increased sequencing depth.For instance, bioinformatically combining the large and small fractions tended to produce higher taxonomic richness than if the large and small PCR products of the fractions were pooled before sequencing, and the lowest taxonomic richness was always observed when bioinformatically averaging the large and small fractions (Figures 2, 3, and S2).Similar results were observed for other pooling strategies used for the extraction replicates, where we observed significantly higher richness when bioinformatically combining the extraction replicates compared with bioinformatically averaging extraction replicates.In contrast, bioinformatically combining PCR replicates caused little or no increase in taxonomic richness.The high similarity in the community composition of PCR replicates (discussed above; Figures 2 and S2) may explain why there was no increase in taxonomic richness when PCR replicates were bioinformatically combined, supporting findings of previous studies that suggest ecological inferences are influenced most by sequencing depth rather than PCR stochasticity (Smith & Peay, 2014).
Across our experimental design, the greatest taxonomic richness was recovered through bioinformatically combining fractions, extractions, and PCR replicates.Procedures equivalent to our bioinformatically combined treatment have been shown to commonly recover the highest number of species (Leray & Knowlton, 2017).
Nonetheless, the risk of false positives is also increased by such additive strategies-through increased introduction and amplification of contaminants, as well as sequencing errors (Alberdi et al., 2018;Flynn et al., 2015).For these reasons, different strategies such as removing singletons, even doubletons and tripletons (Kunin et al., 2010), and stringent filtering can be used to remove artifactual sequences at the expense of removing low abundance true positives (Elbrecht & Leese, 2017;Flynn et al., 2015;Leray & Knowlton, 2017).
In our case, we used both a range of thresholds for the minimum read abundance filtering across subsamples and a site occupancy model to remove rare ASVs or OTUs that may be erroneous sequences or artifacts.By choosing a relative read abundance across subsamples, ASVs, which appear in low abundance in some subsamples but may be present in greater abundance in other subsamples, will be retained as they are possible true positives (Leray & Knowlton, 2017).However, using this approach, we lost a substantial number of taxa (~80% of the ASVs and OTUs), even with the least stringent filtering (0.003%), implying the loss of many true positives (Table S2).Of all the filtering approaches used, we retained the highest number of taxa using the site occupancy model implemented using LULU.This approach uses similarity among ASVs (or OTUs) and co-occurrence patterns to determine whether rare ASVs are a sequencing error from a more abundant ASV (Frøslev et al., 2017).As a result of our findings, our recommendation for filtering would be to use either the LULU algorithm or a low minimum read abundance threshold (0.003%) to minimize the risk of false positives but retain a greater diversity of taxa.
The chosen methodological approach for a community DNA study will differ depending on whether a study is focused on ASVs or OTUs and whether the overall richness or different measures of diversity or turnover among samples are of interest.Recently, the use of ASVs instead of OTUs has been promoted because it improves the reusability, reproducibility, and comprehensiveness of sampled biodiversity (Callahan et al., 2017).In our study, differences in taxonomic richness between the bioinformatically combined subsamples and other pooling strategies diminished when clustering ASVs into OTUs.At the ASV level, combining the size fractions, extractions, or PCRs bioinformatically after sequencing revealed higher taxonomic richness than any other strategies for pooling subsamples, either by bioinformatically averaging or pooling PCR products of the size fractions before sequencing.However, at the OTU level, subsamples pooled before sequencing showed similar richness as bioinformatically combined subsamples at a reduced sequencing cost.Because differences in taxonomic richness between pooled subsamples are no longer significant when similar sequences are collapsed into OTUs, nucleotide differences (<3%) among divergent lineages of the same species or cryptic species appear to drive differences observed at the ASV level.
The community composition inferred for our kelp holdfast communities remained similar regardless of the strategy used for pooling fractions, extractions, and PCRs and were consistent across ASVs and OTUs.Therefore, it appears that multivariate descriptions of community composition may be more robust to methodological or procedural biases than univariate biodiversity indices.If the main objective is to retrieve the greatest number of taxa possible, bioinformatically combining the fractions after sequencing separately would be recommended, especially at the ASV level, using the LULU algorithm or a moderate filtering by read abundance (e.g., 0.003%) to remove false positives.Nevertheless, if there are limits to resources or the primary interest of a study is focused at the OTU level, pooling the PCR replicates of the extractions for the large and small fractions before sequencing would recover a similar community composition.Such an approach would minimize sequencing costs, in favor of increasing the field sampling effort, and thereby potentially the overall richness captured by the study (Porter et al., 2019).
We chose to use COI as the barcode region for characterizing the kelp holdfast community because of its substantial representation in reference sequence repositories (Porter & Hajibabaei, 2018), its broad taxonomic coverage, and it has been shown to successfully discriminate among species (Andújar et al., 2018).However, our data and previous results suggest that reference databases are biased towards highly abundant macroorganisms (Wangensteen, Cebrian, et al., 2018) and lack reference sequences for small and cryptic species, which can also be a challenge to identify morphologically.In our case, only 15% of the species found in the morphology-based survey of kelp holdfast communities conducted by Anderson et al. (2005) have a reference sequence in MARES.Possibly as a consequence, the observed difference in taxonomic richness between the large and small fractions of the holdfast community was not evident when considering only the assigned ASVs or OTUs.Despite efforts to generate DNA barcodes for specific taxa and locations (Carew et al., 2017), DNA metabarcoding is still somewhat limited by incomplete reference databases (Curry et al., 2018;McGee et al., 2019;Morinière et al., 2019).Although reference databases continue to improve, taxonomy-free approaches (Apothéloz-Perret-Gentil et al., 2017;Mächler et al., 2020) enable some important biodiversity inferences, albeit without the tangible links to community function and resilience that require knowledge of species (or OTUs or ASVs) identity and ecology.
Using molecular tools for biomonitoring diverse assemblages is becoming more common, often detecting higher diversity than conventional morphology-based approaches (Deiner et al., 2017).Our metabarcoding approach retrieved almost two times the number of OTUs identified using conventional morphological surveys of kelp holdfasts and had broader taxonomic coverage (Anderson et al., 2005;Shum et al., 2019;Wernberg et al., 2019).An important reason is that the COI primer set used for the metabarcoding survey targeted a broader taxonomic range than the morphological survey.
Nonetheless, there were certain strengths unique to each approach.
For example, a higher diversity of taxa was found using community DNA for some groups, such as Porifera and Cnidaria, with morphological features that are not easily retained through common preservation techniques (e.g., freezing or fixation).Our community DNA approach also recovered Rhodophyta and Ochrophyta, Phyla that were not considered by the morphological survey despite being important components of kelp forest communities (Shum et al., 2019).
Considering the phyla commonly detected with both approaches, we found dissimilarities in the characterization of certain taxa.For example, Molluscs and Bryozoans were poorly represented in the metabarcoding survey relative to the morphology-based survey, potentially due to the strengths of the morphological taxonomists (Anderson et al., 2005), a true loss of diversity in those groups as the two studies were conducted 20 years apart, taxonomic biases in the extraction of DNA in mixed communities (Hermans et al., 2018), or introduced by primer choice (van der Loos & Nijland, 2021).In the latter cases, using alternative extraction techniques or using a combination of primers to target certain taxa may be helpful (Alberdi et al., 2018).
Overall, the metabarcoding approach captured a good representation of the known kelp holdfast diversity and proved more time-and cost-effective.For the groups considered by both survey methods, there were similar trends in the number of taxa recovered for the dominant phyla.For instance, Arthropoda, Annelida, Porifera, and Echinodermata-all abundant and important taxa in kelp forest ecosystems (Anderson et al., 2005;Wernberg et al., 2019)-were common in both surveys showing a high proportion of taxa.However, there was a higher proportion of undefined taxa using the metabarcoding approach, especially at lower taxonomic ranks.The limited ability to taxonomically assign the molecular OTUs, especially at lower taxonomic ranks, again, reflects gaps in reference sequence databases (Wangensteen, Cebrian, et al., 2018), particularly for marine species (Arranz et al., 2020;Leray & Knowlton, 2016).Ideally, both approaches would be used in tandem to further develop an understanding of their respective strengths and weaknesses and to provide a specimen reference collection corresponding to the sequence reference database to help increase the ability of community DNA studies to assign taxonomic identities to sequences.
The effectiveness of a biomonitoring strategy depends on the ability to detect diversity and change over time and space (Shum et al., 2019).Our study retrieved a high number of taxa within a relatively low number of kelp holdfast samples and demonstrated the ability to distinguish among holdfast community samples at small spatial scales (meters apart; Figure 4).The holdfast has been a key focus in ecological studies because it is convenient to sample, hosts a diversity of taxa (Anderson et al., 2005;Smith, 2000;Teagle & Smale, 2018), and because kelps are susceptible to environmental change, so too are their holdfasts (Smale et al., 2013;Teagle et al., 2017;Vergés et al., 2014;Wernberg et al., 2016).One of the drawbacks in using the holdfasts for monitoring, however, was the immense diversity they support, making morphological characterization of their associated biodiversity a highly intensive task.Our study reveals that community metabarcoding provides a means for the high-resolution characterization of biodiversity associated with holdfasts, thus making kelp holdfast assemblages an accessible barometer for monitoring biodiversity change in critically important and at-risk kelp forests.By carefully dissecting procedural sources of bias and determining cost-effective and reproducible methods, it shows promise that community DNA metabarcoding could provide a standardized and repeatable method for monitoring the biodiversity of these hyperdiverse marine communities.

ACK N OWLED G EM ENTS
We would like to thank Marie Wong, Negah Nikanjam, and Nuri Begum for their assistance in the use of the freeze-dryer machine at Massey University; Dave Wright and Emma Betty for help in the field collection, and Chandler Skinner-Horne, Maddie MacLean Stones, Jenny Ann Sweatman, Julia Kim, Hochang Yoo, Ella Lis, Beau Masters, Carrie Reyden, Georgia Hrstich, and Katie Littlewood for help with sample preparation.We are grateful to Ulla Von Ammon, Xavier Turón, Owen Wangensteen, Ramon Gallego, William Pearman, and Aimee Van der Reis for sharing their knowledge and providing advice; and Marti J. Anderson for providing advice and together with Wilma Blom for sharing the morphology-based kelp holdfast biodiversity data.Special thanks to Seacologynz.comand Allgenetic.eu for the images in Figure 1.Vanessa Arranz was supported by a Marsden Fund Fast-Start grant (14-MAU-077), managed by Royal Society Te Apārangi grant awarded to J. David Aguirre.

CO N FLI C T O F I NTE R E S T
We have no conflict of interest to disclose.

( 3 )
Bioinformatically combined fractions (BCF; n = 3) versus pooling PCR products before sequencing (MPF PCR ; n = 3): Do biodiversity estimates differ if DNA extractions of the large and small size fractions are amplified and sequenced separately and then bioinformatically combined, or if the large and small size fractions are manually pooled before sequencing?(4) Bioinformatically averaged fractions (n = 11) versus pooling DNA extractions before PCR (MPF EXT ; n = 6): Do biodiversity estimates differ if DNA extractions of the large and small size fractions are amplified and sequenced separately and then bioinformatically averaged, or if the large and small DNA extractions of the size fraction are manually pooled before PCR? (5) Pooling DNA extractions before PCR (MPF EXT ; n = 6) versus pooling the size fractions before DNA extraction (MPF COM ; n = 6): Do biodiversity estimates differ if the PCR products of the large and small size fractions are manually pooled before PCR, or if the large and small size fractions are manually pooled before DNA extraction?For DNA extraction, we examined differences between replicate extractions of the same community sample (Ext A and Ext B ) and the influence of bioinformatically averaging (BAE) and bioinformatically combining (BCE) sequence data for separately sequenced replicate extractions.Specifically, the contrasts we were interested in were.(6) Extraction A (Ext A ; n = 21) versus Extraction B (Ext B ; n = 21): Do biodiversity estimates of the two extraction replicates differ?(7) Bioinformatically averaged extraction replicates (BAE; n = 42) versus bioinformatically combined extraction replicates (BCE; n = 22): Do biodiversity estimates differ if the sequence data for the separately sequenced replicate extractions are bioinformatically averaged, or if the PCR products of the replicate extractions are bioinformatically combined?For PCR, we examined differences between replicate PCR products for the same extraction (PCR X and PCR Y ) and the influence of pooling replicate PCR products for the same extraction before sequencing (MPP PCR ), as well as bioinformatically averaging (BAP) and bioinformatically combining (BCP) sequence data for replicate PCR products following sequencing.Specifically, the contrasts we were interested in were (8) PCR replicate X (PCR X ; n = 6) versus PCR replicate Y (PCR Y ; n = 6): Do biodiversity estimates from the two PCR replicates differ?(9) Bioinformatically averaged PCR replicates (BAP; n = 12) versus pooling PCR replicates (MPP PCR ; n = 11): Do biodiversity estimates differ if the sequence data for the separately sequenced replicate PCR products are bioinformatically averaged, or if the PCR replicates are manually pooled before sequencing?(10) Bioinformatically combined PCR replicates (BCP; n = 6) versus pooling PCR replicates (MPP PCR ; n = 11): Do biodiversity estimates differ if the sequence data for the separately sequenced PCR replicates are bioinformatically combined, or if the PCR replicates are manually pooled before sequencing?
. conducted the laboratory, bioinformatic and biostatistical analysis, and wrote the first draft of the manuscript.J.D.A. secured project funding, collected the samples, designed and performed the biostatistical analysis, and edited the manuscript.L.L. advised the laboratory and bioinformatic protocols, designed the biostatistical analysis, and edited the manuscript.V.A, J.D.A, and L.L conceived the experimental design, analyzed the data, and designed and prepared figures and tables.
for details of how sampled holdfast communities were subsampled across treatments.BAE, bioinformatically averaged extraction replicates; BAF, bioinformatically averaged fractions; BAP, bioinformatically averaged PCR replicates; BCF, bioinformatically combined fractions; BCE, bioinformatically combined extraction replicates; BCP, bioinformatically combined PCR replicates; EXT A , Extraction replicate A; EXT B , Extraction replicate B; Lrg, large; Sml, small; MPF COM , pooling the size fractions before DNA extractions; MPF EXT , pooling DNA extractions before PCR; MPF PCR , pooling PCR products before sequencing; MPP PCR , pooling PCR replicates; PCR X ; PCR replicate X; PCR Y , PCR replicate Y for further results).BAE, bioinformatically averaged extraction replicates; BAF, bioinformatically averaged fractions; BAP, bioinformatically averaged PCR replicates; BCF, bioinformatically combined fractions; BCE, bioinformatically combined extraction replicates; BCP, bioinformatically combined PCR replicates; EXT A , Extraction replicate A; EXT B , Extraction replicate B; Lrg, large; Sml, small; MPF COM , pooling the size fractions before DNA extractions; MPF EXT , pooling DNA extractions before PCR; MPF PCR , pooling PCR products before sequencing; MPP PCR , pooling PCR replicates;PCR X ; PCR replicate X; PCR Y , PCR replicate Y Mean (±standard error) taxonomic richness of amplicon sequence variants filtered at 0.003% minimum read abundance and considering all reads for each of the 10 planned contrasts considered at the sample preparation, DNA extraction, and polymerase chain reaction (PCR) steps of community DNA metabarcoding approach.Dashed lines connect pairs of experimental treatments considered in each numbered contrast (see Section 2.2 for details).BAE, bioinformatically averaged extraction replicates; BAF, bioinformatically averaged fractions; BAP, bioinformatically averaged PCR replicates; BCF, bioinformatically combined fractions; BCE, bioinformatically combined extraction replicates; BCP, bioinformatically combined PCR replicates; EXT A , Extraction replicate A; EXT B , Extraction replicate B; Lrg, large; Sml, small; MPF COM , pooling the size fractions before DNA extractions; MPF EXT , pooling DNA extractions before PCR; MPF PCR , pooling PCR products before sequencing; MPP PCR , pooling PCR replicates; PCR X ; PCR replicate X; PCR Y