Translator Disclaimer
8 June 2016 Universal Multiplexable matK Primers for DNA Barcoding of Angiosperms
Author Affiliations +

The rapidly evolving and highly variable gene maturase K (matK; Hilu and Liang, 1997) has been recommended as a locus for DNA barcoding by the Consortium for the Barcode of Life (CBOL) Plant Working Group (Hollingsworth et al., 2009). Amplification and sequencing of the matK barcoding region is difficult due to high sequence variability in the primer binding sites (Hollingsworth et al., 2011). Currently, there are three popular matK primer pairs available to amplify approximately the same region of the gene: 390F and 1326R (Sun et al., 2001; Cuénoud et al., 2002), XF and 5R (Ford et al., 2009), and 1R_KIM and 3F_KIM (Hollingsworth et al., 2009; Jeanson et al., 2011). Kress et al. (2009) used these three primer pairs to amplify DNA barcodes from 296 shrub and tree species. These primer combinations showed amplification success in 85% and sequencing success in 69% of the species, proving that reliable amplification is possible across a range of plants, using several primer combinations. However, using more than one primer pair can be time consuming as well as costly and is often complex for large-scale projects (e.g., Heckenhauer et al., unpublished data).

Here, we report a set of universal primers that can be multiplexed in one PCR to amplify matK successfully in angiosperms and expedite high-throughput, rapid, automated, and cost-effective species identification. We present methods that enable efficient PCR amplification and sequencing of the matK barcode region.


Sequences of the matK gene from 178 taxa belonging to 123 genera and 41 families were obtained from GenBank (;  Appendix S1 (apps.1500137_s1.docx)) and aligned using the MAFFT plugin (Katoh and Standley, 2013) in Geneious (version 8.0.5; Kearse et al., 2012). Because primers were initially developed for a barcoding project dealing primarily with the tree flora of Southeast Asia, matK sequences of the most representative genera and families of dicots and monocots were used. The target DNA region was located between positions 383 and 1343 of the matK gene (with respect to Arabidopsis thaliana (L.) Heynh.) and includes the binding sites of the three commonly used matK primer pairs. Primers were designed at the most conserved regions, resulting in a fragment between positions 383 and 1256 (positions 414–1226, excluding the primer sequences). Forward primers are at a similar position to the 390F and XF primers, whereas the reverse primers are located downstream from the above-cited reverse primers to avoid a region of up to 11 adenine bases (e.g., Sterculia tragacantha Lindl. AY321178, positions 1257–1267). which could cause PCR and sequencing problems. To minimize primer degeneracy, aligned sequences were clustered into seven groups according to their genetic similarity in the MAFFT alignment, in which sequences are sorted according to their pairwise distances. Thus, for each cluster, primers with no more than five degenerate nucleotide positions were developed. Primers were developed manually considering primer properties (annealing temperature, 3′ and 5′ end stability) and primer secondary structures (cross dimers, dimers, hairpins) with the use of NetPrimer (PREMIER Biosoft International, Palo Alto, California, USA; Primers were designed at the same positions in the matK gene for the forward and reverse primers so that they could be multiplexed in a single PCR for each sample. Seven forward and seven reverse primers were developed. Because using more primer combinations in a multiplex PCR reduces the probability of the most appropriate primers binding to the target region, only five forward and five reverse primers for the most frequent sequences in our alignment were multiplexed (Table 1 : C_MATK_F/C_MATK_R). Primers were mixed in different ratios depending on their level of degeneration (Table 1). The remaining two forward and two reverse primers serve as spares for amplification of taxa that fail amplification using the previous five-primer combination. Primers were compared against the National Center for Biotechnology Information (NCBI) GenBank nucleotide reference database using the Mega BLAST algorithm ( Table 2 shows BLAST results with no mismatches in forward or reverse primers at the family level. Thus, in studies where the species are identified to family level, primers can be combined accordingly in a multiplex PCR. To evaluate the universality of the primers, multiplex PCR was conducted on DNA of 54 species from 48 families, representing frequently occurring trees and palms (e.g., Arecaceae, Dipterocarpaceae, Euphorbiaceae) in Southeast Asia (Table 3), along with other taxa from other parts of the world to improve the coverage of angiosperms (e.g., Leontodon [Asteraceae], Tillandsia [Bromeliaceae], Helianthemum [Cistaceae], Polystachya [Orchidaceae]). Approximately 30 mg of silica gel–dried material (bark or leaves) was transferred into a 96-well plate, and genomic DNA was extracted using the DNeasy 96 Plant Kit (QIAGEN, Hilden, Germany). PCRs included 5 µL of 2× ReddyMix PCR Master Mix with 1.5 mM MgCl2 (#AB-0575/DC/LD/A; Thermo Fisher Scientific, Waltham, Massachusetts, USA), 0.1 µL of forward and reverse primer cocktail each at 50 µM (final concentration 0.5 µM), 1 µL of template DNA, and H2O up to a final volume of 10 µL. Thermocycler conditions were as follows: 95°C for 2 min: five cycles of 95°C for 25 s, 46°C for 35 s, and 70°C for 1 min; 35 cycles of 95°C for 25 s, 48°C for 35 s, and 70°C for 1 min; and a final extension at 72°C for 5 min. For samples that did not amplify using the above-mentioned protocol, the 2× Phusion Green HS II Hi-Fi PCR Master Mix with 1.5 mM MgCl2 (#F-566S, Thermo Fisher Scientific) was used with the following thermocycler conditions: 98°C for 30 s; five cycles of 98°C for 10 s, 53°C for 30 s, and 72°C for 30 s; 35 cycles of 98°C for 10 s, 55°C for 30 s, and 72°C for 30 s; and a final extension at 72°C for 5 min. PCR products were visualized on a 1.5% TAE agarose gel using ethidium bromide staining. After cleaning the PCR products with 1 µL exonuclease I and FastAP thermosensitive alkaline phosphatase mixture (7 units Exo I, 0.7 units FastAP; Thermo Fisher Scientific) at 37°C for 45 min and 85°C for 15 min, barcodes were Sanger sequenced with the BigDye Terminator Kit version 3.1 (Thermo Fisher Scientific) according to the manufacturer's instructions. Sequencing was carried out using an ABI 3730xL DNA Analyzer (Applied Biosystems, Foster City, California, USA) at the Department of Botany and Biodiversity Research, University of Vienna. Bidirectional sequences were assembled in Geneious and edited.

Table 1.

Primers developed for multiplex PCR used to amplify the matK barcoding region. The forward (C_MATK_F) and reverse (C_MATK_R) primer cocktail as well as the four additional primers are given with their proportions in the primer cocktail.


Using 2× ReddyMix PCR Master Mix, all samples could be amplified except for one sample with low-quality DNA (Fig. 1, slot 30). This sample was successfully amplified in a PCR with 2× Phusion Green HS II Hi-Fi PCR Master Mix (Fig. 1, slot 31). Overall, the newly designed degenerate primer cocktails were very effective (100%) in amplifying the target matK region, with a product of 813 bp in length in Arabidopsis thaliana. By multiplexing the primers in a single PCR, barcodes were recovered from all samples.


We developed 14 universal, partly degenerate primers suitable for DNA barcoding of angiosperms that may also be suitable for multiplexed amplicon sequencing approaches on next-generation sequencing platforms (e.g., fusion primers on the Illumina system, see Elbrecht and Leese, 2015). We confirmed the effectiveness of our multiplexed primers on 53 species from 44 different plant families. Amplification success for these multiplexed primers in the cross-transferability tests with plant families outside Southeast Asia extends their potential usefulness, especially for large-scale barcoding projects with a diverse composition of plant families. Furthermore, by improving the routine amplification of the matK barcode, the establishment of our multiplex PCR approach will reduce laboratory costs as well as potential laboratory errors.

Table 2.

Recommended use of primers for different families, based on BLAST matches with no mismatches.a






Table 3.

Taxa used for primer testing.


Fig. 1.

Images of PCR amplicons for representatives of 53 angiosperm families using multiplex PCR with the newly developed degenerate primers (matK-413f-1 to matK-413f-5, matK-1227r-1 to matK-1227r-5). Bands are approximately 900 bp. Most of the samples were amplified using 2× ReddyMix. Low-quality DNA samples (slot 30) that failed PCR could be amplified using 2× Phusion Green HS II Hi-Fi PCR Master Mix (slot 31). For detailed sample description, see Table 3. Ladder: GeneRuler 100 bp Plus DNA Ladder (#SM0321; Thermo Fisher Scientific, Waltham, Massachusetts, USA). N = negative control.



This research was funded by the Austrian Science Fund (Fonds zur Förderung der wissenschaftlichen Forschung [FWE]; AP26548-B22). The authors thank Anton Russell for language editing.



Cuénoud, P., V. Savolainen, L. W. Chatrou, M. Powell, R. J. Grayer, and M. W. Chase. 2002. Molecular phylogenetics of Caryophyllales based on nuclear 18S rDNA and plastid rbcL, atpB, and matK DNA sequences. American Journal of Botany 89: 132–144. Google Scholar


Elbrecht, V., and F. Leese. 2015. Can DNA-based ecosystem assessments quantify species abundance? Testing primer bias and biomass— sequence relationships with an innovative metabarcoding protocol. PLoS ONE 10: 0130324. Google Scholar


Ford, C. S., K. L. Ayres, N. Toomey, N. Haider, J. Van Alphen Stahl, L. J. Kelly, N. Wikström, et al. 2009. Selection of candidate coding DNA barcoding regions for use on land plants. Botanical Journal of the Linnean Society 159: 1–11. Google Scholar


Hilu, K. W., and H. Liang. 1997. The matK gene: Sequence variation and application in plant systematics. American Journal of Botany 84: 830–839. Google Scholar


Hollingsworth, P. M., L. L. Forrest, J. L. Spouge, M. Hajibabaei, and S. Ratnasingham, M. van der Bank, M. W. Chase, et al. 2009. A DNA barcode for land plants. Proceedings of the National Academy of Sciences, USA 106: 12794–12797. Google Scholar


Hollingsworth, P. M., S. W. Graham, and D. P. Little. 2011. Choosing and using a plant DNA barcode. PLoS ONE 6: e1925. Google Scholar


Jeanson, M. L., J. N. Labat, and D. P. Little. 2011. DNA barcoding: A new tool for palm taxonomists? Annals of Botany 108: 1445–1451. Google Scholar


Katoh, S., and D. M. Standley. 2013. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Molecular Biology and Evolution 30: 772–780. Google Scholar


Kearse, M., R. Moir, A. Wilson, S. Stones-Havas, M. Cheung, S. Sturrock, S. Bixton, et al. 2012. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28: 1647–1649. Google Scholar


Kress, W. J., D. L. Erickson, F. A. Jones, N. G. Swenson, R. Perez, O. Sanjur, and E. Bermingham. 2009. Plant DNA barcodes and a community phylogeny of a tropical forest dynamics plot in Panama. Proceedings of the National Academy of Sciences, USA 106: 18621–18626. Google Scholar


Sun, H., W. McLewin, and M. F. Fay. 2001. Molecular phylogeny of Helleborus (Ranunculaceae), with an emphasis on the East Asian-Mediterranean disjunction. Taxon 50: 1001–1018. Google Scholar
Jacqueline Heckenhauer, Michael H. J. Barfuss, and Rosabelle Samuel "Universal Multiplexable matK Primers for DNA Barcoding of Angiosperms," Applications in Plant Sciences 4(6), (8 June 2016).
Received: 7 December 2015; Accepted: 1 February 2016; Published: 8 June 2016

Back to Top