Global occurrences of herbicide resistant weed populations have increased the demand for development of new herbicides targeting novel mechanisms of action. Metagenomic approaches to natural drug discovery offer potential for isolating weed suppressive compounds from microorganisms. In past research, traditional techniques entailed isolating compounds from living organisms, whereas metagenomic approaches involve extracting fragments of DNA from soil and exploring for compounds of interest produced by the transformed hosts. Several herbicidal compounds have been isolated from soil bacteria through culturing methods and have led to the development of popular herbicides, such as glufosinate. In this review, we discuss the emergence of metagenomic approaches for weed management in the context of natural product discovery using traditional culture-dependent isolation and the more recent culture-independent methods. The same techniques can be used to isolate herbicide resistance genes. Adoption of metagenomic approaches in pest management research can lead to novel control strategies in cropping and landscape systems.
Nomenclature: Glufosinate; bialaphos.
Since the mid-1990s, there has been a steep increase in the reported cases of herbicide resistant weed species. Effective control of weeds is compromised when plant populations develop resistance to common herbicides, such as glyphosate, dicamba, and 2,4-D. Glyphosate resistance has occurred in a wide range of weed species as cropping systems planted with glyphosate resistant crops have expanded across the agricultural landscape (Heap 2012). Plant populations that evolve to withstand glyphosate treatment require alternative herbicide inputs. However, the next generation of genetically modified (GM) crops designed to be resistant to auxinic herbicides may jeopardize even more control options relying on dicamba and 2,4-D (Mortensen et al. 2012). Additionally, overreliance on ALS, PPO, photosystem II, and glycine herbicides is likely to result in the rapid evolution and spread of multiple herbicide-resistant populations of weeds across the U.S. and globally.
The increased occurrences of herbicide-resistant weeds across all major modes of action threaten the productivity of cropping systems dependent on chemical weed control. A limited number of new herbicides have been developed in recent years, but no new modes of action have emerged in recent decades (Duke 2012). With fewer chemical compounds available for weed control, new strategies are needed to discover novel weed-suppressive chemistries. The development of metagenomic tools for natural product discovery shows great promise in confronting the increasing occurrence of herbicide resistance in weed populations. Historically, the development of new, herbicidal compounds isolated from microorganisms was plagued by time- and labor-intensive compound screening, isolation, and efficacy trials. New methods in molecular biology can accelerate the rate and scale of product development. For example, the discovery of antibiotics and enzymes produced by microorganisms using metagenomic techniques has transformed the biomedical and industrial fields in the past decade (Li and Vederas 2009). The adoption of metagenomic-based screening methods for new herbicidal compounds can make significant contributions to agriculture and pest management as well.
Metagenomics refers to the genomic assemblages of microorganisms isolated directly from their environment, without the need for prior culturing under laboratory conditions (Handelsman 2004). Thus the term “culture-independent” often accompanies descriptions of metagenomic techniques. In more recent times, metagenomics entails massively parallel sequencing of microbial metagenomes (Scholz et al. 2012), but functional analyses are still performed using clones. Functional screens can be set up to isolate clones containing genes coding for biosynthetic production of natural products (Brady et al. 2001).
Many of these natural products include antimicrobial and phytotoxic compounds. Pharmaceutical and chemical companies have used metagenomic techniques to screen for new antibiotics and enzymes (Li and Vederas 2009; Schloss and Handelsman 2003). Similar functional screens can be developed to isolate novel weed-suppressive compounds that target the spectrum of plant growth stages. An advantage to using a metagenomic-based functional screen is the increased potential to discover new biosynthetic gene clusters derived from soil bacteria. Soil actinomycetes, namely the Streptomyces spp., produce herbicidal compounds that have been commercially developed for weed control (Barazani and Friedman 2001). The few herbicidal compounds isolated from soil microorganisms to date have been based on culture-dependent methods of growing microorganisms in laboratory settings. With the development of metagenomics in the mid-1990s, we are now able to explore a greater pool (> 90%) of soil microorganisms producing compounds of use to agriculture, biomedicine, and industry.
Advances in molecular biology developed specifically for improved screening of microbially-derived herbicides can enhance the potential for weed control across a range of crop and noncrop systems. These tools can be useful in enhancing integrated pest management (IPM) programs involving rotations of herbicides and other natural products. The same set of techniques can be applied to isolate novel microbially-derived insecticides, antibiotics, and fungicides. In this paper, we first review the history of herbicide development using living microorganisms (bioherbicides) and herbicidal compounds produced by microorganisms (natural products). We then describe the methodology underlying metagenomics research and explain the array of screening methods used to isolate compounds of interest. We include examples of screening methods specific for herbicide discovery and isolating herbicide-resistance genes. We conclude the paper by discussing further advancements in metagenomics research using improved vector–host expression systems, advanced sequencing technology, and bioinformatics.
Development of Bioherbicides
In the past 20 yr, inundative biocontrol research has uncovered an abundance of microorganisms that suppress weeds across the full spectrum of plant growth stages. Inundative control refers to the use of massive numbers of biological agents to control weeds in the same year of application. In contrast, classic biocontrol takes 2 or more yr to suppress weeds. Most commercial bioherbicides are based on the concept of inundative control and are typically target-specific pathogens applied to weeds in a manner similar to using chemical herbicides. Much of the recent work on bioherbicide research has targeted major crops and turfgrass landscapes (Table 1).
Table 1.
List of bioherbicides developed commercially since the 1960s (adapted from Barton 2005).

Isolation of microbial strains involved in weed seed decay, germination inhibition, or germination arrestment provide insights into weed seedbank management (Banowetz et al. 2008; Einhorn and Brandau 2006; Kremer 1993). Numerous strains of bacterial pseudomonads isolated from the soil root zone of weedy plants have served effectively as PRE bioherbicides (Daigle et al. 2002; Kennedy and Stubbs 2007; Zdor et al. 2005). Many more microbial strains have been isolated and used for suppression of seedling and early root growth (Medd and Campbell 2005; Weissmann et al. 2003).
Despite the high potential of biological weed control, few commercial products are available for use by producers and the general public (Hallett 2005). A major obstacle is the extensive time involved with isolating microbial strains through an exhaustive screening process. In a typical experiment, over 1,000 microbial isolates are recovered using conventional agar plate growth conditions (Kennedy and Stubbs 2007). Screening each of the 1,000+ isolates for the potential to suppress weeds is a highly resource- and time-consuming process. Multiple screens narrow down the pool of candidates to a few microbial strains that are further tested under field conditions (Barazani and Friedman 2001). By the end of the screening process, less than 1% of the original pool of microbial isolates is suitable for further research as inoculants (Kennedy et al. 1991).
The screening process is followed by another round of laborious procedures that evaluate the bioherbicide's compatibility with industrial processing and marketing (Auld and Morin 1995). A bioherbicide is considered commercially viable if it is relatively inexpensive to manufacture, has a long shelf-life, and can be applied in the field with the existing farm equipment (e.g. mowing equipment, sprayers, and broadcast spreaders). Several of the bioherbicides studied were developed into commercial products for use in crop and noncrop systems and natural areas (Table 1). Consumer demand for biological weed control products also determines a bioherbicide's potential for commercial success. Recent regulatory bans on the use of pesticides on turfgrasses in Canada and the U.S. have created a demand for bioherbicides in North America (Arya 2005; Pralle 2006). This heightened interest in weed biocontrol will provide new impetus for the development of bioherbicides.
Many review papers provide comprehensive summaries of microbial strains isolated and developed for weed suppression in agricultural, forest, and rangeland settings (Boyetchko 1997; Culliney 2005; Kremer and Kennedy 1996; Sauerborn et al. 2007). Much of the groundbreaking work conducted on PRE bioherbicides focus on pseudomonads. Emergence of green foxtail [Setaria viridis (L.) Beauv.] was suppressed by as much as 90% with the application of Pseudomonas fluorescens in a formulated “Pesta” product (Daigle et al. 2002) and in semolina flour (Zdor et al. 2005). Additionally, P. fluorescens reduced germination of jointed goatgrass (Aegilops cylindrica Host) by over 30%, while several more isolates were effective in inhibiting root elongation of the weed (Kennedy and Stubbs 2007). POST bioherbicides that reduce seed production by mature weeds help reduce the number of seeds in the seedbank and may also reduce their persistence. The biocontrol rust fungus, Puccinia carduorum, was effective in reducing musk thistle (Carduus nutans L.) seed production by as much as 57% (Baudoin et al. 1993). Additionally, seed production by downy brome (Bromus tectorum L.) was decreased by up to 64% with the application of bacterial pseudomonads (Kennedy et al. 1991).
Although bioherbicide research has continued in recent times, the costs and risks of using living organisms often impede progress in commercial development. Nontarget host use is a major concern when using living organisms to control weeds. Biocontrol agents could adapt to their host and environment, leading to a potential risk in developing self-replication, high dispersal (into nontarget habitats), and host-switching over time (Louda et al. 2003). Metagenomic approaches to herbicide discovery avoid the risks of nontarget adaptations and irreversible introductions by isolating the compounds produced by clones. There is no risk of microorganisms becoming pathogens on nontarget plants if the secreted and purified compounds are used instead of the living organism.
Herbicidal Compounds Isolated from Soil Microorganisms
Many compounds demonstrating herbicidal activity are produced intracellularly by microorganisms and secreted through cell membranes into the surrounding environment. These include the production of toxic compounds that inhibit seed germination. Cyanide was identified as a growth-inhibiting compound released by bacteria to suppress weeds (Kremer and Souissi 2001). Cyanide is a potential inhibitor of enzymes involved in various plant metabolic processes. Other herbicidal compounds prevent the germination of seeds through inhibition or arrestment. For example, germination-arrest factor (GAF) has been isolated from P. fluorescens and irreversibly blocks the germination process (Armstrong et al. 2009). Further studies of GAF could yield the potential for commercial production as a PRE herbicide.
Additional compounds produced by microorganisms that have been purified and used for POST weed suppression are listed in Table 2. Herbicidin is an antibiotic compound obtained from Streptomyces saganonensis that inhibits several monocots and dicots (Cutler 1988). Other antibiotics shown to have herbicidal activity include blasticidin and 5-hydroxylmethyl-blasticidin S. These antibiotics are highly selective for dicots, inhibiting as much as 98% of the POST (one-leaf stage) plants examined (Scacchi et al. 1992). Moreover, nigericin, hydantocidin, and geldanamycin derived from Streptomyces hygroscopicus suppressed several annuals and perennials at the PRE and POST stages (Heisey and Putnam 1990; Nakajima et al. 1991). Methoxyhygromycin is an antibiotic produced by Streptomyces sp. 8E-12 that demonstrates weed suppressive abilities (Lee et al. 2003). The compound works as a bleaching herbicide that shows selective control of several monocot species.
Table 2.
List of herbicidal natural products derived from microorganisms (adapted from Barazani and Friedman 2001).

Other natural products derived from microorganisms have been developed for commercial use (Duke et al. 2000). Isolation of the phosphonate antibiotic, bialaphos, from Streptomyces viridochromogenes and Streptomyces hygroscopicus led to the development of the popular herbicide, glufosinate (Hoerlein 1994). Bialaphos consists of a glutamic acid analogue moiety called phosphinothricin, or better known as glufosinate. The compound inhibits glutamine synthetase and triggers a toxic ammonium buildup derived from photorespiration in a wide range of monocot and dicot plants. Glufosinate is sold commercially as Basta®, Buster® Finale®, Ignite®, and Liberty® for nonselective weed control in crop and noncrop landscapes. Glufosinate usage is expected to increase in the coming years as several GM crops have been engineered to be resistant to the herbicide. Additionally, thaxtomin A is an herbicidal compound being developed more recently for several crop species. The metabolite was isolated from Streptomyces acidiscabies and has shown high selectivity for common monocot and dicot weed control in rice crops and turfgrass landscapes (Strange 2007). As isolation techniques improve, many more discoveries of herbicidal compounds from strains are expected. Among these include tagetitoxin from Pseudomonas syringae pv. tagetis (Lydon et al. 2011).
A Case for Metagenomics in Weed Management
Advances in molecular biology in the past decade have allowed researchers to discover novel plant–microbial relationships relevant to weed control (Rector 2008). The expanding field of metagenomics allows both microbial identity and function to be determined by extracting DNA and RNA directly from the environment, which circumvents the need to isolate and culture microorganisms (Handelsman 2004). Direct isolation of DNA from the environment increases the diversity of microbial strains recovered. Metagenomic studies have shown that a single gram of soil holds from 2,000 to over 52,000 microbial species or genomes (Gans et al. 2005; Schloss and Handelsman 2006). This estimate far exceeds the number of microbial species found using traditional culturing techniques.
The massive diversity of bacterial and fungal genomes in soil provides a potentially vast pool of genes that code for the production of herbicidal compounds and herbicide resistance. As a consequence, two approaches to weed management can be developed using metagenomic techniques: (1) Isolation of novel herbicides produced by vector–hosts expressing the biosynthesis genes, and (2) Identification of herbicide resistance genes in vector–hosts exposed to high levels of an herbicide. Both approaches consist of different functional screening assays, as discussed in later sections, but the initial construction of the metagenomic libraries (clones containing fragments of the metagenomes) are identical.
Incorporating metagenomic approaches to a weed scientist's toolbox is feasible today as commercial kits for DNA extraction and cloning are widely available. For example, DNA extraction kits are available for different environmental samples, including soil, water, plant, and stool. Likewise, several vector–host kits are provided through a multitude of companies optimizing for different expression systems and insert lengths. In the following sections, we discuss the details of creating a metagenomics clone library, followed by descriptions of functional screening techniques. It is important to note that researchers are likely to provide subpools of metagenomic clone libraries upon request. We recommend contacting the corresponding authors of the papers we cite in the section on “Constructing Metagenomic Clone Libraries”.
Constructing Metagenomic Clone Libraries
DNA extracted from environmental samples is commonly analyzed for phylogenetics or function. While metagenomics refers to the collection of genomes in an environmental sample, many researchers amplify the 16S and 18S rRNA, rpoB, and recA genes for examining microbial diversity and phylogenetic relationships (Kämpfer and Glaeser 2012; Sleator 2011). Microbial function is examined through a variety of methods that target genes or gene clusters typically coding for the production of compounds and proteins. Depending on the gene or activity of interest, the probability of finding a target can vary. For example, the frequency of antibiotic discovery from soil actinomycetes ranges from 10−1 to 10−7, while rare and novel antibiotics occur at < 10−7 according to some estimates (Baltz 2006).
DNA Extraction
The initial step for constructing a metagenomics library is to extract DNA from environmental samples (Figure 1a). Commercial DNA extraction kits allow for rapid sample preparation, but the resulting DNA fragments are small at ≤ 15 kb in size. The biosynthetic genes for known natural products can be smaller than 15kb, but many are organized into operons with promoters and regulatory sites yielding total lengths ≥ 100kb. For example, biosynthetic genes for phosphinothricin, i.e. glufosinate total approximately 40kb (accession number AY632421.1, GenBank). To increase the probability of obtaining DNA for expression of a full biosynthetic pathway, extraction methods that minimize DNA fragmentation are the best way to obtain high molecular weight DNA.
Figure 1.
Diagram illustrating the layout of metagenomics research. DNA is first extracted from the environment (A) and then analyzed using sequencing (B and E) or function-driven methods (F). Sequencing using next generation technologies (B) can reveal environmental samples enriched with genomes containing polyketide synthases and nonribosomonal peptides (indicative of natural products). Restriction enzymes are used to cut fragments of DNA into large inserts for cloning vectors (C). Sanger-based sequencing of clones (E) has largely been replaced with next generation sequencing (B). Function-driven analysis is used to isolate natural products or resistance mechanisms in clones containing the large insert DNA (F).

Extraction methods of metagenomic DNA can be organized into two categories: direct lysis or cell separation. Direct lysis of DNA was pioneered by Ogram and coworkers (1987) using mechanical and/or chemical forces to gently lyse cells and collect DNA. These methods include a combination of enzymes, high temperatures, freeze/thaw cycles, grinding, and detergents. These techniques lead to some sheering of the DNA and yield a mixture of eukaryotic and prokaryotic DNA, which decrease cloning efficiency into prokaryotic host organisms (Gabor et al. 2003). In comparison, cell separation involves separating the prokaryotic cells from the soil matrix prior to DNA extraction (Holben et al. 1988). While this technique yields mostly prokaryotic DNA of longer lengths, DNA yields 10- to 100-fold lower than direct lysis are typical (Courtois et al. 2001). Each extraction technique has advantages and disadvantages; further research is necessary to overcome the downfalls of each technique (Delmont et al. 2011).
Sequence-Based and Functional Approaches
Following DNA extraction, the sample can be sequenced following PCR (Figure 1b), or cloned into a host organism for sequencing or functional analysis (Figure 1e,f) (Suenaga 2012). Sequencing provides an analysis of the metagenomes in your environmental sample, whereas function-based cloning reveals the activity of a gene or gene cluster when expressed in a compatible host. Pioneering sequencing work required that environmental DNA fragments were ligated and transformed into Escherichia coli cells and then sequenced using Sanger-based technology (Rondon et al. 2000) (Figure 1e). Next generation sequencing (NGS) technology, such as Roche 454, Illumina, or Life Technologies SOLiD allows for sequencing without the need for cloning DNA fragments into E. coli (Pareek et al. 2011; Scholz et al. 2012) (Figure 1b). NGS technologies are more affordable, provide greater reads, more depth of coverage, and lower error rates compared with Sanger-based methods but the read lengths have been limited to short reads of 35 to 250bp until recently (Pareek et al. 2011). NGS sequencing typically requires PCR with barcoded primers, yielding amplified products that can be separated and categorized following sequencing (Mamanova et al. 2010).
Functional analysis involving expression of genes or gene clusters in microbial hosts involve two main steps: (1) insertion of DNA fragments into a cloning vector by ligation (Figure 1c) and (2) introduction of the vector into a suitable host organism through transformation (Figure 1d). Vectors can be chosen by their insert carrying capabilities and include plasmids (< 15kb), fosmid/cosmid (< 40kb), or bacterial artificial chromosomes (BAC) (> 100kb). A single vector can include a metagenomic insert containing an entire gene cluster coding for the production of a natural product (Handelsman et al. 1998). The vectors are circular pieces of self-replicating DNA with genes encoding antibiotic resistance which allows for selection of transformed cells. Following ligation, the vector is incorporated into a host organism. A large variety of host organisms are available including E. coli, Streptomyces lividians, Agrobacterium tumefaciens, Burkholderia graminis, Pseudomonas putida, and Saccharomyces cerevisea.
Proper ligation and insertion of metagenomic DNA into a host does not guarantee expression (Craig et al. 2010; Ekkers et al. 2012). An organism's transcription and translation processes are specific to taxonomic relatives, and differences occur as codon biases, post-translation modifications, and regulation factors. Most metagenomic studies use E. coli as a vector–host but genes coding for metabolite production may not be expressed in E. coli clones. In such cases, modified vectors can be shuttled into secondary hosts that enhance expression. Shuttle cosmid vectors are gaining wider attention as it allows E. coli metagenomic libraries to be screened in many different hosts, including S. lividans (McMahon et al. 2012). Using actinomycetes as vector–hosts can improve expression of metabolite encoding genes since many antibiotics isolated from microorganisms are derived from Streptomyces spp. Once the vector is taken up by the host, each cloned cell contains a single fragment of metagenomic DNA. The cells can be plated, clones propagated, and a large amount of a single environmental gene is collected. The different colonies can be scraped from plates, pooled in a single tube, and stored cryogenically for subsequent analysis.
Functional analysis can be conducted with a combination of targeted sequencing and host expression screening. Blodgett and coworkers (2005) completed the bialaphos biosynthetic operon by screening a fosmid library with primers, searching for known sections to complete the missing components of the operon. If the expressed compound is already known, activity or protein probes can be designed to find clones containing the biosynthetic genes of interest. DNA from Pantoea agglomerans, a known producer of two antibiotics, was used for a cosmid library in E. coli. The clone library was screened for the expected antibiotic activity, and the positive results were used to generate antibiotic defective marker exchange mutants for further molecular biology studies (Wright et al. 2001). Sequence based screening is effective for examining relevant genes, but functional screening based on expression in hosts is most suitable for the discovery of natural products, such as novel herbicides.
Types of Functional Screens for Natural Product Discovery
Functional-based screening allows for the discovery of compounds with new modes of action, structure, or genetic sequence. Functional screening techniques can be divided into three groups: (1) direct detection, (2) indirect detection of activity, and (3) substrate induced detection (Ekkers et al. 2012). It must be noted that not all compounds are secreted by the host organism; therefore, some research works to circumvent this problem (Suenaga et al. 2007). Many natural compounds of interest are classified as secondary metabolites that are those produced during the stationary phase of growth and are not necessary for growth. As shown in Figure 2, these secondary metabolites, also known as small-molecules, are indicated by phenotypic changes in clones. Pigmentation, colony morphology, and antibiosis are common examples (Brady 2007; Craig et al. 2010). Antibiosis, the ability of a metabolite to prevent growth or kill another organism, can be detected by overlaying clones with another microbial culture and observing zones of clearing around colonies (Figure 2). In addition, direct activity can be seen using a medium containing a colorimetric substrate or an opaque quality, and searching for a change in color or zones of clearing, respectively. Substrates can be purchased that are covalently linked to colorimetric or fluorometric molecules to help with detection.
Figure 2.
Metagenomic library clones displaying phenotypes indicative of small-molecule production for natural product discovery. The common phenotypes include: zones of clearing from antibiotic production (A), pigmented clones (B and D), and changes in colony morphology (C). The boxes labeled “WT” refer to the wild type form of the host. The images are reprinted with permission from the American Society for Microbiology (ASM) from Craig et al. (2010).

Indirect detection of activity is conducted through heterologous complementation, reporter genes, and promoter trap assays. Heterologous complementation is the use of a host organism constructed to lack a gene required for function. If the metagenomic DNA insert contains the necessary genes for production, the clone will grow where others cannot because of functional complementation (Wenzel and Müller 2005). Simon and coworkers (2009) used E. coli polA mutants to screen for DNA polymerase-encoding genes from glacial ice. Reporter genes use colorimetric (eg. lacZ gene and β-galactosidase) or fluorometric (e.g. green fluorescent protein (GFP), biosensor) screens. When the reporter microbe is grown as an overlay on top of the metagenomic library clones, specific clones of interest will cause the reporter strain to fluoresce or change color. This technique has been successful for finding genes and proteins involved with quorum sensing (Hao et al. 2010) and quenching (Romero et al. 2011). Indications of bacterial quorum sensing are useful for identifying antibiotic production in metagenomic clones. Another example of indirect activity assay is through promoter-trap vectors where the vector contains the GFP gene but no promoter. Therefore, GFP will not be expressed until a promoter is present in the DNA fragment (Dunn et al. 2003; Lee et al. 2011).
The third category of functional screens examines the regulation of an unknown gene's expression. Substrate induced gene-expression screening, i.e. SIGEX, is based on the assumption that catabolic genes are usually regulated by the corresponding substrate. The initial study using SIGEX involved benzoate and naphthalene as substrate inducers, allowing unknown genes for aromatic compound degradation to be found (Uchiyama et al. 2005). A similar screen is based on metabolite-regulated expression, i.e. METREX, using a reporter plasmid in the host organism which is directly regulated by compounds that induce quorum sensing (Figure 3). Once a threshold of the inducer is produced within the cell, GFP will be expressed (Guan et al. 2007). A third assay, PIGEX, looks at product rather than substrate induced gene expression (Uchiyama and Miyazaki 2010).
Figure 3.
The plate shows a biosensor screening assay developed for quorum sensing (METREX). The metagenomic clone of interest is streaked alongside control clones containing an empty vector (A). The pigmented clone, circled, induces quorum sensing of neighboring colonies (B). Quorum sensing is indicated by green fluorescent protein (GFP) expression of the biosensor plasmids and fluoresces green. The images are reprinted with permission from the American Society for Microbiology (ASM) from Guan et al. (2007).

SIGEX, METREX, and PIGEX do not require dilution and plating of clones, rather the fluorescent cells can be found and collected using fluorescence activated cell sorting (FACS). Initial work with cell sorting involved E. coli mutant libraries and sorting by fluorescence or cell size (Link et al. 2007). Several types of functional screens are tied closely with FACS, especially SIGEX, PIGEX, and METREX. The methods allow for the separation of individual positive cells which can then be pooled and studied further through sequencing or functional analysis (Uchiyama et al. 2005; Uchiyama and Miyazaki 2010). Microorganisms with fluorescence in vivo, such as cyanobacteria, can be sorted with FACS, followed by library building or proteomic research (Mary et al. 2010). Cell sorting can be combined with metabolite screening to assess gene expression. For example, promoter-trap vectors have been applied to search for promoters responding to naringenin, a compound released by plant roots. FACS was used following additions of naringenin to find genes induced by the compound (Lee et al. 2011). As cell sorting technology improves, phenotypes other than fluorescence could be added to the molecular toolbox.
Understanding Antibiotics Discovery as a Model for Herbicide Isolation
The history of natural product discovery, especially antibiotics, provides useful insight into the search for biocontrol agents and bioherbicides. Many of the herbicidal compounds isolated from culturable bacteria are classified as antibiotics (Table 2); therefore it is important to understand the screening methods used for isolating novel antibiotics having the potential to suppress weeds. Following the discovery of penicillin in 1928 by Alexander Fleming and streptomycin in 1943 by Selman Waksman and Albert Schatz, antibiotics have continued to be isolated, purified, and identified. The “golden age” of antibiotic discovery in the 1940 to 1950s focused on isolating pure cultures, screening whole cells or supernatants against microorganisms, and observing cell viability over time. As this process yielded fewer results, researchers turned to chemical engineering and derivitazation of known compounds. Of 109 new clinical drugs developed between 1981 and 2006, natural products or semisynthetic derivatives comprised 68% of this total (Newman and Cragg 2007; Overbye and Barrett 2005;. In the last decade, attempts were made in the pharmaceutical industry to use gene-centric, not activity-based, screens for antibiotic discovery (Hugenholtz and Tyson 2008). For more than 5 yr, both companies GSK and Pfizer were unable to develop a single clinical antibiotic using gene-centric approaches (Miller and Miller 2011). Most recently, industry has turned to whole cell activity screens, verifying antibacterial activity before investing in further analyses (Miller et al. 2009).
The variety of microbially-derived antibiotics discovered from culture-dependent methods suggests that metagenomic approaches can yield greater discoveries of herbicidal compounds. A DNA fragment inserted into the microbial host may contain a gene cluster coding for the production of an antibiotic. If the gene is expressed in the microbial host, the herbicidal antibiotic could be isolated and purified from the metagenomic clone. The use of metagenomic functional screens to probe for antibiosis have yielded several new antibiotics (Table 3). Two studies in 2000 pioneered the use of metagenomics for antibiotic discovery (Brady and Clardy 2000; Wang et al. 2000). Wang and coworkers (2000) discovered terragine A-E using S. lividians as a host. Only 1,020 clones were screened by analyzing individual clone fermentation products with mass spectrometry, with a 1.8% positive frequency. The following study demonstrated high-thoroughput screening of 700-fold more clones when soil-derived DNA was used to create a cosmid library in E. coli. When screened against Bacillus subtilis, antibacterial activity was observed in 65 of 700,000 (0.09%) clones as a result of N-acyl tyrosines secretion (Brady and Clardy 2000). Other antibiotic discoveries using metagenomic approaches followed, including the discovery of violacein and deoxyviolacin (Brady et al., 2001), fatty acid enol esters (Brady et al. 2002; Brady and Clardy 2003), palmitoylputrescine (Brady and Clardy 2004), isocyanide derivatives of tryptophan (Brady and Clardy 2005a,b), and fasamycins A and B (Feng et al. 2012). Most of the metagenomic studies on antibiotics were derived from bulk soil samples. To date, soil collected near plant roots (rhizosphere soil) has not been examined for metagenomic-based natural products. Many root-associated microorganisms evolve specificity with their host plant (Haichar et al. 2008); thus rhizospheric soil could generate metagenomic libraries useful for isolation of plant host-specific herbicides.
Table 3.
List of antibiotics isolated from microorganisms using metagenomic techniques. “ns” refers to “not specified”. Many herbicidal secondary metabolites isolated from microorganisms are classified as antibiotics (see Table 2).

Metagenomic approaches have been useful also in isolating antibiotic resistance genes from microorganisms (Donato et al. 2010; Schmieder and Edwards 2012). Bacteria evolve quickly to resist antibiotics produced by other populations of bacteria, but the mechanisms underlying the arms race are not always clear. Functional screens of metagenomic libraries are used to identify resistance genes. Only clones able to express unknown proteins and mechanisms to avoid the antibiotic present in a medium are able to grow. The individial clone is then analyzed in depth for the genes, proteins, and mechanisms responsible for resistance. Such discoveries can enhance the effectiveness of existing antibiotics (Monier et al. 2011; Schmieder and Edwards 2012). A similar screening technique could assist in isolating herbicide resistance genes in soil microorganisms when metagenomic clones are exposed to a medium containing high levels of the herbicide.
The metagenomics of natural product discovery are not restricted to antibiotics, but include any active compound that is isolated from microorganisms with commercial potential. Aside from antimicrobials, many studies have focused on finding hydrolytic enzymes useful to industry such as cellulases, lipases, proteases, chitinases, and esterases (Li and Vederas 2009). The screening assay consists of growth on a medium containing the substrate of interest where degradation causes a zone of clearning or colormetric reaction. The ease of the screening technique allows for high-throughput screens of > 100,000 clones. Some screens are designed to search for common gene motifs including those common to secondary metabolite production, such as polyketide synthases and nonribosomonal peptide synthetases (Donadio et al. 2007). These compounds catalyze chain elongation from simple building blocks to create many types of natural products based on a core chemical structure. Other screens involve scouting unique environmental niches for novel enzymes with extreme temperature or pH tolerance, e.g. Antarctic desert soil (Heath et al. 2009), glacial ice (Simon et al. 2009), buffalo rumen (Duan et al. 2009; Singh et al. 2012), and even earthworm castings (Beloqui et al. 2010). It is estimated that only 1 to 2% of small molecule natural products produced by microoganisms has been found (Watve et al. 2001; Baltz 2006).
Examples of Using Metagenomic Approaches for Pesticides Research
Metagenomic approaches have been used to find biosynthetic genes or compounds of interest for biocontrol and pesticides in agricultural systems (Table 4). Morgan and coworkers (2001) collected genetic material from three strains of Xenorhabdus nematophilus, a pathogen of cabbage white butterflies (Pieris brassicae) and isolated the biosynthetic genes and proteins associated with the insectidal activity. More than 500 individual E. coli clones were screened for oral insecticide activity, yielding two positives, one with only < 49% genetic homology to a known insecticide (Morgan et al. 2001). A fosmid library from the genetic material of Penicillium coprobium PF1169 was screened using colony PCR on small pools of 100 clones to find the biosynthetic genes of an insecticide, pyripyroprene. With the knowledge of other pyripyropenes, the researchers were able to design primers to target clones and then introduce the positive clone's vector into a model fungus, Aspergillus oryzae, for mechanistic studies (Hu et al. 2011). A third study focused on a bacterial pathogen, Serratia entomophila strain Mor4.1, of several soil pests from the Phyllophaga and Anomala genera. DNA was isolated, used to generate a fosmid library, and individual clones were injected into the insect larvae as a screening technique. Proteins in the cell membrane were found to be toxic and could provide a starting structure for biocontrol design efforts (Rodríguez-Segura et al. 2012). Additional studies on microbial-derived pest management strategies are reviewed elsewhere (Montesinos and Bardaji 2008; Tikhonovich and Provorov 2011).
Table 4.
Metagenomic studies relevant to the agricultural sciences. The targeted genes included in the list were isolated from large-insert clone libraries. “ns” refers to “not specified”.

Metagenomic studies have been developed to examine plant pathogens with the goal of finding the genes involved with pathogenicity. The earliest studies in 2000 were on a virulence and xylanase deficient mutant of Xanthomonas oryzae pv. oryzae and wild-type Xanthomonas campestris pv. vesicatoria. While the study on X. oryzae pv. oryzae focused on virulence genes (Ray et al. 2000), the other study focused on finding the genes conveying hypersensitivity responses in the tomato (Lycopersicon esculentum L.)(Astua-Monge et al. 2000). Additional studies have found and used avirulence genes in pathogens such as Phytophthora infestans (Whisson et al. 2001), X. oryzae pv. pryzae (Ochiai et al. 2001), and Ustilago hordei (Linning et al. 2004). Other researchers have sought genes encoding secretion proteins important to toxin release (Bell et al. 2002; van Dijk et al. 2002). Plant pathogen metagenomic studies have focused on specific, targeted analyses with isolated cultures. A similar, targeted approach with noxious weed species could prove beneficial for herbicide discovery. More expansive studies on compounds facilitating plant–microbe interactions have yet to be conducted.
Research into the cause of phytopathogen-suppressive soil has recently moved toward metagenomic techniques (Hjort et al. 2010; van Elsas et al. 2008a,b). Phytopathogen-suppressive soil does not contain a single activity of interest, rather it comprises a range that suppresses the growth and activity of plant pathogens including xenobiotic degradation, antibiosis, or antibiotic resistance. A large project funded by the European Union titled METACONTROL was a collaborative effort of seven European labs and yielded promising results with optimization of techniques and screening assays (van Elsas et al. 2008a). In addition to one control soil, four different soils known to suppress pathogens were the source of metagenomic DNA for clone libraries. Clones were screened against known phytopathogens and observed for antibiosis activity resulting in < 0.05% positive clones across all libraries. Additional screening was done using primers specific for polyketide synthetase genes and positive hits increased to 0.22% (Ginolhac et al. 2004, 2005; van Elsas et al. 2008b). A more recent study focused on the diversity of chitinase genes in disease suppressive soil (Hjort et al. 2010). Past successful research in the agricultural fields using metagenomic approaches provide a basis for similar scenarios with herbicide research, including targeted weed-suppression studies.
Developing Functional Screens for Weed Management
Countless screening assays specific to weed management can be developed using metagenomic clone libraries. Traditional assays for determining phytotoxic activity can be adopted for screens with vector–host systems. Additionally, functional screening methods for discovering herbicide resistance genes are analogous to techniques used successfully in the search for antibiotic resistance genes from soil. Although not discussed here, sequence-based screens are available when prior knowledge of herbicide biosynthetic or resistance genes are available, including conserved but unique genetic motifs. We aim to highlight the simplicity of functional screens that can be adopted by any weed scientist in search for novel herbicide compounds and herbicide resistance genes.
Isolating Novel Herbicides
Although the process of developing metagenomic clone libraries remains a time intensive procedure, screening the libraries can be simpler and approachable for weed scientists. In order to screen individual clones, the first step to any functional screen is to dilute and plate the clones onto the relevant selective medium (Figure 4). In our work, a metagenomics clone library was constructed from rhizosphere soils of common ragweed (Ambrosia artemisiifolia L.) using the CopyControl Fosmid kit. The kit uses the pCC1FOS vector (Epicentre, Madison, WI) containing chloramphenicol resistance. To ensure that the fosmid containing the environmental DNA is maintained by the host, growth and screening of any clone must be maintained on chloramphenicol. By serial dilution and plating, individual colonies, each containing a unique fragment of environmental DNA, can be selected at random for functional screens or chosen by small molecule phenotype characteristics (Brady 2007; Craig et al. 2010). Invention of novel plating and separation techniques in this part of the assay should work on optimizing a primary screen for clones with secreted compounds or those of interest to the researcher. Separated clones are grown in large, liquid volumes and can be screened as diluted cell suspensions, supernatants, or crude extracts. In all situations, background effects of the microbial cell or metabolites must be monitored via controls.
Figure 4.
Schematic showing preparation of clone libraries and the screening assays for herbicide detection. Prior to functional screens, the metagenomic clone library is diluted and plated on the appropriate medium. Individual clones will yield single, separated colonies which can be selected based on small-molecule phenotypes or chosen at random. Clones are then grown in large liquid cultures for testing as cell suspensions, supernatants, or crude extracts. Screens for herbicidal activity have previously been established including: seedling growth (A), germination in a 96-well plate (B), Lemna minor (C), and leaf spot assays (D). L. minor (C) and leaf spot assay (D) figures are reprinted courtesy of Jiang et al. 2012 and Win et al. 2003, respectively.

Plant toxicity protocols have been standardized by both American and International organizations, and are easily adapted to functional screens. Established protocols include standardization of experimental conditions, recommendations for plant species to use, and many possible response measurements (Figure 4). Data are presented as a percent effect difference from a control plant, linear or nonlinear regression analyses, or tabulated seedling/root lengths and dry mass. Percent of germination at endpoint sampling also is another relevant measurement in addition to determining the lowest concentration with herbicidal effect and the highest concentration with no herbicidal effect. From these observations, concentrations of unknown compounds able to decrease growth of a plant by 50%, EC50, can be determined (ASTM 2009; ISO 2005a; 2012a,b; OECD 1984). Recent research commonly uses this information to test the effect of compounds on model plant species such as rice, lettuce, and cucumber in addition to plants of interest to the researcher (Hillis et al. 2011; Liu et al. 2009).
Greenhouse studies are impractical as an initial screening assay; the frequency of candidate clones have been < 1% in other screens for natural products. To increase the likelihood of finding herbicidal compounds, a high-throughput screen will provide the most success. Germination or early seedling assays can be adapted to a 96-well plate which allows for rapid dilution and application of clone cells, supernatants, or extracts in a replicable manner. Another method makes use of duckweed (Lemna minor L. or Lemna gibba L.) (Figure 4). Duckweed is a small, aquatic plant with small fronds, leaf-like structures, able to be grown in a 96-well format. Frond number, size, and color are assayed using a dissecting microscope or digital image analyzer (EPA 2012; ISO 2005b). Another bioassay adapted to the 96-well format is based on a cell suspension of algae. A variety of algal species have been used including Scenedesmus subspicatus, Pseudokirchneriella subcapitata, Chlamydomonas reinhardii, and Chlorella pyrenoidosa. As higher amounts of a phytotoxic compound are added to an algal culture, more of the cells die, resulting in lower dry biomass (ISO 2012c; OECD 2011). Using algal bioassays, traditional toxicity measurements such as EC50 and regression equations have been calculated for a variety of known herbicides (Ma et al. 2002). Both algal and duckweed bioassays show their own unique sensitivities and have been commonly used for analyzing toxicity in water based systems (Fairchild et al. 1997; Mohammad et al. 2005).
Identifying Herbicide Resistance Genes
The use of GM crops has become a prevalent feature in the U.S. agricultural landscape. Herbicide resistance genes isolated from soil microorganisms provide a valuable source for engineering new GM crops resistant to herbicides (Lucas 2011). Similar to the search for antibiotic resistance genes and mechanisms, E. coli clones can be grown on a medium containing the phytotoxic or herbicidal compound. Colony growth on the medium could indicate the presence of a gene or gene cluster either resistant to the herbicide or able to metabolize the herbicidal compound. Another screen can be developed using herbicidal compounds modified with colorimetric additions to indicate any excessive changes in the host's metabolism. Previous research has discovered glyphosate resistance and glyphosate-degrading abilities using microorganisms (Jin et al. 2007; Staub et al. 2012; Sun et al. 2005).
If the environmental gene is from a eukaryotic organism, the gene will not likely be expressed in E. coli. Therefore, choosing a vector to allow transfer between host organisms or even to screen the library in a eukaryotic host, e.g. Saccharomyces cerevisiae, will increase the chance of expression. Hosts that are filamentous fungi, in the genera Aspergillus or Trichoderma, are also available. Once herbicidal resistance genes are found, transfer of the genes into Agrobacterium tumefaciens would allow for introduction into model plants and extended physiology studies on the mechanisms involved with resistance. Prior knowledge of the plant physiological responses to a given herbicide could assist with functional screening designs for discovering mechanisms of resistance. Each of these possible assays has limitations or biases and the researcher should aim to minimize these drawbacks to increase their chances of success.
Further Improvements in Metagenomics Research
Metagenomics research continues to expand in applicability across many areas of study as vector–host expression systems, sequencing technology, and bioinformatics continue to improve. The use of metagenomics in natural product discovery has tremendous potential, but isolation of the compounds is often limited by expression in hosts (Ekkers et al. 2012). New vector–host systems provide opportunities to expand the spectrum of expression, as genetic material may show host specificity. Previous research has confirmed that genes of interest commonly expressed in one host remain undetected when tested in other hosts (Craig et al. 2009, 2010; Wang et al. 2006). Overcoming such limitations involve adapting the insert DNA prior to ligation and transformation or modifying the host's ability to accept and express foreign DNA. One study described a method to select environmental DNA based on the GC content and matched the DNA to a vector–host system best suited for the specific GC percentage (Holben 2011). Details of the DNA insert are unknown with most functional screening methods, which create challenges for optimizing expression of genes of interest. In some screens, vectors have been designed for straight-forward shuttling between a broad range of hosts, such as the Gateway cloning system (Katzen 2007). Although vector transfer between hosts has become simpler, screening for functional activity in multiple hosts requires massive screening experiments (Craig et al. 2010). Technology that increases the speed of screening will permit more researchers to attempt functional screens using multiple hosts.
Previous estimates have suggested that finding novel, rare antibiotics in actinomycetes requires screening > 107 clones (Baltz 2006). Advances in robotics, microfluidics, and screen designs will increase the rate of clones analyzed and potentially the discovery of novel products. Many functional screens are easily adapted to 96- or 384-well plate formats, which increases the speed of screening without the need for robotics. For example, clones expressing cellulases are found by dilution of the clone library and plating onto hundreds of petri dishes containing opaque medium with cellulose. Positive clones are indicated by a zone of clear medium around a colony and they are individually selected and analyzed by the researcher. When the screening technique was developed using a soluble, colorimetric analog of cellulose, robotics were able to process greater quantities of samples for cellulase production (Mewis et al. 2011).
Complete environmental metagenomes will become easier to obtain as sequencing technology advances, yet assembly and analysis of sequence data sets are currently limited. Single-molecule–based sequencing is in development, and potentially increases the flow of data (Wang et al. 2009). However, high error rates create challenges for widespread adoption of single-molecule sequencing technology in de novo sequencing applications (Metzker 2010). Current bottlenecks in metagenomic sequence analyses include the development of appropriate software programs, data storage capacities (computing server limitations), and lack of adequate reference organisms (sequences with no known homolog or close taxonomic relatives) (Scholz et al. 2012). Sequence analysis software for metagenomic datasets can be improved by focusing on extreme environments or ecosystems comprised of low microbial diversity. Characterization of less complex metagenomic datasets using acid mine drainage biofilms provide promising opportunities for improving functional and taxonomic analyses (Denef et al. 2010). Improved screening techniques involving high-throughput sequencing of gene targets open the potential for prescreening using bioinformatics, followed by functional assays. For example, the sequencing data could provide researchers with an initial screen indicating that a particular environment is enriched in genomes containing polyketide synthases and nonribosomonal peptides (indicative of natural products).
Greater potential for discovery of novel compounds lies in the combination of functional metagenomics with various ‘omic’ techniques to characterize a microbial community. For example, research utilizing a diverse suite of techniques has revealed novel metabolic networks in deep sea sediment aggregates comprised of unculturable microorganisms, known as anaerobic methane oxidizing archaea (ANME). Metagenomic analysis of aggregates known to contain ANME was used to design fluorescent oligonucleotide probes for identifying the spatial locations of the unculturable organisms. The same aggregates were treated with the stable isotope 15N and visualized with nanoscale secondary ion mass spectrometry (Nano-SIMS) to integrate data on fluorescence microscopy with elemental stable isotopes and sequence-based taxonomy (Pernthaler et al. 2008). Further studies with similar environmental samples have combined metagenomics with metaproteomics (Stokke et al. 2012) or mRNA expression analysis (Meyerdierks et al. 2010). Larger environmental niches, such as the open ocean (Shi et al. 2011) or human gastrointestinal tract (Serino et al. 2012) have also been analyzed with transcriptomics and genomics to reveal novel microbial community traits.
Integrating both cultivation-dependent and cultivation-independent molecular techniques can yield innovative strategies to isolate microbial derived compounds for weed control. With increasing concern over the risks of synthetic herbicides to humans and the environment along with the development of herbicide resistance, novel control strategies for weed management are required and warrant further research. The metagenomic techniques described in this review have yielded new antibiotics, pesticides, and enzymes. Many of the antibiotics derived from Streptomyces spp. have been shown to exhibit weed suppressive abilities. Adoption of metagenomic techniques can be targeted for isolating weed suppressive compounds specific to highly noxious and herbicide-resistant weeds. Extending metagenomic approaches to IPM could enhance the portfolio of weed control strategies across a dynamic range of ecosystems and land uses.
Acknowledgments
We thank Jo Handelsman for providing initial comments on the manuscript. We thank an Associate Editor and two anonymous reviewers for providing suggestions that improved the revision of the manuscript. We apologize to the researchers whose papers were not cited in this review because of page limitations. The IPM metagenomic project is supported by USDA Hatch 145403.