Reconstructing the avian tree of life has become one of the major goals in ornithology. The use of genomic tools seemed a promising approach to reach this goal, but, instead, phylogenetic analyses of large numbers of genes uncovered high levels of incongruence between the resulting gene trees. This incongruence can be caused by several biological processes, such as recombination, hybridization, and rapid speciation (which can lead to incomplete lineage sorting). These processes directly or indirectly amount to deviations from tree-like patterns, thereby thwarting the use of phylogenetic trees. Phylogenetic networks provide an ideal tool to deal with these difficulties. We illustrate the usefulness of phylogenetic networks to capture the complexity and subtleties of diversification processes by discussing several recent genomic analyses of birds in general and the well-known radiation of Darwin's finches. With the increasing amount of genomic data in avian phylogenetic studies, capturing the evolutionary history of a set of taxa in a phylogenetic tree will become increasingly difficult. Moreover, given the widespread occurrence of hybridization and the numerous adaptive radiations in birds, phylogenetic networks provide a powerful tool to display and analyse the evolutionary history of many bird groups. The genomic era might thus result in a paradigm shift in avian phylogenetics from trees to bushes.
The most iconic drawing in evolutionary biology was scribbled around July 1837 in a notebook by Charles Darwin. The drawing depicts a crude evolutionary tree with the words “I think” above it. In The Origin of Species, he further developed this idea, which was already circulating in scientific circles in pre-Darwinian times (Archibald 2009), into the metaphor of the tree of life (Darwin 1859):
The affinities of all the beings of the same class have sometimes been represented by a great tree. […] As buds give rise by growth to fresh buds, and these if vigorous, branch out and overtop on all sides many a feebler branch, so by generation I believe it has been with the great Tree of Life, which fills with its dead and broken branches the crust of the earth, and covers the surface with its ever branching and beautiful ramifications.
Reconstructing the tree of life has become one of the major goals in evolutionary biology, but is the tree of life still viable in a phylogenetic context with high levels of interspecific gene exchange? Should we abandon the tree of life metaphor and turn to a network approach?
Until the 1970s, evolutionary trees were largely based on the analysis of morphological characters. The use of molecular data in phylogenetics led to a revolution. The most influential methods were protein electrophoresis in the late 1960s and 1970s, restriction fragment length polymorphism (RFLP) analyses in the 1970s and 1980s, and PCR-mediated DNA sequencing in the 1990s (Avise 2004, Kraus and Wink 2015). At first, a few genes became reference markers. For instance, the gene that encodes the small subunit ribosomal RNA (SSU rRNA) was extensively used for phylogenetic analyses of microorganisms and led to the discovery of a third domain of life, the Archaea (Woese and Fox 1977). But as more genes were sequenced and analysed, it became clear that different genes often result in discordant gene trees (Pamilo and Nei 1988, Maddison 1997).
The advent of multilocus data showed that the occurrence of phylogenetic incongruence (i.e. analyses of different genes resulting in discordant gene trees) is a common and widespread phenomenon (Rokas et al. 2003). Such incongruence can be caused by analytical shortcomings (Rokas et al. 2003, Davalos et al. 2012) or can be the result of biological processes, such as horizontal gene transfer, hybridization, incomplete lineage sorting, and gene duplication (Pamilo and Nei 1988, Maddison 1997, Degnan and Rosenberg 2009). Several methods have been developed to estimate a species tree from a collection of discordant gene trees (Delsuc et al. 2005, Degnan and Rosenberg 2009, Liu et al. 2015). The construction of a species tree from several discordant gene trees is based on the assumption that the underlying evolutionary process is tree-like. But such phylogenetic trees are less suited to depict reticulate events, such as recombination, horizontal gene transfer, and hybridization. In addition, some evolutionary mechanisms, such as incomplete lineage sorting, gene duplication, and gene loss, result in incompatibilities that cannot be easily represented by a species tree. Phylogenetic networks provide an ideal tool to deal with these difficulties.
A phylogenetic network is defined as “any network in which taxa are presented by nodes and their evolutionary relationships are represented by edges” (Huson and Bryant 2006). Phylogenetic networks can be used in 2 main ways: either to represent incompatibilities within and between datasets (implicit or abstract networks), or to represent the occurrence of reticulate events in the evolutionary history of a group of taxa (explicit networks). These networks are also called split networks and reticulate networks, respectively (Huson et al. 2010).
The tree of life houses several events of reticulate evolution. For example, the eukaryotic cell is probably the outcome of endosymbiosis between distantly related prokaryotes, also leading to the conversion of free-living bacteria into cell organelles, such as mitochrondria and chloroplasts (Margulis 1993, Gupta and Golding 1996). Furthermore, in the prokaryotic realm, horizontal gene transfer (i.e. the transfer of genetic material between distantly related lineages) is a common phenomenon (Gogarten et al. 2002, Gogarten and Townsend 2005, Andam and Gogarten 2011a, 2011b). Similarly, in eukaryotes, interspecific gene transfer by means of introgressive hybridization has been documented in numerous taxa (Anderson 1949, Dowling and Secor 1997, Mallet 2005). Moreover, several plant (Rieseberg 1997, Hegarty and Hiscock 2005) and animal (Mallet 2007, Mavarez and Linares 2008) taxa are probably of hybrid origin. For example, the Italian Sparrow (Passer italiae) is probably a hybrid species between House Sparrow (P. domesticus) and Spanish Sparrow (P. hispaniolensis; Elgvin et al. 2011, Hermansen et al. 2011). These examples indicate that many complex and successful lifeforms, such as the eukaryotic cell, could not be possible without reticulate evolution.
Apart from reticulate events, incompatibilities between gene trees can also be caused by other processes, such as incomplete lineage sorting. Several studies using multilocus data reported high levels of incomplete lineage sorting, hampering the estimation of species trees (e.g., Pollard et al. 2006, Willis et al. 2007, Kutschera et al. 2014, Barker et al. 2015). So, along with reticulation, incomplete lineage sorting results in a deviation from a tree-like depiction of evolutionary histories in a species tree. The tree of life might thus be better represented as the “net of life” (Doolittle 1999, Martin 1999, Kunin et al. 2005). Furthermore, in combination with the analysis of retrotransposons, phylogenetic networks can be used to quantify the degree of incomplete lineage sorting and to estimate the duration of the speciation process (Hallstrom and Janke 2010, Suh et al. 2015).
From an analytical point of view, phylogenetic networks may also be an improvement on classical phylogenetic tree analyses. With the rapid growth of genomic data, sampling error (i.e. random error resulting from small sample sizes or short sequence reads) is becoming less of an issue, whereas systematic error (i.e. wrong assumptions in the underlying model of sequence evolution, leading to artefacts and biases in phylogenetic inference) is becoming increasingly important (Felsenstein 2004, Delsuc et al. 2005). Unlike sampling error, systematic error cannot be avoided by increasing sequence length. Phylogenetic tree-building methods attempt to fit a tree to the data, even if a significant gap exists between the resulting tree and the data, possibly leading to phylogenetic artefacts (Steel 2005).
Model-based split networks are able to deal with systematic error by adding extra parameters to the evolutionary model (Huson et al. 2010). Phylogenetic inference comprises 2 kinds of parameters: those describing the evolutionary model (e.g., substitution rates) and those describing the topology (e.g., branch lengths). Evolutionary models based on split networks contain extra topology-related parameters (allowing reticulation) that may lead to a better fit to the data. Several studies have shown that split networks fit the data better than phylogenetic trees and that network analyses can uncover phylogenetic signals missed by tree-based methods (Esser et al. 2004, Kolaczkowski and Thornton 2004).
Despite the usefulness of phylogenetic trees to depict reticulate events, to quantify incomplete lineage sorting, and to deal with systematic error during phylogenetic inference, one main issue currently remains: there is as yet no standard way to interpret a phylogenetic network. What caused reticulations in a particular phylogenetic network? Hybridization? Incomplete lineage sorting? Analytical issues? Disentangling these processes and quantifying the relative contribution of each is challenging and requires the development of new tools and algorithms (Huson et al. 2010). This situation is similar to the mismatch between the rapid progress of next generation sequencing techniques and the relatively slow development of software to analyse the increasing amount of genomic data. The algorithms for estimating phylogenetic networks have not yet reached the complexity of phylogenetic tree methods, but this field of research is growing rapidly (e.g., Cardona et al. 2015, Huber et al. 2016, Solis-Lemus and Ane 2016).
But what about birds? Are birds also entangled in this net of life? Based on the recent surge of avian genomic data (Joseph and Buchanan 2015, Kraus and Wink 2015), we argue that modern avian phylogenetics warrants a phylogenetic network approach to complement the classical (and still useful) concept the phylogenetic tree. We illustrate this with 2 examples: the contrasting results from 2 recent phylogenomic studies (Jarvis et al. 2014, Prum et al. 2015), and the outcome of a genomic perspective on the radiation of Darwin's finches (Lamichhaney et al. 2015).
Joseph and Buchanan (2015) called it “a quantum leap in avian biology,” the simultaneous publication of several papers (27 in 8 journals) based on a genomic dataset of 48 bird species. One of these papers (Jarvis et al. 2014) presented a new and updated avian tree of life. A couple of months later, however, another avian tree (Prum et al. 2015) was published, with some contrasting results. For example, Jarvis et al. (2014) reported a well-supported clade consisting of the Hoatzin (Opisthocomus) as the sister group of plovers (Charadrius) and cranes (Grus), whereas Prum et al. (2015) identified the Hoatzin as a sister group of the core landbirds.
The contrasting results can be caused by analytical shortcomings (e.g., long branch attraction) or can be the result of biological processes, such as hybridization, incomplete lineage sorting, and gene duplication. The rapid diversification of modern birds after the mass extinction event ∼66 million years ago (i.e. the K-Pg boundary) could lead to short internal branches and high levels of incomplete lineage sorting (Degnan and Rosenberg 2009, Rosenberg 2013). Suh et al. (2015) quantified the amount of incomplete lineage sorting along the Neoaves phylogeny using presence/absence data for 2,118 retrotransposons. They uncovered discordant phylogenetic signals near the initial K-Pg radiation and at the base of 2 other radiations that gave rise to the core landbirds and the core waterbirds. They conclude that “as a consequence, their complex demographic history is more accurately represented as local networks within a species tree” (Figure 1). The complexity of this radiation was already apparent in a previous, although limited, analysis of retrotransposons (Hernandez-Lopez et al. 2013).
This example shows the possible effect of incomplete lineage sorting during the diversification of modern birds, but we cannot rule out the possibility of introgressive hybridization, which can result in similar patterns (Maddison 1997). More detailed analyses are necessary to disentangle the relative contributions of analytical issues, incomplete lineage sorting, and hybridization during this rapid radiation. This analysis has been completed for the more recent adaptive radiation of Darwin's finches on the Galapagos Islands (Almen et al. 2016). Using whole-genome resequencing data of 120 individuals, Lamichhaney et al. (2015) found evidence for extensive interspecific gene flow throughout the radiation. They constructed a phylogenetic network from autosomal genomic sequences to display the conflicting signals at the internal branches caused by incomplete lineage sorting and hybridization (Figure 2).
Other studies have also described complex evolutionary histories with high levels of gene flow and incomplete lineage sorting for several groups of closely related bird species (Carling et al. 2010, Hung et al. 2012, Lavretsky et al. 2014). The evolutionary histories of these bird groups have all been forced into a phylogenetic tree, whereas a phylogenetic network may have been a better option to capture the complexity and subtleties of the diversification processes. Traditionally, speciation has been viewed as the splitting of an ancestral population into 2 reproductively isolated species, a process that can easily be depicted as a bifurcation (Dobzhansky 1937, Mayr 1942). Recent genomic studies have shown, however, that speciation is a dynamic and complex process in which the incipient species often continue to exchange genes before they reach complete reproductive isolation (Nosil 2008, Mallet et al. 2016, Pinto et al. 2016). With the increasing amount of genomic data in avian phylogenetic studies, capturing the evolutionary history of a set of taxa in a phylogenetic tree will become increasingly difficult. Given the widespread occurrence of hybridization (Ottenburghs et al. 2015) and the numerous adaptive radiations (Jetz et al. 2012) in birds, phylogenetic networks will provide a powerful tool to display and analyze the evolutionary history of many bird groups. The genomic era might thus result in a paradigm shift in avian phylogenetics from trees to bushes.
We are grateful for the helpful suggestions of 2 anonymous reviewers.
Funding statement: This research was funded by Stichting de Eik.
Author contributions: J.O. conceived the idea and wrote the paper. P.v.H., S.v.W., R.C.Y. and H.H.T.P. commented on earlier versions of the manuscript.