Birds have become increasingly prominent in studies focusing on natural populations and their coevolved pathogens or examining populations under environmental stress from novel and emerging infectious diseases. For either type of study, new DNA-based diagnostic tests, using the polymerase chain reaction (PCR), present challenges in detecting the DNA of pathogens, which exist in low copy number compared with DNA of the host. One example comes from studies of avian malaria: conflicting claims are made by different laboratories about the accuracy of tests using various sets of primers and reagents, especially in relation to blood smears and immunological methods. There is little standardization of protocol or performance among laboratories conducting tests, in contrast to studies of human malaria. This review compares the problems of detecting avian malaria with those of detecting human malaria, and shows definitively that the buffer used to store blood samples following collection is associated with the accuracy of the test. Lower accuracy is associated with use of a lysis buffer, which apparently degrades the DNA in the blood sample and contributes to inhibition of PCR reactions. DNA extraction and purification techniques, and optimization of the PCR reaction, do not appear to be alternative explanations for the effect of storage buffer. Nevertheless, the purest DNA in standard concentrations for PCR is required so that different primers, DNA polymerases, and diagnostic tests can be objectively compared.
Emerging infectious diseases will increase in importance in avian biology as pathogens and vectors expand their ranges and contact naive hosts in association with changes in habitat and climate (Daszak et al. 2000, Dobson and Foufopoulos 2001, Harvell et al. 2002). When bird species and communities become novel hosts, coevolutionary processes may lead to changes in virulence of the pathogen (Ewald 1994, Combes 2001, Frank 2002) and increasing tolerance and resistance in the host (Shehata et al. 2001, Woodworth et al. 2005). This arms race should provide molecular geneticists with opportunities to apply sensitive new assays with relevance to all areas of organismal evolution. Parasites and pathogens are important selective agents that shape the fitness of individuals in the context of both life history and mating strategies (Clayton and Moore 1997, Hamilton 1980, 1982, Hamilton and Zuk 1982, Loye and Zuk 1991, Møller 1997, Valkiūnas 2005).
Pathogens and parasites also have important implications for the viability of populations (Cooper 1989, McCallum and Dobson 1995, Newton 1998). Time is required for rare beneficial mutations that can confer tolerance or resistance to a disease to appear in new avian hosts. Insufficient time may result in population declines and extinctions. Hawaiian birds are a prime example of an isolated island fauna adversely affected by introduced malaria (Plasmodium relictum) and pox virus (Poxvirus avium) transmitted by an introduced mosquito vector (Culex quinquefasciatus, van Riper et al. 1986, 2002, Freed 1999, Freed et al. 2005). Currently, continental birds (and humans) are being exposed to West Nile virus (Peterson et al. 2004) and avian influenza virus H5N1, pathogens that have crossed the avian-human barrier with anthropogenic assistance. Mycoplasmal conjunctivitis is an avian disease that has acquired new species of hosts naturally (Farmer et al. 2002). In all cases, the polymerase chain reaction (PCR) has been used in genotyping both host and pathogen.
This review will address specifically the technical problems associated with detecting the presence of malarial pathogens in the blood of birds. It will emphasize the critical stages affecting test accuracy, beginning with transfer of a blood sample to a buffer storage solution after collection through to the extraction of DNA in the laboratory. However, the issues discussed will include general considerations of sample quality for any molecular analysis involving PCR (Erlich 1989, Arnheim et al. 1990, Saiki 1990, Wilson 1997), and pertain to any use of PCR for diagnosing infection status of birds for any pathogen (Persing et al. 2003). This review is intended to advise avian geneticists, and those interested in avian genetics, that even in the age of PCR, the quality of the starting material determines the results and limits conclusions that could or should be drawn about the accuracy of any disease diagnostic.
Issues Associated with Accuracy
Accurate detection of malaria in blood samples from host species is important in many ways. Estimates of prevalence in natural populations are parameters in mathematical models of infectious disease (Bailey 1975) and depend on reliable determination of the infection status of each individual sampled. Documentation of genetic diversity of parasites within and among host populations (Ricklefs and Fallon 2002, 20Fallon et al. 2003, Bensch et al. 2004), and host specificity of parasite lineages (Bensch et al. 2000), require the same reliable determination. Missed detections reduce the scientific value of any study.
PCR, with appropriately designed primers, has the potential to estimate prevalence of parasites from host tissue samples and identify coevolutionary trends. It can detect the presence, dead or alive, of any stage of Plasmodium in a blood sample, except in infections with the lowest parasitemia. Its inherent shortcomings are: 1) extreme sensitivity in generating false positives if standard precautions are not in place, 2) false negatives arising from early stage infections before parasites are circulating in the peripheral blood, and 3) inability to estimate intensity and stage of infection. Knowledge of stage requires microscopic inspection of blood smears, from which the cell types of the parasite can be identified within host cells and surrounding plasma (Valkiūnas 2005). Intensity can be quantified as number of parasites per standard number of red blood cells. There are many issues associated with the accuracy of PCR, as with any other diagnostic. A test is considered accurate if it results in the correct identification of both true positive individuals and true negative individuals, as a proportion of all test results. The sensitivity of a test is defined as the proportion of positive cases correctly determined out of the total positive and false negative cases determined by some other diagnostic standard.
The problems, false positives and false negatives, that can occur in malaria diagnostics fall into four categories (Table 1). We draw a distinction between genuine problems and apparent problems. Genuine problems are clearly due to inadequate primer design, poor laboratory procedure, variation in enzymatic activity among batches of DNA polymerases from the same manufacturer, and degraded samples. These problems will vary among laboratories according to experience, luck, and skill. Apparent problems are caused by the limits of technology and expertise in microscopy. These problems are shared by all laboratories.
Genuine false negatives can result from poor template quality to which the primers cannot anneal (Freed and Cann 2003), primers that fail to recognize variation in the parasite template that influences annealing, contaminants that inhibit PCR, and suboptimal reaction conditions (Erlich 1989, Freed and Cann 2003). Blood contains PCR inhibitors such as heme and cytochromes (Higuchi 1989). The protein heparin (an anticoagulant), which is used to coat the walls of microhematocrit tubes frequently used to collect samples, is also a PCR inhibitor (Holodniy at al. 1991). If DNA extraction does not isolate the DNA from these compounds, the inhibitors will generate false negatives. Bacterial or soil contamination of field samples, different batches of reaction tubes from different suppliers, and even glove powder have all been reported to efficiently inhibit PCR by different investigators (Wilson 1997).
Genuine false positives can result from contamination in the thermal cycler, test reagents, or pipetmen, wherein products from one amplification, involving a positive sample, are carried over into the next cycle (Reynolds et al. 1991, Freed and Cann 2003). Damaged DNA (from an infected individual) can promote jumping artifacts, amplified as a result of recombination between two illegitimately paired strands with primer annealing sites (Paabo et al. 1989, Paabo 1990, Ruano et al. 1991), and these also can be carried over as a contaminant. We consider these genuine false positives because they are correctable in the short term by improved primer design and by field and laboratory techniques that minimize contamination and optimize reaction conditions. Good laboratory techniques include optimum extraction and purification protocols that yield pure, high-molecular-weight target DNA, followed by use of UV light, clean laminar flow hoods, dUTP, uracil-N-glycosylase, or bleach in setting up amplification reactions. Reaction conditions are optimized by the use of aerosol-resistant tips, hot-start conditions, and gradient thermocyclers to investigate assay performance over a range of temperatures, Mg2+ ion concentrations, and cycle times. An excellent example of a genuine false positive in avian PCR can be found in Jarvi et al. (2002) and was reviewed by Freed and Cann (2003). The assertion made by Atkinson et al. (2005) that PCR can generate false positives based on primers used refers to Jarvi et al. (2002), which is based on incorrect use rather than an inherent property of the primers involved.
Correcting apparent false negatives and false positives requires methodological breakthroughs or greater training in identifying the blood parasites present in the population. For example, improvements in enzymatic properties of Taq (Thermus aquaticus) DNA polymerase that influence template binding, fidelity of replication, stability, and movement along the template, or in features of thermocyclers that improve cycling performance and uniformity across the heating block, may facilitate detection of malaria in samples with extremely low parasite numbers. Internal positive and negative reaction controls should be at the limit of detection of the assay. A negative control that demonstrates successful amplification of a higher copy number target from a sample that yielded a negative test for the parasite may help identify the presence of inhibitors. A positive control that demonstrates successful amplification of a diluted sample that yielded a positive test for the parasite may help with troubleshooting and optimizing the reaction for low parasite numbers. Apparent false positives may be based on a second species of blood parasite that is detected by PCR, but is missed by microscopy. More detailed study of blood parasites from hosts in a given geographic region can identify and confirm a second blood parasite that is detected by PCR. False positives in human malaria PCR were shown to have been caused by detection of mixed infections (Snounou et al. 1993, Tirasophon et al.1994). They were thus erroneously categorized as false.
We argue that, at a minimum, accuracy of a PCR implementation should be reflected by its sensitivity relative to smears made from the same blood samples. This is a restricted use of sensitivity, but the principle is straightforward: genuine false negatives for samples in which there is sufficient parasitemia to be seen under a microscope should alert investigators to additional false negatives and prompt a search for sources of inhibition that are reducing test sensitivity, including a thorough investigation of reaction conditions and DNA purification methods. This might also entail examining the thermal characteristics of test instruments, trying a new polymerase, or incorporating reaction additives. All human and avian PCR diagnostics can identify more positive samples than can traditional microscopy (human: reviewed by Weiss 1995, Makler et al. 1998, Hanscheid 1999; bird: Feldman et al. 1995, Jarvi et al. 2002, Richard et al 2002, 20Fallon, Ricklefs et al. 2003, Waldenström et al. 2004). In addition, serial dilution experiments reveal that PCR tests have the potential to detect malaria at very low levels of parasitemia (Feldman et al. 1995, 20Fallon, Ricklefs et al. 2003, Waldenström et al. 2004). Thus, PCR tests should be extremely sensitive and genuine false negatives relative to microscopy represent a real shortcoming of the laboratory implementation of PCR.
The sensitivity of a test has both a potential component (based on serial dilutions) and a realized component (based on acute infections detectable by microscopy). The accuracy of a test, based on genuine false negatives for acute infections, can be viewed as a link between potential and realized sensitivity. Given the complexity of PCR, failure to detect acute infections is almost certainly correlated with even greater failure to detect chronic infections, when the low copy number of target molecules must exist in a high background of nontarget or competing target DNAs.
Viewed in this light, PCR diagnostics for malaria in birds are quite problematical. Most studies of avian and human malaria identify genuine false negatives (Table 2, Fig. 1), but the problem is more common in bird malaria diagnostics (Fig. 1). Analysis of arcsine-transformed proportions of 18 human studies and 10 avian studies (Fig. 1) indicates that human malaria PCR is significantly more accurate than avian malaria PCR (t26 = 3.2, P = 0.004). It is important to understand why this difference exists.
PCR Diagnostics in Humans and Birds
For detecting human malaria, nested PCR is currently considered the gold standard (Ndao et al. 2004). Nested PCR involves two rounds of amplifications, using two sets of primers. The first set spans a larger target, and the second set binds to a smaller target within the larger target during a second amplification reaction. The virtually perfect accuracy of recent human PCR diagnostics reflects the importance of malaria as a global disease that affects millions of people (Anderson and May 1991), and thus the resources available for perfect diagnoses. Researchers have been able to evaluate and purchase different manufacturer's products, including thermocyclers, automated extraction and purification systems, and DNA polymerases isolated from diverse extremophiles.
In addition to the obvious disparity in economics, we believe there is a systemic reason why avian malaria PCR diagnostics are more experimentally difficult to perfect than PCR diagnostics for human malaria. The biggest problem for PCR diagnostics in birds compared with humans may be due to characteristics of the cell populations in the hematopoietic systems. Birds have nucleated red blood cells and thrombocytes, whereas humans and other mammals lack nuclei in red blood cells and platelets (Smith et al. 2000). In a human blood sample, only white blood cells provide host nuclear DNA, while all human and bird cells contribute mitochondrial (mt-) DNA. Table 3 shows the number of cells that contribute to total nuclear genomic DNA in a blood sample of 1 mm3. There are approximately 727 times more cells yielding amplifiable products in an avian blood sample than in a human sample of comparable size. Because the human genome is approximately twice the size of the avian genome (Singer and Berg 1991, Stevens 1996), this reduces the difference in the amount of nuclear DNA by half. Assuming the distribution of mtDNA is the same in blood cells of humans and birds, the difference in total cell number means that birds have 1.03 times the amount of mtDNA as humans in a blood sample of comparable size. These calculations reveal that birds have over two orders of magnitude (350×) more DNA in a peripheral blood sample than humans.
Given that a sample of bird blood has much more host template than a mammalian sample for a similar level of parasitemia, the avian sample dilutes the parasite:host ratio of DNA by more than two orders of magnitude. Initial conditions of the PCR reaction are thus biased against detecting malaria in birds compared with humans. This bias is especially important for understanding the limitations of test performance at the low levels of parasitemia associated with chronic infections. While the factors that can lead to false negatives and false positives are the same for human and avian malaria PCR diagnostics, the same factors will cause greater problems for the avian malaria PCR, because avian malaria PCR diagnostics must detect a template at a relatively lower copy number. In this respect, avian malaria diagnostics share some of the problems of detecting DNA in forensic and ancient samples (Sensabaugh 1994). Therefore, concentration of primers, Mg2+ optimization, cycling specifics, dilution of samples, and purity of DNA appear, on first principles, to be more important for studies of avian malaria than human malaria and the PCR protocols used for birds must be more exacting than those employed in studies of malaria in mammals.
Variation in Accuracy of PCR Diagnostics in Birds
There are several issues associated with comparing the accuracies of the different avian malaria studies in Table 2. A simple comparison of confidence intervals indicates that the accuracies are indistinguishable with the exception of the last two listed. But this approach does not address the possibility that the perfect accuracy achieved by two laboratories (a boundary condition), or higher accuracy than average, may reflect something in common that differs from other laboratories. Logistic regression can use factors such as primer types and buffer components to determine if the overall fit of the model to the data is significantly improved by the incorporation of these factors. A logistic regression used in this way is like a meta-analysis of different studies that independently attempt to achieve high accuracy.
We approached the comparison of protocols here by examining the performance of various tests based on the different buffers used for sample storage. Placing the blood sample in buffer is the first step after collection relative to DNA extraction, purification, and the PCR protocol. The buffer components from studies in Table 2 are shown in Table 4, along with the highest accuracy a laboratory achieved with that buffer. Note that no two buffers are identical, but they share some common ingredients and features. All buffers contain Tris (Trizma Base) and EDTA (ethylenediaminetetraacetic acid), which serve to maintain appropriate ionic strength and a physiological pH of 8, and also inactivate nucleases in the blood and buffer solution. Some buffers, called lysing buffers (Seutin et al. 1991), also contain SDS (sodium dodecyl sulfate), which destroys various membranes within and around cells. This lysis releases nucleases (enzymes which will degrade the nucleic acid targets of PCR). In principle, the nucleases would be kept inactivated by Tris-EDTA (with EDTA effectively removing divalent cations such as magnesium from the solution and rendering the endonucleases inert). The primary advantage claimed for lysing buffer is that it eliminates the need to keep samples refrigerated (Seutin et al. 1991).
Nonlysis and lysis buffers represent alternative strategies for dealing with blood storage prior to arrival at the laboratory. The first keeps membranes intact until further steps in the DNA extraction protocol, but with EDTA to inactivate nucleases from any cells that might be incidentally damaged from a needle in the wing vein or from exiting the collection tube. The second dissolves membranes chemically before any DNA extraction begins.
To determine if storage buffer was associated with test accuracy, we performed logistic regressions of study accuracy in relation to buffer type. For the first analysis, we used the highest accuracy reported by each laboratory, regardless of the primers employed. The model included accuracy as the dependent variable and buffer type as the independent variable. Accuracy of tests using lysing buffer was significantly lower than that of tests using nonlysing buffer (P = 0.005). The only studies with perfect accuracy were the ones using Tris-EDTA without SDS. Each of these two studies used a different primer, one based on a nuclear template (Feldman et al. 1995), and the other on a mitochondrial template (Waldenström et al. 2004).
For the second analysis, we restricted data to those studies in which the same primer pairs were used with different buffers. The model included primer type and buffer type as the independent variables. The order of entry of these variables was reversed in two applications of the model (Table 5). When primer type was entered first, both effects were significant. The nuclear primers were more accurate than the mitochondrial primers, but the buffer type effect accounted for much more of the deviance than primer type. Nonlysis buffers (those without SDS) were associated with higher accuracy, regardless of primer used. When buffer type was entered first, the analysis showed that there was a highly significant buffer effect, which accounted for most of the deviance, with only a marginally significant primer type effect. The accuracy of avian malaria studies thus appears to depend more on the buffer used than on the primers used.
Rapid lysis buffer was not designed to be used for sensitive diagnostics. Tris-EDTA may not be able to deal with the host of nucleases and other enzymes and compounds released during lysis that do not normally come into contact with DNA and may react with it. It is important to realize that inactivation is an equilibrium condition, and that Mg2+ will switch between being chelated to Tris-EDTA and being made available to the nucleases. The rate of switching will increase with warmer temperatures. In addition, not all nucleases require magnesium for activation. Studies that utilized mtDNA markers from blood samples stored in lysis buffer did not have an exacting need for biological integrity of sampled tissues, because the target sequences were such a high proportion of the total genomic DNA that small differences in stability and quality of starting material were simply not a problem for most laboratories. A certain percentage of degraded host DNA will still leave sufficient intact host DNA for numerous types of genetic analyses, including restriction, ligation, and amplification. However, a comparable percentage of degraded pathogen DNA in a sample from a bird with a chronic infection may result in too little intact pathogen DNA to effectively amplify in the PCR test. The relationship between accuracy and buffer from samples of birds with acute infections indicates that the same problem can occur in these cases as well, based on the low parasite to host ratio of DNA in birds with even acute infections.
We use our own experience and that of colleagues to further the comparison between non-SDS and SDS buffers. Non-SDS buffers can keep blood samples intact for over a month even in the warm temperatures of tropical Africa (J. Waldenström, University of Lund, pers. comm.), and can keep blood samples intact for several months at room temperature in Sweden, as long as they are not frozen (S. Bensch, University of Lund, pers. comm.). Freezing and then thawing samples in non-SDS buffer requires further storage in cryogenic conditions until DNA is extracted (S. Bensch, University of Lund, pers. comm.). This is because the thawing of the sample results in some lysing of the cell membranes. We formerly transported frozen blood samples in non-SDS buffer for 5 hr before storage under cryogenic conditions. These experiences indicate that non-SDS buffers can keep blood samples intact at warm temperatures for at least a month. The value of preservation claimed by SDS lysing buffer is not unique to that buffer for avian malaria diagnostics.
The degradation of DNA by lysis buffer is expected to vary with time spent in a noncryogenic state. For example, Jarvi et al. (2002) used SDS buffer but froze it shortly after collection because the aviary was close to the laboratory. The degradation in this case would have occurred mainly during the thawing of the sample. Samples from other studies (Africa and the Caribbean) were at ambient temperature for a longer time. Consistent with the temporal prediction, Jarvi et al. (2002) had highest accuracy among the laboratories using lysis buffer (Table 2).
Problems with buffers containing SDS have also been reported in contexts other than the study of avian malaria. Conrad et al. (2000) indicated they (and others) did not obtain sufficient quantity and quality of DNA for multilocus fingerprinting using lysis buffer for storing the blood samples of several species of tyrant flycatchers (Eastern Kingbird [Tyrannus tyrannus], Least Flycatcher [Empidonax minimus], and Acadian Flycatcher [E. virescens]). Several laboratories with different reagents attempted, without success, to extract DNA for conducting the genetic analyses. Conrad et al. (2000) suggested that the problem was the lysis buffer. Their recommendation that samples be analyzed as soon as possible after collection is consistent with our interpretation that the longer the sample is exposed to SDS, the more degradation of DNA will take place. This finding is also consistent with observations from the use of PCR in microbial genetics (Wilson 1997).
Finally, it is important to recognize that commercially available buffers with proprietary recipes represent biochemical unknowns. We had to access the materials safety data sheet for the Puregene cell lysis buffer used by 20Fallon, Ricklefs et al. (2003) to obtain a list of ingredients and only approximate concentrations. Nothing on the manufacturer's web site indicated the extent of lysis expected using this product, its advisability for use in malaria diagnostics, or even stability for long-term storage. For sensitive PCR malaria diagnostics or, for that matter, any DNA-based use of a blood sample, biologists and laboratory personnel should be fully aware of what they mix with their samples.
Alternatives to Buffer in Contributing to Accuracy
Our logistic regression did not address the possibility that differences in accuracy explained by various buffers are actually associated with protocols used in the laboratory. Two alternative hypotheses for the variation in accuracy shown by Table 2 are: 1) methods of purification and extraction of the DNA from the blood samples in each buffer type, and 2) optimization of the PCR reaction, including experience of the investigator. Published methods specify less information about these techniques than about the buffer used, so a comparison needs to find systematic differences between the studies with perfect accuracy and those with lower accuracy.
To examine the first hypothesis, it is important to understand the difference between extraction, purification, and concentration. Extraction involves isolating the nucleic acid component of genomes from the rest of the buffered blood sample containing chromatin and associated cellular debris. Proteins, lipids, RNAs, and carbohydrates can be bound to the DNA and these compounds will be extracted along with the DNA as chromatin is stripped open. Purification is the phase that separates the extracted nucleic acids from the rest of the materials in the blood buffer solution. The concentration phase precipitates the pure DNA or reduces the liquid component of the buffer in which the DNA is to be stored. A combined purification and concentration phase involves precipitating the extracted nucleic acids with ethanol. However, this phase will also precipitate any materials that are still bound to the nucleic acids.
One of the essential parts of the extraction phase is incubating the thawed blood sample with SDS and Proteinase K to ensure complete lysis. The latter reagent is a heat-stable protease, derived from a fungus that liquefies its host by denaturing proteins. Proteinase K will strip proteins from the DNA and denature all proteins, including nucleases and other enzymes, into their component amino acids. If DNA is not stripped of these structural or binding proteins, physical or chemical incompatibilities may result in PCR inhibition. This protocol has often been used by molecular biologists to produce high molecular weight total genomic DNA to target unique genomic sequences (Kocher et al. 1989), and is used by several ancient DNA and forensic laboratories (Cooper 1994, Sensabaugh 1994). Proteinase K must itself be eliminated after the digestion step so that it does not destroy the Taq DNA polymerase enzyme during later amplification.
There was variation in methodologies among the laboratories in how blood samples were processed in the laboratory. The two groups with perfect accuracy, and non-SDS buffer, each incubated samples in SDS and Proteinase K before extraction of nucleic acids. One laboratory with less than perfect accuracy also incubated samples in SDS and Proteinase K before extraction, but used SDS buffer at the collection site (Richard et al. 2002). The Richard et al. (2002) and 20Fallon, Ricklefs et al. (2003) accuracies are similar, yet the latter did not use Proteinase K during the extraction process, based on the manufacturer's protocol. Jarvi et al. (2002) just stated standard methods of phenol-chloroform purification, without a reference, so it is not known if Proteinase K was used. There is no obvious correlation between extraction protocols and buffer that could replace the significant effect of buffer type.
Extracted DNA can be purified by dialysis or multispin columns with size or affinity characteristics (Sambrook et al. 1989). Without dialysis or other purification techniques, any residual contaminants, such as salts and phenol, will be concentrated with DNA in an ethanol precipitation step. This can create problems in PCR applications. A superpurified product will undergo less degradation in storage and will not be as likely to complicate the PCR amplification. However, purification may expose laboratory personnel to toxic chemicals and results in the accumulation of hazardous waste products.
The purification stage is not necessary to achieve perfect accuracy in malaria diagnostics. Only one laboratory with perfect accuracy performed the purification stage (dialysis; Feldman et al. 1995); the other laboratory with perfect accuracy did not (Waldenström et al. 2004). Thus, extraction and purification cannot be as important as the buffer used in accounting for the variation in accuracy. However, as we will show below, purification of DNA is absolutely essential for troubleshooting and for comparing buffers, primers, or optimization protocols among laboratories.
The second alternative hypothesis is that variation in accuracy is associated with optimization of the PCR reaction, which might vary systematically between the laboratories with perfect accuracy and those with less than perfect accuracy. Unfortunately, the methods presented in the reported studies are too terse and incomplete to allow replication of any study in exact detail. There are many ways that PCR reactions can be optimized, and optimization requirements are specific to particular thermocyclers, Taq polymerases, and primers. Thus, optimization problems were likely different in each laboratory and it is probable that differences in optimization among laboratories contributed to variation in accuracy and will continue to do so until robotics become universal and affordable (Bustin 2002). However, for the buffer type effect to be identical to a potential optimization effect means that the two laboratories with perfect accuracy did many things differently and better than the other laboratories after extraction of the DNA. There is still residual deviance with 2 df in Table 5, which suggests that effects other than buffer and primer ( = optimization effects) may have contributed to variation in accuracy among the laboratories, either as a main effect or as an interaction with the buffer used. This means that the buffer type effect is not identical to optimization. Both a buffer effect and an optimization effect are based on biochemistry, but occur at different stages of the protocol. It is difficult to imagine how optimization could compensate for the effects of degraded DNA caused by the buffer effect.
Issues in Empirical Testing of Avian Malaria Diagnostics
The effect of buffer type and its expected influence on DNA quality provide constraints on empirical testing of alternative aspects of PCR protocols. First and foremost, the quality of DNA must be as pure as possible for any comparison. If it is not, if two primers are being compared, as in Richard et al. (2002) and 20Fallon, Ricklefs et al. (2003), authors cannot eliminate the alternative explanation that the difference in performance of the different primers was based on a primer-degraded DNA–inhibition interaction. If two buffers are being compared without purification of the DNA, the comparison cannot eliminate the alternative explanation that the DNA required purification after storage in one buffer but simpler extraction was sufficient after storage in the other buffer. If a study is designed to test the ability of different laboratories to effectively use the same primer, and the laboratories are given the same blood samples in SDS lysis buffer, the comparison cannot eliminate the alternative explanation that differences in performance among laboratories were based on differences in template quantity and quality in relation to optimization techniques. Nonlysis Tris-EDTA buffer and DNA purification eliminate numerous confounding variables and interactions in all types of comparisons. DNA purification is thus essential for testing the accuracy of diagnostics based on the buffer in which blood samples were stored.
Jarvi et al. (2002) and Waldenström et al. (2004) are the only avian studies of which we are aware in which two methodologies or approaches are compared. Jarvi et al. (2002) asserted that the prevalence of malaria in birds is underestimated using PCR compared with immunoblot. Their assertion is only weakly supported because SDS lysis buffer was used; thus, the authors cannot eliminate the alternative hypothesis that the lower prevalence detected by PCR was based on degraded DNA associated with the buffer, the extraction technique, or the optimization techniques.
In addition, the Jarvi et al. (2002) comparison says little about PCR in general because a different laboratory performing the PCR may have had more accurate results and higher sensitivity. This is a disease diagnosis problem analogous to pseudoreplication in ecology (Hurlbert 1984). For comparing different methodologies, different primers, or different buffers, a study group is a replicate, and a comparison within a single study group has the limitations of a single replicate. However, when two methodologies are associated with perfect accuracy in a single laboratory, as in Waldenström et al. (2004) for regular and nested PCR, greater confidence can be placed in the generality of the result.
A study in which multiple laboratories use the purest starting material obtainable is necessary to distinguish the effect of DNA quality on results from any other variable. For example, consider a design to test whether the variation in accuracy of avian malaria PCR tests is based on buffer used, as our analysis of existing studies strongly suggests. Blood samples could be split into both lysing and nonlysing buffers. Samples identified as positive by microscopy could be selected for further analysis. Next, the DNA should be extracted and purified the same way for samples from each buffer. The participating laboratories would receive aliquots of pure DNA. Any differences in accuracy between the buffers could then be attributed to the effects of the buffer on the sample. Any differences in accuracy among laboratories would indicate optimization problems in laboratories with lower accuracy. A fully blocked design incorporating both laboratory and an effect of either primer or buffer type is the only way to distinguish optimization issues from issues associated with the primer or buffer type. However, this design will only work if the DNA has been purified. Studies performed under this design, with a collegial approach, have the potential to improve the accuracy and sensitivity of avian malaria PCR diagnostics in participating laboratories.
In principle, nested PCR should increase accuracy and sensitivity for amplifying a target with low copy number. However, Jarvi et al. (2002) used nested PCR and the same primers as Feldman et al. (1995), who employed regular PCR, yet Jarvi and colleagues had lower accuracy than Feldman et al. (1995; Table 2). Waldenström et al. (2004) had perfect accuracy with both regular and nested PCR, and had greater sensitivity with nested PCR. Nested PCR in general cannot be a substitute for pure DNA and PCR reactions that are fully optimized (Holst-Jensen 1998). The amplicons produced by the first stage of nested PCR have high risk of contaminating the second stage unless extreme precautions are taken, including using separate rooms and separate instruments. Nested PCR, which has great potential to improve sensitivity, should only be undertaken when regular PCR has been sufficiently investigated and found not to yield the expected results, compared to standards published by other laboratories.
In conclusion, researchers investigating avian malaria should recognize that a protocol begins at the time the blood sample is taken, followed by the temporary storage of the sample at the study site, the shipping of the sample to the laboratory, and the storage of the sample in the laboratory before the extraction of DNA. Buffers that work well for mitochrondrial host studies do not necessarily work well for diagnosing malaria. The rest of the protocol depends on the purity of DNA. Troubleshooting PCR reactions is the norm, even if pure DNA is used, but troubleshooting can be more effective when a degraded sample is not the primary issue. Lysis buffer is apparently a major problem for accuracy of avian malaria PCR diagnostics. If damage to DNA results from use of that buffer, then all the problems of ancient and forensic DNA become relevant to avian malaria PCR diagnostics, which are already challenging because of the relatively greater amount of host DNA. Increased attention needs to be directed toward the buffer used for blood storage, DNA purification, and, ultimately, experimental design for progress to be made in standardizing and improving avian malaria diagnostics.
We recognize the encouragement of many colleagues who were supportive of our effort to prepare this review. The cooperative sharing of information and unpublished data by S. Bensch and J. Waldenström is greatly appreciated. Discussions with G. Bodner, M. Medeiros, G. Ostrander, and S. Conant contributed to the review. Anonymous reviewer comments helped improve the manuscript. Financial support was provided by the National Center for Environmental Research (Science to Achieve Results, Environmental Protection Agency, R82-9093), and by the National Institutes of Health – Minority Access to Research Careers.
Table 1. Problems associated with false negatives and false positives in PCR diagnostics for malaria. Genuine and apparent false results are derived from different causes
Table 2. Accuracy of avian malaria PCR tests using different primers (molecular tests). All tests are based on samples with enough parasitemia to be detected through microscopy (smear tests, A). 18S rRNA and TRAP gene are nuclear primers. All others are mitochondrial primers. A unique primer is defined by name, region of the genome, and developer. Smear positive and molecular positive (B) represent accurate diagnosis by PCR. Smear positive and molecular negative represent genuine false negatives by PCR, as defined in Table 1. Accuracy is defined as the proportion of smear positive samples that are correctly diagnosed by PCR
Table 3. Comparison of numbers of different types of avian and human blood cells per mm3. Avian erythrocyte data is based on passerine birds from Nice et al. (1935); avian leukocyte and thrombocyte data is based on nonpasserine birds from Sturkie (1986); all human blood cell data is from Williams et al. (1983). Human platelets serve the same clotting function as avian thrombocytes
Table 4. Relationship of accuracy to buffer used. Accuracy is defined as the proportion of samples diagnosed positive by microscopy (smear) correctly diagnosed by PCR. The accuracies listed are the highest achieved by the five laboratories in Table 2, and are in the order Feldman et al. (1995), Waldenström et al. (2004), Jarvi et al. (2002), Richard et al. (2002), and 20Fallon, Ricklefs et al. (2003). Salt (NaCl) and Tris (Trizma base) keep the buffer solution at a constant pH. EDTA (ethylenediaminetetraacetic acid) chelates Mg2+ ions that would otherwise activate endonucleases released from damaged cell and nuclear membranes. SDS (sodium dodecyl sulfate) is a surfactant that lyses cell and nuclear membranes, thereby releasing endonucleases, other enzymes, and cellular debris
Table 5. Analysis of deviance tables for logistic regression of accuracy of PCR diagnostics on primer type and buffer type. Accuracy is defined as the proportion of blood samples that test positive by microscopy (smear) that also test positive by PCR. NULL specifies a model with minimal structure. Primer type and buffer type specify effects that add structure to the model, and the regression determines the significance of the improved goodness of fit. Primer type refers to the Feldman 18s rRNA nuclear primers and to the Bensch mt-HaemF-R Cyt B mitochondrial primers (Table 2). Buffer type refers to nonlysing buffer and lysing buffer with SDS (Table 4). The five laboratories included in the analysis represented collectively all primer type and buffer type combinations. Models (a) and (b) differ by the order in which primer type and buffer type were entered into the regression