Calabrese and O'Connor (1) (hereafter denoted as C&O) offer commentary on the origin of the linear no-threshold (LNT) model for low-dose radiation risk in the context of the National Academy of Sciences' (NAS') Biological Effects of Ionizing Radiation VII (BEIR VII) report (2). We write to correct several mischaracterizations and inaccuracies in their commentary. Our comments are organized according to the sections in C&O's article.
LNT MODEL AND BEIR: HISTORICAL FOUNDATIONS
Calabrese and O'Connor assert that the use of the LNT model by NAS BEIR committees is the direct result of efforts by Hermann Muller and his radiation genetics colleagues to persuade national and international committees to drop their historical reliance on threshold dose-response models (p. 464). According to C&O, Muller finally achieved success by convincing the Biological Effects of Atomic Radiation (BEAR) committee (3) to adopt the LNT model. This characterization of Muller's actions has been disputed by Cicerone and Crowley (4); the first author is president of the National Academy of Sciences.
Scientific advances over the past seven decades have prompted a reexamination of the risks arising from exposure to low-dose, low-LET radiation by several BEIR committees (2, 5–7). The most recent BEIR report (2) includes a detailed review of the then-current scientific understanding of biological responses to ionizing radiation. Based on this analysis, the committee that authored the BEIR VII report (2) concluded that
“… the balance of scientific evidence at low doses tends to weigh in favor of a simple proportionate relationship between radiation dose and cancer risk” (p. 246).
The committee's reasoning to support this conclusion is laid out clearly on p. 245–246 of the BEIR VII report (2). C&O may disagree with this conclusion, but they should acknowledge that the BEIR committee undertook an independent analysis based on contemporary data rather than implying that the committee relied on 1950s era science.
The BEIR VII committee analyzed then-available radiobiology and experimental data to identify plausible cancer risk models. The committee judged that the primary mechanism driving cancer development was induction of chromosomal aberrations arising from double-strand breaks in DNA. Aberration induction was judged to be linear quadratic based on most critical studies that had sufficient statistical power (8). Also, then-available animal data supported the linear-quadratic model (e.g. 9, 10). The BEIR VII committee understood that the linear-quadratic model could be approximated by a linear function in the low-dose (<100 mSv) region (see Fig. 1).
The BEIR VII committee had a clear-eyed view of the challenges for estimating radiation risks at low doses:
“It is abundantly clear that direct epidemiologic and animal approaches to low-dose cancer risk are intrinsically limited in their capacity to define possible curvilinearity or dose thresholds for risk in the range 0–100 mSv. For this reason, the present [BEIR VII] report has placed much emphasis on the mechanistic data that can underpin such judgments” (p. 245).
Future BEIR committees may reach different conclusions about the validity of the LNT model based on advances in science and medicine. Those future committees will not be bound by previous BEIR reports, and certainly not by seven-decade-old science, in reaching their conclusions.
THE CASE FOR AND AGAINST THE LNT MODEL
Calabrese and O'Connor note (p. 465) that “decisions made by the various BEIR Committees (with respect to models used to generate risk estimates) are often at odds with those from prior BEIR Committees and may well change with the next iteration of the BEIR process”. The authors also assert that the published scientific literature does not support the magnitude of changes in the risk estimates across these BEIR studies.
It is certainly true that risk models and risk estimates have changed across the BEIR studies as scientific understanding and analytical capabilities have improved. Low-dose cancer risk assessment is far from settled science: It is technically challenging to estimate cancer risks arising from low-dose ionizing radiation based on observational studies alone because of confounding influences that can vary among and within study cohorts. More generally, it is important to carefully consider the results of all available studies when developing risk estimates, paying due attention to statistical power and the potential for uncontrolled confounding.
SOURCES OF DATA OF (sic) STOCHASTIC EFFECTS OF IONIZING RADIATION
Calabrese and O'Connor discussed and critiqued the use of four classes of data in the BEIR VII report. Some of their critiques suffer from the selective use of study results and/or the failure to use the full range of available scientific information. Examples follow.
The studies cited in this section of Calabrese and O'Connor's commentary do not support any particular conclusions about the validity of the LNT model because of study design or statistical power limitations. In particular, ecologic studies have well-known limitations arising from the lack of individual dose estimates and information on possible confounding factors.
Calabrese and O'Connor cite a conclusion from a study performed by Thompson et al. (11) on residential radon exposure in Worchester County, Massachusetts, as “running contrary” to a statement in an NAS media release on the BEIR VII report that suggested ways to reduce radon exposures (and potential cancer risks arising from such exposures). This is a good example of C&O's selective reporting of research results to make broad and scientifically unsupported insinuations.
The Thompson et al. (11) study, which involved 200 cases and 397 controls, concluded that the “possibility of a hormetic effect on lung cancer at low radiation doses cannot be excluded” (C&O, p. 466; see also their Fig. 12). Failing to exclude an effect does not constitute evidence for its existence. Furthermore, C&O ignore much larger and more statistically powerful European (12), North American (13) and Chinese (14) residential radon studies that do not show evidence for hormesis. These studies collectively included over 12,000 lung cancer cases and 21,000 controls. All dose categories referenced in these studies had estimated adjusted odds ratios or relative risks above 1.0; moreover, the risk estimates tended to increase with dose [e.g., see Table 2 in ref. (12); Table 9 in ref (13) and Table 3 in ref (14)].
Calabrese and O'Connor also ignore the results from the long-term follow-up of the Russian Techa River cohort, which received low-dose, low-dose-rate exposures from environmental contamination. This cohort shows evidence of a radiation dose-response, similar to that seen in the atomic bomb survivors, for end points including solid cancer mortality (15) and leukemia (16).
Occupational Radiation Studies
Calabrese and O'Connor mischaracterize the BEIR VII committee's conclusions about the usefulness of nuclear worker studies for risk estimation, which appear in Chapter 8 of BEIR VII (2). C&O comment (p. 466) that:
“As the BEIR VII report indicated, in most of the nuclear industry worker studies, rates for all causes and all cancer mortality in the workers were substantially lower than the reference population. The BEIR VII Committee did not attempt to ascertain why, but speculated that it may be due to a ‘healthy worker effect and unknown differences between nuclear industry workers and the general public'. Consequently the BEIR VII Committee concluded that occupational studies were not suitable for the projection of population-based risks and eliminated them from further consideration in its risk estimates.”
The BEIR VII committee's conclusions were not based on “speculation,” but rather on an extensive analysis of then-available nuclear worker studies. The committee identified potentially useful studies (Table 8-1), summarized their characteristics (Table 8-2) and risk estimates (Table 8.3–8.4), and discussed the challenges for using the studies for population-based risk estimation:
“Because of the uncertainty in occupational risk estimates and the fact that errors in doses have not formally been taken into account in these studies, the [BEIR VII] committee has concluded that the occupational studies are currently not suitable for the projection of population-based risks. These studies, however, provide a comparison to the risk estimates derived from atomic bomb survivors” (p. 206).
Medical Radiation Studies
Calabrese and O'Connor's Fig. 2 reproduces results from the Lundell et al. (17) study of breast cancer in subjects receiving radiotherapy for skin hemangioma. C&O conclude, based on a visual inspection of data in this figure, that there are no increases in breast cancer risk at doses below 100 mSv. C&O do not cite Lundell and colleagues' own conclusion that a linear model provided the best fit based on a more rigorous statistical analysis of the data.
Calabrese and O'Connor also did not cite the more recent study of the same cohort by Eidemüller et al. (18), which investigated risk-threshold models in two contexts: excess relative risk (ERR) and two-stage clonal expansion. In both contexts, the threshold models had a deviance (a measure of model fit) that was worse than the no-threshold model (Table 1 in Eidemüller et al.). Eidemüller et al. commented that:
Atomic Bomb Survivor Studies
Calabrese and O'Connor mischaracterize the source of risk estimates from the life span study (LSS) cohort that are presented in the BEIR VII report. They comment (p. 467) that the BEIR VII committee relied on risk estimates “produced by researchers from the RERF,” implying that the committee was unable to design or carry out its own analyses. Although RERF researchers provided assistance to the BEIR VII committee in developing risk estimates, the committee was responsible for determining how the risk assessments would be conducted and it performed the analyses.
Calabrese and O'Connor present data from the LSS cohort to show that “at doses up to ~100 mGy, no increase in the number of cancers is observed” (p. 467–468). This conclusion is based on the visual inspection of a semi-logarithmic plot (C&O's Fig. 4) that contains only two data points between 0–100 mGy. These data points represent the midpoints of two broad dose categories (0–5 and 5–100 mGy) taken from Table 4 of Preston et al. (19).
Calabrese and O'Connor's use of a semi-logarithmic plot dramatically stretches out the low-dose portion of the dose range and prevents readers from visually evaluating the linearity of the data. Indeed, even a simple linear regression of the points in C&O's plot would look flat at low doses because of the logarithmic scale. Figure 2 illustrates the differences in appearance between linear and semi-logarithmic plots of the data from Table 4 of Preston et al. (19). The slope of the fitted curve in the semi-logarithmic plot in Fig. 2 looks nearly flat below 0.1 Gy as observed by C&O. However, the slope of the fitted curve in the linear plot in Fig. 2 is clearly greater than zero. Of course, the appearance of these plots also depends on the placement of boundaries for the dose categories. Nevertheless, the results in individual dose categories are actually quite close to the linear regression fitted on the dose interval from 0–2 Gy, i.e., omitting the highest dose category due to the leveling-off at high doses (Fig. 1).
There is a more fundamental problem with C&O's visual approach, which is used throughout their paper, for evaluating dose-response relationships: visual examination is a necessary and important first step for assessing data patterns, but it is insufficient for developing quantitatively supportable conclusions about data relationships. The conclusions of Ozasa et al. (20) about low-dose cancer risks are based on rigorous fitting and comparison of models that represent different hypotheses about the radiation induction of cancer risk [e.g., linear, linear quadratic, threshold (Fig. 1)]. This is the same approach that was used by the BEIR VII committee.
Calabrese and O'Connor suggest (p. 468) that the dose-threshold analysis for solid cancer mortality that was performed by Ozasa et al. (20) is inconsistent with an LNT model because the slope for the dose-response fit was higher below 0.1 Gy than above 0.1 Gy. C&O neglect to note that these slope differences are not statistically significant and have wide confidence intervals.
Ozasa et al. compared the LNT model, which has a zero threshold, to a linear-threshold (LT) model, which has an additional parameter for a nonzero threshold. They concluded that for no value of the threshold parameter did the LT model fit significantly better than the LNT model. Their analysis gave a best estimate of the threshold dose as 0 Gy (i.e. no threshold) and a 95% confidence interval of (0, 0.15) Gy. An analysis by Preston et al. (19) was also compatible with a 0 Gy dose threshold.
Calabrese and O'Connor state (p. 468) that “an analysis by Doss … using a more flexible model showed that the LSS data does not support a zero dose threshold and concluded that there was too much variability in the data to draw any conclusion as to the existence or absence of a threshold”. The first part of this statement is inconsistent with the second part in the usual context of hypothesis testing, which is the basis for the results obtained by Ozasa et al. In statistically testing whether or not there is a threshold, the usual default assumption (“null hypothesis”) is that there is no threshold, i.e., that the threshold is zero, as this is the simpler and more parsimonious model. However, when a term for a threshold is added to the model, if the best-fitting value for the threshold is >0 and significantly improves the fit to the data (i.e., if the confidence interval for the location of the threshold does not include zero) the default assumption of no threshold is rejected. If the default assumption is a zero-dose threshold, to say that the data “does not support a zero dose threshold” equates to saying that the data support a statistically significant parameter estimate for a nonzero threshold, which is inconsistent with the statement that “there was too much variability in the data to draw any conclusion as to the existence or absence of a threshold”.
The origin of the inconsistency is better understood by examining the analysis on which Doss based the cited statement. The criterion used by Doss (21) (p. 497 and Fig. 2) was based on visual inspection of the confidence intervals for risk associated with dose categories, as to whether they include zero risk, or in Doss's words, whether “lower bounds of the point-wise 95% CIs would have been below zero for low doses”. This is not a statistical test of the shape of the dose-response function, as may be appreciated from the simple observation that the widths on the risk axis of the confidence intervals depend on the widths of the dose categories on the dose axis and become arbitrarily wide as the dose categories are made narrower. And even if properly estimated confidence bands for the dose response had lower bounds below zero at low doses, this would not exclude a threshold at zero—it would be upper confidence bounds at or below zero on a dose interval from zero to some dose T > 0 that would exclude a threshold below T, because they would exclude risk >0 at dose <T.
More importantly, the article by Doss that is cited by C&O purports to give evidence not only of a threshold but also a hormetic effect. This claim is based on the assertion that the observed baseline cancer rates in the LSS are somehow 20% too low, based on a study by Hwang et al. (22) of a cohort in Taiwan that was exposed to protracted gamma radiation from contaminated concrete rebar in a residential building. The Hwang et al. study has far less statistical power than the LSS (7,271 vs. 93,746 subjects) and far less follow-up, among other concerns, starting with the fact that it represents an entirely different population with follow-up beginning in a different calendar period than the LSS. Doss's use of the Hwang et al. study to artificially adjust the baseline rates of the LSS in a risk regression is unsupportable.
Calabrese and O'Connor critique the models selected by the BEIR VII committee to generate low-dose cancer risk estimates. They note that the lung cancer studies cited in the BEIR VII report display a wide variability in estimated ERR versus average lung dose (see C&O's Fig. 5). C&O state (p. 469) that “Ideally all [ERR] estimates should be identical and should all lie within one or two standard deviations of each other”. This is not strictly correct: ERR uncertainty estimates for individual studies do not provide a statistical basis for drawing conclusions about the expected variability across studies without considerable additional assumptions and modeling. Indeed, the BEIR VII report (2) notes that “Although risk estimates from these studies vary, confidence intervals are very large and the estimates shown are therefore statistically compatible” (p. 175).
The uncertainty estimates in the individual studies cited in the BEIR VII report are typically based on sampling error (i.e. uncertainty due to the finite number of subjects in a study) and do not include additional uncertainty due to factors that may vary among studies, including uncontrolled confounders and attributes of the individual cohorts, among many others. Indeed, the BEIR VII report (2) notes that “Since the conditions of exposure, the characteristics of the study populations, and the extent and quality of the dosimetry and follow-up differ widely, the risk estimates derived for individual studies are not strictly comparable” (p. 174).
Calabrese and O'Connor note that the mean ERR estimate for lung cancer studies of medically exposed populations cited in the BEIR VII report (0.05/Gy) is 17 times smaller than the ERR estimate based on atomic bomb survivor data (0.86/Gy). C&O assert (p. 469) that the differences among the individual studies cited in the BEIR VII report illustrate “the tremendous uncertainties in estimating the risk factor for a single organ and the dangers in making any risk estimate based on this data”. However, the ERR estimates from medically exposed populations and atomic bomb survivors are not strictly comparable; the medically exposed populations may differ from atomic bomb survivors with respect to a variety of factors, for example, the mean age at exposure, fractionated vs. single exposures, and the presence of pre-existing medical risk factors in the medically exposed populations.
Calabrese and O'Connor critique the BEIR VII committee's use of ERR and excess absolute risk (EAR) models to estimate the Lifetime Attributable Risk (LAR) for various radiogenic cancer sites in U.S. populations. C&O observe that the BEIR VII LAR estimates based on the ERR and EAR models differed appreciably for some cancer sites; they assert (p. 469) that “Given that both models are essentially based on the RERF studies, one would expect reasonable agreement between the models for most cancers” (p. 469). In fact, there is no reason to expect agreement between ERR and EAR models from Japanese to Western populations when the background rates for a cancer site differ appreciably between the two populations. Indeed, if C&O's statement were correct the BEIR VII committee would have been far less concerned about model selection. Because information about radiogenic cancer risks for many cancer sites in Western populations is limited, the BEIR VII committee used a combination of the ERR and EAR models to develop LAR estimates and commented that
“Although it is likely that the correct transport model [of risk] varies by cancer site, for sites other than breast, thyroid and lung, the Committee judged that current knowledge was insufficient to allow the approach to vary by cancer site” (p. 276).
Calabrese and O'Connor assert (p. 469) that the use of a dose and dose-rate effectiveness factor (DDREF) in the BEIR VII risk model “converts the LNT into a linear-quadratic or biphasic model, and provides a means of modifying the linear model without officially abandoning the LNT hypothesis”. In fact, as noted previously in this commentary, the BEIR VII committee recognized that the dose-response model is probably linear quadratic over the broader dose range (see Fig. 1), with the low-dose portion of the curve being essentially linear. Consequently, the use of the DDREF does not invalidate the linear model at low doses as C&O suggest.
REGULATORY ISSUES AND LNT
Calabrese and O'Connor scold the legislative, regulatory and scientific communities for ignoring what the authors refer to as “reality checks” (p. 471) that have challenged the LNT model for chemical carcinogens. They never explain how this discussion relates, if at all, to regulations for ionizing radiation.
More importantly, C&O's mixing of science and policy advocacy in their commentary is unhelpful for promoting the credibility of radiation science in the policy and public domains. Scientists can do grave damage to their credibility, and to the credibility of the broader scientific enterprise, when they use science to advocate for personal policy choices. Science is an important input to policy decisions on chemical and radiation regulations, but it is not the only input. Such decisions usually involve a number of nonscientific factors, for example regulatory costs and benefits as well as public preferences. The Administrative Procedure Act (Public Law 79-404) requires U.S. government agencies to provide for public participation in regulatory decision making.
The U.S. Environmental Protection Agency (EPA) has the primary responsibility for setting radiation protection standards in the United States. The BEIR reports are an important input to those standards, but the EPA uses other scientific information, including advice from its Science Advisory Board, as well as information from affected industries and interested members of the public, in its standards-setting process.
Comments by C&O such as “… the regulatory community has refused to confront the possibility that their [regulatory] decisions were grossly in error” (p. 471) are more appropriately published in OpEd articles or policy journals. In our view, they do not belong in scientific articles that could be used to inform regulatory standard-setting.
The Radiation Effects Research Foundation (RERF), Hiroshima and Nagasaki, Japan, is a public interest foundation funded by the Japanese Ministry of Health, Labour and Welfare (MHLW) and the U.S. Department of Energy (DOE). The RERF research reported in this article was funded in part through DOE award DE-HS0000031 to the National Academy of Sciences. This publication was supported by RERF Research Protocol 1-75. The views of the authors do not necessarily reflect those of the two governments. The authors are grateful to James Cleaver, John Cologne, Evan Douple and Dale Preston for their helpful comments on this paper.
 Calabrese and O'Connor state (p. 466) that their Fig. 1 contains adjusted odds ratios from Thompson et al. (2008). In fact, C&O's Fig. 1 presents the unadjusted odds ratios from Table 2 of Thompson et al. The adjusted odds ratios from Table 3 of Thompson et al. contain only one dose category (50–<75 Bq m−3) that is significantly less than 1.