Heat poses a major environmental risk to occupational safety, necessitating timely insights into associated risks to safeguard workers. In June 2022, the National Weather Service (NWS) initiated operational wet bulb globe temperature (WBGT) forecasts, offering valuable information for heat risk management. This study evaluates the effectiveness of NWS WBGT forecasts, aiming to identify potential areas of caution and improvements for their application for occupational safety management. To this end, the study examines 1.3 million hourly historical NWS WBGT forecast data, comparing it with observed data from 252 weather stations across the US during the summer of 2023. The results offer key insights, revealing that: (1) the accuracy of NWS WBGT forecasts is influenced more by the times of interest than by the forecast horizons; (2) NWS WBGT forecast accuracy varies across different climates in the US, with air temperature bias being the most influential factor in this inaccuracy; and (3) while NWS WBGT forecasts accurately identify the lowest heat risks (i.e. no heat risk), their performance decreases at higher risk levels, emphasizing the importance of careful interpretation in safety management. These insights offer guidance for more cautious interpretations of NWS WBGT forecasts and lay the foundation for future studies on leveraging operational weather forecasting services in effective heat mitigation strategies.
Introduction
Timely evaluation of heat exposure is essential to mitigate one of the most critical environmental concerns affecting workers.1 Among the myriad of heat indices, such as the Universal Thermal Climate Index (UTCI)2 and Physiological Equivalent Temperature (PET),3 the Wet Bulb Globe Temperature (WBGT) index is widely recognized as an international reference for assessing environmental heat risks in occupational settings.4 Here, heat risk refers to environmental heat stress assessed by WBGT, indicating potential risks for heat-related illnesses among workers. This index incorporates weather variables such as natural wet-bulb temperature (Tnwb), air temperature (Ta), and globe temperature (Tg), forming the basis for safety guidelines.5 Occupational safety institutions like the National Institute for Occupational Safety and Health (NIOSH),6 the American Conference of Governmental Industrial Hygienists (ACGIH),7 and the International Organization for Standardization (ISO),8 provide the WBGT-based heat exposure limits, such as ACGIH’s thermal limit values and NIOSH’s recommended alert limit and recommended exposure limits. These include ACGIH’s thermal limit values and NIOSH’s recommended alert limits and recommended exposure limits, along with other heat mitigation strategies, such as appropriate work/rest scheduling to accommodate high heat-sensitive work areas and periods.6
Despite the crucial role of the WBGT index in occupational safety management, obtaining timely WBGT information in outdoor workplaces presents practical challenges. It often requires additional weather sensors, such as a black globe thermometer,9 which introduce logistical and cost constraints on-site along with the need for frequent calibrations.10 Predicting future occupational heat risks for strategic mitigation planning is even more complex. Such predictions require large-scale numerical simulations and expert adjustments to properly capture global and local weather patterns, alongside daily and seasonal fluctuations.11 To address these challenges, previous studies have explored the use of weather forecasting services to estimate predictive WBGT values as effective alternatives.12,13 One notable example is the HEAT-SHIELD project,14 which leverages European Center for Medium Range Weather Forecasts (ECMWF)15 to provide daily maximum WBGT forecasts for Europe. Despite these efforts, gaps remain in the application of such forecasts to occupational safety. Specifically, short-term forecast information, such as hourly updates, is crucial for effective safety planning, as maximum values alone do not account for diurnal variations in heat risks. Additionally, the accuracy of these forecasts has not been validated, raising concerns about their reliability for safety management purposes.
In June 2022, the US National Weather Service (NWS) initiated its operational WBGT forecasts, hereafter referred to as NWS WBGT forecasts.16 This service provides public forecasts on a 2.5 km grid for the contiguous United States, updated hourly for up to 36 hours, every 3 hours for up to 72 hours, and every 6 hours for up to 168 hours, also including Hawaii, Guam, and Puerto Rico at 3-hour intervals for up to 72 hours and 6-hour intervals for up to 168 hours.17 Recognizing the practical value of these forecasts, recent studies have assessed their accuracy against observed data, which are measurements collected directly from weather stations. For instance, Ahn et al18 evaluated the NWS WBGT data from 2018 to 2019 across the US and found average discrepancies ranging from −0.64°C to 1.46°C, while Clark et al19 examined forecast accuracy for North Carolina during the summers of 2019 to 2021, noting variations in accuracy depending on the level of heat risk and time of day.
Although these evaluations provide valuable insights, a comprehensive understanding of their role in supporting critical safety decisions requires more than just overall accuracy assessments. For example, if the forecasts perform well in non-heat-risk scenarios but falter under heat-sensitive conditions, their reliability for practical applications may be compromised. Similarly, evaluating forecast performance over different time horizons (e.g. 1 hour ahead versus 6 hours ahead) is crucial, as high fidelity in short-term forecasts but poor performance in longer-term scenarios would limit the service’s application in occupational safety management. Moreover, these evaluations were conducted before the service became operational in 2022. Since then, WBGT calculation methods have been refined in several areas, including direct and diffused solar radiation, convective heat transfer coefficients, adjustment of solar flux based on cloud cover, and the calculation of natural wet-bulb temperature.20 However, the accuracies of these updates have not yet been explored.
These gaps in knowledge motivate our study to revisit the evaluation of NWS WBGT forecasts by comparing them with observed weather data, aiming to better understand its role in occupational safety management. To this end, we have compiled an extensive dataset, resulting in 1.3 million hourly data points, including in-situ observations at 252 locations and historical NWS WBGT forecasts during the summer months of 2023 in the US. Our analysis addresses forecast horizons, times of interest, climate types, and the ability to detect occupational heat risk levels to uncover practical insights and identify areas for future research and improvement. The main contributions of this study are threefold. First, it explores hourly NWS WBGT forecast data, improving the granularity across times of interest and forecast horizons. The findings offer empirical insights into the effectiveness of NWS WBGT forecasts according to different heat risk levels. Second, the study evaluates weather biases affecting forecast accuracy across various climate types, providing insights on potential improvement plans to minimize climate-specific biases in future research endeavors and highlighting heat-risk-vulnerable climate types that require more vigilant occupational heat risk monitoring. Third, it provides empirical evidence on the importance of addressing multiple factors in assessing heat-health warning systems, beyond overall accuracy. This underscores the need to incorporate forecast horizons, times of interest, climate types, and the ability to detect occupational heat risk levels. Overall, this study contributes to improving the role of operational heat risk forecasting services in effectively managing heat exposures for occupational workers.
Materials and Methods
Our study evaluates the effectiveness of NWS WBGT forecasts by comparing them to observations from in-situ weather stations across the US. To accomplish this, we integrated data from multiple sources into a unified dataset, aligning observation locations with forecast times to enable a direct comparison, as detailed in the data collection and preprocessing section. This integration process involved collecting hourly data from June 1, 2023, to August 31, 2023, during the hours of 6:00 AM to 8:00 PM, from 252 weather stations in the US. Following data collection, WBGT values were calculated using the NWS WBGT calculation method to analyze the reliability of NWS WBGT forecasts in the context of occupational safety management. This section details the research methodology, outlining the procedures of (1) data collection and preparation, (2) WBGT calculations, and (3) an evaluation of the effectiveness of NWS WBGT forecasts, as illustrated in Figure 1.
Data collection and preprocessing
National digital forecast database (NDFD)
The NDFD integrates digital weather forecasts from the weather forecast offices, the river forecast centers, and the national centers for environmental prediction.11 Our study focused on collecting hourly data on weather variables pertinent to NWS WBGT forecasts from the NDFD, including Ta, RH, cloud cover, and air velocity at 10 m (Va10). The NDFD provides weather forecasts for these variables on an hourly basis for up to 36 hours and every 3 hours for up to 168 hours, with a 2.5 km gridded horizontal resolution.21 The frequency of updates can vary based on the discretions of the weather forecast offices.22 The NDFD is publicly available through the National Oceanic and Atmospheric Administration (NOAA)’s data portals, accessible via Hypertext Transfer Protocol (HTTP) or File Transfer Protocol (FTP).23
Additional data for NWS WBGT computation
The NWS WBGT computation requires additional data not available in the NDFD.20 These include (1) surface air pressure, (2) surface albedo, and (3) surface roughness length. For surface air pressure, the NWS algorithm uses the 50th percentile mean sea level pressure from the National Blend of Models (NBM) and converts this to surface air pressure. This study, however, directly sourced surface air pressure from the High-Resolution Rapid Refresh (HRRR) model,24 streamlining the computation by avoiding the need for conversion. For surface albedo, the NWS algorithm employs the NOAA17 Advanced Very High Resolution Radiometer (AVHRR3), which is no longer publicly available. Alternatively, this study has sourced data from the Visible Infrared Imaging Radiometer Suite (VIIRS) Level 3 daily gridded land surface albedo from the NOAA-20 satellite.25 Additionally, land cover type data required for calculating surface roughness length was sourced from the Moderate Resolution Imaging Spectroradiometer (MODIS) Land Cover Type Product (MCD12Q1) version 6.1,26 then converted to surface roughness length, according to Supplemental Table 1.
In-situ observation data
Observed weather data were collected from 5 weather station networks: Automated Weather Data Network27; Agriculture and meteorology28; California Irrigation Management Information System29; Soil Climate Analysis Network30; and WeatherSTEM.31 Consequently, approximately 1.8 million hourly data for Ta, RH, Va, Global Horizontal Irradiance (GHI) and their geolocation data (latitude and longitude) were collected from 1508 stations. The data were further refined through 3 rounds of filtering. First, we eliminated hourly data recorded when the sun’s elevation was below 3°, according to the NWS WBGT calculations to exclude nighttime observations.20 Second, we applied quality control measures to remove statistically suspect observations. This involved filtering out hourly weather data outliers for each station’s monthly data by identifying observations with z-scores in the top or bottom 0.5% for each weather parameter over the study period. Finally, to avoid an overconcentration of data from specific regions, we randomly selected stations based on their latitudes and longitudes within a 100 km radius. Table 1 and Figure 2 show the number of stations and data points according to the Köppen climates based on climate trends from 1981 to 2000.32 The collected data represent that Humid subtropical climate (Cfa), Cold desert climate (BSk), Warm-summer humid continental climate (Dfb), and Hot-summer humid continental climate (Dfa) are the most frequent Köppen climates in the US. More details on how the Köppen climates are determined can be found in this reference.32
Table 1.
Number of stations and data points for Köppen climates.
Matching dataset
The NDFD data, additional data for NWS WBGT computation, and in-situ observation data were aligned based on their forecast horizons and local standard times (LST). This alignment process involved matching the LST of the in-situ observations with the NDFD data across forecast intervals of 1, 6, 12, 18, and 24 hours. For example, an in-situ observation recorded on July 2nd at 12:00 in the Central Time Zone (UTC-5 hours) was matched with the NDFD data as follows: the 24-hour forecast data was collected from the NDFD data issued on July 1st at 17:00, the 18-hour forecast data from July 1st at 23:00, the 12-hour forecast data from July 2nd at 05:00, the 6-hour forecast data from July 2nd at 11:00, and the 1-hour forecast data from July 2nd at 16:00. This integration effort resulted in an extensive dataset of 1 300 737 hourly data points over the study period. These data points were used to calculate WBGT values, allowing for a direct comparison between the NWS WBGT forecasts and the WBGT values derived from in-situ observations across different local work hours and forecast horizons.
WBGT calculations
The WBGT index is the most widely used thermo-physiological model for occupational heat stress assessments.33 Its outdoor calculations involve 3 variables: Tnwb, Tg, and Ta, as shown in equation (1).34
The NWS employs equation (2) for Tnwb calculations in its WBGT computation mechanism.20
Where Twb represents the thermodynamic wet bulb temperature in °C. To ensure consistency in comparing in-situ observations with NWS forecasts, this study uses a formula35 for Twb calculation, as shown in equation (3).
For GHI calculation, in accordance with the NWS WBGT computation mechanism, the daily maximum of solar radiation flux (GHImax) is calculated based on the Environment Canada weather forecast model.20 To accommodate the impact of cloud cover on solar radiation flux, this value is subsequently refined using equation (4).
Where, n represents the cloud cover percentage, expressed as a fraction (0.0-1.0). Subsequently, the hourly solar radiation flux is determined using a Gaussian distribution, under the assumption that the maximum daily solar radiation flux occurs at noon.36
For Va calculation, given that NDFD only provides Va10 (i.e. air velocity at 10 m), the equation (5) is further employed, where represents the roughness length in meters, obtained from the MODIS MCD12Q1 dataset.
Tg is derived from the equation (6).37
Once B and C are determined using equations (7) and (8), the remaining value to be determined from the quartic equation is Tg. This value is calculated under the assumption that the real solution to the quartic equation is the one nearest to Ta.37
Where, fdif is the diffuse radiation constant, defined as the sky cover percentage, with a minimum value set at 0.25; fdb is the direct radiation constant, calculated as 1 − fdif; σ represents the Stephan-Boltzmann constant (i.e. 5.6−7 × 10−8 W/m2. K4); z is zenith angle in degrees, determined using the Solar Position and Intensity (SOLPOS) algorithm38; and εa is the thermal emissivity calculated as
Where ea is the atmospheric vapor pressure, determined by equation (10).39
Where, P is the surface air pressure in millibars, sourced from the HRRR dataset,24 and Td is the dew point temperature in °C, calculated using equation (11).40
The WBGT heat risk levels are evaluated using the following reference criterion outlined in Table 2,41 originally proposed by the armed services, which targets acclimated average-sized people. This criterion is also used by the Occupational Safety and Health Administration (OSHA).42 This criterion was chosen for its detailed safety guidelines, including work/rest ratios and water intake recommendations, which offer actionable insights for managing occupational safety.6 This study investigates the performance of NWS WBGT forecasts, focusing on their ability to accurately assess heat risk levels using this criterion.
Table 2.
Work/rest ratios and water intake based on WBGT index and work intensities.6
Evaluating the effectiveness of NWS WBGT forecasts
The effectiveness of NWS WBGT forecasts is assessed through the comparison of matched WBGT values derived from observations and forecasts across 5 time horizons (1, 6, 12, 18, and 24 hours), focusing on (1) the numerical closeness between observed and forecasted WBGT values and (2) the detection performance of heat risk levels. Forecast accuracy at each hourly observation is determined using root-mean-square error (RMSE) and mean bias error (MBE), based on formulas,43 according to equations (12) and (13):
Where represents the forecasted value for the ith data point, is the observed value for the ith data point, and n is the total number of data points. The performance of NWS WBGT forecasts to identify occupational heat risk levels is measured using a row-normalized confusion matrix, which identifies discrepancies between forecasted and observed WBGT heat risk levels. The confusion matrix is widely used to assess detection performance.44,45 Given the prevalence of data points at the lowest heat risk level, which exhibit variability across Köppen climates, this study adjusts the confusion matrix by normalizing each row. This process involves dividing the counts by the row totals and converting the results to percentage form to better recognize the NWS WBGT forecasts performance across the different heat risk levels and Köppen climates.
Results
Occupational heat vulnerabilities based on Köppen climates
Figure 3 features a blue line representing the mean of hourly WBGT values, encompassed by shaded areas indicating the upper and lower quartiles. Furthermore, colored horizontal dotted lines highlight the WBGT levels ranging from Level 1 (25.6 °C-27.7 °C) to Level 5 (over 32.2 °C). Generally, the WBGT trends form a bell-shaped curve, peaking between 12:00 and 14:00 across all climate types, reflecting the typical pattern of occupational heat exposure in the US. Notably, climate types “Csb,” “Dfb,” “Dsa,” “Dsb,” “Dwb,” and “ET” have upper quartile values that do not surpass Level 1, indicating relatively lower occupational environmental heat risks. Furthermore, climate types “BSk,” “BWk,” “Cfb,” “Csa,” “Dfa,” and “Dwa” mostly remain WBGT values below Level 1 for the mean and lower quartiles, but their upper quartile may occasionally surpass Level 1. The most pronounced occupational heat risks are observed in climate types “Am,” “BSh,” “BWh,” and “Cfa,” with working hours often ranging between Level 1 and Level 5. Additionally, the data reveals variability in the interquartile ranges of WBGT across these climates, with “Am” showing a narrower interquartile range, indicating consistent occupational heat risk, while “BSh,” “BWh,” and “Cfa” display wider ranges, suggesting more daily variations in occupational heat risks. Figure 4 represents the occupational heat risk level distributions across Köppen climates, where each colored bar indicates the proportion of WBGT heat risk levels. In climates “Dsb” and “ET,” the majority of heat risk levels are at Level 0 (below 25.6°C), indicating lower occupational heat risks. In contrast, climates “Am,” “BSh,” “BWh,” “Cfa,” and “Csa” predominately face heat risks above this threshold, emphasizing distinct occupational heat vulnerabilities. Notably, “Am,” “Cfa,” and “Csa” show relatively consistent variances in heat risk level distributions, while “BSh” and “BWh” exhibit increasing trends in percentages across the risk levels with Level 5 (over 32.2°C) being the most prevalent.
Accuracy of NWS WBGT forecasts compared to in-situ observations
Given that WBGT is a function of the weather variables, the accuracy of NWS WBGT forecasts is intrinsically associated with the accuracy of its input weather variables: Ta, RH, Va, and GHI. Table 3 outlines the MBE and RMSE for these forecasts over 5 time horizons (1, 6, 12, 18, and 24 hours) and 7 local standard times (6:00-20:00 at 2-hour intervals) for the 4 weather variables and WBGT forecasts. The MBE and RMSE for Ta vary from −0.04°C to −1.84°C and from 1.85°C to 3.13°C, respectively. For RH, these metrics range from −0.50% to 6.43% and from 9.51% to 15.35%, respectively; for Va, from 0.01 to −1.20 m/s and from 2.22 to 2.95 m/s. Ta and RH show the most pronounced errors at 20:00 and the highest accuracy at around noon, whereas Va shows the best performance at 6:00 and 8:00, deteriorating between 16:00 and 20:00. GHI discrepancies are notably higher, with MBE and RMSE ranging from 34.08 to 325.77 W/m2 and from 63.67 to 384.69 W/m2, respectively. Overall, the MBE and RMSE for WBGT range from 0.01°C to 1.77°C and from 1.91°C to 3.00°C, respectively, with the most significant inaccuracies observed at 8:00 and the smallest between 12:00 and 18:00.
Table 3.
Comparative analysis of MBE and RMSE metrics for Ta, RH, Va, GHI, WBGT across five horizons of 1, 6, 12, 18, and 24 hours at 7 local standard time (6:00-20:00) at 2-hour intervals.
Figure 5 illustrates how these biases in NWS WBGT forecasts vary across Köppen climates. While MBE remains around ±2.0°C and RMSE remains within 3.0°C in most Köppen climates, the climates “Cfb,” “Dsa,” and “Dsb” exhibit larger errors with a decreasing trend, peaking at 6:00, reaching their minimum around 12:00, and continuing to decrease until 20:00. This feature is more evident in RMSE trends, where “Cfb” and “Dsa” display an inverted bell curve, with the best performance around 12:00. In contrast, “Dsb” experiences the largest errors at 6:00, which subsequently decreases but still shows relatively larger errors compared to other climates. Next, this study explores how the identified biases of NWS WBGT forecasts across Köppen climates misrepresent occupational heat risks and identify underlying weather factors that necessitate further improvements in their accuracies.
Performance in detecting occupational heat risk levels
Figure 6 compares observed and forecasted WBGT risk levels across Köppen climates using normalized confusion matrices. Except for the “Am” climate, the NWS WBGT forecasts demonstrate high accuracy in detecting heat risks at Level 0 (below 25.6°C), with success rates of 86.4% for “BWh,” and above 90% for other climates. However, for heat risk levels beyond Level 0, performances are generally below 50%. In “BSh,” “BWh,” and “Cfa” climates, which are vulnerable to work-related heat risks evidenced in Figure 4, the accuracy for predicting higher heat risk levels (equal to or exceeding Level 1) is not as reliable. Forecasts in “BWh,” and “Cfa” tend to underestimate risks in these climates, whereas “BSh” forecasts are more likely to overestimate.
We further investigate the causes of certain low performance by analyzing the impact of discrepancies between in-situ observations and NWS weather forecasts (i.e. Ta_diff, RH_diff, Va_diff, and GHI_diff) on the WBGT_diff (i.e. the difference between observed and forecasted WBGT values). These analyses utilize beta coefficients derived from regression analyses, with results presented in Figure 7. The symbol “*” denotes statistical significance with a P-value below .05, and the values under each x-axis label in parentheses represent the variance inflation factor (VIF) values. The R² values, ranging from 0.88 to 0.96, show strong linearity in explaining variances in WBGT forecast bias, and VIFs, not exceeding 3.6, suggest minimal multicollinearity among variables.46 The analysis identifies Ta_diff as the most influential factor, with the second most influential factor alternating between RH_diff and GHI_diff, whereas Va_diff has the least influence. Notably, in climates “Cfb,” “Dsa,” “Dsb,” Ta_diff significantly influences WBGT_diff.
Discussion
This study highlights the importance of comprehensive assessments in occupational heat risk forecasts, specifically focusing on the performance of NWS WBGT forecasts with respect to local time, forecast horizons, and the ability to alarm occupational heat risks across Köppen climates. Utilizing in-situ observations from 252 weather stations and historical NWS WBGT forecasts from the summer of 2023 in the US, the study underscores the necessity for increased vigilance toward occupational heat threats in certain Köppen climates in the US, such as “Am,” “BSh,” “BWh,” “Cfa,” and “Csa,” where heat risk levels during working hours often exceed Level 0. These climates demand more regular monitoring due to frequent fluctuations in WBGT heat risk levels from Level 1 to Level 5, which can lead to misconceptions about actual risk levels without adequate monitoring.
The study reveals a stronger association between NWS WBGT forecast errors and time of day rather than forecast horizons. Prior research found that the RMSE for NWS WBGT forecasts in North Carolina was 1.3°C from May to September during 2019 to 2021,19 and another study identified that NWS WBGT forecasts’ RMSE varied from −0.64°C to 1.46°C from April to October during 2018 to 2019 in the US.18 These errors likely stem from underlying inaccuracies in the component weather forecasts, highlighting the limitations of relying on a single metric for forecast evaluation. The analysis in this study further reveals its temporal trends, with specific times of the day and forecast horizons showing increased accuracy or discrepancies for different component weather forecasts, as shown in Table 3. In most Köppen climates, MBE is approximately ±2.0°C and RMSE remains within 3.0°C, as represented in Figure 5. However, the climates “Cfb,” “Dsa,” and “Dsb” exhibit relatively larger errors, peaking at 6:00, reaching a minimum around 12:00, and continuing to decrease until 20:00. While such notable errors are observed in these climates, it is also noteworthy that the observed time frames with greater errors, such as 6:00 and after 16:00, are most likely to be under Level 0 heat risks (i.e. no occupational heat risk), as evidenced in Figure 3.
The study also evaluates the effectiveness of NWS WBGT forecasts in identifying occupational heat risk levels. Results reveal consistently low performance in detecting WBGT heat risk levels (equal to or exceeding Level 1) across Köppen climates, with forecast performance correlating more strongly with time of day than with forecast horizons within a 1-day period, as shown in Figure 6. While NWS WBGT forecasts effectively identify the lowest heat risks (i.e. no heat risk), their performance declines at higher risk levels, often below 50%, highlighting the need for cautious interpretation of NWS WBGT forecasts in safety management. For instance, adopting a more conservative interpretation (e.g. include the one level below) of forecasts in “BWh” and “Cfa” climates could be recommended to improve accuracy, potentially exceeding 60% in all instances. The “Am” climate demonstrates lower overall forecast accuracy, indicating a need for more careful forecast interpretation. While forecasts are reliable for detecting Level 0 heat risks, a careful approach is advised for higher heat risk levels, underlining the importance of considering regional climate characteristics in interpreting NWS WBGT forecasts. Without further adjustments, such as post-processing47 or a conservative interpretation strategy, the NWS WBGT forecasts may not provide a reliable basis for evaluating occupational heat risk levels for safety management. This also highlights the need for more careful attention to its performance during heat-sensitive periods (e.g. Level 1 to Level 5), where a few degrees of error can affect the implementation of appropriate safety standards, as exemplified in Table 2.
Along with the quantitative results, which can be directly used for practical decision support regarding NWS WBGT forecasts, this study offers several practical implications. For instance, it identifies empirical causes behind the low performance by examining the impact of discrepancies between in-situ observations and NWS weather forecasts, as shown in Figure 7. In all climates, Ta_diff significantly influences WBGT_diff, which is closely related to the associated weather biases and underlying NWS WBGT calculation mechanism, which assigns greater weights to Ta. Another notable finding is that the second most influential variable is either GHI_diff or RH_diff, depending on Köppen climates. For practical scenarios, these insights provide climate-specific recommendations for minimizing relevant weather forecast biases and thus effectively improving the reliability of NWS WBGT forecasts, such as using post-processing techniques widely employed to improve weather forecasts.47 Furthermore, since operational weather forecasts are a common feature in most countries,48 the methodologies developed in this study for evaluating public weather forecast-driven occupational heat risk forecasts can be readily adapted and applied internationally. For instance, the evaluation framework used in this study could be employed to assess the performance of heat risk forecasts provided by the HEAT-SHIELD project,14 which encompasses the entire European region. Extending this evaluation approach to different operational forecast systems around the world would yield new insights into the effectiveness of these systems in managing occupational safety. This extended application could help identify specific areas where forecast accuracy needs improvement and develop tailored strategies to mitigate occupational heat risks in diverse climatic conditions.
Despite its contributions, this study has several limitations that open avenues for future research. First, the study primarily uses the NWS’s WBGT calculation mechanism for direct comparisons between NWS WBGT forecasts and in-situ observation-based WBGT. Therefore, this analysis does not address the reliability of the NWS’s WBGT calculation validated by direct measurements using additional weather sensors, such as a black globe thermometer. Future research also could explore other WBGT estimation methods, such as those proposed by Lemke and Kjellstrom,49 Bernard,50 and Liljegren et al51 to evaluate their accuracy. Second, the ability to identify occupational heat risk levels is closely linked to the standards of occupational heat levels, which may incorporate different risk thresholds. Future studies could benefit from sensitivity analyses exploring different safety guidelines, providing practitioners with practical insights for interpreting NWS WBGT forecasts under various guidelines. For example, OSHA provides recommendations on exposure limits for light work activity at 30°C, moderate work activity at 27.8°C, and heavy work activity at 26.1°C.6 Revisiting the method of this study to evaluate the performance of weather forecasts under different guidelines would provide more tailored insights for their applications. Lastly, the data used in this study is based on observations from distributed weather stations for the hours of 6:00 AM to 8:00 PM during the summer months of 2023 in the US, resulting in small samples in certain climate types (e.g. BSh, Dsa). Additional data would help address this limitation, allowing for a more comprehensive analysis that incorporates other yearly trends and extends the observation period beyond the current timeframe. Furthermore, the effectiveness of heat risk forecasts is primarily focused on outdoor work environments. Different work environments (e.g. indoors) may need to consider other environment-related conditions (e.g. indoor air temperature), requiring additional types of data for their applications in safety management.
Conclusion
To properly mitigate the forthcoming heat risks in occupational settings, predictive information on environmental heat is fundamental for developing effective measures to protect the health and safety of workers. Leveraging the benefits of public weather forecasting has the potential to address this need. In this context, this study evaluates the performance of NWS WBGT forecasts. The findings identify areas where heightened caution and improvements could enhance this understanding for proactive occupational safety management. Furthermore, this study discusses its practical implications and extensive research benefits. Overall, this research contributes to enhancing the role of operational heat risk forecasting services, ensuring better management of heat exposures for occupational workers.
Acknowledgements
The authors would like to express our sincere gratitude to Timothy Boyer from the US National Weather Service (NWS) for providing valuable information and insights regarding the NWS Wet Bulb Globe Temperature (WBGT) calculation mechanisms.
© The Author(s) 2024 SAGE Publications Ltd unless otherwise noted. Manuscript content on this site is licensed under Creative Commons Licenses
This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License ( https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages ( https://us.sagepub.com/en-us/nam/open-access-at-sage).
Author Contributions
YK conceptualized and designed the study, collected and analyzed the data, and wrote the manuscript. YH provided advice, supervised the study, and reviewed and revised the manuscript.
Availability of Data and Materials
Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.
Supplemental Material
Supplemental material for this article is available online.