Danielson, J.J.; Poppenga, S.K.; Brock, J.C.; Evans, G.A.; Tyler, D.J.; Gesch, D.B.; Thatcher, C.A., and Barras, J.A., 2016. Topobathymetric elevation model development using a new methodology: Coastal National Elevation Database. In: Brock, J.C.; Gesch, D.B.; Parrish, C.E.; Rogers, J.N., and Wright, C.W. (eds.), Advances in Topobathymetric Mapping, Models, and Applications. Journal of Coastal Research, Special Issue, No. 76, pp. 75–89. Coconut Creek (Florida), ISSN 0749-0208.
During the coming decades, coastlines will respond to widely predicted sea-level rise, storm surge, and coastal inundation flooding from disastrous events. Because physical processes in coastal environments are controlled by the geomorphology of over-the-land topography and underwater bathymetry, many applications of geospatial data in coastal environments require detailed knowledge of the near-shore topography and bathymetry. In this paper, an updated methodology used by the U.S. Geological Survey Coastal National Elevation Database (CoNED) Applications Project is presented for developing coastal topobathymetric elevation models (TBDEMs) from multiple topographic data sources with adjacent intertidal topobathymetric and offshore bathymetric sources to generate seamlessly integrated TBDEMs. This repeatable, updatable, and logically consistent methodology assimilates topographic data (land elevation) and bathymetry (water depth) into a seamless coastal elevation model. Within the overarching framework, vertical datum transformations are standardized in a workflow that interweaves spatially consistent interpolation (gridding) techniques with a land/water boundary mask delineation approach. Output gridded raster TBDEMs are stacked into a file storage system of mosaic datasets within an Esri ArcGIS geodatabase for efficient updating while maintaining current and updated spatially referenced metadata. Topobathymetric data provide a required seamless elevation product for several science application studies, such as shoreline delineation, coastal inundation mapping, sediment-transport, sea-level rise, storm surge models, and tsunami impact assessment. These detailed coastal elevation data are critical to depict regions prone to climate change impacts and are essential to planners and managers responsible for mitigating the associated risks and costs to both human communities and ecosystems. The CoNED methodology approach has been used to construct integrated TBDEM models in Mobile Bay, the northern Gulf of Mexico, San Francisco Bay, the Hurricane Sandy region, and southern California.
The coastal land/water interface, or coastal zone, is a critical area because of its influence on shoreline mapping, geomorphic change analysis, and hydrodynamic modeling (Brock, Danielson, and Purkis, 2013; Buxton et al., 2013; Eakins et al., 2011; Gesch and Wilson, 2001). To better understand the coastal zone, consistently integrated topographic and bathymetric (topobathymetric) data are needed for many coastal applications (Danielson et al., 2013; Eakins and Grothe, 2014; Gesch and Wilson, 2001; Gesch, Gutierrez, and Gill, 2009; Leon et al., 2013; Poppenga et al., 2014; Stoker et al., 2009; Turnipseed et al., 2007; Tyler et al., 2007; Zhang et al., 2015). For example, near-shore topography and bathymetry reflect physical processes that are controlled by the geomorphology of both land elevation and water depth; therefore, onshore-offshore, cross-ecosystem characteristics are useful for understanding the likely impacts of global natural hazards (Gesch, Gutierrez, and Gill, 2009; Thatcher et al., 2016) and support scientific research assessing the impacts of various climate change scenarios on coastal regions (Buxton et al., 2013).
Although topobathymetric coastal digital elevation models (TBDEMs) are not new, there are unique challenges when attempting to integrate them (Davidson and Miglarese, 2003; Eakins and Grothe, 2014; Freeman, Bernstein, and Mitasova, 2004; Gesch and Wilson, 2001; Gesch, Gutierrez, and Gill, 2009; Medeiros et al., 2011; Mitasova et al., 2003). Because TBDEMs are three-dimensional (3D) renderings of disparate land and water depth data, the unique multi-temporal and spatially variable characteristics of each input data source can cause integration issues for aligning vertically and horizontally to common reference systems (Danielson et al., 2013; Gesch and Wilson, 2001; Medeiros et al., 2011). A consistent framework is needed to account for best practices in datum transformations, data buffering, and digital elevation model (DEM) accuracy and uncertainty (Amante and Eakins, 2016; Eakins and Grothe, 2014).
In this paper, an updated methodology used by the U.S. Geological Survey (USGS) Coastal National Elevation Database (CoNED) Applications Project (U.S. Geological Survey, 2015) is presented for developing seamlessly integrated coastal TBDEMs from multiple topographic data sources including adjacent intertidal topobathymetric and offshore bathymetric sources. This repeatable, updatable, and logically consistent methodology assimilates land elevation and water depth into seamless coastal TBDEMs. Within the overarching framework, vertical datum transformations are standardized in a workflow that interweaves spatially consistent interpolation (gridding) techniques with a land/water boundary mask delineation approach. Output gridded raster TBDEMs are stacked into a file storage system of mosaic datasets within an Esri ArcGIS geodatabase for efficient updating while maintaining current and updated spatially referenced metadata. TBDEMs are important for shoreline delineation, coastal inundation mapping, sediment-transport modeling, sea-level rise assessment, storm surge models, tsunami impact assessment, and analysis of the impact of various climate change scenarios on coastal regions (Buxton et al., 2013).
Topobathymetric Digital Elevation Model Integration
Coastal TBDEM integration seeks to improve data usability so that impacts of flooding, hurricanes, or other catastrophic events can be mapped more accurately and planning can be more effective (Eakins et al., 2011). However, coastal TBDEM integration is a complex issue because of the dynamic nature of water levels at the land/water interface (Gesch and Wilson, 2001) and the lack of consistent vertical surfaces between disparate data sources, which stems from varying orthometric and tidally referenced datums. An additional challenge is the acquisition of bathymetric data where surf conditions and shallow terrain features make the operation of bathymetric survey vessels hazardous (Hogrefe, Wright, and Hochberg, 2008). These issues are compounded by numerous technical challenges when merging disparate topographic (lidar, topographic maps) and bathymetric (acoustic sonar, hydrographic soundings, bathymetric charts) data collected with different types of survey instruments and at varying sampling densities over a single large region. Transposed positive and negative signs, data gaps at the coastal zone, or variations in horizontal and vertical units are also problematic (Eakins and Grothe, 2014; Quadros, Collier, and Fraser, 2008). The resampling and merging process can significantly degrade the vertical and horizontal accuracy of an integrated DEM (Yin et al., 2008) compared to the original source data, and can result in artifacts caused by misclassification of features types, reducing the usefulness of the DEM for cartographic and hydrologic applications.
Vertical Datums and Transformations
A major issue in TBDEM development is the differences in vertical datums (Davidson and Miglarese, 2003; Eakins and Grothe, 2014; Gesch and Wilson, 2001; Hogrefe, Wright, and Hochberg, 2008; Medeiros et al., 2011; Quadros, Collier, and Fraser, 2008; Stoker et al., 2009). Bathymetric data are typically referenced to tidally referenced datums, such as Mean Lower Low (MLLW). Topographic elevation data are typically referenced to an orthometric vertical datum, such as the North American Vertical Datum of 1988 (NAVD88). Prior to TBDEM integration, all tidally referenced heights first must be transformed into orthometric heights that are normally used for mapping elevation on land. In other words, coastal TBDEM development usually requires the conversion of source elevation data to a common vertical datum to avoid errors and discontinuities in the elevation data at the land/water interface (Davidson and Miglarese, 2003; Eakins et al., 2011; Eakins and Grothe, 2014; Gesch and Wilson, 2001; Hogrefe, Wright, and Hochberg, 2008; Medeiros et al., 2011; NOAA, 2007; Thatcher et al., 2016).
VDatum, a tool developed jointly by the National Oceanic and Atmospheric Administration's (NOAA's) National Geodetic Survey (NGS), Center for Operational Oceanographic Products and Services (CO-OPS), and Office of Coast Survey (OCS), performs transformations among tidal, orthometric, and (3D) ellipsoid-based datums and resolves vertical inconsistencies among diverse bathymetric, topographic, and shoreline data (Eakins et al., 2011; Gesch and Wilson, 2001; Milbert and Hess, 2001; Myers, 2005; NOAA, 2008; Parker et al., 2003). This transformation tool employs models, both grid-based and parametric, of the spatially-varying relationships between 36 different tidal, orthometric, and 3D datums (Eakins et al., 2011). The models that underpin the VDatum software tool are now available for the conterminous United States, providing an essential capability to seamlessly integrate land and water elevations that were originally referenced to various vertical datums (Thatcher et al., 2016).
There are several interpolation techniques for gridding elevation data, including spline, inverse distance weighting (IDW), natural neighbors, triangulation, and kriging, all of which create markedly different DEM surfaces when built from the same source data (Eakins and Grothe, 2014; Maune et al., 2007). Many of these interpolation techniques can introduce artifacts into coastal DEMs, such as edge effects or topographic creep (Eakins and Grothe, 2014), into unsurveyed marine areas. For example, in a random split-sample analysis of bathymetric data, Amante (2012) concluded that spline was more accurate over large interpolation distances, whereas IDW and triangulation interpolation techniques had effectively equivalent accuracies over short interpolation distances. Eakins and Grothe (2014) indicated that in coastal areas with a cliff or steep slope, a spline minimum-curvature interpolation may create a deep, false trough if bathymetric data are some distance offshore. To avoid this issue, they interpolated (without topographic data) between offshore bathymetry measurements and an accurate, detailed coastline, with the assumption that the land/water interface is relatively smooth with a constant slope.
Zhang et al. (2011) proposed an IDW interpolation method that considered the distance and uncertainty together. However, Zhang et al. (2015) indicated that the IDW method is only suitable for high-density multisource DEMs and does not account for the spatial continuity and spatial distribution for smaller data density. According to Johns (1998), the IDW method may present the “bull's-eye” effect, which describes obvious uplift or sunken regions.
Zhang et al. (2015) indicated that the ordinary kriging method, which uses covariance and a variogram to determine the relationship between the elevation variable and the distance, is used to interpolate (coastal) areas where data from different sources overlap. However, this interpolation method does not consider the accuracy differences while interpolating grid-based intertidal zones. Intertidal terrain changes present anisotropic characteristics (Chen et al., 2011), or elevation variation differences in orthogonal directions (Brown and Bara, 1994; Eakins and Grothe, 2014). Thus, Zhang et al. (2015) concluded that the data accuracy differences (uncertainty) and anisotropy (distance) of the intertidal terrain should be considered together. As such, they proposed a method that combines the optimal trend surface and kriging interpolation to account for anisotropy and to improve the overall accuracy of the model by using the optimal equivalent weight to reduce the influence of the accuracy difference of different data sources. The resulting residual surface defines the difference between the trend surface and the true terrain, thereby obtaining information of the local terrain by the kriging method (Zhang et al., 2015).
Because of the challenges associated in modeling elevation data derived from Real-Time Kinematic Global Positioning Systems (RTK-GPS) and single-beam sonar, Freeman, Bernstein, and Mitasova (2004) also applied a universal kriging method that incorporated an anisotropy factor and other parameters into the gridding procedure, which substantially improved the accuracy of the resulting DEM and the repeatability of the interpolation process.
To promote consistency among TBDEM development and to retain the accuracy of the original source data, techniques have been developed to spatially interpolate bathymetric single-beam, multi-beam, and hydrographic survey source data. Ship-mounted acoustic survey instruments such as single-beam and multi-beam sonar systems acquire data in a denser regular pattern than previous hydrographic or trackline surveys where the point pattern is more irregular, with tightly spaced points within the linear track and widely spaced points between tracklines. These techniques resolve the challenges that arise when interpolating point data acquired in a non-random pattern. A more effective approach uses an empirical Bayesian kriging algorithm, which is a geostatistical interpolation method that accounts for the error in estimating the underlying semivariogram through repeated simulations (Pilz and Spöck, 2007; Zhang et al., 2015).
Coastal Digital Terrain Models
In the scientific literature, methods have been employed to assimilate topographic and bathymetric data into a seamless surface model (Eakins and Grothe, 2014; Gesch and Wilson, 2001; Hogrefe, Wright, and Hochberg, 2008; Medeiros et al., 2011). For example, Hogrefe, Wright, and Hochberg (2008) derived coastal terrain models by mosaicking bathymetry acquired from 5-m spectral imagery with 10-m DEMs. Although the 10-m DEMs were previously merged with bathymetric data, there were shallow water data gaps that prevented a seamless surface. Therefore, a gap-fill expression was iteratively applied but some large voids remained from extensive areas of clouds necessitating estimating mean values of surrounding grid cells. NoData cells were resolved by assigning a mean value from a 6x6 average moving window analysis (Hogrefe, Wright, and Hochberg, 2008) that did not change the original data values.
Other types of coastal terrain methodologies employ digital terrain models (DTMs) to generate coastal TBDEMs (Kearns, 2005). As defined in the USGS Lidar Base Specifications version 1.2 (Heidemann, 2014), a DTM is a vector dataset composed of 3D breaklines and regularly spaced 3D mass points that characterize the shape of the bare-earth terrain. A DTM is not a surface model as its component elements are discrete and not continuous; thus, a triangulated irregular network (TIN) or DEM surface must be derived from the DTM. ArcGIS (Esri, Redlands, CA) can efficiently store large DTMs using a geodatabase structure framework called a terrain dataset, which comprises multipoint feature datasets within a file geodatabase. Because of the sheer volume of lidar and bathymetric point data, the terrain dataset is useful for organizing the data points into an efficient and logical order of varying resolutions and vertical tolerances, and can generate TINs on the fly. Kearns (2005) indicated that 3D raster-based elevation models can be generated from terrain dataset source data, rather than derived datasets.
Eakins and Grothe (2014) stated that the inclusion of terrain parameters, such as slope and aspect, or terrain features (breaklines representing rivers and valleys), creates a superior type of DEM. Maune et al. (2007) indicated that DTMs are technically superior to standard DEMs for many applications. One of the first integrated TBDEMs to be created was developed by Gesch and Wilson (2001) for Tampa Bay, Florida. A user-defined shoreline was used based upon an interface of zero/nonzero elevations from the National Elevation Dataset (NED) (Gesch et al., 2002; Gesch and Wilson, 2001). A surface interpolation was employed in a buffered overlap area by including points from near-shore bathymetry and near-shore topography.
Medeiros et al. (2011) generated DTMs of lidar and hydrographic survey data to produce a seamless integrated TBDEM in the vicinity of Tampa Bay, Florida. They noted that creating DTMs requires substantial manual processing of the data to ensure its applicability and accuracy. Because VDatum (NOAA, 2008) was not available for their entire study area, they used NOAA Tidal Benchmark Stations to vertically transform the input source data and the NOAA Medium Resolution Shoreline and study boundary to constrain the DTM model.
The process to separate land and water for TBDEM development is time-consuming and has been traditionally accomplished by heads-up digitizing using high-resolution aerial photography. In the topobathymetric model, the land/water interface is a critical area since this is where the topography and bathymetry data are united along the near-shore. From a data acquisition standpoint, boats cannot acquire data in shallow water (Eakins and Grothe, 2014) and water conditions are typically turbid with consistent white cap surf conditions. Before interpolation (gridding), the land and water points must be separated to remove overlapping data points and to avoid the introduction of topographic creep and other edge artifacts (Eakins and Grothe, 2014). Eakins and Grothe (2014) point out that many datasets contain zero values to represent the water surface. An effective way to remove the water surface from the data is to create a detailed bare earth coastline of the area that can be used to clip out the water values.
A new and improved automated method for deriving the land/water interface employs a Minimum Convex Hull (MCH) algorithm directly from the perimeter lidar points. The details for this method are discussed in the “Topobathymetric Methods” section of this paper, which provides a repeatable and systematic method for deriving the land/water boundary from topographic (lidar) data. High-resolution integrated coastal TBDEMs are being developed by the USGS Coastal National Elevation Database (CoNED) Applications Project as a collaborative effort between the USGS Coastal and Marine Geology Program, the USGS National Geospatial Program, and NOAA National Centers for Environmental Information. CoNED Applications Project TBDEM development is focused in select regions along the U.S. coast, such as in the northern Gulf of Mexico, the Hurricane Sandy region (Buxton et al., 2013), San Francisco Bay, the Pacific Northwest, and the North Slope of Alaska (Gibbs, Nolan, and Richmond, 2015). These CoNED TBDEMs are developed using a methodology that incorporates systematic gridding techniques for handling spatial data with varying point spacing and densities, a new technique for land/water masking, and a geodatabase storage solution for easy management, manipulation, and updating of overlapping and adjacent raster elevation data.
The principal methodology for developing CoNED integrated TBDEMs is organized into three main components. The “topography component” consists of the land-based elevation data, which is primarily composed of lidar data. Topography source data include lidar data from different sensors; the most common spectral wavelengths are near-infrared (NIR) (1,064 nm) for topographic and green (532 nm) for topobathymetric lidar data. The “bathymetry component” consists of hydrographic sounding (acoustic) data acquired using boats and bathymetry acquired from topobathymetric lidar. The most common forms of acoustic bathymetry used include multi-beam, single-beam, and swath. The final component, “integration,” encompasses the merging of the topographic and bathymetric data along the near-shore based on a predefined set of priorities. The land/water interface (−1.5 m – +1 m) is the most critical area, and the green laser lidar systems, such as the Experimental Advanced Airborne Research Lidar (EAARL-B) and the Coastal Zone Mapping and Imaging Lidar (CZMIL) that cross the near-shore interface, are valuable in developing a seamless elevation model transition. The end products from the topography and bathymetry components are rasters with associated spatial masks and metadata that can be passed to the integration component using a geodatabase (mosaic dataset) for final model incorporation.
For CoNED models, the vertical control in the model is referenced to NAVD88. Horizontal control is referenced to the North American Datum of 1983 (NAD83) horizontal datum and the Universal Transverse Mercator (UTM) projection. The topobathymetric methods framework was developed in Python 2.7.5 using Geospatial Data Abstraction Library and Esri ArcGIS data management, spatial, 3D, and geostatistical analyst extension libraries. The geospatial functions in the data management extension are used to create and manage geodatabases, such as the LAS Dataset and mosaic dataset. The spatial analyst tools are used in image processing operations, and the 3D extension tools are used to access and manipulate lidar point cloud data in ArcGIS. The geostatistical analyst tools, such as Cross-Validation and Empirical Bayesian Kriging (EBK), are used in the general manipulation, analysis, and interpolation of point-based data. VDatum and LP360 (GeoCue Group) are software packages also used in the framework workflow but are external to the centralized Python code.
There are eight primary steps in the topography processing component (Figure 1). The first step is a quality control check of the vertical and horizontal datum and projection information of the input lidar source to ensure the data are referenced to NAVD88 and NAD83 (UTM). All input source data are inspected for completeness, spurious anomalies, metadata, and accuracy. The accompanying vendor reports and metadata are important for assessing acquisition dates, nominal point spacing, dataset classification, accuracy statements, and processing steps. If the source data are not referenced to NAVD88, the input lidar data are transformed to the NAVD88 reference frame using current NGS geoid models. If required, the input source topographic data are projected to UTM. The second step is to verify that the data are classified with the minimally required classes of ground (class 2), water (class 9), low points and noise (class 7), and withheld points (class 11). If necessary, the point clouds are classified using GeoCue LP360-Classify or other equivalent lidar processing software. Step three applies breaklines where available to delineate and hydroflatten internal water bodies, such as lakes, ponds, and rivers.
Step four extracts the ground returns from the classified lidar data and takes a random spatial subset of 5% of the total points. The ArcGIS command Subset Features in the Geostatistical Analyst extension is used for this step. Ninety-five percent of the points are used to construct a terrain model along with associated breaklines and land/water masks to generate the topographic surface, while the random subset of points is used to compute the interpolation accuracy from the derived surface as Root Mean Square Error (RMSE). The RMSE, as described in Maune et al. (2007), is a commonly used metric to express vertical accuracy of elevation datasets (Equation 1). Step five in the topographic component is to create the land/water boundary mask using the MCH method to separate topographic points on land from bathymetric points in the water. The MCH algorithm is further highlighted and explained in the upcoming section.
Minimum Convex Hull (MCH) Algorithm – Land/Water Boundary Mask
To improve the efficiency of coastal TBDEM generation, a robust automated geoprocessing method that extracts the perimeter land/water boundary from the exterior classified ground lidar points for large regions based on the lidar source data is presented. This spatial mask is used to constrain the terrain model and remove extraneous artifacts outside of the extent of the ground lidar points (i.e., points incorrectly classified as ground over the water). The land/water boundary masks are needed to remove topographic lidar returns from the water surface and to create openings to permit actual bathymetry values to be passed into the TBDEM model during the integration phase. For example, bathymetry data exist for the lower Mississippi River and several navigation channels in coastal Louisiana. A detailed land/water mask is required to remove topographic lidar returns from the channels so that the submerged topography can be included in the assimilated TBDEM. The Geographic Information Systems (GIS) and image processing procedures to construct the MCH land/water mask are as follows:
Using a LAS Dataset geodatabase model in ArcGIS, the average point spacing value is computed from the input lidar point cloud tiles using the 3D Analyst function Point File Information. The tile-based point spacing values are summarized into an average point spacing value for the entire lidar project. Figure 2A displays an example lidar point cloud in a near-shore coastal area.
The average point spacing number is multiplied by 4 to grow and expand areas void of ground points. The empirical scale factor of 4 provides the optimal cell size for the upcoming density raster as it allows areas void of ground points to be further extended and enlarged. A scale factor of 4 was selected after empirical testing with different factor values using several lidar datasets with varying point densities in focus regions such as Mobile Bay, southern Louisiana, and the Hurricane Sandy region.
Using the computed average point spacing, as defined in step two, as the cell size, a density raster from the lidar point count attribute information is created. Areas in the lidar point cloud with sparse points or that are void of ground points are represented as NoDATA in the output raster. The resulting density raster essentially defines the extent of land areas from the lidar point cloud with voids remaining where areas exceed the buffered point spacing. Figure 2B displays the density raster output with higher density (point count) values shaded in white and lower density values (point count) shown in black. The ArcGIS command LAS Point Statistics As Raster in the Data Management Analyst extension was used to create the raster from the LAS data in this step.
Using the density raster, all cell values greater than zero are assigned a cell value of 1. The output in Figure 2C displays the binary mask and was generated using the ArcGIS command Con in Spatial Analyst.
Working with the binary mask, small NoDATA areas (salt and pepper effect) are removed by expanding neighboring (zonal) cell values (foreground cells with a value of 1 from the binary mask) by one cell into the background NoDATA cells (Figure 2D) using the ArcGIS command Expand in Spatial Analyst.
The overall extent of data cells (foreground cells with value of 1) is reduced in the expanded grid by one cell to improve the spatial fit of the overall raster data coverage to the most exterior ground lidar points (Figure 2E.) The ArcGIS command Shrink in Spatial Analyst was used in this step.
The resulting shrink raster is converted to a polygon feature class. Polygon features in this layer represent land and water bodies connected to the ocean, such as channels, bays, estuary, and ocean edge. Inland polygons in this layer represent areas where the point spacing is too sparse (lack of ground points) and should be voided in the terrain to avoid excessive interpolation of elevation features. Many of the inland feature polygons are in vegetated or urban areas where buildings have been removed (void of ground points), but some inland polygons also indicate valid inland water bodies, such as lakes and ponds. In Figure 2F, the red polygon boundary outlines the extent of the MCH land/water mask overlaid on the ground lidar points (class 2) with higher elevation values displayed in green.
Small interior polygons can be removed by eliminating those polygons with an area 90% less than the total outer polygon area, leaving only the primary land/water delineation. The eliminated mask could be useful to create a surface entirely free of voids in the terrain, especially when no alternate data source is available to gap-fill the holes. The ArcGIS management function Eliminate Polygon Part was used in this step.
If required, step six in the topographic component is to hydrologically enforce the bare-earth lidar-based raster, where feasible, to ensure that downstream flow continues through man-made structures (bridges and culverts) in the modeled DEM (Poppenga et al., 2014). Step seven uses the MCH land/water mask to constrain the terrain model based on TINs and associated breaklines to interpolate (grid) the ground points at a 1-m spatial resolution using a natural neighbor algorithm. The gridded raster surface from step seven is passed into the raster stack for merging in the integration component. Finally, step eight in the topographic component computes the interpolation accuracy in RMSE by comparing elevation values in the random subset of points to values extracted from the derived gridded elevation surface.
The bathymetry processing component has six primary steps (Figure 3). The first step is to quality control check the vertical and horizontal datum and projection information of the input bathymetric source to ensure the data are referenced to NAVD88 and NAD83 and projected in UTM. If the source data are not NAVD88, transform the input bathymetric data to NAVD88 reference frame using VDatum. Likewise, if required, convert the input source data to NAD83 and project in UTM.
The second step in the bathymetry component prioritizes and spatially sorts the bathymetric survey points based on accuracy, acquisition date, spatial distribution, and point density. This eliminates any overlapping bathymetric points in order to minimize interpolation artifacts that can introduce artificial morphological changes or erroneous edge anomalies. The ArcGIS for Maritime: Bathymetry extension is useful for sorting Bathymetric Attributed Grid (BAG) files using many associated attributes. The BAG format is a non-proprietary, open source data exchange format created to facilitate the processing and storage of large-volume multi-beam sonar data. Step three takes a random spatial subset of 5% of the bathymetric points to compute the interpolation accuracy (RMSE) from the derived surface. The remaining points will be gridded in the EBK model in step four along with associated spatial masks to generate the bathymetric surface.
Empirical Bayesian Kriging (EBK) Algorithm
The spatially sorted bathymetric single-beam, multi-beam, and hydrographic survey source data are interpolated using an EBK gridding algorithm. Bathymetric data (acoustic sonar) have a different spatial distribution than topographic lidar data. The points from both data sources are randomly distributed but the lidar point pattern is denser and more uniform and contains fewer gaps than the bathymetric point pattern. The spatial gaps are more prevalent in the older trackline bathymetric data than current high-resolution swath or multi-beam bathymetric data. Figure 4 displays an example bathymetric survey where the point spacing within and along a trackline is 1 m but the gap between tracklines is 100 m. The selection of an appropriate interpolation method is important so that a smooth continuous bathymetric surface is generated versus a surface with artifacts such as triangulated facets in flat water or misrepresented features caused by the use of TINs.
Using overlapping data subsets with a moving window, EBK is a geostatistical interpolation method that accounts for the error in estimating the underlying semivariogram through repeated simulations. This method is useful in making predictions and for estimating spatial autocorrelation. For every interpolated area, the EBK algorithm assumes the computed semivariogram is the actual semivariogram. For all kriging-based algorithms, the semivariogram is the function used to compute the variance (differences) between point samples at varying distances. The EBK algorithm supports three semivariogram implementations using power, linear, or thin plate spline. For the purpose of EBK in the topobathymetric methods framework, the linear semivariogram model was selected and is defined in Equation 2.
Because there are no upper data limits, the sill and range values in the linear semivariogram model for EBK are undefined but the nugget (low value) is fixed. In the linear equation, the h parameter is the distance, the b parameter is the slope, and the Nugget is defined as the lowest value. Semivariogram parameters are estimated using a restricted maximum likelihood algorithm, while most other kriging methods use a weighted least squares approach to estimate semivariogram parameters. Finally, the kriging model in EBK uses an intrinsic random function, which is useful in rectifying spatial trends in the input data. The EBK algorithm has been implemented in ArcGIS 10.2.2 Geostatistical Analyst and was used for bathymetric gridding in the topobathymetric methods framework. EBK in ArcGIS requires the input source points to be multipoint features and supports only one source file as input so all source points need to be merged into a single feature class dataset. The key EBK parameters used to implement the tool in ArcGIS for CoNED bathymetric gridding are explained in the attached Appendix.
Step five in the bathymetry component uses cross-validation to compare the predicted value in the geostatistical model to the actual observed value to assess the accuracy and effectiveness of model parameters by removing each data location one at a time and predicting the associated data value. The results are reported in terms of RMSE. Finally, step six computes the interpolation accuracy (as RMSE) by comparing the elevation values in the 5% random sample of points to values extracted from the derived kriging predictive surface.
There are seven primary steps in the integration processing component (Figure 5). Step one develops a mosaic dataset geodatabase model in ArcGIS. Step two loads the input individual rasters from the topography and bathymetry components into the mosaic dataset and creates spatial seamlines using the MCH land/water boundary or associated breakline for each raster layer included in the integrated TBDEM. Seamlines are important since they provide the blending rules between adjacent raster datasets. Seamlines are checked to ensure that older topographic boundaries do not extend beyond the most recent topographic dataset boundary. Otherwise, the resulting mosaic may contain one or more false shorelines. Step three generalizes the seamline edges to smooth transition boundaries between neighboring raster layers and to split complex raster datasets with isolated regions into individual unique raster groups. Within this step, any raster that corresponds to a modified seamline boundary must be re-masked to remove the excess shoreline area. Step four in the integration component develops an integrated shoreline transition zone from the best available topographic and bathymetric data to blend the topographic and bathymetric elevation sources. Using the MCH boundary, a 100-m buffer is created to logically mask input topography/bathymetry data. Then, EBK is used to interpolate the selected topographic and bathymetric points to gap-fill any near-shore holes in the bathymetric coverage, if required. Topobathymetric lidar data sources such as the EAARL-B or CZMIL systems provide up-to-date, high-resolution data along the critical land/water interface within the inter-tidal zone.
Step five prioritizes and spatially sorts the input topographic and bathymetric raster layers based on accuracy and acquisition date to sequence the raster data in the geodatabase (mosaic dataset). Step six outputs the final integrated TBDEM based on the prioritization of the raster layers and generates spatially referenced metadata for each unique data source. The raster boundary feature classes are combined by iteratively executing the update tool in ArcGIS and inserting the highest priority boundary last. The resulting multi-part feature class is designated as the TBDEM's spatial metadata, representing the spatial footprint of each data source used in the generation of the topobathymetric dataset. Each polygon is populated with attributes that describe the source data, such as resolution, acquisition date, geoid, interpolation accuracy, source name, source organization, source contact, source project, source URL, and data type (topographic lidar, bathymetric lidar, multi-beam bathymetry, single-beam bathymetry). During the TBDEM update phase, the spatially referenced metadata serve as the processing area boundary to create TBDEMs from small mosaic datasets. The data must be processed into smaller subsets because a mosaic dataset cannot have more than 10,000 vertices in a seamline component, and a typical spatial metadata feature class can contain millions of vertices. Finally, in step seven after a TBDEM has been created and published, the geodatabase (mosaic dataset) can be updated as new elevation data are collected. This is accomplished by regenerating the combined spatial metadata with the boundary (MCH) of the new raster and then reprocessing only the affected portion of the TBDEM.
MCH Land/Water Mask Comparison
The MCH mask is especially useful in working with lidar data to define the land/water edge where no breaklines derived from photography are available. One advantage of the MCH algorithm is that it is repeatable and systematic from one lidar survey to another with little need to change algorithm parameters for different geographic regions or lidar surveys. Breaklines are typically created at major terrain breaks including hydrography, road edges, street centerlines, rail centerlines, and walls. A key purpose of hydrologic breaklines is defining the extents and elevations around rivers, channels, canals, and lakes for hydro-flattening of water features.
One practical way to test the operational effectiveness of the MCH algorithm is to spatially compare the MCH land/water result against boundaries traditionally derived from stereo photography. For this comparison, the Atchafalaya Basin was selected where recent 2013 topographic (NIR 1,064 nm) lidar was collected (Atchafalaya2, 2014). This lidar collection was processed to the requirements in the USGS Lidar Base Specifications version 1.2 (Heidemann, 2014). The breaklines derived from stereo photography by the lidar vendor that accompany the classified lidar are used as ground truth in this comparison. Figure 6 displays the Atchafalaya Basin zoomed in on the Atchafalaya Delta. Red lines on the map represent the MCH-derived boundaries and the blue lines represent breaklines derived from stereo photography. In this example, a few channels (red) were demarcated by MCH that were not captured in the breakline dataset. To quantify the change between the two boundary masks, the Coefficient of Areal Correspondence (CAC) metric was used. The CAC evaluates the corresponding overlap of two areal delineations (Taylor, 1977). The CAC is computed by dividing the intersecting area of two delineations by the union of the same two delineations. The result of the CAC metric comparison between the two delineations indicates a 92.22% spatial agreement between the MCH-derived land/water mask and the breakline mask derived from stereo photography. The reliability of the MCH boundary delineation is dependent on the lidar point cloud being classified, and any derived output is directly influenced by the quality and accuracy of the classification. Besides being used for topobathymetric land/water masking, the MCH method is also beneficial for delineating the extent of inland channel features and emergent marsh wetlands.
EBK Model Validation
Understanding the error associated with interpolation (gridding) is important for independently assessing the internal accuracy of each individual source in the integrated TBDEM. EBK was used to interpolate bathymetric acoustic sonar data in south San Francisco Bay for inclusion in the 2-m topobathymetric model. A 2005 hydrographic survey of south San Francisco Bay was used as source input for the EBK model (Foxgrover et al., 2007). This survey served as an essential baseline for tracking changes in the bay and provided insight into the sea floor topobathy from the 1980s to 2005. This survey was acquired with a 1-m spacing along trackline and 100-m spacing in the overlap between adjacent trackline zones. The non-uniformity of the survey with large spatial gaps between tracklines lends itself to EBK as an appropriate interpolation technique. A predictive kriging model surface was created using a linear semivariogram model.
An evaluation of the EBK gridding result was conducted to assess internal performance and accuracy of the EBK algorithm. One way to test the effectiveness of the kriging model is to use a geostatistical technique called cross-validation to compare predicted values in the model to actual observed values by removing each data location one at a time and predicting the associated data value. The predicted and actual values at the location of the omitted point are compared, and this process is repeated for all points in the dataset. Cross-validation error is typically represented in RMSE. The cross-validation result was 0.03 m RMSE, which indicates a reasonably close fit between the actual values and the predicted kriging values.
The interpolation accuracy can be assessed using test control points withheld from the source bathymetric points. Figure 7 displays 5% of the test control points draped over the predictive kriging model surface. The resulting interpolation error for the predictive kriging surface was 0.72 m (RMSE) compared to the test control points. The overall mean error of 0.01 m indicates minimal bias, with the majority of the points falling near zero (Figure 8). The resulting figure is also useful for identifying tracklines with issues and potential outliers. In particular, the figure shows two tracklines that appear to deviate above and below the cluster around zero error.
Integrated TBDEM models provide essential base information for coastal science applications such as sea-level rise assessments, tsunami impact forecasting, sediment transport analysis, and storm surge modeling. It is important to construct seamless TBDEM models using a consistent, repeatable, and systematic framework that maintains transformations, masking, horizontal and vertical accuracy, interpolation uncertainty, and easy updating and maintenance.
The methodological approach presented in this paper outlines a geospatial approach for developing CoNED-based TBDEM models that incorporate an updated technique called Minimum Convex Hull for land/water masking and exploits Empirical Bayesian Kriging for bathymetric gridding while providing a consistent and replicable end-to-end workflow for TBDEM generation. Segments of the CoNED methodology could also be used to enhance other geospatial applications, such as using EBK to generate a probability error surface from control points or expanding the appliance of the MCH technique to map wetland extent, inland water features, shoreline boundaries, and change vectors from multi-temporal lidar point cloud data. This methodology has been applied to construct integrated TBDEM models in Mobile Bay, Alabama, the northern Gulf of Mexico, San Francisco Bay, the Hurricane Sandy region, and southern California.
The authors would like to thank Daniel Howard, Sandra Prince, Michael Oimoen, Patricia Schrader, Bruce Worstell, and Zheng Zhang, all of Stinger Ghaffarian Technologies (SGT), contractor to the USGS Earth Resources Observation and Science (EROS) Center for their contributions to the CoNED Applications project. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
Empirical Bayesian Kriging (Geostatistical Analyst) Parameters
Input features = Multipoint features located in a Geodatabase or shapefile
Z value field = Shape.Z (Z values stored in the shape geometry field)
Output raster = GeoTIFF
Output cell size = Variable (100 m to 1 m, depending on spatial point density)
Data transformation type = None
Semivariogram model type = Linear
Output surface type = Prediction (Surface produced from interpolated values)
Additional Model Parameters:
Search Neighborhood Parameters:
Workspace = Set current workspace path
Output Coordinates = Set output coordinate system to the same as the multipoint feature data
Processing Extent = Set the spatial extent to the MCH boundary shapefile or feature class to constrain the EBK output extent.
Raster Analysis = Set cell size and set the mask to the MCH boundary shapefile or some other appropriate feature class layer.