Predictive soil property maps with prediction uncertainty at 30-meter resolution for the Colorado River Basin above Lake Mead
Data for journal manuscript: A hybrid approach for predictive soil property mapping using conventional soil survey data
Dates
Publication Date
2020-06-29
Time Period
2020
Citation
Nauman, T.W., and Duniway, M.C., 2020, Predictive soil property maps with prediction uncertainty at 30 meter resolution for the Colorado River Basin above Lake Mead: U.S. Geological Survey data release, https://doi.org/10.5066/P9SK0DO2.
Summary
These data were compiled to demonstrate new predictive mapping approaches and provide comprehensive gridded 30 meter resolution soil property maps for the Colorado River Basin above Hoover Dam. Random forest models related environmental raster layers representing soil forming factors with field samples to render predictive maps that interpolate between sample locations. Maps represented soil pH, texture fractions (sand, silt clay, fine sand, very fine sand), rock, electrical conductivity (ec), gypsum, CaCO3, sodium adsorption ratio (sar), available water capacity (awc), bulk density (dbovendry), erodibility (kwfact), and organic matter (om) at 7 depths (0, 5, 15, 30, 60, 100, and 200 cm) as well as depth to restrictive layer (resdept) [...]
Summary
These data were compiled to demonstrate new predictive mapping approaches and provide comprehensive gridded 30 meter resolution soil property maps for the Colorado River Basin above Hoover Dam. Random forest models related environmental raster layers representing soil forming factors with field samples to render predictive maps that interpolate between sample locations. Maps represented soil pH, texture fractions (sand, silt clay, fine sand, very fine sand), rock, electrical conductivity (ec), gypsum, CaCO3, sodium adsorption ratio (sar), available water capacity (awc), bulk density (dbovendry), erodibility (kwfact), and organic matter (om) at 7 depths (0, 5, 15, 30, 60, 100, and 200 cm) as well as depth to restrictive layer (resdept) and surface rock size and cover. Accuracy and error estimated using a 10-fold cross validation indicated a range of model performances with coefficient of variation (R2) for models ranging from 0.20 to 0.76 with mean of 0.52 and a standard deviation of 0.12. Models of pH, om and ec had the best accuracy (R2 > 0.6). Most texture fractions, CaCO3, and SAR models had R2 values from 0.5-0.6. Models of kwfact, dbovendry, resdept, rock models, gypsum and awc had R2 values from 0.4-0.5 excepting near surface models which tended to perform better. Very fine sands and 200 cm estimates for other models generally performed poorly (R2 from 0.2-0.4), and sample size for the 200 cm models was too low for reliable model building. More than 90% of the soils data used was sampled since 2000, but some older samples are included. Uncertainty estimates were also developed by creating relative prediction intervals, which allow end users to evaluate uncertainty easily.
Nauman, T.W. and Duniway, M.C., 2020, A hybrid approach for predictive soil property mapping using conventional soil survey data. Soil Science Society of America Journal
The primary purpose of this data was to demonstrate a new workflow for creating soil property maps across the United States. However, some of these maps have potential to assist 1) land managers with decision making, 2) earth system modeling applications, and 3) future sampling to improve soil survey and future predictive mapping products. Soil properties were chosen to address relevant soils data needs such as concerns about erosion, salinity, and dust emissions. Uncertainty was characterized for every pixel with 95% prediction interval bounds and a relative prediction interval (RPI) metric that standardizes prediction intervals to the orignial training sample distribution for each model. The RPI values easily interpretable as values below 0.5 indicate low likelihood of error being higher than the global root mean squared error, and values exceeding 1.0 indicate more likelihood of error beyond global error summaries. In short, RPI values < 0.5 are consistently pretty good; values up to 0.9 are probably still reliable but probably have some error, and values close to and above 1.0 should be regarded with suspicion and perhaps trigger field evaluation of estimates before use.
Rights
The author(s) of these data request that data users contact them regarding intended use and to assist with understanding limitations and interpretation. Unless otherwise stated, all data, metadata and related materials are considered to satisfy the quality standards relative to the purpose for which the data were collected. Although these data and associated metadata have been reviewed for accuracy and completeness and approved for release by the U.S. Geological Survey (USGS), no warranty expressed or implied is made regarding the display or utility of the data for other purposes, nor on all computer systems, nor shall the act of distribution constitute any such warranty.