Sampling-Based Approaches to Estimating Two-Dimensional Large Wood Area from UAS Imagery

Matthew I. Barker; Jonathan D. Burnett; Francisco Mauro Gutiérrez; Steven M. Wondzell; Michael G. Wing

doi:10.4236/jgis.2022.146032

Journal of Geographic Information System > Vol.14 No.6, December 2022

Sampling-Based Approaches to Estimating Two-Dimensional Large Wood Area from UAS Imagery

Matthew I. Barker¹, Jonathan D. Burnett², Francisco Mauro Gutiérrez³, Steven M. Wondzell², Michael G. Wing^1*
¹Forest Engineering, Resources, and Management Department, College of Forestry, Oregon State University, Corvallis, OR, USA.
²USDA Forest Service, Pacific Northwest Research Station, Corvallis, OR, USA.
³EiFAB—iuFOR, Campus Duques de Soria s/n, Universidad de Valladolid, Soria, Spain.
DOI: 10.4236/jgis.2022.146032 PDF HTML XML 121 Downloads 638 Views

Abstract

Large in-stream wood (LW) is a critical component of riparian systems that increases heterogeneity of flow regimes and provides high quality habitat for salmonids and other fishes. We present four sampling-based methods to estimate two-dimensional LW for a 61-hectare river restoration project on the South Fork McKenzie River near Rainbow, OR (USA). We manually delineated LW area, from unoccupied aircraft systems (UAS) multispectral imagery for 40 randomly selected 51.46 m² hexagonal plots. Seven auxiliary variables were extracted from the imagery and imagery derivatives to be incorporated in four estimators by summarizing spectral statistics for each plot including Random forest (RF) classification of segmented imagery (Cohen’s kappa = 0.75, balanced accuracy = 0.86). The four estimators were: difference estimator, simple linear regression estimator with one auxiliary variable, general regression estimator with seven auxiliary variables, and simple random sample without replacement. We assessed variance of the estimators and found that the simple random sample without replacement produced the largest estimate for LW area and widest confidence interval (17,283 m², 95% CI 10,613 - 23,952 m²) while the generalized regression approach resulted in the smallest estimate and narrowest confidence interval (16,593 m², 95% CI 13,054 - 20,133 m²). These methods facilitate efficient estimates of critical habitat components, that are especially suited to efforts that seek to quantify large amounts of these components through time. When combined with traditional sampling methods, classified imagery acquired via UAS promises to enhance the temporal resolution of the data products associated with restoration efforts while minimizing the necessity for potentially hazardous field work.

Keywords

UAS, Large in-Stream Wood, Sampling, Image Classification, River Restoration

Share and Cite:

Barker, M. , Burnett, J. , Gutiérrez, F. , Wondzell, S. and Wing, M. (2022) Sampling-Based Approaches to Estimating Two-Dimensional Large Wood Area from UAS Imagery. Journal of Geographic Information System, 14, 571-588. doi: 10.4236/jgis.2022.146032.

1. Introduction

Large in-stream wood (LW) plays a critical role in ecological processes in fluvial systems. A primary goal in river restoration projects focusing on process-based restoration is the recruitment and retention of LW in the stream environment as LW is associated with increased sediment deposition, greater geomorphic complexity, increased heterogeneity of flow regimes, and provides habitat for benthic macroinvertebrates that are food sources for fishes [1] [2] [3]. However, anthropogenic influences associated with installation of dams and roads, and forest overstory removal have resulted in reduced wood delivery to mountain streams in recent decades [4].

Process-based restoration methods such as restoration to a Stage 0 condition that has been implemented in the Pacific Northwest often include addition of wood to the stream to jump start in-stream processes that depend on wood, such as macroinvertebrate life cycles [5] [6] [7]. Monitoring stream dynamics and wood retention within river restoration areas is important for assessing the restoration results and informing design for future restoration projects. Additionally, restoration activities are unlikely to return the wood delivery process to the stream system without removal of the anthropogenic influences that inhibited the process to begin with. As such, long term monitoring is necessary to understand how LW dynamics are changing with time. This is especially important at the South Fork McKenzie River Stage 0 restoration site which is located downstream of Cougar Reservoir and Dam. The dam was completed in 1963, and since then it has limited the potential for wood delivery both due to the regulation of very high discharges that could deliver additional wood downstream, and the inherent barrier it poses to wood passage.

Large wood has been measured using many methods, including transect sampling, census (i.e., measuring all wood), and quadrat approaches based on a locally defined reach. For example, the Bureau of Land Management AIM National Aquatic Monitoring Field Protocol for Wadable Lotic Systems counts large wood on sample reaches by tallying pieces by diameter size class [8]. The Large Woody Debris Index is another formalized protocol that is based on 100 m longitudinal transects where diameter and length of wood are measured [9]. In contrast, terrestrial approaches like the US Forest Service Forest Inventory Analysis program, estimates LW by tallying pieces that intersect transects in circular subplots at azimuths 30˚, 150˚ and 270˚ [10]. Woldendorp et al. [11] found transect-based approaches offer acceptable levels of accuracy when the transect is of sufficient length to cover the range of varying sizes and distributions throughout a site. However, on a restoration site where thousands of logs have been placed in the restored area, and a formerly single threaded channel becomes a broad multithreaded channel system, conventional methods are of limited utility for quantifying the LW adequately enough for making inference across space and time. Measuring individual pieces that intersect transects would not be practical, especially when a monitoring project requires multiple revisits to determine whether wood is being retained through time. Additionally, safety issues of conventional in-stream measurement methods are of particular concern in fluvial environments, making these surveys impossible to conduct when a river is flowing during a high discharge event. In addition to potential dangers posed to crew members, these surveys are time-consuming, requiring many hours to implement.

Imagery datasets can be time consuming to manually interpret. However, when paired with supervised classification methods, this effort can be greatly reduced. In a supervised classification, an interpreter manually classifies a subset of the imagery to train the classifier. A portion of the training data is withheld to be used as validation data, thereby assessing the performance of the classifier. Lastly, predictions are made to the remaining imagery.

In this study, we pair spectral variables with results of a supervised classification to estimate the two-dimensional area of large wood throughout the project site. We propose four sampling methods that utilize high resolution multispectral UAS imagery to estimate total in-stream LW two-dimensional area for a stage 0 restoration project in the South Fork McKenzie River (USA). Our objectives are to compare the LW estimates and variances for each sampling method.

This paper demonstrates a novel approach to quantify in-stream LW by combining both RF supervised classification and sampling-based estimators. Previous research by Queiroz et al. [12] classified terrestrial LW as standing snags or downed wood using combination of aerial imagery, LiDAR data, image segmentation, and RF classification, but they did not seek to quantify the wood area, and therefore estimated accuracy of the classified segments.

2. Methods

2.1. Site Description

The South Fork McKenzie River Stage 0 restoration project is located in the Western Cascade Mountains, approximately 70 km east of Eugene, OR (44.1586, −122.2883). The site is approximately 60 hectares with elevation ranging from 300 to 340 m. The site receives more than 1770 mm rainfall per year. Cougar Dam is located six km upstream of the study area and was constructed in 1963. The dam effectively disconnected the channel from the surrounding floodplain, due to reduction in sediment deposition resulting in increased incision and armoring of the channel bed. This has altered the plant community on the flood plain and along the riparian zone and has ultimately reduced available salmonid habitat relative to historic conditions. Stage 0 restoration seeks to restore processes that existed at the site prior to anthropogenic influence [13]. To this end, in 2018, crews implemented a multimillion-dollar restoration project that reconnected the fluvial plain with the historic channel by filling the previous channel with sediment and placing large wood throughout the restored area.

2.2. UAS Monitoring

On 23 September 2019, we acquired aerial imagery spanning the post-treatment site with a small UAS, the DJI Matrice 200 v2. We mounted a Micasense Altum combination multispectral/thermal sensor to the UAS oriented nadir relative to the landscape. UAS imaging required three flights that were conducted at approximately solar noon to minimize shadows. Flights were initiated at 11:59 and ended at 13:43 PDT (UTC-7). We recorded images of a calibrated reflectance panel with known albedo values prior to and following the flights. DJI Pilot flight planning software [14] ensured we recorded imagery with 80% front and side overlap between images. We produced a radiometrically-corrected orthomosaic with Agisoft Metashape photogrammetry software [15]. The resulting multispectral orthomosaic had a ground sampling distance (GSD) of 5.88 cm and six spectral bands (Table 1). Note that the LWIR band was automatically up-sampled from a GSD of 100 cm to 5.88 cm. Additionally, we installed 12 ground control points and recorded their locations with a Trimble GeoXH (0.1 m combined horizontal and vertical accuracy post differential correction) GNSS receiver and Tornado antenna to georeference the imagery.

2.3. Sample Design

We generated a tessellation of 11,737 51.46 m² hexagons within the site using ArcGIS Pro software [16]. We classified the hexagons as forested, wetted, or barren based on the dominant class within each hexagon resulting from a supervised classification using the using RGB orthomosaics from high resolution (3 cm GSD) UAS imagery acquired in June 2019. 400 hexagons were then systematically selected from a random start by selecting every 29^th hexagon. Systematic sampling was chosen over simple random sampling to ensure a spatially distributed sampling design. Additionally, we wanted to ensure that each of the three classes were represented in the sample population proportional to their appearance on the study site to support field sampling and measurement of other

Table 1. Micasense altum multispectral sensor electromagnetic wavelength specifications.

conditions not related to this aspect of the study. Because the objective was to focus on instream conditions, only the 169 hexagons classified as wetted or barren were selected for high-resolution photosampling with a Phantom 4 Pro UAS (note, these data were not used in this study). 40 hexagons were randomly selected from the 169 hexagons for field sampling (described in the next section). It was not reasonable to sample the conditions of the entire 51.46 m² plot, so the plot was divided into four circular subplots that were located at hexagon center and 3 meters from center at azimuths 30˚, 150˚, and 270˚. There were a total of 3858 non-forested hexagons that comprised our sampling frame (Figure 1). In retrospect we would have simply drawn from this non-forested population to simplify plot expansion However, for the sake of simplifying the analysis, we use a finite population approach [17] where each hexagon was considered as a population unit in order to estimate LW total area.

Testing and Validation Data

Since we intend to quantify wood across the entire study site, we needed to first ascertain the level of agreement between the remotely sensed data and the field data. Field sampling for LW involved technicians navigating to plot locations,

Figure 1. Phase 1 South Fork McKenzie orthomosaic with non-forested sample frame hexagons and 40 field-sampled plots.

then visually estimating percent of wood covering each of the four 1-meter radius (3.14 m²) subplots. We also tested measuring each piece of LW in each subplot to facilitate statistical scaling of two-dimensional estimates to wood volume, however, the approach was too laborious once crews encountered multiple plots where the entire plot was composed of LW. Additionally, we tested simple LW counts, which was less time intensive than measurement, but still too arduous upon encountering a LW field.

To develop a digital dataset comparable to the field sampled LW data and to ultimately quantify our target parameter to estimate for all sampling frame hexagons, we manually delineated LW in GIS software. We conducted a heads-up digitization to delineate LW in the 40 hexagonal plots by viewing the UAS-derived multispectral orthomosaics in ArcGIS Pro and drawing polygons around the visible wood (Figure 2). We obtained our target parameter by summarizing the area of manually delineated LW within the 40 sampled hexagonal plots. To ensure manually delineated wood reasonably represented conditions observed in situ, field technicians visually estimated percent of two-dimensional wood cover (i.e., 0% - 100%) in situ within each of the 4 1-meter radius subplots within hexagons intended for sample. After removing subplots that were occluded by canopy in the orthomosaic image, we multiplied field-observed percent wood cover by subplot area (3.14 m²) to approximate wood area at subplot locations and compare to manually delineated wood area within subplots. The sample consisted of 92 subplots with field observations of wood area.

We examined the correlation between the heads-up manually delineated wood and field assessed wood area with the non-parametric Spearman rank correlation coefficient. Additionally, we performed a paired t-test to examine whether the means from the paired wood area measurements were different.

2.4. Estimating Wood Area with Statistical Estimators

It was inefficient to conduct a heads-up digitization of all the wood on the site,

Figure 2. Example hexagon plot. Left frame depicts RGB hexagonal plot, middle frame is the resulting heads-up manual delineation of LW performed in GIS software, and right frame is the result of RF classifying LW in image segments.

for the same reason that it was inefficient to hand-measure all the wood in the field. Instead, we examined the feasibility of using a statistical estimator approach with auxiliary information to estimate the two-dimensional area occupied by LW and associated 95% confidence intervals within the 3858 non-forested hexagons. We utilized four estimators with different types of auxiliary information derived from the multispectral orthomosaic to facilitate expansion of plot estimates across a broader area (Figure 3).

Figure 3. The flowchart illustrates the process we followed to produce wood area estimates from multispectral imagery using a combination of image segmentation, GIS processing, and sampling-based estimators.

Auxiliary information must be spatially continuous across the area of inference, as such, mean reflectance values and radiant temperature measured for individual spectral bands within the multispectral orthomosaic were obvious candidates. In addition, we may improve accuracy of the estimator if we incorporated information regarding whether a given location was likely to be wood or not wood. As such we assembled a Random Forest binary classification model to produce a classification of wood or not wood across the entire study area. Preliminary testing with a conventional pixel-based naïve bayes supervised classification model resulted in a model with a high degree of ambiguity. As such, we opted to use an object-based image analysis approach known as image segmentation, which has been shown to be effective in ecological applications because it captures patches on the landscape that represent ecological conditions more appropriately than individual pixels [18]. Then we use the resulting segments and associated spectral and segment attribute information as predictor variables in a binary (wood or non-wood) classification model.

We used Trimble eCognition software to segment the multispectral orthomosaic described earlier using all six spectral bands and a seventh NDVI band. NDVI is a transformation of the red and NIR bands that is known to be correlated positively with photosynthetic activity, and water features correspond to relatively low values of the index [19] [20] [21]. As such, we incorporated NDVI in our segmentation aiming to distinguish areas where LW bordered leafy material and water. Each of the resulting 112,939 segments contain 40 attributes related to the input spectral information as well as information related to the structure and texture of each segment such as asymmetry, border index, border length, brightness, compactness, density, length/width, max difference, skewness, standard deviations, rectangularity, roundness, and shape index which are described in the eCognition reference book [22]. These 40 attributes then served as predictor variables in a Random Forest (RF) classifier [23].

Random Forest is a machine learning model development algorithm that has gained popularity in ecological applications because it tends to be robust to overfitting when parameterized properly and due to its non-parametric nature, can account for interactions between covariates, and is largely unaffected by multi-collinearity [23] [24]. To train the model, we selected a 1.4-hectare subset of the orthomosaic image that represented the range of spectral variability and geomorphic conditions across the rest of the site in terms of presence of submerged and unsubmerged wood, gravel bars, live vegetation, riffles, and pools. The training area was comprised of 8363 segments, of which we randomly sampled 15%, manually classifying sampled segments as wood or not wood.

We trained a binary supervised RF classifier on 1211 segments and assessed model performance using 10 repetitions of 10-fold cross-validation. Cross-validation has been shown to be an effective and robust approach to model performance evaluation when model tuning is not part of the process because it better accounts for the range of model outcomes. We omitted model parameter tuning from this workflow, opting to use recommended parameters of mtry = 6 (i.e., the number of predictor variables that are randomly sampled at each node) and number of trees = 500 to simplify the processing architecture and reduce processing time. We selected kappa as the summary metric used to optimize model performance. We used the resulting model to classify all 112,939 image segments in the entire study area as either wood or not wood. The area of the RF-classified wood segments was then summarized for each of the 3858 hexagonal plots in the sample frame. As a result, each of the 40 hexagonal plots randomly selected for sampling contains wood polygons that were both manually delineated in GIS and classified by the RF classifier (Figure 2).

2.5. Estimators

Estimators utilize auxiliary information and design information to make an estimate about a target parameter across a broad area. The target parameter here is two-dimensional wood area across the 3858 non-forested 51.46 m² hexagons. The four estimators we developed and tested are briefly described below. More specifics including equations, assumptions, and associated references are described in the supplement.

The difference estimator (DIFF) works by postulating a relation between a response variable and auxiliary information [25]. The primary assumption of the difference estimator is that the auxiliary information explains the response reasonably well. In this instance, we assume total area classified as wood from the RF model to be a reasonable predictor for wood area.

The simple linear regression (SLR) estimator allows for adjustments to the difference estimator with the inclusion of coefficients β₀ and β₁. By minimizing the distance between a regression line and observed values with least squares approach, residual error is minimized. Of note, the regression estimator is not unbiased. However, this bias is minimized when the relationship between the auxiliary and target parameters are reasonably linear, and the correlation coefficient approaches 1.

The general regression (GREG) estimator is an extension of the previous SLR estimator. In this case, we have seven auxiliary variables, and we incorporate mean values from all bands described in Table 1. Mean values for each of the six bands are extracted using zonal statistics where 40 hexagonal plots are the zone features. Additionally, we included area estimated to be wood as determined by the RF model. This estimator is not unbiased. However, the bias will be small with larger sample sizes and reasonably correlated auxiliary data.

The simple random sampling without replacement (SRSwoR) estimator requires only a measured response variable, in this case, the heads-up digitized wood area for the sample set of hexagonal plots. The target parameter is a fixed value for our population, and the randomness is only associated with the selected sample. Our resulting estimate is unbiased because the expectation from the possible samples will equal the true population total [26].

3. Results and Discussion

Visual assessment of the wood area distribution indicates the data are right tailed. As a result, we used the non-parametric Spearman rank correlation coefficient to assess correlation between field-estimated wood area and manually delineated wood area in the 92 subplots. Spearman’s correlation coefficient (ρ) indicates a high degree of correlation between manually delineated wood area and field estimated wood area in the subplots (ρ = 0.57, p < 0.001). This suggests that manually delineated wood area from UAS imagery serves as a reasonable representation of actual wood area encountered in situ.

We assessed differences in means between the two wood area measurements with a paired t-test. Results suggest there is a non-0 difference in means (p < 0.005, 95% CI 0.17 m² to 0.33 m²). When the CI is converted to proportion of 3.14 m² subplot, it ranges from 5% to 11%, which is reasonable because our in-field assessment was based on visual estimation of percent wood cover within subplots.

The random forest classifier was trained using data partitioned by 10 repetitions of 10-fold cross validation resulting in an average kappa of 0.76 (95% CI 0.75 to 0.77), accuracy of 0.89, and out-of-bag estimate of error rate of 11.15%.

Model performance is visualized in the confusion matrix (Table 2). The final model kappa was 0.75, with a balanced accuracy of 0.86, sensitivity of 0.94, and specificity of 0.78 where “wood” is taken to be the positive class.

The correlation matrix (Figure 4) illustrates Spearman’s correlation coefficients, ρ [27]. The matrix visualizes correlations between the target parameter (i.e., manually delineated wood area, manAREA) and 7 auxiliary variables. The highest correlation is the coefficient associated with manually delineated wood area (manAREA, our target parameter) and with wood area estimated by the RF model (RFAREA, one of the 7 auxiliary variables) at 0.72. As a result, we applied RF wood area as our auxiliary variable in both the difference estimator and the general regression estimator with a single auxiliary variable. Field-estimated wood area is not presented in this matrix because it is not a variable utilized in any of the four estimators.

We calculated estimates of total wood area for the area of interest, with accompanying 95% confidence intervals for the four estimators (Table 3). We expected these estimates to be biased low compared to in-site LW area due to limitations

Table 2. Confusion matrix: Bold cells indicate number of segments where the RF classifier prediction agreed with reference data manually classified by a human interpreter. White cells indicate misclassifications of the RF classifier.

Figure 4. Correlation Matrix: Cells contain Spearman’s correlation coefficients (ρ). Manually delineated wood area (manAREA) is the target parameter we are estimating. RFAREA is measured wood area from the RF classifier, and spectral variables are mean reflectance values and radiant temperature in the 40 sample hex plots for the spectral band and lwir band, respectively.

Table 3. Wood area estimates and 95% confidence intervals.

associated with remotely sensed data tending to have occlusion due to canopy.

Estimated total LW area ranged from 16,593 to 17,283 m² with the SRSwoR producing the largest estimate and widest confidence interval. In contrast, GREG produced the smallest estimate of LW area and narrowest confidence interval.

The results indicate that as we incorporate more auxiliary information, we increase the precision of our estimated wood area. SRSwoR has the lowest precision of the design-based estimators examined in this study. However, this method is the most straightforward to implement, requiring only manual delineation of wood area in the subplots. This results in increased efficiency compared to the methods that require auxiliary information as such methods require GIS/image processing beyond that of the SRSwoR approach.

Based on kappa and accuracy metrics of the associated 10-rep 10-fold cross-validation used to produce the machine learning RF model, the model performed relatively well. A caveat of using accuracy metrics to assess model performance is the lack of the ability to objectively quantify an estimate associated with the model.

Design-based regression estimators do not assume a distribution for the population of hex plots. Wood area is considered a fixed parameter using these regression estimators, and bias is associated with the estimator itself. The difference estimator provides an interesting approach to incorporate wood modeled from a combination of image segmentation and a machine learning predictive model. Combining the results of the RF classifier with the difference estimator provides the added benefit of calculating estimator variance and the accompanying 95% confidence interval. Although this method has greater uncertainty compared to the other regression estimators, the difference estimator provides an unbiased estimate of total wood area.

The SLR estimator is an extension of the difference estimator and provides adjustments via β₀ and β₁ coefficients. It is unsurprising that this estimator reduces variance and narrows the associated 95% confidence interval when compared against the difference estimator.

Lastly, GREG with 7 auxiliary variables reduces variance even further by incorporating additional auxiliary information. In this step, we did not incorporate any transformations of bands in the form of indices, e.g., NDVI. However, GREG resulted in the most precise estimate compared to the other three estimators examined in the study. Generating auxiliary information from a zonal statistics calculation is a straightforward process in GIS or other software. However, producing a random forest model to classify image segments generated in eCognition requires careful implementation and some knowledge of machine learning. Therefore, the GREG estimator is also the most technically complex. This estimator can be simplified by removing the random forest component, but the certainty of the estimate will decrease.

The wide confidence intervals are indicative of both the complexity of the site and the novel nature of the method. The area includes tree canopies that obscure the ground surface, exposed gravel beds, geomorphic features like islands, varying water velocities, and wood that is submerged in water. These features confound the classifier, providing a challenging backdrop of overlapping spectral signatures. However, the data and associated estimates of LW area serve as a reference condition for future monitoring activities at the restoration site. To conduct spatio-temporal analyses of changes in LW area, these methods can be readily repeated, minimizing field work in a site that can be potentially hazardous to field crews. Additionally, repeated surveys of LW would otherwise not be feasible due to the vast amount of wood deposited during restoration activities. In this instance, a crew consisting of two members was able to sample 70 1-meter radius subplots over the course of five days. However, we acquired UAS imagery for the entirety of the 60-hectare site in approximately 1.5 hours. Image acquisitions with UAS can be conducted safely and on-demand, with minimal coordination required between pilot in command, local forest aviation officials, and other interest groups.

Prior to the implementation of Stage 0 restoration at the South Fork McKenzie River, there was little to no LW present in the stream. We expect the concentration of deposited LW will diminish through time as pieces are displaced downstream and decompose. The repeated implementation of the methods we propose would help quantify wood dynamics over time and imagery could be collected at a higher temporal resolution than is typical with traditional field sampling methods. Jennings [7] describes the potential for LW area to support macroinvertebrate biomass and secondary production. It follows that our estimates of LW may be used in conjunction with these estimates to quantify site-level potential macroinvertebrate biomass and secondary production.

One potential limitation to the approach presented in this paper is wood area that is occluded by canopy in the UAS orthomosaic. In areas occluded by canopy, we lack auxiliary information to estimate wood area. Further, we expect substantial recovery of the riparian vegetation in the restored area over time, so that wood in locations that are currently clearly visible to the UAS may be obscured by dense shrub or tree canopies in the future. We do not know how changes in the canopy cover of trees and shrubs might bias UAS-based measurements of large-wood area over time. Aerial LiDAR may be used to supplement aerial imagery, as pulses can penetrate the forest canopy and characterize attributes of the subcanopy, but LW can be difficult to discern from surrounding terrain [28] [29]. Queiroz et al. [30] demonstrated the potential of multispectral aerial LiDAR used with aerial imagery and image segmentation to improve classification accuracy of subcanopy wood area. Submerged wood pieces present additional challenges. However, we can partially account for wood that may lie below the surface of still water by modifying the stretched display of colors or viewing false-color composites where wood features are more prominent and delineating these polygons in sampled hex plots during the manual delineation step. Future studies may improve on these methods by estimating wood area in wetted areas by incorporating digital elevation models derived from LiDAR or Structure-from-Motion photogrammetry [31] and implementing imputation methods to generate the auxiliary information for the occluded area.

4. Conclusion

We demonstrated the applicability of sample-based estimators to provide estimates for LW area in riparian restoration projects. SRSwoR is the most straight-forward to implement, requiring the least data manipulation, but it produces estimates with the least precision. By incorporating auxiliary information and machine learning classifiers, we can improve the precision of our estimates. Combining UAS aerial imagery with a machine learning classifier and producing estimates of woody material using sampling-based approaches offers an efficient way of quantifying material in restoration sites where LW are critical to habitat structure and flow regimes. Traditional field sampling methods are time-consuming and potentially dangerous, especially in high-flow conditions. A single skilled technician can carry out the UAS survey as proposed in this paper over the course of a few hours compared to the multiple days it would require a field crew to implement a field sampling campaign of the same scale. We anticipate future studies can further improve upon these methods by incorporating additional auxiliary data such as vegetation indices and/or elevation data from LiDAR or SfM, thereby increasing precision and reducing dependence on machine learning classifiers.

Acknowledgements

This research was supported in part by an appointment to the United States Forest Service (USFS) Research Participation Program administered by the Oak Ridge Institute for Science and Education (ORISE) through an interagency agreement between the U.S. Department of Energy (DOE) and the U.S. Department of Agriculture (USDA). ORISE is managed by ORAU under DOE contract number DE-SC0014664. All opinions expressed in this paper are the author’s and do not necessarily reflect the policies and views of USDA, DOE, or ORAU/ORISE.

The first author would like to express his sincere gratitude to Sarah Hinshaw (Colorado State University), Katie Nicolato (Oregon State University), and William Hirsch (Oregon State University) for their help conducting fieldwork. The first author is grateful for the LW heads up delineation performed by Kiefer Kreps. Lastly, we thank the people associated with design, implementation, and monitoring of the Stage 0 restoration at the South Fork McKenzie River.

Supplement

1) Difference Estimator

The difference estimator was originally applied in an accounting context where book values were assumed to be reasonably explained by audit values [25]. We will hold the β coefficient constant at 1.

Using the difference estimator, we estimate the population total, $y_{T}$ as follows:

${\bar{y}}_{T d i f} = \sum_{U} y_{k}^{0} + \sum_{S} {\hat{D}}_{k}$

where ${\hat{D}}_{k} = D_{k} / π_{k} = (y_{k} - y_{k}^{0}) / π_{k}$

and $π_{k} = \frac{n}{N}$ .

We are assuming $y_{k}^{0} = x_{k}$ .

Variance for estimated population mean and total can be estimated as follows:

$\hat{V a r} ({\bar{y}}_{d i f}) = \frac{1 - n}{N} \frac{1}{n} \frac{1}{n - 1} \sum^{} {(y_{i} - y_{i}^{0})}^{2}$

$\hat{V a r} ({\bar{y}}_{T_{d i f}}) = N^{2} \times \hat{V a r} ({\bar{y}}_{d i f})$

The difference estimator formula with modified notation is from Särndal et al. [32].

2) Simple Linear Regression (SLR) Estimator

First, we estimate our population mean ${\bar{y}}_{l r}$ and multiply by our population size N to estimate the population total $y_{T_{l r}}$ .

We calculated the population mean ${\bar{y}}_{l r} = b_{0} + b_{1} μ_{x}$ , where $b_{1} = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}}$ and $b_{0} = \bar{y} - b_{1} \bar{x}$ . Additionally, $μ_{x}$ is the population mean for the auxiliary variable, and $\bar{x}$ is the sampling mean.

We estimate variance of ${\bar{y}}_{l r}$ with:

$\hat{V} a r ({\bar{y}}_{l r}) = \frac{N - n}{N \times n} \cdot \frac{\sum_{i = 1}^{n} {(y_{i} - b_{0} - b_{1} x_{i})}^{2}}{n - 2} = \frac{N - n}{N \times n} \cdot M S E$

Then, we estimate total wood area for the population of wetted pixels: ${\bar{y}}_{T l r} = {\bar{y}}_{l r} \times N$

Last, we estimate the variance of ${\bar{y}}_{T l r}$ with:

$\hat{V} a r ({\bar{y}}_{T l r}) = N^{2} \hat{V} a r ({\bar{y}}_{l r}) = \frac{N \times (N - n)}{n} \cdot M S E$

Notation of SLR equations for population mean and beta coefficients above have been modified and originate from Avery and Burkhart [33]. Variance equations are modified from the regression lecture from Penn State stat 506 web site [34].

3) General Regression (GREG) Estimator with Multiple Auxiliary Variables

${\bar{y}}_{G R E G} = b_{0} + b_{1} μ_{b l u e} + b_{2} μ_{g r e e n} + b_{3} μ_{r e d} + b_{4} μ_{r e d e d g e} + b_{5} μ_{n i r} + b_{6} μ_{l w i r} + b_{7} μ_{R F A r e a}$

${\bar{y}}_{T_{G R E G}} = {\bar{y}}_{G R E G} \times N$

GREG estimated variance

The equations below demonstrate the estimate for variance associated with the GREG estimator. Equations are from McConville et al. [35] with modified notation.

$\hat{V a r} ({\bar{y}}_{G R E G}) = (1 - \frac{n}{N}) \frac{1}{N} \frac{1}{n - 1} \sum_{i \in s} {(y_{i} - \hat{m} (x_{i}))}^{2}$

Where $\hat{m}$ is the sample-estimated prediction.

$\hat{V a r} ({\bar{y}}_{T_{G R E G}}) = N^{2} \times \hat{V a r} ({\bar{y}}_{G R E G})$

4) SRSwoR

We estimate population total, $τ$ as follows:

$\hat{τ} = \frac{N}{n} \sum_{i = 1}^{n} y_{i}$

Our sample variance ${\hat{S}}_{y}^{2}$ is estimated as follows:

${\hat{S}}_{y}^{2} = \frac{1}{n - 1} \sum_{i = 1}^{n} {(y_{i} - \hat{μ})}^{2}$

Lastly, we estimate variance $V (\hat{τ})$ :

$\hat{V} (\hat{τ}) = (N^{2} - N n) \frac{{\hat{S}}_{y}^{2}}{n} = f p c \frac{N^{2} {\hat{S}}_{y}^{2}}{n}$

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1]	Swanson, F.J. and Lienkaemper, G.W. (1978) Physical Consequences of Large Organic Debris in Pacific Northwest Streams. USDA Forest Service General Technical Report PNW-GTR-069, U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station, Portland, 1-12.
[2]	Sass, G.G. (2009) Coarse Woody Debris in Lakes and Streams. In: Likens, G.E., Ed., Encyclopedia of Inland Waters, Academic Press, Cambridge, 60-69. https://doi.org/10.1016/B978-012370626-3.00221-0
[3]	Wohl, E., Cenderelli, D.A., Dwire, K.A., Ryan-Burkett, S.E., Young, M.K. and Fausch, K.D. (2010) Large In-Stream Wood Studies: A Call for Common Metrics. Earth Surface Processes and Landforms, 35, 618-625. https://doi.org/10.1002/esp.1966
[4]	Faustini, J.M. and Jones, J.A. (2003) Influence of Large Woody Debris on Channel Morphology and Dynamics in Steep, Boulder-Rich Mountain Streams, Western Cascades, Oregon. Geomorphology, 51, 187-205. https://doi.org/10.1016/S0169-555X(02)00336-7
[5]	Flitcroft, R.L., Brignon, W.R., Staab, B., Bellmore, J.R., Burnett, J., Burns, P., Cluer, B., Giannico, G., Helstab, J.M., Jennings, J., Mayes, C., Mazzacano, C., Mork, L., Meyer, K., Munyon, J., Penaluna, B.E., Powers, P., Scott, D.N. and Wondzell, S.M. (2022) Rehabilitating Valley Floors to a Stage 0 Condition: A Synthesis of Opening Outcomes. Frontiers in Environmental Science, 10, Article 892268. https://doi.org/10.3389/fenvs.2022.892268
[6]	Hinshaw, S., Wohl, E., Burnett, J.D. and Wondzell, S. (2022) Development of a Geomorphic Monitoring Strategy for Stage 0 Restoration in the South Fork McKenzie River, Oregon, USA. Earth Surface Processes and Landforms, 47, 1937-1951. https://doi.org/10.1002/esp.5356
[7]	Jennings, J.C. (2022) Effects of Stage 0 Stream Restoration on Aquatic Macroinvertebrate Production. Oregon State University, Corvallis.
[8]	Bureau of Land Management (2021) AIM National Aquatic Monitoring Framework: Field Protocol for Wadeable Lotic Systems. Technical Reference 1735-2, Version 2. U.S. Department of the Interior, Bureau of Land Management, National Operations Center, Denver, CO.
[9]	Harman, W.A., Barrett, T.B., Jones, C.J., James, A. and Peel, H.M. (2017) Application of the Large Woody Debris Index: A Field User Manual Version 1. Stream Mechanics and Ecosystem Planning & Restoration, Raleigh.
[10]	Woodall, C. and Williams, M.S. (2005) Sampling Protocol, Estimation, and Analysis Procedures for the Down Woody Materials Indicator of the FIA Program. General Technical Report NC-256. U.S. Department of Agriculture, Forest Service, North Central Research Station, St. Paul, 47 p. https://doi.org/10.2737/NC-GTR-256
[11]	Woldendorp, G., Keenan, R.J., Barry, S. and Spencer, R.D. (2004) Analysis of Sampling Methods for Coarse Woody Debris. Forest Ecology and Management, 198, 133-148. https://doi.org/10.1016/j.foreco.2004.03.042
[12]	Queiroz, G.L., McDermid, G.J., Castilla, G., Linke, J. and Rahman, M.M. (2019) Mapping Coarse Woody Debris with Random Forest Classification of Centimetric Aerial Imagery. Forests, 10, Article No. 471. https://doi.org/10.3390/f10060471
[13]	Powers, P.D., Helstab, M. and Niezgoda, S.L. (2019) A Process-Based Approach to Restoring Depositional River Valleys to Stage 0, an Anastomosing Channel Network. River Research and Applications, 35, 3-13. https://doi.org/10.1002/rra.3378
[14]	DJI (2020) DJI Pilot-DJI Download Center-DJI. https://www.dji.com/downloads/djiapp/dji-pilot
[15]	Agisoft (2019) Agisoft Metashape. https://www.agisoft.com/
[16]	ESRI (2021) ArcGIS Pro. Esri Inc., Redlands, CA. https://www.esri.com/en-us/arcgis/products/arcgis-pro/overview
[17]	Zhao, X. and Grafstrom, A. (2020) A Sample Coordination Method to Monitor Totals of Environmental Variables. Environmetrics, 31, e2625. https://doi.org/10.1002/env.2625
[18]	Sibaruddin, H.I., Shafri, H.Z.M., Pradhan, B. and Haron, N.A. (2018) Comparison of Pixel-Based and Object-Based Image Classification Techniques in Extracting Information from UAV Imagery Data. IOP Conference Series: Earth and Environmental Science, 169, Article ID: 012098. https://doi.org/10.1088/1755-1315/169/1/012098
[19]	Rouse, J.W., Haas, R.H., Schell, J.A. and Deering, D.W. (1974) Monitoring Vegetation Systems in the Great Plains with ERTS. 3rd ERTS Symposium, Washington DC, 10-14 December 1974, 309-317.
[20]	Lillesand, T.M., Kiefer, R.W. and Chipman, J.W. (2015) Remote Sensing and Image Interpretation. 7th Edition. John Wiley & Sons, Inc., Hoboken.
[21]	Han, Q. and Niu, Z. (2020) Construction of the Long-Term Global Surface Water Extent Dataset Based on Water-NDVI Spatio-Temporal Parameter Set. Remote Sensing, 12, Article No. 2675. https://doi.org/10.3390/rs12172675
[22]	Trimble (2022) Reference Book: Trimble eCognition Developer for Windows Operating System. Trimble Germany GmbH, Munich.
[23]	Breiman, L. (2001) Random Forests. Machine Learning, 45, 5-32. https://doi.org/10.1023/A:1010933404324
[24]	Cutler, D.R., Edwards, T.C., Beard, K.H., Cutler, A., Hess, K.T., Gibson, J. and Lawler, J.J. (2007) Random Forests for Classification in Ecology. Ecology, 88, 2783-2792. https://doi.org/10.1890/07-0539.1 http://www.jstor.org.ezproxy.proxy.library.oregonstate.edu/stable/27651436
[25]	Godfrey, J., Roshwalb, A. and Wright, R.L. (1984) Model-Based Stratification in Inventory Cost Estimation. Journal of Business & Economic Statistics, 2, 01-09. https://doi.org/10.1080/07350015.1984.10509365
[26]	Thompson, S.K. (2012) Sampling. 3rd Edition, Wiley, Hoboken. https://doi.org/10.1002/9781118162934
[27]	Wei, T. and Simko, V. (2021) R Package “Corrplot”: Visualization of a Correlation Matrix.
[28]	Campbell, M.J., Dennison, P.E., Hudak, A.T., Parham, L.M. and Butler, B.W. (2018) Quantifying Understory Vegetation Density Using Small-Footprint Airborne Lidar. Remote Sensing of Environment, 215, 330-342. https://doi.org/10.1016/j.rse.2018.06.023
[29]	Pesonen, A., Maltamo, M., Eerikainen, K. and Packalèn, P. (2008) Airborne Laser Scanning-Based Prediction of Coarse Woody Debris Volumes in a Conservation Area. Forest Ecology and Management, 255, 3288-3296. https://doi.org/10.1016/j.foreco.2008.02.017
[30]	Queiroz, G.L., McDermid, G., Linke, J., Hopkinson, C. and Kariyeva, J. (2020) Estimating Coarse Woody Debris Volume Using Image Analysis and Multispectral LiDAR. Forests, 11, Article No. 141. https://doi.org/10.3390/f11020141
[31]	Westoby, M.J., Brasington, J., Glasser, N.F., Hambrey, M.J. and Reynolds, J.M. (2012) “Structure-from-Motion” Photogrammetry: A Low-Cost, Effective Tool for Geoscience Applications. Geomorphology, 179, 300-314. https://doi.org/10.1016/j.geomorph.2012.08.021
[32]	Sarndal, C.-E., Swensson, B. and Wretman, J. (2003) Model Assisted Survey Sampling. Springer Series in Statistics, Springer, New York.
[33]	Avery, T.E. and Burkhart, H.E. (2002) Forest Measurements. 5th Edition, McGraw-Hill Series in Forest Resources, McGraw-Hill, Boston.
[34]	PennState: Statistics Online Courses (n.d.) 5.1-Linear Regression Estimator: STAT 506. https://online.stat.psu.edu/stat506/lesson/5/5.1
[35]	McConville, K.S., Moisen, G.G. and Frescino, T.S. (2020) A Tutorial on Model-Assisted Estimation with Application to Forest Inventory. Forests, 11, Article No. 244. https://doi.org/10.3390/f11020244

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies