The purpose of this paper is to address the uncertainties associated with widely-held practices for deriving land cover models from single wavelength, discrete return light detection and ranging (LiDAR) data. Accurate and high-resolution models of land cover classification are crucial for a variety of disciplines including forestry and vegetation mapping (Hansen et al. 2013, Xie et al. 2008), ecology and conservation monitoring (Nagendra et al. 2013; Wang et al. 2010), mapping and monitoring wetlands (Klemas, 2013), and monitoring changes in land-cover land-use (Rogan and Chen 2004). Airborne LiDAR surveys have been performed since the 1990’s (Baltsavias 1999, Wehr and Lohr 1999). Traditionally used to generate accurate digital elevation models (DEM’s), the potential for airborne LiDAR surveys to be used to classify land cover was first proposed by Song et al. (2002). Since then, land cover classification using LiDAR data has expanded to include a variety of land cover classes and sensor types, including multispectral (Morsy and Larocque 2016, Wang et al. 2014, Wichmann et al. 2015) and full-waveform instruments (Heinzel and Koch 2011, Neuenschwander et al. 2009, Ni-meister et al. 2018, Reitberger et al. 2008). However, discrete return LiDAR instruments were the most commonly used systems, and remain prevalent today. If methodologies are established for the accurate classification of land cover from LiDAR data, then value can be added to these archived LiDAR data, originally collected for the generation of DEM’s, by extracting land cover. Airborne laser scanning (ALS) surveys also show significant flexibility in the available resolution of LiDAR data since surveys can be customized based on the study requirements. Therefore, it is important to ensure that the methodology followed to derive land cover models from LiDAR data retain the highest resolution achievable, and that uncertainties and artefacts of common classification practices are well understood. State-of-practice methodologies in classifying LiDAR data work directly with tiled data, using raster surfaces and object based image analysis techniques. However, it has been proposed that the impact of scan angle on the intensity response of non-Lambertian reflectors may be mitigated by organizing the data by flightstrip (Casto et al. 2015), and studies have indicated that direct classification of the LiDAR point cloud can lead to high accuracy models of the landscape without loss of resolution which commonly occurs when using LiDAR-derived raster surfaces (Brzank et al. 2008, Höfle et al. 2009). Object-based image analysis (OBIA) techniques have been established as having higher accuracy than point-based image analysis (PBIA) techniques in high-resolution imagery (Blaschke 2010). However, the impact of the choice of image analysis technique on the configuration of land cover in the final land cover models derived from LiDAR data has not been investigated, and may be applicable to other scientific fields such as landscape ecology and landscape genetics where both the accuracy of landscape models and the configuration of landscape features are highly relevant in analyses of the relationships between species and the environment. Current methods to address systematic errors in recorded intensity within a LiDAR survey due to range and incident angle typically assume that targets are Lambertian reflectors (Tan and Cheng 2017). However, specular reflectors such as open water bodies are common in natural landscapes. Therefore, this paper aims to test the hypothesis posited by Crasto et al. (2015) that organizing LiDAR data into flightstrips prior to classification may minimize classification errors in non-Lambertian reflectors. The intensity response of a specular reflector is dependent on the angle of incidence of the LiDAR signal. In calm open water bodies, the intensity response would therefore be correlated with the scan angle of the LiDAR instrument. At low scan angles, it would be expected that water would have a high intensity signature, whereas at higher scan angles (>3-5 degrees) most of the signal would be deflected away from the sensor, and water would exhibit a low intensity signature. While data collected from adjacent flightstrips are superimposed when data are organized into tiles, organizing the data by flightstrip will prevent the overlapping of these two disparate intensity signatures of water bodies, from neighbouring flightstrips, and may therefore improve the classification accuracy of specular reflectors. Initial land cover models derived from 3D LiDAR point clouds rasterized derivative positional and intensity data prior to classification (Bakula et al. 2016, Brennan and Webster 2006, Charaniya et al. 2004, Crasto et al. 2015, Song et al. 2002). This has the advantage of averaging multiple points per pixel, and enables the derivation of integrative LiDAR parameters such as point density and variation in elevation (Charaniya et al. 2004, Crasto et al. 2015). However, this may result in a loss of information, such as the raw intensity per point, and subsequent lowering of the resolution as data from individual points are averaged over the area of a pixel. While some studies have investigated the possibility of classifying the LiDAR point cloud directly (Brzank et al. 2008, Höfle et al. 2009, Malinowski et al. 2016), these studies focus only on distinguishing between land and water. It is well established that object-based image analysis (OBIA) out-performs pixel-based image analysis (PBIA) in the classification of high resolution images (Blaschke 2010), a trend that has been verified when deriving land cover from LiDAR data (El-Ashmawy et al. 2011). However, differences in the unit of analysis between the two techniques may lead to significant differences in the distribution and configuration of land cover in the final model. In PBIA, the unit of analysis is the pixel whereas in OBIA it is a group of pixels. This difference may influence the spatial configuration of land cover within a study area. In applications where configuration of land cover is also significant in the analysis of the landscape, such as landscape genetics (Manel et al. 2003), these differences may have significant effects in downstream analyses. The state of practice for determining the veracity of land cover models derived from LiDAR data is to provide a measure of accuracy, precision, recall, and/or kappa coefficient. However, other qualitative characteristics of landscapes may be of interest when classifying land cover. The configuration of landscape features within a study area is relevant for a variety of scientific fields including ecology, landscape ecology and landscape genetics. The field of landscape genetics seeks to correlate observed patterns of genetic structuring with landscape features hypothesized to affect the movement or gene flow of species (Manel et al. 2003). For this purpose, an accurate representation of the configuration of landscape features is crucial. For instance, if it is hypothesized that a species’ movement or gene flow is inhibited by a road, then the contiguity of the road is essential for accurately modelling it as an uninterrupted feature. Similarly, if wetlands are good habitat and rock barrens are bad habitat for a particular species, then the frequent misclassification between these two land cover classes could result in an under- or over-estimation of suitable habitat. This may in turn impact studies of the species ecology in the study area or conservation management strategies. Therefore, it is important to not only accurately model the landscape, but do so in a way that maintains the configuration and contiguity of the land cover classes. In this paper I compare six methodologies to classify land cover from airborne LiDAR data collected using a single wavelength, discrete return system. Three processing decisions that must be made during the classification procedure, which have been un-tested or under-tested in literature, were investigated: (i) use of data organized as tiles or flightstrips, (ii) use of raster surfaces or point clouds, and (iii) use of point- or object-based image analysis techniques

Authors
  • Beaulne, Danielle
  • Fotopoulos, Georgia
  • Lougheed, Stephen C.
Universities

Summary

Recent technological advancements in next generation sequencing techniques, enabling the use of thousands of genetic markers across an individual’s genome, and continued improvements to the spatial resolution and information content of remote sensing data, present a unique opportunity to investigate the finest geographic scale at which genetic structuring within and between populations becomes detectable. However, in order to exploit the integration of both high resolution genetic and geospatial data, an understanding of the uncertainties associated with both data sets, and the accuracy requirements of landscape genetic analyses of geospatial models, is required. In this thesis, I begin by highlighting some sources of uncertainty and bias in landscape models derived from high resolution LiDAR data which have the potential to impact downstream analyses of the correlation between patterns of genetic structuring and landscape heterogeneity. I then investigate the patterns of genetic structuring between breeding aggregations of the spring peeper, Pseudacris crucifer.

I discovered that while different methodologies to derive land cover from airborne LiDAR data may result in similar overall accuracy, the configuration of landscape heterogeneity within the landscape, and class-specific recall and precision differed between models. A significant finding is that some classification methodologies did not accurately represent the contiguity of a road, which is often considered a putative barrier for amphibians. While ddRADseq could not resolve signatures of fine-scale genetic differentiation between breeding aggregations of Pseudacris crucifer within distances of <10 km, some differentiation between sampling locations separated by 60 km was detected. This grants some insight into the scale of genetic structuring of Pseudacris crucifer, and provides some representation of hylids in the population genetic literature. Ultimately this thesis highlights the importance for communication and collaboration between biologists and geospatial scientists to ensure the optimal modeling of heterogeneity with landscapes to address a wider array of applications in ecology and landscape genetics, as well as an accurate representation of uncertainty in geospatial models in ecological and landscape genetic analyses.

Methodology

2.2.1 Study area and data description

An airborne LiDAR survey was performed from June 10-11, 2015 using a fixed wing aircraft and an Optech Gemini Airborne Laser Terrain Mapper, a discrete return system which emits a laser pulse at 1064nm and is capable of recording up to four return signals. The resultant point cloud had a density of 1 point/m2 , and a precision of ±15.0 cm and ±30.0 cm in open and vegetated areas, respectively. The 32 study was performed over a subset of properties at the Queen’s University Biological Station (QUBS), which covers 34km2 in South Frontenac County approximately 40 km northeast of Kingston, Ontario, Canada (see Figure 2.1). QUBS and environs are reasonably pristine, heterogeneous landscapes, including deciduous and mixed forests, inundated forests, various types of wetlands, lakes, rivers, streams, barren rock outcrops, and some fields. Deciduous forests, open water and wetlands dominate the landscape. Recent anthropogenic alterations to the landscape are typically minor and include the addition of paved 2-lane roads, gravel lanes, some scattered buildings, hydro lines, and some maintained fields.

2.2.2 Classification workflow

A total of six land cover models were generated, the result of independently testing the impacts of three processing decisions on the accuracy and configuration of the land cover models. Classification parameters that are not tested in this study were held constant between models (with few exceptions). Figure 2.2 outlines the classification decisions that were tested in this study, and summarizes the workflow established to derive each land cover model. The classification decisions tested are: the direct classification of a 3-dimensional point cloud (POINT) versus classification of a derived raster image (PIXEL), the organization of data into flight-strips (FLIGHTSTRIP) versus tiles (TILE), and the use of pixel- (PXL-BASED) versus object-based (OBJ-BASED) image classification techniques. The final models generated from these methodologies are compared by analyzing their overall accuracy and class-specific recall and precision. However, a numerical value for accuracy may not always convey the distribution of uncertainty within models. Therefore, differences in the configuration of landscape features between models are also compared.

2.2.2.1 Land types

In all models, eight land cover classes were chosen to be classified. These classes were chosen based on their anticipated influence on dispersal of Pseudacris crucifer, informed through knowledge of their ecology and expert opinion, with additional land cover classes included to test the limits of the potential for LiDAR to discriminate between similar land cover classes. These land covers are: fields, mixed and coniferous forests, deciduous forests, inundated forests, rock barrens, wetlands, open water and roads. Preliminary tests to sub-classify wetlands into cattail marshes and other wetlands indicated a limited ability to differentiate between these classes, and thus these were merged into one ‘wetland’ class. Discrimination between coniferous and deciduous forests using LiDAR-derived data is best performed when the LiDAR survey is collected in leaf-off conditions, but was attempted in this study to test the potential of LiDAR data to differentiate similar land cover classes in sub-optimal conditions (Liang and Matikainen 2007).

2.2.2.2 Data processing

Radiometric correction was not performed on the LiDAR data. Studies by Habib et al. (2011) and Yan et al. (2012) suggested that performing geometric calibration and radiometric correction can improve the accuracy of a land cover model by up to 7% and 12%, respectively. However, the overall accuracies of the land cover models in these studies do not exceed 70%, and do not incorporate other parameters derived from the LiDAR data, such as textural information. Im et al. (2008) demonstrated that >90% accuracy in land cover classification of LiDAR data can be achieved without radiometric correction or geometric calibration. It was assumed that since the study area is largely flat, range differences between points would have minimal impact on the recorded intensity of the LiDAR pulse, rendering radiometric correction unnecessary. Studies in areas with more highly variable topography, such as the Alps, benefit more from this type of calibration (Höfle et al. 2009). The incorporation of textural and point density information derived from the LiDAR data can also improve the accuracy of classifying land cover without the need for geometric calibration or radiometric correction. Performing radiometric correction on these LiDAR data may improve accuracy, but it would likely affect all six models similarly and thus not affect the comparative results of this study. Data were available segmented in non-overlapping tiles, up to 1km x 1km in dimension. When classifying by tile, no modifications to the data were made. When classifying by flightstrip, data were separated into their respective flightstrips. This resulted in overlapping data sets since flightstrip overlap was approximately 50%. For point-based classification, the LiDAR data were classified directly. For classification of rasterized LiDAR data, points were gridded for each parameter of interest. A resolution of 2m was used for each raster surface, which optimized the resolution while minimizing the presence of no data cells and the requirement to interpolate, which could introduce another source of uncertainty. Parameters derived from the LiDAR data were held constant for the direct classification of the point cloud and the classification of the rasterized data, with some differences due to the nature of each unit of analysis. Eight parameters were used for the classification of both the rasterized data and the point cloud (Table 2.1). These were selected based on their ability to distinguish between land cover classes, and justifications for each parameter are outlined in Table 2.1. Intensity was established as a parameter for the discrimination of different surfaces since Song et al. (2002) first tested its viability, and has been extensively used since (Amolins et al. 2008, Antonarakis et al. 2008, Brennan and Webster 2006, Charaniya et al. 2004, Chasmer et al. 2014, Chen and Gao 2014, Crasto et al. 2015, El-Ashmawy et al. 2011, Höfle et al. 2009, Im et al. 2008, Lang and McCarty 2009, Yan et al. 2012). Intensity-derived metrics were also included in this study to measure the variability in intensity of different land cover classes (Antonarakis et al. 2008, Crasto et al. 2015, Goodale et al. 2007, Höfle et al. 2009). Point density is also an established LiDAR parameter to distinguish between land cover classes, particularly between water and land (Brzank et al. 2008, Crasto et al. 2015, Höfle et al. 2009). Textual information derived from LiDAR point clouds are often used to complement land cover classification of optical remotely sensed data, but have also been used in the classification of LiDAR data into land cover classification models directly (Amolins et al. 2008, Antonarakis et al. 2008, Brennan and Webster 2006, Brzank and Heipke 2006, Charaniya et al. 2004, Chen and Gao 2014, Crasto et al. 2015, El-Ashmawy et al. 2011, Goodale et al. 2007, Im et al. 2008, Höfle et al. 2009). Metrics that can only be derived from the point cloud, including the classification, return number, and the number of returns per pulse, cannot effectually be converted to a raster and so have not been used in the classification of rasterized LiDAR data. Direct classification of the LiDAR point cloud can incorporate these parameters, and have been explored in one study by Amolins et al. (2008), reaching approximately 75% accuracy. Crasto et al. (2015) also included scan angle in their classification scheme to account for the variability of open water in its intensity signature, a property of specular reflectors.

2.2.2.3 Classification

Classification of all models was performed using a supervised random forest approach (Breiman 2001). Random forests perform as well as or better than other machine learning algorithms, including support vector machines (Pal 2005), decision trees and naïve Bayes (Malinowski et al. 2016), neural nets, logistic regression, bagged trees, boosted trees and boosted stumps (Caruana and Niculescu-Mizil 2006). Random forests also have the added benefits of high performance without calibration (Caruana and Niculescu-Mizil 2006), speed (Gislason et al. 2006), lower sensitivity to the selection of training data, and robustness to the dimensionality of the input dataset (Belgiu and Drăgu 2016). Random forest classifiers have also been used extensively for the purpose of land cover classification (Belgiu and Drăgu 2016, Guo et al. 2011, Malinowski et al. 2016, Rodriguez-Galiano 2012). Comparisons in the relative performance of pixel- and object-based image analysis techniques for land cover classification from airborne LiDAR data have indicated higher performance from object-based techniques (El-Ashmawy et al. 2011), an expected result when working with high-resolution imagery (Blaschke 2010). While Höfle et al. (2009) proposed a method for object-based image analysis of discrete-return LiDAR point data for water/land classification, this methodology adopts a region-growing algorithm, which is difficult to apply when classifying multiple landscape features over a large spatial extent. Region growing algorithms are seeded with points which meet a certain threshold; for instance, Höfle et al. (2009) used a threshold of >90% intensity density for seeding water regions. In models with multiple land cover classes, which overlap in parameter space, this method would likely be prone to misclassification. A more adaptable approach to the direct classification of the LiDAR point cloud adopts the use of machine learning algorithms in a method similar to pixel-based image analysis of raster data, as performed by Malinowski et al. (2016) using full-waveform LiDAR data. Therefore, in this study OBIA was only performed on the rasterized LiDAR surfaces, while pixel- and point-based image analysis techniques were used for both the rasterized surfaces and the direct classification of the point cloud, respectively. Pixel-based image analysis was performed on the rasterized LiDAR surfaces as a more comparable analog to direct classification of the point cloud than OBIA. When classifying LiDAR data organized into non-overlapping tiles, data from all tiles were merged such that the entire study area was classified simultaneously. This enabled the maximum geographic distribution of training and testing data, and was assumed to capture the most variation in spectral and textural response of each land cover class. When classifying LiDAR data organized into overlapping flightstrips, each flightstrip was classified individually. Since not all land cover classes are uniformly distributed throughout the study area, this resulted in some flightstrips containing only a subset of the eight land cover classes being classified in this study. Only deciduous forests, open water, rock barrens, and wetlands were distributed throughout the study area. Classification by flightstrip also resulted in overlapping land cover models. Since the LiDAR points themselves do not overlap each other, no further processing to the classified point dataset had to be made before merging all flightstrips into one point-based land cover classification model for the study area. In the raster-based classification, overlapping pixels were not always of the same land cover class. In order to reconcile this, class probabilities were computed for each pixel and the land cover class that had the higher class probability was assigned to the pixel. This resulted in one, non-overlapping, land cover model of the study area.

2.2.2.4 Validation

Models were trained and cross-validated using polygons manually-derived from a combination of high resolution optical imagery collected over the study area, visual inspection of the LiDAR data, and in situ field visits of the study area. The optical imagery includes a QuickBird multispectral image with a resolution of acquired on September 4, 2005 and WorldView-2 multispectral images with a resolution of 2m acquired on August 26, 2016 and April 23, 2017. The availability of mulitspectral images in leaf-on and leaf-off conditions permitted the inclusion of the normalized difference vegetation index (NDVI) as an additional parameter to inform the selection of training and validation data for wetlands, deciduous and mixed/coniferous forests. NDVI was calculated for each WorldView-2 image, using the red and the first near-infrared bands (NIR; bands 5 and 7 respectively), following the equation: NDVI = NIR1 − Red NIR1 + Red (2) proposed by Nouri et al. (2013) for the WorldView-2 mission specifically. Since the optical imagery, used to inform the selection of training and validation data, was not collected concurrently with the LiDAR data, there is limited ability to correctly train and validate land cover classes which fluctuate both seasonally and between years. This mostly affects the classification of wetlands whose depth, extent and hydroperiod can be highly variable. No major land use changes occurred over the study area from 2015 to 2017; therefore, there should not be any discrepancies in more stable land cover classes such as forests, fields, barren rock, roads or open water. The number of training and validation polygons required to inform the classifier and accurately test its performance is influenced by the number of parameters used in the classification and the organization of the input data into tiles or flightstrips. Typically, the rule of thumb is that the number of pixels (n) used to train a classifier and validate the resultant models should be 10p to 30p, where p is equal to the number of input parameters to the classifier (Mather 1999, Van Niel et al. 2005). However, this notion was tested by Van Niel et al. (2005), who discovered that 2p to 4p is sufficient to adequately train a classifier. When classifying data organized in non-overlapping tiles, all tiles can be classified simultaneously, requiring only one training and one testing data set for the entire study area. When the LiDAR data are organized into flight-strips, training and testing datasets must be developed for each flightstrip individually. This increases the total number of training and testing data sets required. Some flightstrips may only contain a subset of the eight land cover classes being classified, limiting the opportunity to apply the 8-class classifier universally across the study area. Some flightstrips may also only have a small area for a land cover classes, limiting the number of training polygons that can be derived for certain land cover classes. When a land cover class could not be confidently identified within a flightstrip, it was omitted from the classification. This was done to increase the confidence of the resulting models, maintain the integrity of the land cover classes which were present, and minimize the potential for errors of commission to propagate in the resultant models through uncertain assignment of less common land cover classes. Deciduous forests, open water, rock barrens and wetlands were identified in all flightstrips. Fields, coniferous forests, inundated forests and roads were less common in the study area and were not present in all flight-strips. For classification of LiDAR data organized into tiles, 80 training and validation polygons were identified for each land cover class. This is equal to 10p, since eight parameters were used as input for the classifier. These training and testing polygons were spatially distributed throughout the study area and represent the spectral and textural variation present for each class. For classification of LiDAR data organized into flightstrips, an average of 50 polygons were identified as training and validation data. No less than 23 polygons were identified for any given land cover class in any of the flightstrips. Therefore, n ranged from 20p to 50p, within the requirements for accurately training a classifier (Van Niel et al. 2005). These polygons were randomly divided into training and validation data, such that the data used to validate the land cover models were not used as input to train the classifier, thereby biasing the cross-validation accuracy. There is no standardized methodology for partitioning training and testing data when classifying land cover from LiDAR data, so two-thirds of the dataset was used to train the classifier and the remaining third was retained for validation, following Rodriguez-Galiano et al. (2012) who performed a land cover classification study using LANDSAT-5 TM data and a random forest classifier.