Support vector regression and synthetically mixed training data for quantifying urban land cover

Abstract

Exploiting imaging spectrometer data with machine learning algorithms has been demonstrated to be an excellent choice for mapping ecologically meaningful land cover categories in spectrally complex urban environments. However, the potential of kernel-based regression techniques for quantitatively analyzing urban composition has not yet been fully explored. To a great extent, this can be explained by difficulties in deriving quantitative training information that reliably represents pairs of spectral signatures with associated land cover fractions needed for empirical modeling. In this paper we present an approach to circumvent this limitation by combining support vector regression (SVR) with synthetically mixed training data to map sub-pixel fractions of single urban land cover categories of interest. This approach was tested on Hyperspectral Mapper (HyMap) data acquired over Berlin, Germany. Fraction estimates were validated with extensive manual mappings and compared to fractions derived from multiple endmember spectral mixture analysis (MESMA). Our regression results demonstrate that the sets of multiple mixtures yielded high accuracies for quantitative estimates for four spectrally complex urban land cover types, i.e., fractions of impervious rooftops and pavements, as well as grass- and tree-covered areas. Despite the extrapolation uncertainty of SVR, which resulted in fraction values below 0% and above 100%, physically meaningful model outputs were reported for a clear majority of pixels, and visual inspection underpinned the quality of produced fraction maps. Statistical accuracy assessment with detailed reference information for 92 urban blocks showed linear relations with R2 values of 0.86, 0.58, 0.81 and 0.85 for the four categories, respectively. Mean absolute errors (MAE) ranged from 6.4 to 12.8% and block-wise sums of the four individually modeled category fractions were always around 100%. Results of MESMA followed similar trends, but with slightly lower accuracies. Our findings demonstrate that the combination of SVR and synthetically mixed training data enable the use of empirical regression for sub-pixel mapping. Thus, the strengths of kernel-based approaches for quantifying urban land cover from imaging spectrometer data can be well utilized. Remaining uncertainties and limitations were related to the known phenomena of spectral similarity or ambiguity of urban materials, the spectral deficiencies in shaded areas, or the dependency on comprehensive and representative spectral libraries. Therefore, the suggested workflow constitutes a new flexible and extendable universal modeling approach to map land cover fractions.

Publication
Remote Sensing of Environment