Mapping urban-rural gradients of settlements and vegetation at national scale using Sentinel-2 spectral-temporal metrics and regression-based unmixing with synthetic training data

Abstract

The increasing impact of humans on land and ongoing global population growth requires an improved understanding of land cover (LC) and land use (LU) processes related to settlements. The heterogeneity of built-up areas and infrastructures as well as the importance of not only mapping, but also characterizing anthropogenic structures suggests using a sub-pixel mapping approach for analysing related LC from space. We implement a regression-based unmixing approach for mapping built-up surfaces and infrastructure, woody vegetation and non-woody vegetation for all of Germany and Austria at 10 m resolution to demonstrate the potential of sub-pixel mapping. We map LC fractions for one point in time, using all available Sentinel-2 data from 2017 and 2018 (<70% cloud cover). We combine the concept of synthetically mixed training data with statistical aggregations from spectral-temporal metrics (STM) derived from Sentinel-2 reflectance time series. We specifically examine how STM can be used for creating synthetically mixed training data. STM are known to facilitate large area mapping by being largely independent of image acquisition dates and inherently incorporate phenological information. Vegetation is an important part of settlements and time series information supports its mapping. Synthetically mixed training data facilitates a streamlined training by using pure reference spectra to generate artificial mixtures as input to regression modelling of LC fractions in mixed pixels. We here show how combining both offers great potential for wall-to-wall LC fraction mapping. We further investigate the positive effect of STM on map results by comparing the performance of different subsets of STM combinations. Our results indicate that many STM combinations containing spectral variability and vegetation indices provide suitable input to creating synthetic training data for regression-based fraction mapping. Results for built-up surfaces and infrastructure (MAE 0.13/RMSE 0.18 at 20 m resolution), woody vegetation (0.18, 0.22) and non-woody vegetation (0.14, 0.19) are highly consistent across Germany and Austria. Only a few surface types were not accurately predicted in our nation-wide mapping. Further research is required to optimize mapping of temporally invariant bare soil and rock surfaces that show spectral similarity to built-up surfaces and infrastructure. The proposed methodology combines benefits of both regression-based modelling with synthetically mixed training data and STM, and thus facilitates mapping of LC fractions on a national scale and at high resolution. Such information will allow to better characterize settlements and identifying processes such as densification that are best represented by continuous LC mapping.

Publication
Remote Sens Environ