The automatic detection of changes in remote sensing data is a topic that has been studied for decades. Basically, a necessary prerequisite for the actual step of change detection is always the accurate alignment of images acquired before and after the change - usually in the form of image registration and radiometric calibration. However, especially for very high-resolution images of urban areas image registration fails in the presence of significantly different viewing angles or different sensor technologies (e.g., optical and synthetic aperture radar (SAR)). This is due to due to the fact that elevated structures such as buildings, trees or masts exhibit a geometric distortion that is proportional to both the viewing angle as well as the object height. Therefore most existing remote sensing-based approaches for change detection are limited to images taken from the same - or at least a very similar - viewing angle and with the same sensor technology (e.g. change detection from optical to optical or from SAR to SAR). There are very few exceptions to this limitation. Regarding change detection with multiple sensors, most existing work focuses on the combination of medium-resolution sensors (e.g., Landsat and Sentinel-2 or Sentinel-1 and Sentinel-2). In these cases the data homogeneity problem is limited to the radiometric radiometric alignment, while geometric differences are negligible. To date, the automatic detection of changes in high-resolution images of urban areas, especially when taken by different sensors, is an open scientific challenge. In this work, we investigate, whether advances in deep learning-based single-image height reconstruction can provide a perspective for the detection of urban changes in a sensor- and viewing angle-independent way. The idea is to process both the pre- and the post-change images independently with a sensor-specific single-image height reconstruction model. Then, the reconstructed heights can be projected into a common map geometry. Changes in the scene are then theoretically represented by changes in the reconstructed heights. However, heights produced from single images are always prone to a significant amount of noise. Besides, different sensors, or observations from different viewing angles will lead to different occlusions. Thus, the change detection cannot be implemented in a conventional pixel-by-pixel manner. In order to avoid a high number of false positives, regularization has to be employed. We argue that openly available auxiliary data, e.g. building footprints extracted from the OpenStreetMap database can be used beneficially for this task.
|