Digital Twins
Cities are Cloned in the Virtual World to Transform Urban Planning
  4 min
Urban research is very interested in automatically reconstructing the structure and the roofs of buildings. Understanding these structures provides essential information for generating accurate 3D city models for digital twins that can be used in applications like urban planning, environmental analysis, and disaster management.
While deep-learning models have been used to identify roof areas from satellite images, these models struggle with complex roof shapes and processing issues. To improve this, researchers proposed a new framework in which aerial images and cadastral data are combined with an advanced neural network to identify roof corners and edges.
Rapid urban development and limited land availability have caused a need for more infrastructure above and below ground. Managing all this infrastructure requires reliable city models.
As traditional 2D systems struggle with the complexity of modern cities, 3D city models are increasingly used to represent urban environments accurately. These detailed 3D models used as a basis for city digital twins are often created by using remotely sensed data, point clouds from LIDAR data (Laser Imaging Detection And Ranging and other vector and raster geospatial information technology.
However, for some countries, LIDAR is simply too expensive and complicated to use. That’s why researchers looked into a cost-effective approach in which for the 3D city model creation they use only aerial imagery.
As manually extracting building features from images can be a time-consuming and costly affair, machine-learning and deep-learning techniques are being explored to automate the process.
However, it is hard to develop algorithms to reconstruct complex building structures only from images, especially when obstacles block roof edges or when roof designs are complex. In addition for many planning processes, the aim is to obtain vector and not only raster output as a result. In an attempt to remedy this, the researchers developed a new multi-step method for accurately mapping and extracting roof shapes from regular images in a vector format.
The multi-step process begins with detecting roof corners and edges using HEAT (holistic edge attention transformer), a deep-learning approach originally designed for building reconstruction. After preparing the necessary data from aerial images and cadastral data, the HEAT model is trained to identify roof outlines.
Moreover, the developed method detected more than what was expected. It managed to reconstruct the building roof structure on roofs partly hidden from vegetation. This was achieved due to the variety of input information. The positive result was also related to the fact that the training data was from two different countries: The Netherlands and Bulgaria.

3D modelling workflow
The researchers surmise that their proposed framework outperforms traditional techniques, especially when it comes to detecting inner roof planes.
That being said, limitations arise when multiple buildings and ground surfaces appear in the same image. The model may then confuse the ground with roof structures, which leads to inaccuracies in 3D models.
The framework also has trouble with intricate roofs with many corners and large, densely packed buildings. While the model is very good at capturing finer details, in some cases it detects more roof planes than actually exist.

Qualitative evaluations on building roof plane extraction
Despite the challenges mentioned, the proposed framework does show promise as a new approach to accurately identifying corners and edges in building roofs.
Possible future improvements include refining the methodology and creating an open-source end-to-end framework for 3D city modelling for digital twins.
This story is adapted from a journal article: Campoverde, C., Koeva, M., Persello, C., Maslov, K., Jiao, W., & Petrova-Antonova, D. (2024). Automatic Building Roof Plane Extraction in Urban Environments for 3D City Modelling Using Remote Sensing Data. Remote Sensing, 16(8), 1386. It has been adapted in accordance with the copyright license CC BY 4.0
To read the original article, follow the link below: