Tree detection from very high-resolution imagery data in drought-prone areas
(1) Thomas Onfroy - Modelling, R&D Department, CCR
(2) Aurélien Couloumy - Digital Factory, CCR
(3) Antoine Labonne - Digital Factory, CCR
(4) Michel Médic - Institut Paris Région
(5) Jean-Baptiste Henry - Esri France
INTRODUCTION
The presence of trees near individual houses is one of the aggravating factors of the CSS (Clay Shrinking and Swelling) drought hazard in Metropolitan France [1]. In some cases, it may even be the determining cause of the damage. To integrate this phenomenon as a parameter of CCR’s CSS drought hazard model, a tree detection study at sub-parcel level was carried out on the territory of a commune in Île-de-France exposed to CSS hazard: Montigny-le-Bretonneux. To prevent this risk, the recommended distance by the French Ministry of Ecological Transition and Territorial Cohesion between a tree and the outer perimeter of a house can vary from 0.5H to 1.5H, i.e., a distance corresponding to 0.5 or 1.5 times the height of the tree, depending on its species. Current remote sensing and Artificial Intelligence techniques now allow the precise geolocation of each tree from Very High Resolution (VHR, with resolutions <1m) imagery data.
Beyond the differentiation of tree cover from other land cover types by remote sensing, tree characteristics such as crown diameter and height at a given date are used as information for the study. They can be collected from the extraction of elevation values contained in LIDAR (Laser Imaging Detection and Ranging) data or from DEM (Digital Elevation Models) and DTM (Digital Terrain Model) processing with VHR.
The combination of remote sensing methods of tree cover on aerial images with a resolution of 20cm (IGN’s BD ORTHO HR), Deep Learning tasks dedicated to Object Detection and GIS processing of elevation data thus lead to precisely locating trees and assessing their dimensions. To assess the feasibility of such a study and with a view to applying it to the entire country, work was initially carried out on the territory of one commune. The results obtained made it possible to locate the trees, know their dimensions in the vicinity of each dwelling and propose an exposure indicator relating to the presence of trees.
If the method is extended to the rest of the country, the grid data produced will make it possible to improve understanding of the causes of CSS losses and to take into account the presence of trees in the modelling of consequential damage.
METHODOLOGY
To accurately locate and measure the height, distance, and density of trees near individual houses, a hybrid approach was adopted. It combines:
- remote sensing methods on the images of IGN’s BD ORTHO HR at 20cm;
- the automated detection of the footprint of each tree by Deep Learning from IGN’s BD ORTHO HR and extracts of LIDAR HD;
- processing of elevation data in VHR: Digital Surface Model at 20cm from the Paris Region Institute and Digital Terrain Model RGE ALTI at 1m from the IGN;
- GIS spatial analyses of buildings in IGN’s BD TOPO.
The implementation of a methodology fostering the complementary nature of remote sensing methods, Artificial Intelligence techniques (Deep Learning Object Detection) and GIS analyses has made it possible to propose an indicator based on the height and distance of trees to buildings: the TreeBatiScore.
First, the tree cover is mapped at 20cm resolution with the NDVI (Natural Difference Vegetation Index) from the BD ORTHO HR RGB (Red, Green, Blue bands) and IRC (Infrared colours) images of the IGN. The NDVI is considered to be one of the most effective indices to identify vegetation from other land cover types on all types of RGB and IRC optical data [2]. The NDVI result is a 20cm resolution image with grid cell values ranging from -1 to 1. The threshold considered for extracting vegetation areas and trees is 0.1. All values below 0.1 correspond to other land use types [3]. The height of vegetation and trees is then estimated by subtracting a Digital Surface Model (DEM 20cm from the Institut Paris Région) from a Digital Terrain Model (DTM RGE ALTI 1m from IGN).
# trees
# CSS drought
# VHR imagery
# remote sensing
# Deep Learning
# Object Detection
This results in the Canopy Height Model (CHM) at 20cm. The resulting CHM provides the elevation values of all ground features including vegetation. Most of the hedges and bushes do not exceed 3m in height in the study area. Thus, the threshold chosen to define a tree is 3m. This makes it possible to mask all CHM values below 3m and to obtain a map of the trees (characterised upstream by the NDVI index) and their height (Figure 1).
In parallel with the remote sensing and GIS processing, a Deep Learning ‘Deep Forest’ model [4] of the Object Detection type was used to identify each tree from the BD ORTHO HR RVB with 20cm resolution. This model had to be re-trained due to the specificity of the task (10 cm National Ecological Observatory Network images versus 20cm BD ORTHO images). In order to re-train the model, annotation work was necessary. Pre-annotation bounding boxes were determined from the IGN’s HD LIDAR database.
More precisely, the LIDAR points corresponding to tall vegetation were first selected and then analysed using the PyCrown library [5]. Post-processing was used to reproduce these bounding boxes for each tree detected.
These pre-annotations greatly facilitated the final manual annotation work. This allowed DeepForest to be re-trained on 2 LIDAR areas: Louhans (Saône-etLoire) and Manosque (Alpes-de-HauteProvence). After taking into account detection thresholds (softmax function), a significant increase in the performance of the model compared to the pre-trained model was observed on the Montignyle-Bretonneux test base. The output of this model is rectangular footprints for each tree. As can be seen in some parts of Figure 3, some trees are not detected by the NDVI but are detected by the footprints obtained by Deep Learning and vice versa.
Thus, the different image processing approaches (NDVI, CHM and Deep Learning tasks) complement each other to minimise missing data. This mutual completion is not negligible and makes the two approaches complementary. Based on the 20cm CHM and a function of the R-ArcGIS ForestTools package [6], the tree tops are located by a sliding window algorithm. When a given CHM cell is found to be the highest in the window, it is labelled as the highest point in the canopy. This point is called the Tree Top Maxima. The maximum values for each tree are then used as buffer zones, the radius of which represents the safety distance between the tree and the outer perimeter of the dwelling. It can be noted that for each tree detected by the Deep Learning approach and not detected by the NDVI, the Tree Top Maxima is also recovered in each area by a zonal statistics function.
The recommended distance between a tree and a house can vary from 0.5 H to 1.5 H depending on the tree species. The classification of tree species from VHR imagery requires an in-depth remote sensing study which has not been prioritised in this study. Thus, the tree-tohouse distance is set at 1H. The exposed wall length can be calculated in GIS with IGN’s BD TOPO layer. This is the wall length in red in Figure 4. The generalisation of this methodology to the rest of the country raises major practical and technical issues. Firstly, the availability of imagery data comparable in spatial and spectral resolution and of a sufficiently recent vintage will be central to the reproducibility of the approach. This study was carried out at commune level with a specific sample of vegetation. Therefore, extension to a wider area will require additional and extensive training of Deep Learning models. The next issue is the technical management of the volume of data generated by these various very high-resolution coverages, in terms of storage, distribution and processing. Finally, this methodology will only reach its full potential if it can monitor the changes in vegetation cover and soil sealing over time, based on regularly updated data, thus further emphasising the need for the full automation of extraction processes.
# trees
# CSS drought
# VHR imagery
# remote sensing
# Deep Learning
# Object Detection
Figure 1 - Canopy Height Model and Digital Surface Model at 20cm
Figure 2 - Example of PyCrown processing of LiDAR data and post-processing
Figure 3 - Tree footprints produced by Deep Learning with the Canopy Height Model (CHM)
Figure 4 - Exposed perimeter per dwelling (linear meter of wall in red)
Figure 5 - TreeBatiScore results on private houses (Percentage of exposed linear meter of wall)
THE PARTNERS
The Institut Paris Région’s main mission is to carry out the necessary studies and work for the decision-making of the Greater Paris Region and its partners. Ranging from local to large metropolitan areas, it is involved in many fields such as urban planning, transport and mobility, the environment, the economy and social issues. It supports the planning and development policies of communes, local authorities, inter-communes and departments. It also carries out on-demand studies for organisations, both in the Greater Paris region and elsewhere (including abroad).
Esri France distributes and develops the uses of ArcGIS, Esri Inc’s cartographic and spatial analysis platform. The company offers full geographic desktop and online solutions. Founded in 1969 as the Environmental Systems Research Institute as a land-use consulting firm, the company has 10 regional offices in the United States and more than 80 distributors internationally, with approximately one million users in 200 countries and 3,800 employees worldwide.
RESULTS
This work allowed to locate each tree on VHR imagery data and to measure their height and distance from dwellings. The percentage of exterior wall length exposed by one or more trees that are too close according to their size, and according to a tree-to-dwelling safety distance of 1 H, was calculated for each house. The TreeBatiScore shown in Figure 5 is the percentage of exposed exterior wall length per dwelling. The presence of trees in the direct vicinity of the foundation of a single-family home contributes to the aggravation of the CSS drought hazard, however, other physical and structural factors contribute to CSS damage to single-family homes [7].
The environment of each plot must also be considered: slopes, presence of drains, rainwater collection networks, water supply trenches or sumps, depth and right of way of electricity or gas networks. These elements are not all available and their digitalisation on the scale of the metropolitan territory is still incomplete. Data on the building materials and structure of each building can be used and linked to the information on CSS drought hazard. On this last point, the Base de Données Nationale des Bâtiments (BDNB) [National Buildings Database] will be explored. When all this data (plot environment and building materials) is available in the form of a geo-database, they can be considered as aggravating factors of the CSS hazard and integrated into a multi-criteria indicator.
CONCLUSION
The combination of remote sensing methods using Very High Resolution (VHR) imagery and a Deep Learning model has allowed the precise location of trees in the vicinity of houses. To delineate tree crowns and get an estimate of the height of each tree, a Digital Surface Model, a Digital Terrain Model or LIDAR data in VHR is essential.
The results of the TreeBatiScore, based on the presence of trees, can be crossreferenced with geo-localised data linked to the CSS hazard. As tree-to-building proximity is not the only aggravating factor, data on the environment of the dwellings and building materials should be used to establish a comprehensive diagnosis of house vulnerability.
To apply the methods used throughout the country, the issue of data management and storage must be considered. The ability to monitor the evolution of the vegetation cover and the height of each tree over time requires the full automation of extraction processes.
IGN’s HD LIDAR data, available from the end of 2025 throughout mainland France, will make it possible to describe land elevation and the elements of the subsoil very accurately, including the vegetation cover. Ultimately, when tree data is produced for the entire country, it can be used as an input to the internal CSS model to improve loss prediction./
REFERENCES
1. Li, J., & Guo, L. (2017). Field investigation and numerical analysis of residential building damaged by expansive soil movement caused by tree root drying. Journal of Performance of Constructed Facilities.
2. Tucker, C.J. (1979) ‘Red and Photographic Infrared Linear Combinations for Monitoring Vegetation’, Remote Sensing of Environment.
3. Kang, et al. (2021) Land Cover and Crop Classification Based on Red Edge Indices Features, MDPI Remote Sensing.
4. Weinstein, et al. (2019) Individual Tree-Crown Detection in RGB Imagery Using Semi-Supervised Deep Learning Neural Networks. Remote Sensing.
5. Zörner, , et al. (2018) PyCrown - Fast raster-based individual tree segmentation for LiDAR data. Landcare Research NZ Ltd.
6. Plowright. (2021) Canopy analysis in R using Forest Tools, CRAN-R Project.
7. IFSTTAR 2017. Retrait et gonflement des argiles Analyse et traitement des désordres créés par la sécheresse, IFSTTAR Technical Guide.
CITATION
Onfroy et al, Tree detection from Very High-Resolution imagery data in areas at risk of clay shrinking and swelling. In CCR 2022 Scientific Report; CCR, Paris, France, 2022, pp. 23-27