Deep learning for powerline removal from LiDAR point cloud

when quantifying ecosystem structure from massive ALS data

Yifang Shi

August 26, 2022

What is LiDAR?

Light Detection And Ranging (LiDAR) is a laser-based remote sensing technology. It measures the amount of time it takes for the laser light pulses to reach the ground and return. Billions of these rapidly collected measurements (points) can create extremely detailed three-dimensional models of the Earth's surface. This technology is used in geographical information systems (GIS) to produce a digital elevation model (DEM) or a digital terrain model (DTM) for 3D mapping.

We are using country-wide airborne LiDAR data for ecosystem structure monitoring.

Actueel Hoogtebestand Nederland (AHN) is a collection of national airborne LiDAR flight campaigns (AHN1, AHN2, AHN3 and AHN4) during the leaf-off season covering the whole Netherlands.

You can find a detailed description and download options here .

AHN1

1996-2003

AHN2

2007-2012

AHN3

2014-2019

AHN4

2020-2022

A glimpse of the AHN4 data

Point density: 20-35 point/m²

Classification:

1. Unassigned

2. Ground

3. Building

4. Water

5. Powerline

6. Reserved

LiDAR-derived products

1. Ecosystem height

Example of the 95th percentile of normalized height (10-meter resolution).

Left: generated from AHN2 data which was collected during 2007-2012. Right: generated from AHN3 data which was collected during 2014-2019. The height range is between 0-50 meter.

The lighter yellow indicates lower vegetation; the darker purple indicates higher vegetation.

2. Ecosystem cover

Example of pulse penetration ratio (10-meter resolution).

Left: generated from AHN2 data which was collected during 2007-2012.

Right: generated from AHN3 data which was collected during 2014-2019. The pulse penetration ratio range is between 0-1.

The lighter yellow indicates more dense vegetation cover; the darker purple indicates more open vegetation cover.

3. Ecosystem structural complexity

Example of Shannon index (10-meter resolution).

Left: generated from AHN2 data which was collected during 2007-2012.

Right: generated from AHN3 data which was collected during 2014-2019. The Shannon index range is between 0-8.

The lighter yellow indicates less vertical variability of vegetation; the darker purple indicates more vertical variability of vegetation.

Powerline issue

When looking closely into the ecosystem height derived from AHN3 data, you will find pixels higher than 30 meters. Are they all tall trees? Or is there something else?

How powerline displays in LiDAR-derived data products

Let's have a look what is there in the point cloud.

Powerline in original LiDAR dataset

Deep learning method

PointCNN

PointCNN, proposed by Li et al ( 2018 ), is a deep learning generalization of convolutional neural networks (CNNs) that performs feature learning from point clouds.

PointCNN shares a similar design of hierarchical convolution on 2D CNNs (grid-based) and generalizes it to point clouds. The repository of PointCNN can be found here .

We use Dayton Annotated LiDAR Earth Scan (DALES) dataset for PointCNN model training.

DALES is a new large-scale aerial LiDAR data set with over a half-billion points spanning 10 square kilometers of area. DALES contains forty scenes of dense, labeled aerial data spanning multiple scene types including urban, suburban, rural, and commercial. The data was hand labeled into eight categories: ground, vegetation, cars, trucks, poles, power lines, fences and buildings.

We use arcgis.learn module and related deep learning frameworks to train PointCNN model and predict AHN3 data.

After setting up the deep learning environment, we can train the model with DALES data and save the trained model for further prediction. We only consider 3D coordinates (X, Y, and Z) as input, to prevent introducing unstandardized attributes (e.g. unnormalized intensity values or RGB values) into training and prediction.

The prediction results provide the point clouds classified into eight abovementioned classes ‒ including powerlines.

Model performance

Training PointCNN in Jupyter notebook

PointCNN model training and prediction using Jupyter Notebook

Training performance monitored on Tensorboard

Powerline extraction results

We tested the performance of PointCNN on predicting powerline points in ten study areas with various vegetation height, ranging from 0-18.8 m median height.

The results show that PointCNN generates rather stable results with high accuracy, while no evident relationship between the accuracy and vegetation height is observed.

Statistical analysis and time efficiency

Point-based method using deep learning (PointCNN model) shows a rather stable and high accuracy result in ten study areas, with an average F1 score of 96.47% and a precision value of 98.01%, while no distinct change of its performance is observed in relation to vegetation height.

The dip appeared in area D may be caused by the misclassification of water points.

Executing time of PointCNN in ten study areas

The time efficiency of PointCNN is ~5,900 points/sec.

A NVIDIA GeForce GTX 1650 GPU card with 4 GB dedicated memory and 16 GB shared memory was used for the training of PointCNN model. When upscaling the process to a high performance computing (HPC) or cloud environment, the executing time can be significantly shortened by parallel processing with the benefit of multi-node GPU clusters.

Ecosystem height derived from tested methods

We calculate the 95th percentile of normalized height (10-meter resolution) using the tested methods and compare the results.

Summary

PointCNN showed a great capability of removing powerlines from ALS data, with an average F1 score of 96.47% and a precision value of 98.01%.
The performance of PointCNN on identifying powerline points is rather stable in different areas with various vegetation height.
Improving the performance of deep learning model and exploring parallel processing solutions in high performance computing and cloud environment are worth considering in future study of powerline extraction, especially for generating high quality data products of ecosystem structure at regional or national scale.

Credits and Resources

Meijer, C., Grootes, M. W., Koma, Z., Dzigan, Y., Gonçalves, R., Andela, B., . . . Kissling, W. D. (2020). Laserchicken—A tool for distributed feature calculation from massive LiDAR point cloud datasets. SoftwareX, 12, 100626. doi: https://doi.org/10.1016/j.softx.2020.100626

Jung, J., Che, E., Olsen, M. J., & Shafer, K. C. (2020). Automated and efficient powerline extraction from laser scanning data using a voxel-based subsampling with hierarchical approach. ISPRS Journal of Photogrammetry and Remote Sensing, 163, 343-361. doi: https://doi.org/10.1016/j.isprsjprs.2020.03.018

Kim, H. B., & Sohn, G. (2013). Point-based classification of power line corridor scene using random forests. Photogrammetric Engineering and Remote Sensing, 79(9), 821-833. doi: https://doi.org/10.14358/PERS.79.9.821

Laserfarm documentation. https://laserfarm.readthedocs.io/en/latest/index.html

Object Detection from Aerial Point Cloud. https://storymaps.arcgis.com/stories/21ab424b665d4fa3b9dff65d34cc707a

Washington Geological Survey, 2019, Washington State Lidar Plan: Washington Geological Survey. https://wadnr.maps.arcgis.com/apps/Cascade/index.html?appid=b93c17aa1ef24669b656dbaea009b5ce