Deep learning for powerline removal from LiDAR point cloud

when quantifying ecosystem structure from massive ALS data


What is LiDAR?

Light Detection And Ranging (LiDAR) is a laser-based remote sensing technology. It measures the amount of time it takes for the laser light pulses to reach the ground and return. Billions of these rapidly collected measurements (points) can create extremely detailed three-dimensional models of the Earth's surface. This technology is used in geographical information systems (GIS) to produce a digital elevation model (DEM) or a digital terrain model (DTM) for 3D mapping.

We are using country-wide airborne LiDAR data for ecosystem structure monitoring.

Actueel Hoogtebestand Nederland (AHN) is a collection of national airborne LiDAR flight campaigns (AHN1, AHN2, AHN3 and AHN4) during the leaf-off season covering the whole Netherlands.

You can find a detailed description and download options  here .

AHN1

1996-2003

AHN2

2007-2012

AHN3

2014-2019

AHN4

2020-2022

A glimpse of the AHN4 data

Point density: 20-35 point/m 2 

Classification:

1. Unassigned

2. Ground

3. Building

4. Water

5. Powerline

6. Reserved


LiDAR-derived products

1. Ecosystem height

Example of the 95th percentile of normalized height (10-meter resolution).

Left: generated from AHN2 data which was collected during 2007-2012. Right: generated from AHN3 data which was collected during 2014-2019. The height range is between 0-50 meter.

The lighter yellow indicates lower vegetation; the darker purple indicates higher vegetation.

2. Ecosystem cover

Example of pulse penetration ratio (10-meter resolution).

Left: generated from AHN2 data which was collected during 2007-2012.

Right: generated from AHN3 data which was collected during 2014-2019. The pulse penetration ratio range is between 0-1.

The lighter yellow indicates more dense vegetation cover; the darker purple indicates more open vegetation cover.

3. Ecosystem structural complexity

Example of Shannon index (10-meter resolution).

Left: generated from AHN2 data which was collected during 2007-2012.

Right: generated from AHN3 data which was collected during 2014-2019. The Shannon index range is between 0-8.

The lighter yellow indicates less vertical variability of vegetation; the darker purple indicates more vertical variability of vegetation.


Powerline issue

When looking closely into the ecosystem height derived from AHN3 data, you will find pixels higher than 30 meters. Are they all tall trees? Or is there something else?

How powerline displays in LiDAR-derived data products

Let's have a look what is there in the point cloud.

Powerline in original LiDAR dataset


Deep learning method

PointCNN

PointCNN, proposed by Li et al ( 2018 ), is a deep learning generalization of convolutional neural networks (CNNs) that performs feature learning from point clouds.

 PointCNN  shares a similar design of hierarchical convolution on 2D CNNs (grid-based) and generalizes it to point clouds. The repository of PointCNN can be found  here .

We use Dayton Annotated LiDAR   Earth Scan  (DALES)  dataset for PointCNN model training.

DALES is a new large-scale aerial LiDAR data set with over a half-billion points spanning 10 square kilometers of area. DALES contains forty scenes of dense, labeled aerial data spanning multiple scene types including urban, suburban, rural, and commercial. The data was hand labeled into eight categories: ground, vegetation, cars, trucks, poles, power lines, fences and buildings. 

We use  arcgis.learn   module and related deep learning frameworks to train PointCNN model and predict AHN3 data.

After setting up the deep learning environment, we can train the model with DALES data and save the trained model for further prediction. We only consider 3D coordinates (X, Y, and Z) as input, to prevent introducing unstandardized attributes (e.g. unnormalized intensity values or RGB values) into training and prediction.

The prediction results provide the point clouds classified into eight abovementioned classes ‒ including powerlines.


Model performance

Training PointCNN in Jupyter notebook

PointCNN model training and prediction using Jupyter Notebook

Training performance monitored on Tensorboard

Powerline extraction results

We tested the performance of PointCNN on predicting powerline points in ten study areas with various vegetation height, ranging from 0-18.8 m median height.

The results show that PointCNN generates rather stable results with high accuracy, while no evident relationship between the accuracy and vegetation height is observed.

Statistical analysis and time efficiency

Accuracy of PointCNN results

Point-based method using deep learning (PointCNN model) shows a rather stable and high accuracy result in ten study areas, with an average F1 score of 96.47% and a precision value of 98.01%, while no distinct change of its performance is observed in relation to vegetation height.

The dip appeared in area D may be caused by the misclassification of water points.

Executing time of PointCNN in ten study areas

The time efficiency of PointCNN is ~5,900 points/sec.

A NVIDIA GeForce GTX 1650 GPU card with 4 GB dedicated memory and 16 GB shared memory was used for the training of PointCNN model. When upscaling the process to a high performance computing (HPC) or cloud environment, the executing time can be significantly shortened by parallel processing with the benefit of multi-node GPU clusters.


Ecosystem height derived from tested methods

We calculate the 95th percentile of normalized height (10-meter resolution) using the tested methods and compare the results.

The 95th percentile of normalized height (10-meter resolution) using the tested methods


Summary

  • PointCNN showed a great capability of removing powerlines from ALS data, with an average F1 score of 96.47% and a precision value of 98.01%.
  • The performance of PointCNN on identifying powerline points is rather stable in different areas with various vegetation height.
  • Improving the performance of deep learning model and exploring parallel processing solutions in high performance computing and cloud environment are worth considering in future study of powerline extraction, especially for generating high quality data products of ecosystem structure at regional or national scale.

Credits and Resources

Meijer, C., Grootes, M. W., Koma, Z., Dzigan, Y., Gonçalves, R., Andela, B., . . . Kissling, W. D. (2020). Laserchicken—A tool for distributed feature calculation from massive LiDAR point cloud datasets. SoftwareX, 12, 100626. doi:  https://doi.org/10.1016/j.softx.2020.100626 

Jung, J., Che, E., Olsen, M. J., & Shafer, K. C. (2020). Automated and efficient powerline extraction from laser scanning data using a voxel-based subsampling with hierarchical approach. ISPRS Journal of Photogrammetry and Remote Sensing, 163, 343-361. doi: https://doi.org/10.1016/j.isprsjprs.2020.03.018 

Kim, H. B., & Sohn, G. (2013). Point-based classification of power line corridor scene using random forests. Photogrammetric Engineering and Remote Sensing, 79(9), 821-833. doi: https://doi.org/10.14358/PERS.79.9.821 

Washington Geological Survey, 2019, Washington State Lidar Plan: Washington Geological Survey.  https://wadnr.maps.arcgis.com/apps/Cascade/index.html?appid=b93c17aa1ef24669b656dbaea009b5ce 

Training performance monitored on Tensorboard

Accuracy of PointCNN results

Executing time of PointCNN in ten study areas

The 95th percentile of normalized height (10-meter resolution) using the tested methods