Image Processing for Basic Depth Completion
Depth completion is the task of converting a sparse depth map Dsparse into a dense depth map Ddense. This algorithm was originally created to help visualize 3D object detection results for AVOD.
An accurate dense depth map can also benefit 3D object detection or SLAM algorithms that use point cloud input. This method uses an unguided approach (images are ignored, only LIDAR projections are used). Basic depth completion is done with OpenCV and NumPy operations in Python. For more information, please see our paper: In Defense of Classical Image Processing: Fast Depth Completion on the CPU.
Please visit https://github.com/kujason/scene_vis for 3D point cloud visualization demos on raw KITTI data.
If you use this code, we would appreciate if you cite our paper:
In Defense of Classical Image Processing: Fast Depth Completion on the CPU
@inproceedings{ku2018defense,
title={In Defense of Classical Image Processing: Fast Depth Completion on the CPU},
author={Ku, Jason and Harakeh, Ali and Waslander, Steven L},
booktitle={2018 15th Conference on Computer and Robot Vision (CRV)},
pages={16--22},
year={2018},
organization={IEEE}
}
Click here for a short demo video with comparison of different versions.
Click here to see point clouds from additional KITTI raw sequences. Note that the structure of smaller or thin objects (e.g. poles, bicycle wheels, pedestrians) are well preserved after depth completion.
Also see an earlier version of the algorithm in action here (2 top views).
Tested on Ubuntu 16.04 with Python 3.5.
~/Kitti/depth
(only the val_selection_cropped and test data sets are required to run the demo). The folder should look like something the following:git clone [email protected]:kujason/ip_basic.git
cd ip_basic
pip3 install -r requirements.txt
python3 demos/depth_completion.py
This will run the algorithm on the cropped validation set and save the outputs to a new folder in demos/outputs
. Refer to the readme in the downloaded devkit to evaluate the results.
depth_completion.py
# Validation set
and uncomment the lines below # Test set
fill_type
:
'fast'
- Version described in the paper'multiscale'
- Multi-scale dilations based on depth, with additional noise removalextrapolate
:
True
: Extends depths to the top of the frame and runs a 31x31 full kernel dilationFalse
: Skips the extension and large dilationsave_output
:
True
- Saves the output depth maps to diskFalse
- Shows the filling process. Only works with fill_type == 'multiscale'
taskset --cpu-list 0 python3 demos/depth_completion.py
Method | iRMSE |
iMAE |
RMSE | MAE |
Device |
Runtime |
FPS |
---|---|---|---|---|---|---|---|
NadarayaW | 6.34 | 1.84 | 1852.60 | 416.77 | CPU (1 core) | 0.05 s | 20 |
SparseConvs | 4.94 | 1.78 | 1601.33 | 481.27 | GPU | 0.01 s | 100 |
NN+CNN | 3.25 | 1.29 | 1419.75 | 416.14 | GPU | 0.02 s | 50 |
IP-Basic | 3.75 | 1.29 | 1288.46 | 302.60 | CPU (1 core) | 0.011 s | 90 |
Table: Comparison of results with other published unguided methods on the KITTI Depth Completion benchmark.
Several versions are provided for experimentation.
Gaussian (Paper Result, Lowest RMSE)
: Provides lowest RMSE, but adds many additional 3D points to the scene.Bilateral
: Preserves local structure, but large extrapolations make it slower.Gaussian, No Extrapolation
: Fastest version, but adds many additional 3D points to the scene.Bilateral, No Extrapolation (Lowest MAE)
: Preserves local structure, and skips the large extrapolation steps. This is the recommended version for practical applications.Multi-Scale, Bilateral, Noise Removal, No Extrapolation
: Slower version with additional noise removal. See the above video for a qualitative comparison.The table below shows a comparison of timing on an Intel Core i7-7700K for different versions. The Gaussian versions can be run on a single core, while other versions run faster with multiple cores. The bilateral blur version with no extrapolation is recommended for practical applications.
Version | Runtime | FPS |
---|---|---|
Gaussian (Paper Result, Lowest RMSE) | 0.0111 s | 90 |
Bilateral | 0.0139 s | 71 |
Gaussian, No Extrapolation | 0.0075 s | 133 |
Bilateral, No Extrapolation (Lowest MAE) | 0.0115 s | 87 |
Multi-Scale, Bilateral, Noise Removal, No Extrapolation | 0.0328 s | 30 |
Table: Timing comparison for different versions.
Qualitative results from the Multi-Scale, Bilateral, Noise Removal, No Extrapolation
version on samples from the KITTI object detection benchmark.