Get to know Point Cloud

2 minute read

Published:

This blog post go through the basics of point cloud data and related projects I have done, including point cloud registration, point cloud segmentation.

Basics of PCD

Point cloud data contains information of a set of points in three-dimensional space. Each point is represented as a vector of its 3D Cartesian coordinates (x, y, z), plus other information such as color (RGB), reflection intensity (Intensity) etc. Point cloud data can be acquired from LiDAR, RADAR, RGB-D cameras, or derived from images using multi-view stereo vision algorithms. A scan of LiDAR with annotations

Point Cloud Registration

One of the base tasks in the applications of PCD is point cloud registration. The idea behind is to get the transformation information by matching adjacent point clouds, thus to ground the high level tasks such as state estimation of a moving robot. Matching two point clouds actually means finding the same set of points that shows in both point clouds and calculate the transformation matrix. However, it’s nontrivial to find the points consistent in two point clouds. There are many sophisticated hand crafted descriptors to help find similar points in two point clouds, deep learning based methods are also on the table. A pipeline of point cloud registration

Point Cloud Segmentation

Point cloud segmentation is a classic task to cluster points with similar characteristics into homogeneous regions. Point cloud semantic segmentation takes a step further, aim to assign each 3D point a semantic label. After many explorations in supervised machine learning methods including Support Vector Machine, Gaussian Mixture Model, Random Forest and statistical contextual models, such as Conditional Random Field, deep learning based methods now achieve the state of the art for point cloud segmentation.
Instance segmentation achieves detection and semantic segmentation of objects at the same time, by giving different labels for separate instances belonging to the same class. If we predict background semantics while doing instance segmentation, we get panoptic segmentation, the goal of which is to assign each point (pixel for images) a semantic label and instance ID if the point belongs to an object. Then if we can track objects moving in a sequence of consecutive point clouds and conduct the panoptic segmentation on every point cloud, we obtain a 4D panotic segmentation (3D space and 1D time).