Accurate Extrinsic Calibration of LiDAR and Camera
with Refined Vertex Features
Shuo Wang1 , Zheng Rong2 , Pengju Zhang2 and Yihong Wu1,2,*
1
    School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
2
    National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China


                                         Abstract
                                         LiDARs and cameras are widely used in many research fields, due to the complementarity of their data.
                                         Calibrating the extrinsic parameters of LiDAR frame and camera frame is essential for fusing the two
                                         kinds of data. Calibration methods based on a calibration board extract geometry features from point
                                         clouds and images, then build geometry constraints to estimate the extrinsic parameters. In this paper, we
                                         exhaustively analyze the measurement characteristics of LiDARs that would introduce notable negative
                                         effect on the calibration, including the noise of depth measurement and the divergence of laser beams.
                                         Therefore, we propose a refining method for vertex features from LiDAR point clouds using the prior
                                         information of the board, which can effectively mitigate the effect of systematic measurement errors and
                                         improve the accuracy of the calibration results. Meanwhile, our calibration method reduces the minimal
                                         number of calibration datasets to one, which promotes the efficiency of calibration processes. Besides, we
                                         propose an objective and independent evaluation method for target-based calibration methods. Extensive
                                         experiments and comparisons with state-of-the-art methods show that using refined vertex features can
                                         notably improve the accuracy and efficiency of extrinsic parameter calibration.

                                         Keywords
                                         Extrinsic Calibration, Vertex Feature, Calibration Evaluation


1. Introduction
LiDARs and cameras are indispensable and ubiquitous in positioning and navigation, which
are used in many fields of applications, such as autonomous drivings and industrial robotics.
LiDARs measure the structure of surroundings in the form of point clouds and cameras measure
the texture of surroundings in the form of images. The information fusion of LiDAR and camera
can strengthen the images with accurate depth information and provide color information for
point clouds. The prerequisite of information fusion is the calibration of extrinsic parameters
between a LiDAR frame and a camera frame.
   Calibration board-based methods are the most popular calibration methods nowadays [1, 2,
3, 4, 5, 6, 7, 8]. In these methods, one or more boards are deployed in a static environment. By
extracting and matching geometry features of the calibration board from point cloud and image,

IPIN 2022 WiP Proceedings, September 5 - 7, 2022, Beijing, China
*
 Corresponding author.
$ wangshuo2020@ia.ac.cn (S. Wang); zheng.rong@nlpr.ia.ac.cn (Z. Rong); pengju.zhang@nlpr.ia.ac.cn (P. Zhang);
yhwu@nlpr.ia.ac.cn (Y. Wu)
 0000-0003-1269-7506 (S. Wang); 0000-0002-9096-6049 (Z. Rong); 0000-0001-8245-0205 (P. Zhang);
0000-0003-2198-9113 (Y. Wu)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
the constraints of extrinsic parameters between LiDAR and camera can be constructed, and the
extrinsic parameters can be estimated using these constraints.
   In most of existing calibration methods, geometry features are extracted directly from point
clouds and refined with fitting algorithms, such as plane fitting algorithms and line fitting
algorithms, to reduce sensor noises. But these fitting methods only focus on the geometry
characteristics of calibration boards and ignore the measurement characteristics of LiDARs. In
this paper, we analyze the measurement characteristics of LiDARs and the systematic errors
raised by these characteristics, including the noise of range measurements and the divergence
of LiDAR beams. Then we propose a refining method for vertex features to reduce the negative
impact aroused by the systematic errors. We calibrate the extrinsic parameters of LiDAR and
camera using the refined vertex features to improve the accuracy of calibration results. Due to
the high accuracy of refined vertex features, our proposed method can calibrate the extrinsic
parameters using only one calibration dataset.
   To evaluate calibration results, some papers compare their estimated extrinsic parameters with
the ground truth, for example, simulated values. Some papers calculate the re-projection errors
of geometry features for evaluation, such as point features or line features from calibration
boards. However, in real world, the ground truth is hard to get and different kinds of re-
projection errors are not holistic measurements for two 3D frames. To evaluate our calibration
method effectively and compare our method with other calibration methods fairly, we propose
an evaluation method using raw LiDAR measurements and 3D space distances of point-to-line
that are logically independent of any method used in the calibration process.
   The contributions of this paper are as follows:
   1) A refining method for vertex features extracted from LiDAR point clouds yielding notable
accuracy improvement of calibration results;
   2) A fair and independent evaluation method using 3D space distances;
   3) Extensive experiments and comparisons between our proposed method and other state-of-
the-art calibration methods.
   The rest of this paper is organized as follows. Section 2 reviews calibration methods. Section 3
analyzes measurement characteristics of LiDARs, proposes a refining method for vertex features,
and introduces a calibration method based on refined vertex features. An evaluation method
using 3D space distances is also presented in this part. Section 4 shows the experimental results.
Section 5 concludes this paper.


2. Related Work
Extrinsic calibration methods between LiDAR and camera can be classified as three kinds,
target-based calibration methods, target-less calibration methods and motion-based calibration
methods. In target-based calibration methods, there are two kinds of targets: calibration
boards [1, 2, 3, 4, 5, 6, 7, 8] and other artificial markers that are different from typical calibration
boards, such as a board with different patterns like a circle [9, 10], or a target with different
shapes like a ball [11, 12]. The calibration boards are main calibration targets. With explicit
data association between LiDAR and camera, the calibration board-based methods usually use
the corresponding relationship between features extracted from point clouds and images to
estimate the extrinsic parameters.
   Calibration board-based methods can be classified as plane-based, edge-based and point-
based methods, according to their used features. The plane-based constraint uses the plane
features of calibration boards. This kind constraint can be constructed by the points on the
board extracted from point clouds and plane parameters calculated from images in the camera
frame [1, 2, 3]. Another way to construct the plane-based constraint is to compute the plane
parameters from point clouds [1, 4, 8]. The edge-based constraint uses the edge features of
calibration boards. For point clouds, a board edge can be modeled as a series of points on the
edge [1] or line parameters fitted by these points [2, 4, 5]. For cameras, a board edge can be
modeled as a back-projected plane[1] or a 3D line in the camera frame [2, 4, 5]. The point-based
constraint uses the point features of board vertices. The vertices in the LiDAR frame can
be computed by the points on the calibration board. From images, the projection of board
vertices can be extracted [6, 7], and the coordinates of board vertices in the camera frame can
be derived according to the relative pose between the calibration board and the camera [4, 8].
The calibration board-based methods usually perform better than other calibration methods
because the extraction and matching of features are explicit and clear.
   Target-less calibration methods depend on the information extracted from the environ-
ment rather than artificial targets. Structure information [13, 14, 15], semantic information [16]
and mutual information [17, 18] are widely used in the target-less calibration methods. The
weakness of the target-less calibration methods is that there is no explicit corresponding rela-
tionship between point clouds and images compared with target-based methods.
   Motion-based calibration methods [19, 20, 21, 22, 23, 24] leverage odometry information
to estimate the extrinsic parameters. Motion-based methods simplify the procedure of data
collections but the accuracy of extrinsic parameters greatly depends on the performance of the
odometry algorithms.
   In this paper, we focus on calibration board-based methods. Vertex features extracted from
point clouds are refined according to the prior information of the calibration board to reduce
the systematic errors of LiDAR measurements and improve the accuracy and efficiency of
calibration results. We compare our calibration method with other state-of-the-art calibration
methods using our proposed fair evaluation method based on space distances.


3. Calibration Method
3.1. Overview of Our Calibration Method
We denote the coordinates of a point in the LiDAR coordinate frame {𝐿} as 𝑃 𝐿 =
(𝑋𝐿 , 𝑌𝐿 , 𝑍𝐿 )T , and the coordinates of the identical point in the camera coordinate frame
{𝐶} as 𝑃 𝐶 = (𝑋𝐶 , 𝑌𝐶 , 𝑍𝐶 )T . The transformation between 𝑃 𝐶 and 𝑃 𝐿 can be written as

                                      𝑃 𝐶 = 𝑅𝐶        𝐶
                                             𝐿 𝑃 𝐿 + 𝑡𝐿 ,                                      (1)

where 𝑅𝐶𝐿 is the rotation matrix, 𝑡𝐿 is the translation vector. The transformation matrix 𝑇 𝐿
                                   𝐶                                                        𝐶

can be written as                     (︂ 𝐶
                                         𝑅𝐿 3×3 𝑡𝐶
                                                          )︂
                                 𝐶
                               𝑇𝐿 =                 𝐿 3×1       .                          (2)
                                          0T3×1       1     4×4
The goal of the extrinsic calibration is to estimate 𝑅𝐶
                                                      𝐿 and 𝑡𝐿 .
                                                             𝐶

   In this paper, two ArUCO markers are used as calibration targets to implement the extrinsic
calibration between a LiDAR and a camera, which are pasted on two separate boards. A
calibration board coordinate frame {𝐵} is built to represent the calibration features as Figure 1.
The 𝑋𝑂𝑌 plane of the calibration board frame is the board plane. The 𝑍 axis is perpendicular
to the 𝑋𝑂𝑌 plane and points out of the marker plane.


Figure 1: The configuration of two board frames, a LiDAR frame and a camera frame. There is no
specific spatial relationship between the two board frames. The blue box indicates the image. The golden
points indicate the LiDAR points on the calibration boards.


   The whole calibration process is described below. First, we analyze the measurement charac-
teristics and systematic errors of LiDARs. Second, in the LiDAR frame, according to extracted
points on the board from point clouds and the known size of the calibration board, the points on
vertices are refined. Third, the relative transformation between the calibration board frame and
the camera frame is calculated with the known size of the marker, and the vertices of calibration
board are estimated in the camera frame. Finally, with the information of vertices in these two
sensor frames, the corresponding relationship between frames is constructed and the extrinsic
parameters are calibrated. The details of each stage are introduced in the following paragraphs.

3.2. Measurement Characteristics and Systematic Errors of LiDARs
There are two kinds of systematic errors of LiDARs.
3.2.1. Range Error
The LiDAR measures the surroundings in spherical coordinates with the range 𝑟, elevation 𝜔
and azimuth 𝛼. Its Cartesian coordinates are calculated as
                                      𝑋𝐿 = 𝑟 * 𝑐𝑜𝑠𝜔 * 𝑠𝑖𝑛𝛼,
                                       𝑌𝐿 = 𝑟 * 𝑐𝑜𝑠𝜔 * 𝑐𝑜𝑠𝛼,                                     (3)
                                       𝑍𝐿 = 𝑟 * 𝑠𝑖𝑛𝜔.

The elevation 𝜔 and azimuth 𝛼 are accurate, but there is error in the range 𝑟. This paper uses a
Velodyne VLP-16 LiDAR sensor, whose range error can reach up to ±3𝑐𝑚 typically. As we can
see from Figure 2, when we project LiDAR points on a board plane about 2𝑚 away from the
LiDAR, the thickness of the point cloud can be up to 4𝑐𝑚. The true thickness is zero.


                          (a) Front View                      (b) Top View
Figure 2: The point cloud of a calibration board about 2𝑚 away from the LiDAR. The thickness of the
point cloud is about 4𝑐𝑚, which is caused by the range error in the measurement of every LiDAR point.


3.2.2. Divergence Error
As shown in Figure 3a, the LiDAR beam becomes divergent and the LiDAR spot becomes larger
as distance becomes larger. For example, the horizontal size of spot is about 18.2𝑚𝑚 and the
vertical size is about 12.5𝑚𝑚 when the range is 2𝑚.
   Therefore, when we project a LiDAR point on the edge of the calibration board, part of the
LiDAR spots is on the board and other part is on the background. For these edge points, when
the reflectivity of the calibration board is stronger than that of the background, the returned
range measurement is the distance between the LiDAR and the board edge, even if the centerline
of the LiDAR beam is out of the board (the LiDAR spot is mostly on the background). The
resulting point measurement is called ghost point and makes the point cloud of the calibration
board larger than its true size.
   When the reflectivity of the calibration board is similar with that of the background, the
returned ranges of the edge points will be a weighted average of the range measurement of
the calibration board and the background. This kind of LiDAR point is also the ghost point
and makes a series of points behind the calibration board, whose shape is like a streamline.
As shown in Figure 3b, we can see that the points in the highlight ellipse are the points near
the board edge which is represented by the bright line. Due to the divergence error of LiDAR
                    (a) Divergence of LiDAR Beam     (b) Ghost Points Caused by Divergence
Figure 3: The illustration of divergence of the LiDAR beam, which makes the board’s point cloud
becomes larger than its truth size. The ghost points can be seen in the highlight ellipse. The bright line
represents the true edge of the board.


beams, the LiDAR points on the edge are off the true edge of the calibration board and there are
a series of points behind of the board edge.
   Because of the range error and divergence error of LiDAR measurements, the points on the
board plane are not co-planar and the points on the edge are off the true edge, which makes the
calculation results of the points on the calibration board vertex not accurate. These problems
further affect the calibration results between LiDAR and camera.
   To solve these problems, we use the prior information of the calibration board to find the
true position of the board in LiDAR point clouds. As a board can be defined by its four vertices,
we propose a refining method to extract the accurate vertices from LiDAR point clouds and use
the vertex information to calibrate the extrinsic parameters between LiDAR and camera.

3.3. Calibration Information in LiDAR Frame
3.3.1. Extraction of Plane Points
With the known size of a calibration board and the relative pose between calibration board
and LiDAR, the points located on the board can be segmented out from the LiDAR point cloud.
The parameters of board plane in the LiDAR frame, 𝜋 𝐿,𝑝𝑙𝑎𝑛𝑒 = (𝐴, 𝐵, 𝐶, 𝐷)T , are fitted by
RANSAC method with the fitting threshold of 0.01𝑚 in this paper. The inlier points of the
fitted plane parameters are considered as the LiDAR points on the board plane. We denote these
points as 𝑃 𝑘𝐿,𝑝𝑙𝑎𝑛𝑒 (1 ≤ 𝑘 ≤ 𝑀 ), where 𝑀 is the number of inlier points.

3.3.2. Extraction of Edge Points
The points lying on the edge are extracted from the board plane points and divided into four
groups which corresponds to the four edges of the board. We denote the edge points in LiDAR
frame as 𝑃 𝑗,𝑘
             𝐿,𝑒𝑑𝑔𝑒 (𝑗 = 1, 2, 3, 4, 1 ≤ 𝑘 ≤ 𝑀𝑗 ), where 𝑀𝑗 is the number of points on each edge.
With the points on the edge, the line parameters of calibration board edges in the LiDAR frame
are fitted by RANSAC method with the threshold of 0.01𝑚. The line parameters can be denoted
as 𝐿𝑗𝐿,𝑒𝑑𝑔𝑒 = ((𝑑𝑗𝐿,𝑒𝑑𝑔𝑒 )T , (𝑃 𝑗𝐿,𝑒𝑑𝑔𝑒 )T )T , where 𝑑𝑗𝐿,𝑒𝑑𝑔𝑒 is the direction vector of the edge line
and 𝑃 𝑗𝐿,𝑒𝑑𝑔𝑒 is a point on the edge. The inlier points of the edge line fitting are considered as
the edge points.

3.3.3. Extraction and Refinement of Vertex Points
With the above resulting edge line parameters, the coordinates of board vertices in the LiDAR
frame 𝑃 𝑗𝐿,𝑣𝑒𝑟𝑡𝑒𝑥 (𝑗 = 1, 2, 3, 4) can be calculated by the parameters of edge lines. Due to the
LiDAR measurement errors discussed in Section 3.2, the two neighboring edge lines are not
co-planar and not intersecting in fact. We calculate the shortest segment line between the two
neighboring edge lines and the midpoint of the segment line is regarded as the vertex of the
calibration board. These coordinates can be used as initial values in the further refinement
using the prior information of the calibration board. The proposed method uses two kinds of
prior information: geometry-based information and pose-based information.
   Geometry-based constraint can be used to construct two kinds of constraints. The first one
is the constraint of the length of board edges, i.e. the distance of each pair of neighboring board
vertices should be equal to the length of board edges 𝐿. We denote the neighboring vertices as
the 𝑚th, 𝑛th (1 ≤ 𝑚 ≤ 4, 1 ≤ 𝑛 ≤ 4, 𝑚 ̸= 𝑛) vertices, and this constraint can be constructed
as
                                   𝐿 = ||𝑃 𝑚             𝑛
                                           𝐿,𝑣𝑒𝑟𝑡𝑒𝑥 − 𝑃 𝐿,𝑣𝑒𝑟𝑡𝑒𝑥 ||.                             (4)
   The second one is the constraint of the perpendicular relationship between neighboring
edges.
   We denote the two perpendicular edge lines is determined by the 𝑚th,𝑛th,𝑞th(1 ≤ 𝑚 ≤
4, 1 ≤ 𝑛 ≤ 4, 1 ≤ 𝑞 ≤ 4, 𝑚 ̸= 𝑛 ̸= 𝑞) vertex points, and the constraint relationship can be
written as
                                                     𝑞
                  0 = (𝑃 𝑚             𝑛         T                𝑛
                          𝐿,𝑣𝑒𝑟𝑡𝑒𝑥 − 𝑃 𝐿,𝑣𝑒𝑟𝑡𝑒𝑥 ) (𝑃 𝐿,𝑣𝑒𝑟𝑡𝑒𝑥 − 𝑃 𝐿,𝑣𝑒𝑟𝑡𝑒𝑥 ).            (5)
  Using these constraints, the four vertices can be fully guaranteed co-planar and correct size.
The shape and size of the board determined by these four vertices are the same as the actual
calibration board. Consequently, the measurement errors caused by the LiDAR are eliminated,
including the range error and divergence error. The Geometry-based information can constrain
the accurate geometry of the vertices, but during the optimization the pose of the vertices in
the LiDAR frame may drift.
  Pose-based constraint can be used to further constrain the pose of the board determined
by the four vertices in the LiDAR frame.
  We use the laser points on the edge to constrain the pose of the board. The edge point
𝑃 𝑗,𝑘
  𝐿,𝑒𝑑𝑔𝑒 (𝑗 = 1, 2, 3, 4, 1 ≤ 𝑘 ≤ 𝑀𝑗 ) should be on the edge line determined by the vertex points,
which can be formulated as

                                 ||(𝑃 𝑗𝐿,𝑣𝑒𝑟𝑡𝑒𝑥 − 𝑃 𝑗,𝑘         𝑗
                                                    𝐿,𝑒𝑑𝑔𝑒 ) × 𝑑𝐿,𝑒𝑑𝑔𝑒 ||
                            0=                                              ,                    (6)
                                               ||𝑑𝑗𝐿,𝑒𝑑𝑔𝑒 ||

where 𝑑𝑗𝐿,𝑒𝑑𝑔𝑒 (𝑗 = 1, 2, 3, 4) is the direction vector of the 𝑗th edge line in the LiDAR frame and
can be calculated as
                                  𝑑𝑗𝐿,𝑒𝑑𝑔𝑒 = 𝑃 𝑚             𝑛
                                               𝐿,𝑣𝑒𝑟𝑡𝑒𝑥 − 𝑃 𝐿,𝑣𝑒𝑟𝑡𝑒𝑥 .                           (7)
𝑚 and 𝑛 are the indices of vertex points that are on the end of the 𝑗th board edge.
  The vertex points are refined with these constraints in the LiDAR frame. We solve this
optimization problem using Levenberg-Marquardt algorithm implemented by Ceres.

3.4. Calibration Information in Camera Frame
According to ArUCO Library [25, 26], with known camera intrinsic matrix 𝐾 and distortion
coefficients 𝐷, the relative pose 𝑇 𝐶
                                    𝐵 between calibration board frame and camera frame can be
determined as                               (︂ 𝐶 𝐶 )︂
                                              𝑅𝐵 𝑡𝐵
                                        𝐶
                                      𝑇𝐵 =                .                                (8)
                                               0T 1
     Denoting the coordinates of the four board vertices in the calibration board frame as
𝑃 𝑗𝐵,𝑣𝑒𝑟𝑡𝑒𝑥 (𝑗 = 1, 2, 3, 4), their corresponding coordinates in the camera frame can be written
as
                            𝑃 𝑗𝐶,𝑣𝑒𝑟𝑡𝑒𝑥 = 𝑇 𝐶   𝑗
                                            𝐵 𝑃 𝐵,𝑣𝑒𝑟𝑡𝑒𝑥 , 𝑗 = 1, 2, 3, 4.                   (9)

3.5. Extrinsic Parameter Estimation
After the vertex information of the calibration board accurately extracted from point clouds
(Section 3.3) and images (Section 3.4), the extrinsic parameters between the LiDAR frame and
the camera frame can be estimated using the point-to-point correspondence.
   With the calculated coordinates of vertex points in LiDAR frame and camera frame, the
point-to-point constraint can be constructed as
                                𝑛 ∑︁
                                   4
                                                 𝑗              𝑗
                               ∑︁
                          𝐶=             ||𝑇 𝐶
                                             𝐿 𝑃 𝐿,𝑣𝑒𝑟𝑡𝑒𝑥,𝑖 − 𝑃 𝐶,𝑣𝑒𝑟𝑡𝑒𝑥,𝑖 ||,              (10)
                               𝑖=1 𝑗=1

where 𝑛 is the number of the calibration datasets, 𝑗(1 ≤ 𝑗 ≤ 4) is the index of board vertices.
  The optimization problem is solved using Levenberg-Marquardt algorithm implemented by
Ceres. We summarize our proposed calibration method in Algorithm 1.

Algorithm 1 Calibration of extrinsic parameters between LiDAR and camera
Input:
    Camera intrinsic matrix 𝐾 and distortion coefficients matrix 𝐷;
    Calibration data collected from 𝑁 board poses;
Output:
    Extrinsic parameters 𝑇 𝐶𝐿;
 1: Extract the LiDAR points on board and edge from point clouds;
                                              𝑗
 2: Calculate and refine the board vertices 𝑃 𝐿,𝑣𝑒𝑟𝑡𝑒𝑥,𝑖 , 𝑗 = 1, 2, 3, 4;
                                                      𝑗
 3: Calculate the vertices in the camera frame 𝑃 𝐶,𝑣𝑒𝑟𝑡𝑒𝑥,𝑖 , 𝑗 = 1, 2, 3, 4;
 4: Estimate the extrinsic parameters 𝑇 𝐶
                                        𝐿 using (10);
             𝐿.
 5: return 𝑇 𝐶
3.6. Evaluation Method of Calibration Results
Because the ground truth is not available in most cases, some papers used re-projection errors to
evaluate the accuracy of calibration results. By transforming the points lying the edge or vertex
from LiDAR frame to camera frame with the estimated extrinsic parameters and projecting
them on images, the re-projection error can be defined as the distance between the projected
point coordinates and its corresponding line or vertex. However, the re-projection error is not
an independent method for the evaluation because it is relative with the specific calibration
method. For example, the evaluation using re-projection error of vertex points is prone to better
results with point-based calibration methods. Intuitively, the re-projection error reflects the
alignment between the objects in 3D space and 2D image space, but it is not a sufficient and
necessary condition for the alignment in two 3D frames.
   To evaluate various calibration methods fairly, we propose a 3D space distance based method
to evaluate the calibration results as (11), which is independent of the specific calibration
method.
                                𝑁       𝐶 𝑖
                             1 ∑︁ ||(𝑇 𝐿 𝑃 𝐿,𝑒𝑑𝑔𝑒 − 𝑃 𝐶,𝑒𝑑𝑔𝑒 ) × 𝑑𝐶,𝑒𝑑𝑔𝑒 ||
                      𝐷=                                                    ,                (11)
                             𝑛                    ||𝑑𝐶,𝑒𝑑𝑔𝑒 ||
                               𝑖=1

where 𝑇 𝐶
        𝐿 is the calibration result, 𝑃 𝐿,𝑒𝑑𝑔𝑒 is the coordinates of a LiDAR point on the edge in
                                       𝑖

the LiDAR frame, 𝑃 𝐶,𝑒𝑑𝑔𝑒 is the coordinates of a point on the edge in the camera frame, 𝑑𝐶,𝑒𝑑𝑔𝑒
is the direction vector of the edge in the camera frame and 𝑁 is the point number.
   𝑃 𝑖𝐿,𝑒𝑑𝑔𝑒 is directly extracted from the point cloud as introduced in Section 3.2. Although
these laser points on the edge are not the ideally accurate points on the edge as we discussed in
Section 3.2, using these raw sensor measurements to evaluate the calibration results is fair and
direct to reflect calibration quality.
   This 3D space distance can’t be zero even if we calculate it with ideal calibration results
because of the systematic errors of LiDAR measurements, but smaller space distance directly
represents better calibration results.


4. Experiment
4.1. Calibration Data Collection
In this paper, we use two multi-sensors device sets to collect calibration data, which are denoted
as Device-1 and Device-2. Each device set is comprised of a Velodyne VLP-16 LiDAR and an
RGB camera with the resolution of 752 × 480. The LiDAR and camera are mounted on a base
using rigid connection. As introduced in Section 3.1, two boards with different ArUCO markers
are used as calibration targets to make full use of the camera field of view and shorten the
tedious and laborious calibration process.
   For each device, we collect the calibration datasets from 27 different poses of calibration
boards with respective to the sensor, extract the feature information of two calibration boards
as introduced in Section 3.3 and Section 3.4, and finally get 54 sets of board information for
experiments.
4.2. Experiment of Vertex Refinement
The refinement results of vertex points are evaluated by computing the length of each board
edge and the angle between neighboring edges, before and after the refinement introduced in
Section 3.3.
  We denote the true length of board edges as 𝐿𝑔𝑡 , the calculated length using the vertices as
𝐿𝑒𝑠𝑡 , the true angle of neighboring edges as 𝛼𝑔𝑡 , and the calculated angle using the vertices as
𝛼𝑒𝑠𝑡 . Then the error can be formulated as

                                        𝑒𝐿 = ||𝐿𝑔𝑡 − 𝐿𝑒𝑠𝑡 ||,                                    (12)

                                        𝑒𝛼 = ||𝛼𝑔𝑡 − 𝛼𝑒𝑠𝑡 ||.                                    (13)
  We compute the board edge length and the board angle of all 54 board poses for each device
with and without the vertex refinement, and show the mean and standard deviation of the
errors in Table 1 and Table 2.

Table 1
The length error between the calculated value and the ground truth. ’w’ means the error calculated by
the vertices with the refinement and ’w/o’ means calculated by the vertices without the refinement.
                  Device                   Length Error Mean and STD (m)
                Refinement                 w/o                               w
                                        −2             −1                  −4
                 Device-1      4.23 × 10     , 1.90 × 10     6.55 × 10          , 1.72 × 10−4
                 Device-2      4.36 × 10−2 , 1.89 × 10−1     6.87 × 10−4 , 1.79 × 10−4


Table 2
The angle error between the calculated value and the ground truth. ’w’ means the error calculated by
the vertices with the refinement and ’w/o’ means calculated by the vertices without the refinement.
                            Device         Angle Error Mean and STD (deg)
                        Refinement         w/o                    w
                                                                 −3
                           Device-1    2.12, 5.66    2.78 × 10        , 1.91 × 10−3
                           Device-2    2.19, 5.82    2.92 × 10−3 , 2.09 × 10−3


  After refining the vertex of calibration boards using the prior information, the error caused
by the measurement characteristics of LiDARs can reach to a remarkably low level. Without
the vertex refinement, the length error can be up to 4𝑐𝑚 and the angle error can be up to 2𝑑𝑒𝑔,
which will introduce error into the points extraction from board edges and the calculation of
vertex points, and deteriorate the extrinsic calibration between LiDAR and camera. Eliminating
the error of points extracted from point clouds can effectively improve the accuracy of extrinsic
parameters calibration, and we use the refined vertex points to implement the following extrinsic
calibration.
4.3. Experiment Results of Calibration
We evaluate the performance of our proposed method and compare it with other state-of-the-art
methods [1, 6]. [1] introduced two kinds of geometry constraints to calibrate the extrinsic
parameters. The first one is the point-to-plane correspondences which uses the LiDAR points
on the board plane and the plane parameters estimated in the camera frame to construct the
constraint. We denote this calibration method as Mishra’s plane method. The second one is the
point-to-line correspondences which uses the LiDAR points on the edge and the back-projected
plane parameters estimated in the camera frame to construct the constraint. We denote this
calibration method as Mishra’s edge method. [6] used the point-to-point correspondences to
calibrate the extrinsic parameters. The points used in [6] are the vertex points of the board,
which is denoted as Ankit’s vertex method, but this paper only calculates the vertices by the
edge line parameters without the refinement like our methods.


                                                                                                                                                                                            

                                                                                           0 L V K U D 
 V  S O D Q H  P H W K R G                                                                                                                           0 L V K U D 
 V  S O D Q H  P H W K R G

                                                                                           0 L V K U D 
 V  H G J H  P H W K R G                                                                                                                         0 L V K U D 
 V  H G J H  P H W K R G
                       
                                                                                                                                                6 W D Q G D U G  ' H Y L D W L R Q  P 
                                                                                           $ Q N L W 
 V  Y H U W H [  P H W K R G                                                                                                                           $ Q N L W 
 V  Y H U W H [  P H W K R G

                                                                                           2 X U  S U R S R V H G  P H W K R G                                                                                                                           2 X U  S U R S R V H G  P H W K R G
   0 H D Q  P 


                       


                                                                                                                                                                                                
                       


                                                                                                                                                                                                

                       


                                                                                                                                                                                                

                       
                                                                                                                                                                                                                                                                                           
                                           1 X P E H U  R I  & D O L E U D W L R Q  ' D W D V H W V                                                                                                         1 X P E H U  R I  & D O L E U D W L R Q  ' D W D V H W V


                              (a) Mean of space distances for Device-1                                                                        (b) Standard deviation of space distances for Device-1


                                                                                                                                                                                            

                                                                                           0 L V K U D 
 V  S O D Q H  P H W K R G                                                                                                                           0 L V K U D 
 V  S O D Q H  P H W K R G

                                                                                           0 L V K U D 
 V  H G J H  P H W K R G                                                                                                                         0 L V K U D 
 V  H G J H  P H W K R G
                       
                                                                                                                                                6 W D Q G D U G  ' H Y L D W L R Q  P 


                                                                                           $ Q N L W 
 V  Y H U W H [  P H W K R G                                                                                                                           $ Q N L W 
 V  Y H U W H [  P H W K R G

                                                                                           2 X U  S U R S R V H G  P H W K R G                                                                                                                           2 X U  S U R S R V H G  P H W K R G
   0 H D Q  P 


                       


                                                                                                                                                                                                
                       


                                                                                                                                                                                                

                       


                                                                                                                                                                                                

                       
                                                                                                                                                                                                                                                                                           
                                           1 X P E H U  R I  & D O L E U D W L R Q  ' D W D V H W V                                                                                                         1 X P E H U  R I  & D O L E U D W L R Q  ' D W D V H W V


                              (c) Mean of space distances for Device-2                                                                        (d) Standard deviation of space distances for Device-2
Figure 4: The mean and std-deviation are used to evaluate our proposed and other state-of-the-art
calibration methods. 𝑛 ∈ [2, 50] datasets are randomly selected for calibration and others for evaluation.
The results show that our method performs much better on the calibration accuracy and stability even
using fewer datasets.


   We use the collected 54 sets of board information to implement the calibration for each
LiDAR-camera device set, as shown in Figure 4. The experiments are conducted in the form
of cross-validation. We calibrate the extrinsic parameters using these different methods with
two or more datasets, and evaluate the calibration results with the remaining datasets. In detail,
we randomly choose 𝑛 ∈ [2, 50] datasets to calibrate the extrinsic parameters. For each 𝑛, we
execute each calibration method 100 times and compute the mean and standard deviation of the
space distance (11) of each execution for evaluation.
   Figure 4a and Figure 4c show that the mean of the 3D space distance of our proposed method
is much smaller than that of other methods, which means our proposed calibration method
performs better on the accuracy of the calibration results. It is worth to mention that even using
only 2 datasets, our proposed calibration method has little loss on the accuracy performance.
Figure 4b and Figure 4d show that other state-of-the-art methods get more stable calibration
results when they use more calibration datasets, but our proposed method can maintain the
stability of calibration even using a few of datasets.
   For Mishra’s plane method, its space distance is much larger than that of other methods,
because the point-to-plane correspondence constructed by one dataset can only provide 3 DoF
(Degree of Freedom) constraints, as shown in Figure 5. The cost function used in point-to-plane
based calibration method is the distance between LiDAR points and the board in the camera
frame. The rotation of the laser points along the normal vector of the board plane or the
translation of the laser points on the board plane won’t violate the above mentioned constraint,
i.e. the cost function will not reflect this change. This is the reason why in Mishra’s plane
method at least three non-parallel board poses are necessary for extrinsic parameters calibration,
and more datasets lead to better results as shown in Figure 4a and Figure 4c. The performance of
Mishra’s plane method depends on the quantity and quality of the collected calibration datasets.
Notice that the range error discussed in Section 3.2 also reduces the accuracy of this calibration
method.


Figure 5: Point-to-plane correspondence of one calibration board can’t constrain all 6 DoFs of extrinsic
parameters. The rotation of the laser points along the normal vector of the plane or the translation of
the laser points on the board plane won’t violate the point-to-plane correspondence.


   For Mishra’s edge method, as the point-to-line based calibration method needs at least two
datasets collected at different poses to calibrate the extrinsic parameters, the 3D space distance
errors are smaller than that of Mishra’s plane method. At the same time, the divergence error
of LiDAR measurements discussed in Section 3.2 can also influence the calibration results of
this kind of methods.
   Ankit’s vertex method performs better than Mishra’s plane method and Mishra’s edge method,
because the dataset collected from only one pose is enough for the point-to-point based method
to complete the extrinsic calibration. When the number of datasets increases, the 3D space
distance error of Ankit’s vertex method decreases but still much larger than that of our method.
The reason is that the vertex points are calculated form the line parameters of the board edge,
which is estimated using the LiDAR points on the board edge, but the range error and divergence
error of LiDAR measurements discussed in Section 3.2 can notably decrease the accuracy of the
edge line estimation.
   Our proposed method uses the similar point-to-point correspondence to estimate extrinsic
parameters as Ankit’s vertex method, but the board vertices used for the estimation are well
refined using the prior information of the calibration board to mitigate the influence of the
systematic errors of LiDAR measurements. The 3D space distance error of our proposed method
is remarkably smaller than that of other method, and remains at a stable and low level even using
just a few calibration datasets. In fact, our proposed method can accurately calibrate the extrinsic
parameters with only one dataset collected at one pose without any loss of performances.


 (a) Mishra’s plane method (b) Mishra’s edge method   (c) Ankit’s vertex method   (d) Our proposed method


 (e) Mishra’s plane method (f) Mishra’s edge method   (g) Ankit’s vertex method   (h) Our proposed method
Figure 6: The images on the first row are the re-projection of LiDAR points using the calibrated extrinsic
parameters to intuitively evaluate the calibration results. Different colors of the projection of LiDAR
points indicates the distance away from the origin of the camera frame. We conduct calibrations using
different calibration methods and the minimal number of required datasets. Mishra’s plane method
uses three datasets, Mishra’s edge method uses two datasets, Ankit’s vertex method and our proposed
method use one dataset. The datasets used for calibration are randomly selected from 54 datasets. The
images on the second row are the re-projection of refined vertex points with the estimated extrinsic
parameters. The red point on the board vertex is the true vertex, the blue point close to the board vertex
is the projection of the refined vertex using the extrinsic parameters estimated by different methods. The
distance of these two points clearly shows that our method performs best on the calibration accuracy.


   To intuitively show the difference of the calibration results using different methods, we
re-project the raw laser scan and calculated vertices on the image using the estimated extrinsic
parameters, as shown in Figure 6. Different colors of the projection of LiDAR points indicates
the different point depth away from the origin of the camera frame. From the figure we can
clearly see that our proposed method fits the laser points better with the calibration board and
other foreground objects, such as the tripod, compared with other state-of-the-art methods.
Further we zoom in the re-projection results of the vertices in the highlight ellipse, as shown
in the second row of Figure 6. The red point on the board vertex is used as the true location
of the board vertex for reference. The blue point close to the board vertex is the projection of
refined board vertex extracted from point clouds. The distance between these two points can
clearly reflect the calibration performance. As shown in Figure 6h, the distance of these two
points using our proposed method is smaller than those of other methods, which indicates our
calibration methods is more accurate than other calibration methods.


5. Conclusion
In this paper, we analyze the measurement characteristics and systematic errors of a LiDAR
sensor, propose a refining method for vertex points towards eliminating the errors in LiDAR
measurements, and further propose a calibration method for the extrinsic parameters using
the refined vertex points. Through quantitative and qualitative experiments, we demonstrate
the performance of our method and compare it with other three state-of-the-art calibration
methods. Experiment results show that the refinement of vertex points can effectively mitigate
the influence of the measurement error, and consequently our proposed calibration method
performs notably better than the state-of-the-art methods on the calibration accuracy and
stability.
   To fairly evaluate the calibration results, we also introduce a 3D space distance based evalua-
tion method, which uses raw LiDAR measurements and is fully independent of the different
calibration methods mentioned in this paper.
   The future research work includes the error analysis of calibration information from images,
and a refinement method for the geometry features extracted from images to improve the
accuracy of calibration results further.


Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grant
No.62002359 and 61836015.


References
 [1] S. Mishra, P. R. Osteen, G. Pandey, S. Saripalli, Experimental evaluation of 3d-lidar camera
     extrinsic calibration, in: 2020 IEEE/RSJ International Conference on Intelligent Robots
     and Systems (IROS), 2020, pp. 9020–9026. doi:10.1109/IROS45743.2020.9340911.
 [2] L. Zhou, Z. Li, M. Kaess, Automatic extrinsic calibration of a camera and a 3d lidar using
     line and plane correspondences, in: 2018 IEEE/RSJ International Conference on Intelligent
     Robots and Systems (IROS), 2018, pp. 5562–5569. doi:10.1109/IROS.2018.8593660.
 [3] G. Koo, J. Kang, B. Jang, N. Doh, Analytic plane covariances construction for precise
     planarity-based extrinsic calibration of camera and lidar, in: 2020 IEEE International
     Conference on Robotics and Automation (ICRA), 2020, pp. 6042–6048. doi:10.1109/
     ICRA40945.2020.9197149.
 [4] R. Voges, B. Wagner, Set-membership extrinsic calibration of a 3d lidar and a camera, in:
     2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020,
     pp. 9012–9019. doi:10.1109/IROS45743.2020.9341266.
 [5] Y. Bok, D. Choi, Y. Jeong, I. S. Kweon, Capturing village-level heritages with a hand-
     held camera-laser fusion sensor, in: 2009 IEEE 12th International Conference on Com-
     puter Vision Workshops, ICCV Workshops, 2009, pp. 947–954. doi:10.1109/ICCVW.2009.
     5457600.
 [6] A. Dhall, K. Chelani, V. Radhakrishnan, K. M. Krishna, Lidar-camera calibration using 3d-3d
     point correspondences, CoRR abs/1705.09785 (2017). URL: http://arxiv.org/abs/1705.09785.
     arXiv:1705.09785.
 [7] Y. Xie, R. Shao, P. Guli, B. Li, L. Wang, Infrastructure based calibration of a multi-camera
     and multi-lidar system using apriltags, in: 2018 IEEE Intelligent Vehicles Symposium (IV),
     2018, pp. 605–610. doi:10.1109/IVS.2018.8500646.
 [8] S. Verma, J. S. Berrio, S. Worrall, E. Nebot, Automatic extrinsic calibration between a
     camera and a 3d lidar using 3d point and plane correspondences, in: 2019 IEEE Intelligent
     Transportation Systems Conference (ITSC), 2019, pp. 3906–3912. doi:10.1109/ITSC.
     2019.8917108.
 [9] H. Alismail, L. D. Baker, B. Browning, Automatic calibration of a range sensor and camera
     system, in: 2012 Second International Conference on 3D Imaging, Modeling, Processing,
     Visualization & Transmission, 2012, pp. 286–292. doi:10.1109/3DIMPVT.2012.52.
[10] S. A. Rodriguez F., V. Fremont, P. Bonnifait, Extrinsic calibration between a multi-layer
     lidar and a camera, in: 2008 IEEE International Conference on Multisensor Fusion and
     Integration for Intelligent Systems, 2008, pp. 214–219. doi:10.1109/MFI.2008.4648067.
[11] J. Kümmerle, T. Kühner, M. Lauer, Automatic calibration of multiple cameras and depth
     sensors with a spherical target, in: 2018 IEEE/RSJ International Conference on Intelligent
     Robots and Systems (IROS), 2018, pp. 1–8. doi:10.1109/IROS.2018.8593955.
[12] T. Tóth, Z. Pusztai, L. Hajder, Automatic lidar-camera calibration of extrinsic parame-
     ters using a spherical target, in: 2020 IEEE International Conference on Robotics and
     Automation (ICRA), 2020, pp. 8580–8586. doi:10.1109/ICRA40945.2020.9197316.
[13] J. Kang, N. L. Doh, Automatic targetless camera-lidar calibration by aligning edge with gaus-
     sian mixture model, Journal of Field Robotics 37 (2020) 158–179. URL: https://onlinelibrary.
     wiley.com/doi/abs/10.1002/rob.21893. doi:https://doi.org/10.1002/rob.21893.
     arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/rob.21893.
[14] P. Moghadam, M. Bosse, R. Zlot, Line-based extrinsic calibration of range and image
     sensors, in: 2013 IEEE International Conference on Robotics and Automation, 2013, pp.
     3685–3691. doi:10.1109/ICRA.2013.6631095.
[15] J. Castorena, U. S. Kamilov, P. T. Boufounos, Autocalibration of lidar and optical cameras
     via edge alignment, in: 2016 IEEE International Conference on Acoustics, Speech and
     Signal Processing (ICASSP), 2016, pp. 2862–2866. doi:10.1109/ICASSP.2016.7472200.
[16] Y. Zhu, C. Li, Y. Zhang, Online camera-lidar calibration with sensor semantic information,
     in: 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020, pp.
     4970–4976. doi:10.1109/ICRA40945.2020.9196627.
[17] G. Pandey, J. R. McBride, S. Savarese, R. M. Eustice, Automatic targetless extrinsic calibra-
     tion of a 3d lidar and camera by maximizing mutual information, in: Proceedings of the
     Twenty-Sixth AAAI Conference on Artificial Intelligence, AAAI’12, AAAI Press, 2012, pp.
     2053–2059.
[18] A. Mastin, J. Kepner, J. Fisher, Automatic registration of lidar and optical images of urban
     scenes, in: 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp.
     2639–2646. doi:10.1109/CVPR.2009.5206539.
[19] C. Park, P. Moghadam, S. Kim, S. Sridharan, C. Fookes, Spatiotemporal camera-lidar
     calibration: A targetless and structureless approach, IEEE Robotics and Automation
     Letters 5 (2020) 1556–1563. doi:10.1109/LRA.2020.2969164.
[20] Z. Taylor, J. Nieto, Motion-based calibration of multimodal sensor extrinsics and timing
     offset estimation, IEEE Transactions on Robotics 32 (2016) 1215–1229. doi:10.1109/TRO.
     2016.2596771.
[21] T. Scott, A. A. Morye, P. Piniés, L. M. Paz, I. Posner, P. Newman, Choosing a time and place
     for calibration of lidar-camera systems, in: 2016 IEEE International Conference on Robotics
     and Automation (ICRA), 2016, pp. 4349–4356. doi:10.1109/ICRA.2016.7487634.
[22] T. Scott, A. A. Morye, P. Piniés, L. M. Paz, I. Posner, P. Newman, Exploiting known
     unknowns: Scene induced cross-calibration of lidar-stereo systems, in: 2015 IEEE/RSJ
     International Conference on Intelligent Robots and Systems (IROS), 2015, pp. 3647–3653.
     doi:10.1109/IROS.2015.7353887.
[23] Z. Xiao, H. Li, D. Zhou, Y. Dai, B. Dai, Accurate extrinsic calibration between monocular
     camera and sparse 3d lidar points without markers, in: 2017 IEEE Intelligent Vehicles
     Symposium (IV), 2017, pp. 424–429. doi:10.1109/IVS.2017.7995755.
[24] G. Pascoe, W. Maddern, P. Newman, Direct visual localisation and calibration for road
     vehicles in changing city environments, in: 2015 IEEE International Conference on
     Computer Vision Workshop (ICCVW), 2015, pp. 98–105. doi:10.1109/ICCVW.2015.23.
[25] F. J. Romero-Ramirez, R. Muñoz-Salinas, R. Medina-Carnicer, Speeded up detection
     of squared fiducial markers, Image and Vision Computing 76 (2018) 38–47. URL: https:
     //www.sciencedirect.com/science/article/pii/S0262885618300799. doi:https://doi.org/
     10.1016/j.imavis.2018.05.004.
[26] S. Garrido-Jurado, R. Muñoz-Salinas, F. Madrid-Cuevas, R. Medina-Carnicer, Genera-
     tion of fiducial marker dictionaries using mixed integer linear programming, Pattern
     Recognition 51 (2016) 481–491. URL: https://www.sciencedirect.com/science/article/pii/
     S0031320315003544. doi:https://doi.org/10.1016/j.patcog.2015.09.023.