Add more references. Update missing references with webarchive. Use mathbf for matrices. Check that the determinant of the rotation matrix is not negative (reflection), and correct it if it is the case.
This tutorial will demonstrate the basic concepts of the homography with some codes.
For detailed explanations about the theory, please refer to a computer vision course or a computer vision book, e.g.:
* Multiple View Geometry in Computer Vision, @cite HartleyZ00.
* An Invitation to 3-D Vision: From Images to Geometric Models, @cite Ma:2003:IVI
* Computer Vision: Algorithms and Applications, @cite RS10
* Multiple View Geometry in Computer Vision, Richard Hartley and Andrew Zisserman, @cite HartleyZ00 (some sample chapters are available [here](https://www.robots.ox.ac.uk/~vgg/hzbook/), CVPR Tutorials are available [here](https://www.robots.ox.ac.uk/~az/tutorials/))
* An Invitation to 3-D Vision: From Images to Geometric Models, Yi Ma, Stefano Soatto, Jana Kosecka, and S. Shankar Sastry, @cite Ma:2003:IVI (a computer vision book handout is available [here](https://cs.gmu.edu/%7Ekosecka/cs685/VisionBookHandout.pdf))
* Computer Vision: Algorithms and Applications, Richard Szeliski, @cite RS10 (an electronic version is available [here](https://szeliski.org/Book/))
* Deeper understanding of the homography decomposition for vision-based control, Ezio Malis, Manuel Vargas, @cite Malis2007 (open access [here](https://hal.inria.fr/inria-00174036))
* Pose Estimation for Augmented Reality: A Hands-On Survey, Eric Marchand, Hideaki Uchiyama, Fabien Spindler, @cite Marchand16 (open access [here](https://hal.inria.fr/hal-01246370))
The tutorial code can be found here [C++](https://github.com/opencv/opencv/tree/3.4/samples/cpp/tutorial_code/features2D/Homography),
This is a quick solution (see also \ref projective_transformations "2") as this does not ensure that the resulting rotation matrix will be orthogonal and the scale is estimated roughly by normalize the first column to 1.
A solution to have a proper rotation matrix (with the properties of a rotation matrix) consists to apply a polar decomposition
(see \ref polar_decomposition "6" or \ref polar_decomposition_svd "7" for some information):
A solution to have a proper rotation matrix (with the properties of a rotation matrix) consists to apply a polar decomposition, or orthogonalization of the rotation matrix
(see \ref polar_decomposition "6" or \ref polar_decomposition_svd "7" or \ref polar_decomposition_svd_2 "8" or \ref Kabsch_algorithm "9" for some information):
@ -239,7 +241,7 @@ To check the correctness of the calculation, the matching lines are displayed:
### Demo 3: Homography from the camera displacement {#tutorial_homography_Demo3}
The homography relates the transformation between two planes and it is possible to retrieve the corresponding camera displacement that allows to go from the first to the second plane view (see @cite Malis for more information).
The homography relates the transformation between two planes and it is possible to retrieve the corresponding camera displacement that allows to go from the first to the second plane view (see @cite Malis2007 for more information).
Before going into the details that allow to compute the homography from the camera displacement, some recalls about camera pose and homogeneous transformation.
The function @ref cv::solvePnP allows to compute the camera pose from the correspondences 3D object points (points expressed in the object frame) and the projected 2D image points (object points viewed in the image).
@ -269,7 +271,7 @@ The intrinsic parameters and the distortion coefficients are required (see the c
Z_o \\
1
\end{bmatrix} \\
&= \boldsymbol{K} \hspace{0.2em} ^{c}\textrm{M}_o
&= \mathbf{K} \hspace{0.2em} ^{c}\mathbf{M}_o
\begin{bmatrix}
X_o \\
Y_o \\
@ -279,9 +281,9 @@ The intrinsic parameters and the distortion coefficients are required (see the c
\end{align*}
\f]
\f$ \boldsymbol{K} \f$ is the intrinsic matrix and \f$ ^{c}\textrm{M}_o \f$ is the camera pose. The output of @ref cv::solvePnP is exactly this: `rvec` is the Rodrigues rotation vector and `tvec` the translation vector.
\f$ \mathbf{K} \f$ is the intrinsic matrix and \f$ ^{c}\mathbf{M}_o \f$ is the camera pose. The output of @ref cv::solvePnP is exactly this: `rvec` is the Rodrigues rotation vector and `tvec` the translation vector.
\f$ ^{c}\textrm{M}_o \f$ can be represented in a homogeneous form and allows to transform a point expressed in the object frame into the camera frame:
\f$ ^{c}\mathbf{M}_o \f$ can be represented in a homogeneous form and allows to transform a point expressed in the object frame into the camera frame:
\f[
\begin{align*}
@ -291,7 +293,7 @@ The intrinsic parameters and the distortion coefficients are required (see the c
Z_c \\
1
\end{bmatrix} &=
\hspace{0.2em} ^{c}\textrm{M}_o
\hspace{0.2em} ^{c}\mathbf{M}_o
\begin{bmatrix}
X_o \\
Y_o \\
@ -300,7 +302,7 @@ The intrinsic parameters and the distortion coefficients are required (see the c
\end{bmatrix} \\
&=
\begin{bmatrix}
^{c}\textrm{R}_o & ^{c}\textrm{t}_o \\
^{c}\mathbf{R}_o & ^{c}\mathbf{t}_o \\
0_{1\times3} & 1
\end{bmatrix}
\begin{bmatrix}
@ -327,19 +329,19 @@ The intrinsic parameters and the distortion coefficients are required (see the c
Transform a point expressed in one frame to another frame can be easily done with matrix multiplication:
* \f$ ^{c_1}\textrm{M}_o \f$ is the camera pose for the camera 1
* \f$ ^{c_2}\textrm{M}_o \f$ is the camera pose for the camera 2
* \f$ ^{c_1}\mathbf{M}_o \f$ is the camera pose for the camera 1
* \f$ ^{c_2}\mathbf{M}_o \f$ is the camera pose for the camera 2
To transform a 3D point expressed in the camera 1 frame to the camera 2 frame:
@ -362,11 +364,11 @@ On this figure, `n` is the normal vector of the plane and `d` the distance betwe
The [equation](https://en.wikipedia.org/wiki/Homography_(computer_vision)#3D_plane_to_plane_equation) to compute the homography from the camera displacement is:
Where \f$ ^{2}\textrm{H}_{1} \f$ is the homography matrix that maps the points in the first camera frame to the corresponding points in the second camera frame, \f$ ^{2}\textrm{R}_{1} = \hspace{0.2em} ^{c_2}\textrm{R}_{o} \cdot \hspace{0.1em} ^{c_1}\textrm{R}_{o}^{T} \f$
is the rotation matrix that represents the rotation between the two camera frames and \f$ ^{2}\textrm{t}_{1} = \hspace{0.2em} ^{c_2}\textrm{R}_{o} \cdot \left( - \hspace{0.1em} ^{c_1}\textrm{R}_{o}^{T} \cdot \hspace{0.1em} ^{c_1}\textrm{t}_{o} \right ) + \hspace{0.1em} ^{c_2}\textrm{t}_{o} \f$
Where \f$ ^{2}\mathbf{H}_{1} \f$ is the homography matrix that maps the points in the first camera frame to the corresponding points in the second camera frame, \f$ ^{2}\mathbf{R}_{1} = \hspace{0.2em} ^{c_2}\mathbf{R}_{o} \cdot \hspace{0.1em} ^{c_1}\mathbf{R}_{o}^{\top} \f$
is the rotation matrix that represents the rotation between the two camera frames and \f$ ^{2}\mathbf{t}_{1} = \hspace{0.2em} ^{c_2}\mathbf{R}_{o} \cdot \left( - \hspace{0.1em} ^{c_1}\mathbf{R}_{o}^{\top} \cdot \hspace{0.1em} ^{c_1}\mathbf{t}_{o} \right ) + \hspace{0.1em} ^{c_2}\mathbf{t}_{o} \f$
the translation vector between the two camera frames.
Here the normal vector `n` is the plane normal expressed in the camera frame 1 and can be computed as the cross product of 2 vectors (using 3 non collinear points that lie on the plane) or in our case directly with:
@ -377,7 +379,7 @@ The distance `d` can be computed as the dot product between the plane normal and
The projective homography matrix \f$ \textbf{G} \f$ can be computed from the Euclidean homography \f$ \textbf{H} \f$ using the intrinsic matrix \f$ \textbf{K} \f$ (see @cite Malis), here assuming the same camera between the two plane views:
The projective homography matrix \f$ \textbf{G} \f$ can be computed from the Euclidean homography \f$ \textbf{H} \f$ using the intrinsic matrix \f$ \textbf{K} \f$ (see @cite Malis2007), here assuming the same camera between the two plane views:
@ -466,8 +468,8 @@ As you can see, there is one solution that matches almost perfectly with the com
At least two of the solutions may further be invalidated if point correspondences are available by applying positive depth constraint (all points must be in front of the camera).
```
As the result of the decomposition is a camera displacement, if we have the initial camera pose \f$ ^{c_1}\textrm{M}_{o} \f$, we can compute the current camera pose
\f$ ^{c_2}\textrm{M}_{o} = \hspace{0.2em} ^{c_2}\textrm{M}_{c_1} \cdot \hspace{0.1em} ^{c_1}\textrm{M}_{o} \f$ and test if the 3D object points that belong to the plane are projected in front of the camera or not.
As the result of the decomposition is a camera displacement, if we have the initial camera pose \f$ ^{c_1}\mathbf{M}_{o} \f$, we can compute the current camera pose
\f$ ^{c_2}\mathbf{M}_{o} = \hspace{0.2em} ^{c_2}\mathbf{M}_{c_1} \cdot \hspace{0.1em} ^{c_1}\mathbf{M}_{o} \f$ and test if the 3D object points that belong to the plane are projected in front of the camera or not.
Another solution could be to retain the solution with the closest normal if we know the plane normal expressed at the camera 1 pose.
The same thing but with the homography matrix estimated with @ref cv::findHomography
@ -516,7 +518,7 @@ The [stitching module](@ref stitching) provides a complete pipeline to stitch im
The homography transformation applies only for planar structure. But in the case of a rotating camera (pure rotation around the camera axis of projection, no translation), an arbitrary world can be considered
The homography can then be computed using the rotation transformation and the camera intrinsic parameters as (see for instance \ref homography_course "8"):
The homography can then be computed using the rotation transformation and the camera intrinsic parameters as (see for instance \ref homography_course "10"):
\f[
s
@ -534,7 +536,7 @@ The homography can then be computed using the rotation transformation and the ca
\f]
To illustrate, we used Blender, a free and open-source 3D computer graphics software, to generate two camera views with only a rotation transformation between each other.
More information about how to retrieve the camera intrinsic parameters and the `3x4` extrinsic matrix with respect to the world can be found in \ref answer_blender "9" (an additional transformation
More information about how to retrieve the camera intrinsic parameters and the `3x4` extrinsic matrix with respect to the world can be found in \ref answer_blender "11" (an additional transformation
is needed to get the transformation between the camera and the object frames) with Blender.
The figure below shows the two generated views of the Suzanne model, with only a rotation transformation:
* \anchor szeliski 3. [Computer Vision: Algorithms and Applications](https://szeliski.org/Book/), Richard Szeliski
* \anchor answer_dsp 4. [Step by Step Camera Pose Estimation for Visual Tracking and Planar Markers](https://dsp.stackexchange.com/a/2737)
* \anchor pose_ar 5. [Pose from homography estimation](https://team.inria.fr/lagadic/camera_localization/tutorial-pose-dlt-planar-opencv.html)
* \anchor pose_ar 5. [Pose from homography estimation](https://visp-doc.inria.fr/doxygen/camera_localization/tutorial-pose-dlt-planar-opencv.html)
* \anchor polar_decomposition 6. [Polar Decomposition (in Continuum Mechanics)](http://www.continuummechanics.org/polardecomposition.html)
* \anchor polar_decomposition_svd 7. [A Personal Interview with the Singular Value Decomposition](https://web.stanford.edu/~gavish/documents/SVD_ans_you.pdf), Matan Gavish
* \anchor homography_course 8. [Homography](http://people.scs.carleton.ca/~c_shu/Courses/comp4900d/notes/homography.pdf), Dr. Gerhard Roth
* \anchor answer_blender 9. [3x4 camera matrix from blender camera](https://blender.stackexchange.com/a/38210)
* \anchor polar_decomposition_svd 7. [Chapter 3 - 3.1.2 From matrices to rotations - Theorem 3.1 (Least-squares estimation of a rotation from a matrix K)](https://www-sop.inria.fr/asclepios/cours/MVA/Rotations.pdf)
* \anchor polar_decomposition_svd_2 8. [A Personal Interview with the Singular Value Decomposition](https://web.stanford.edu/~gavish/documents/SVD_ans_you.pdf), Matan Gavish
* \anchor Kabsch_algorithm 9. [Kabsch algorithm, Computation of the optimal rotation matrix](https://en.wikipedia.org/wiki/Kabsch_algorithm#Computation_of_the_optimal_rotation_matrix)
* \anchor homography_course 10. [Homography](http://people.scs.carleton.ca/~c_shu/Courses/comp4900d/notes/homography.pdf), Dr. Gerhard Roth
* \anchor answer_blender 11. [3x4 camera matrix from blender camera](https://blender.stackexchange.com/a/38210)