Qitao Zhao (qitaoz), Fall 2024
The 8-point algorithm estimates the fundamental matrix ( F ) from eight or more point correspondences between two images. Each correspondence between points (x, y) in the first image and (x', y') in the second image contributes one row to the matrix ( A ), as shown below:
Given correspondences:
(x_i, y_i): coordinates in the first image
(x'_i, y'_i): coordinates in the second image
The matrix ( A ) is constructed as follows:
Each row of ( A ) is derived from a single correspondence, and the final matrix is ( n
Linear Solution: Solve for ( F ) by finding the null space of ( A ), specifically the vector ( f ) corresponding to ( A )'s smallest singular value. Reshape ( f ) into a ( 3
Constraint Enforcement: Enforce the rank-2 constraint on ( F ) by adjusting it to its nearest rank-2 approximation using Singular Value Decomposition (SVD). This ensures that ( F ) represents a valid fundamental matrix.
For the bench example:
[[ 1.19197327e-07 2.53110054e-06 -2.07464874e-04]
[ 3.96237477e-07 -2.26965824e-07 -2.48724802e-02]
[ 1.06829351e-03 2.29919977e-02 -9.99425607e-01]]
For the remote example:
[[-7.18941087e-07 1.90565734e-07 -2.36193611e-03]
[ 1.88141885e-06 8.60432205e-07 8.45608352e-03]
[ 3.98021997e-03 -9.46413878e-03 -9.99908748e-01]]
Viewpoint 1 (Points) | Viewpoint 2 (Epipolar Lines) |
---|---|
![]() | ![]() |
![]() | ![]() |
Given the fundamental matrix we computed above, the essential matrix is given by:
For the bench example:
[[ 0.18517716 3.93215204 0.95903963]
[ 0.6155686 -0.35259924 -30.80735191]
[ 1.60047095 30.53609736 -0.44594919]]
For the remote example:
[[ -1.01658402 0.2694603 -2.94588746]
[ 2.66032971 1.21665272 11.37159011]
[ 5.82700359 -10.61597778 -0.40184261]]
Normalization of Points: Normalize the coordinates of the points in both images to improve numerical stability. This involves transforming the points so that they are centered and have an average distance of (
Construct Matrix ( A ): Use the normalized points to construct the matrix ( A ). For each correspondence pair ((x, y)) and ((x', y')), create a row in ( A ) as the Kronecker product of the points:
Singular Value Decomposition (SVD): Perform SVD on ( A ) to find the fundamental matrices ( F_1 ) and ( F_2 ), which correspond to the last two columns of the null space of ( A ) (since ( A ) has seven rows and is ( 7
Linear Combination: Compute the fundamental matrix ( F ) as a linear combination of ( F_1 ) and ( F_2 ):
Find (
Unnormalization: Convert ( F ) back to the original coordinate system by unnormalizing it:
where ( T ) and ( T' ) are the normalization matrices for the first and second image, respectively.
This process yields up to several possible solutions for ( F ), from which the most suitable one is chosen based on additional criteria, if available. The 7-point algorithm is especially useful when exactly seven point correspondences are available, providing a reliable estimate of the fundamental matrix.
Viewpoint 1 (Points) | Viewpoint 2 (Epipolar Lines) |
---|---|
![]() | ![]() |
![]() | ![]() |
Fundamental matrix:
For the hydrant example:
[[ 1.06836663e-07 -3.30854241e-06 -1.37397509e-03]
[ 4.79268901e-06 1.63305018e-07 -1.67456794e-02]
[ 1.12974017e-03 1.64136613e-02 1.00000000e+00]]
For the ball example:
[[-3.35078059e-09+0.j -2.91545005e-06+0.j -6.74891733e-03+0.j]
[ 3.51682683e-06+0.j -8.84237457e-07+0.j -1.50044163e-02+0.j]
[ 9.09398538e-03+0.j 1.48748649e-02+0.j 1.00000000e+00+0.j]]
Essential matrix:
For the hydrant example:
[[ 0.15857475 -4.91077927 -4.16191946]
[ 7.1136576 0.24238918 -18.26768512]
[ 5.06442012 18.65933802 1.11386097]]
For the ball example:
[[-1.29362504e-02+0.j -1.12555838e+01+0.j -1.86746229e+01+0.j]
[ 1.35772998e+01+0.j -3.41374700e+00+0.j -2.74530807e+01+0.j]
[ 2.43843346e+01+0.j 2.45414016e+01+0.j 1.64166858e+00+0.j]]
Random Sampling:
Randomly sample 7 or 8 point correspondences to compute a candidate fundamental matrix ( F ) using either the 7-point or 8-point algorithm.
Reprojection and Inlier Counting:
Use ( F ) to reproject points from the second image onto the first (or vice versa), calculating the epipolar line ( l_2 ).
Measure the distance of each point in the second image (
Inlier Ratio Evaluation:
Compute the ratio of inliers to total correspondences. If the ratio is higher than any previous iteration, update ( F ) as the best estimate.
Iterate and Select Best Model:
Repeat this process, retaining the ( F ) that maximizes the inlier ratio, providing a robust estimate of the fundamental matrix that is less sensitive to outliers.
Here we use Hydrant as an example.
The estimated fundamental matrix:
[[ 1.30066947e-07 -2.67447440e-06 -1.73480584e-03]
[ 4.08684223e-06 2.46604403e-07 -1.62214550e-02]
[ 1.54457266e-03 1.57896599e-02 1.00000000e+00]]
The estimated essential matrix:
[[ 0.19305482 -3.96964942 -4.10784484]
[ 6.06598847 0.36602819 -17.86717574]
[ 5.04148418 18.23486105 1.09006968]]
Viewpoint 1 (Points) | Viewpoint 2 (Epipolar Lines) | Iter v.s. Inlier |
---|---|---|
![]() | ![]() | ![]() |
The estimated fundamental matrix:
[[-1.29431630e-07 2.43170768e-06 1.76940192e-03]
[-3.81790712e-06 -2.81330608e-07 1.57887795e-02]
[-1.61273104e-03 -1.52890588e-02 -9.99755584e-01]]
The estimated essential matrix:
[[ -0.19211184 3.6093174 3.96475579]
[ -5.66681543 -0.41757135 17.42778055]
[ -4.91945338 -17.7564847 -1.06676404]]
Viewpoint 1 (Points) | Viewpoint 2 (Epipolar Lines) | Iter v.s. Inlier |
---|---|---|
![]() | ![]() | ![]() |
Input Data:
Use the projection matrices ( P_1 ) and ( P_2 ) of the two cameras, along with matched 2D points ( x_1 ) and ( x_2 ) in each image.
Formulate Equations:
For each matched pair of points ( x_1 ) and ( x_2 ), define the corresponding 3D point ( X ) using the relationships ( P_1 X = x_1 ) and ( P_2 X = x_2 ).
Rearrange these into the form ( P_1 X
Stack Equations into Matrix ( A ):
Combine these equations into a single matrix equation ( A X = 0 ), where ( A ) is constructed as:
Here, ( P_1(i, :) ) and ( P_2(i, :) ) represent the ( i )-th row of the respective projection matrices, and each row in ( A ) corresponds to one of the projection constraints.
Solve with Singular Value Decomposition (SVD):
Perform SVD on ( A ), and obtain the 3D point ( X ) as the last column of the matrix ( V ) (from the decomposition ( A = U \Sigma V^T )), which minimizes the reprojection error.
The output of this system is as follows:
Here we tried COLMAP on two scenes:
Due to the limited space, we only show 4 of the input images (~50 in total)
Due to the limited space, we only show 6 of the input images (~50 in total)