16-822 Geometry-based Methods in Vision

Assignment 3: 3D Reconstruction

Qitao Zhao (qitaoz), Fall 2024

Q1: 8-point and 7-point algorithm

(A1) F matrix using 8-point algorithm

The 8-point algorithm estimates the fundamental matrix ( F ) from eight or more point correspondences between two images. Each correspondence between points (x, y) in the first image and (x', y') in the second image contributes one row to the matrix ( A ), as shown below:

Given correspondences:

The matrix ( A ) is constructed as follows:

(1)A=[x1x1x1y1x1y1x1y1y1y1x1y11x2x2x2y2x2y2x2y2y2y2x2y21xnxnxnynxnynxnynynynxnyn1]

Each row of ( A ) is derived from a single correspondence, and the final matrix is ( n × 9 ), where ( n 8 ) (the number of point correspondences).

Steps in the Algorithm

  1. Linear Solution: Solve for ( F ) by finding the null space of ( A ), specifically the vector ( f ) corresponding to ( A )'s smallest singular value. Reshape ( f ) into a ( 3 × 3 ) matrix, yielding an initial estimate of ( F ).

  2. Constraint Enforcement: Enforce the rank-2 constraint on ( F ) by adjusting it to its nearest rank-2 approximation using Singular Value Decomposition (SVD). This ensures that ( F ) represents a valid fundamental matrix.

Viewpoint 1 (Points)Viewpoint 2 (Epipolar Lines)

(A2) E matrix using 8-point algorithm

(B) 7-point algorithm

Steps in the 7-point Algorithm

  1. Normalization of Points: Normalize the coordinates of the points in both images to improve numerical stability. This involves transforming the points so that they are centered and have an average distance of (2) from the origin. The normalization matrices are denoted by ( T ) and ( T' ) for each image.

  2. Construct Matrix ( A ): Use the normalized points to construct the matrix ( A ). For each correspondence pair ((x, y)) and ((x', y')), create a row in ( A ) as the Kronecker product of the points:

    (3)A=[x1x1x1y1x1y1x1y1y1y1x1y11x7x7x7y7x7y7x7y7y7y7x7y71]
  3. Singular Value Decomposition (SVD): Perform SVD on ( A ) to find the fundamental matrices ( F_1 ) and ( F_2 ), which correspond to the last two columns of the null space of ( A ) (since ( A ) has seven rows and is ( 7 × 9 )).

  4. Linear Combination: Compute the fundamental matrix ( F ) as a linear combination of ( F_1 ) and ( F_2 ):

    (4)F=αF1+(1α)F2

    Find ( α ) such that ( det(F) = 0 ). This condition creates a cubic equation in ( α ), which can have up to three real roots. Choose the real root that satisfies the rank-2 constraint on ( F ).

  5. Unnormalization: Convert ( F ) back to the original coordinate system by unnormalizing it:

    (5)F=TFT

    where ( T ) and ( T' ) are the normalization matrices for the first and second image, respectively.

This process yields up to several possible solutions for ( F ), from which the most suitable one is chosen based on additional criteria, if available. The 7-point algorithm is especially useful when exactly seven point correspondences are available, providing a reliable estimate of the fundamental matrix.

Viewpoint 1 (Points)Viewpoint 2 (Epipolar Lines)

Q2: RANSAC with 7-point and 8-point algorithm

Steps in the Algorithm:

  1. Random Sampling:

    • Randomly sample 7 or 8 point correspondences to compute a candidate fundamental matrix ( F ) using either the 7-point or 8-point algorithm.

  2. Reprojection and Inlier Counting:

    • Use ( F ) to reproject points from the second image onto the first (or vice versa), calculating the epipolar line ( l_2 ).

    • Measure the distance of each point in the second image ( pts2 ) to the epipolar line ( l_2 ) and count inliers where the sum of distances is below a threshold (e.g., 2 pixels).

  3. Inlier Ratio Evaluation:

    • Compute the ratio of inliers to total correspondences. If the ratio is higher than any previous iteration, update ( F ) as the best estimate.

  4. Iterate and Select Best Model:

    • Repeat this process, retaining the ( F ) that maximizes the inlier ratio, providing a robust estimate of the fundamental matrix that is less sensitive to outliers.

Here we use Hydrant as an example.

7-point Algorithm

The estimated fundamental matrix:

[[ 1.30066947e-07 -2.67447440e-06 -1.73480584e-03]

[ 4.08684223e-06 2.46604403e-07 -1.62214550e-02]

[ 1.54457266e-03 1.57896599e-02 1.00000000e+00]]

The estimated essential matrix:

[[ 0.19305482 -3.96964942 -4.10784484]

[ 6.06598847 0.36602819 -17.86717574]

[ 5.04148418 18.23486105 1.09006968]]

Viewpoint 1 (Points)Viewpoint 2 (Epipolar Lines)Iter v.s. Inlier

8-point Algorithm

The estimated fundamental matrix:

[[-1.29431630e-07 2.43170768e-06 1.76940192e-03]

[-3.81790712e-06 -2.81330608e-07 1.57887795e-02]

[-1.61273104e-03 -1.52890588e-02 -9.99755584e-01]]

The estimated essential matrix:

[[ -0.19211184 3.6093174 3.96475579]

[ -5.66681543 -0.41757135 17.42778055]

[ -4.91945338 -17.7564847 -1.06676404]]

Viewpoint 1 (Points)Viewpoint 2 (Epipolar Lines)Iter v.s. Inlier

Q3: Triangulation

Steps in the Triangulation Algorithm

  1. Input Data:

    • Use the projection matrices ( P_1 ) and ( P_2 ) of the two cameras, along with matched 2D points ( x_1 ) and ( x_2 ) in each image.

  2. Formulate Equations:

    • For each matched pair of points ( x_1 ) and ( x_2 ), define the corresponding 3D point ( X ) using the relationships ( P_1 X = x_1 ) and ( P_2 X = x_2 ).

    • Rearrange these into the form ( P_1 X × x_1 = 0 ) and ( P_2 X × x_2 = 0 ), which express that ( X ) should lie on the epipolar lines induced by the projections.

  3. Stack Equations into Matrix ( A ):

    • Combine these equations into a single matrix equation ( A X = 0 ), where ( A ) is constructed as:

      (6)A=[x1TP1(3,:)P1(1,:)x1TP1(3,:)P1(2,:)x2TP2(3,:)P2(1,:)x2TP2(3,:)P2(2,:)]

    Here, ( P_1(i, :) ) and ( P_2(i, :) ) represent the ( i )-th row of the respective projection matrices, and each row in ( A ) corresponds to one of the projection constraints.

  4. Solve with Singular Value Decomposition (SVD):

    • Perform SVD on ( A ), and obtain the 3D point ( X ) as the last column of the matrix ( V ) (from the decomposition ( A = U \Sigma V^T )), which minimizes the reprojection error.

The output of this system is as follows:

Q4: Reconstruct your own scene!

Here we tried COLMAP on two scenes:

Hydrant:

Due to the limited space, we only show 4 of the input images (~50 in total)

Loading animation

Teddybear:

Due to the limited space, we only show 6 of the input images (~50 in total)

Loading animation