16822 Assignment 3¶

Zi Wang (ziwang2)

Q1: 8-point and 7-point algorithm (40 points)¶

(A1) F matrix using 8-point algorithm (15 points)¶

Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
$$ F = \begin{bmatrix} -1.19265833e-07 & -2.53255522e-06 & 2.07584109e-04 \\ -3.96465204e-07 & 2.27096267e-07 & 2.48867750e-02 \\ -1.06890748e-03 & -2.30052117e-02 & 1.00000000e+00 \end{bmatrix} $$$$ E = \begin{bmatrix} -0.18528359 & -3.93441194 & -0.95959081 \\ -0.61592238 & 0.35280189 & 30.82505761 \\ -1.60139078 & -30.55364717 & 0.44620549 \end{bmatrix} $$
Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
$$ F = \begin{bmatrix} 7.19006698e-07 & -1.90583125e-07 & 2.36215166e-03 \\ -1.88159055e-06 & -8.60510729e-07 & -8.45685523e-03 \\ -3.98058321e-03 & 9.46500248e-03 & 1.00000000e+00 \end{bmatrix} $$$$ E = \begin{bmatrix} 1.01667679 & -0.26948489 & 2.9461563 \\ -2.66057249 & -1.21676375 & -11.37262789 \\ -5.82753537 & 10.6169466 & 0.40187929 \end{bmatrix} $$

Implementation Details:¶

For the fundamental matrix, we use 8-point algorithm. First, the points are normalized with transformation $ T_1 $ and $ T_2 $. Here $$\mathbf{T}_i = \begin{bmatrix}s_i & 0 & - s_i x_{io} \\ 0 & s_i & - s_i y_{io} \\ 0 & 0 & 1 \end{bmatrix}$$ $$(x_{io}, y_{io}) = \frac{1}{\mathbf{N}} \sum{p_i}$$ $$s_i = \frac{\sqrt{2}}{d_{avg}}$$ $$d_{avg} = \frac{1}{\mathbf{N}} \sum{\sqrt{(x_i - x_{io})^2 + (y_i - y_{io})^2}}$$ Then, we construct the matrix $ A $ using the normalized points. Each row of $ A $ is written as: $$ A_i = [x_i x'_i, x_i y'_i, x_i, y_i x'_i, y_i y'_i, y_i, x'_i, y'_i, 1]. $$ Next, singular value decomposition (SVD) is applied to $ A $, and the fundamental matrix $ F $ is derived from the last column of the matrix $ V $. To enforce the rank-2 constraint, SVD is performed again on $ F $, and the smallest singular value is set to zero to obtain the final matrix $ F $. Finally, the fundamental matrix is denormalized by applying $ F = T_2^T F T_1 $.

We can further calculate the essential matrix $E$ as $E=K^T_2 F K_1$, where $K1$ and $K_2$ are the intrinsic matrices.

(B) 7-point algorithm (20 points)¶

Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
$$ F = \begin{bmatrix} -3.35078059e-09 & -2.91545005e-06 & -6.74891733e-03 \\ 3.51682683e-06 & -8.84237457e-07 & -1.50044163e-02 \\ 9.09398538e-03 & 1.48748649e-02 & 1.00000000e+00 \end{bmatrix} $$$$ E = \begin{bmatrix} -1.29362504e-02 & -1.12555838e+01 & -1.86746229e+01 \\ 1.35772998e+01 & -3.41374700e+00 & -2.74530807e+01 \\ 2.43843346e+01 & 2.45414016e+01 & 1.64166858e+00 \end{bmatrix} $$
Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
$$ F = \begin{bmatrix} 1.06836663e-07 & -3.30854241e-06 & -1.37397509e-03 \\ 4.79268901e-06 & 1.63305018e-07 & -1.67456794e-02 \\ 1.12974017e-03 & 1.64136613e-02 & 1.00000000e+00 \end{bmatrix} $$$$ E = \begin{bmatrix} 0.15857475 & -4.91077927 & -4.16191946 \\ 7.1136576 & 0.24238918 & -18.26768512 \\ 5.06442012 & 18.65933802 & 1.11386097 \end{bmatrix} $$

Implementation Details:¶

We use 7-point algorithm to calculate the fundamental matrix.

The reconstruction of the matrix $A$ is the same as Q1(a), and we also conduct SVD on this matrix. We can then get two possible solusions $F_1$ and $F_2$ using the last two columns of the matrix $V$. The set of possible fundamental matrices is determined by the values of $ \alpha $ that satisfy $ \det(\alpha\mathbf{F}_1 + (1 - \alpha)\mathbf{F}_2) = 0 $. Solving this cubic equation can result in multiple real roots. For each potential fundamental matrix $ \mathbf{F} $, the total error is computed based on how far the corresponding points are from their respective epipolar lines. The fundamental matrix that produces the smallest error is chosen as the best solution. Then we perform the same transformation as Q1(a) to obtain the fundamental matrix.

Q2: RANSAC with 7-point and 8-point algorithm (20 points)¶

RANSAC + 8-point algorithm

Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
$$ F = \begin{bmatrix} -1.16159794e-07 & -2.82640795e-06 & 8.55354514e-05 \\ -1.88045745e-08 & 2.77560899e-07 & 2.40778862e-02 \\ -9.20385604e-04 & -2.23850401e-02 & 1.00000000e+00 \end{bmatrix} $$

RANSAC + 7-point algorithm

Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
$$ F = \begin{bmatrix} -1.57555217e-07 & -3.54329069e-06 & 2.15099672e-04 \\ 5.53352066e-07 & 2.11592147e-08 & 2.49076418e-02 \\ -1.03020517e-03 & -2.29717205e-02 & 1.00000000e+00 \end{bmatrix} $$

RANSAC + 8-point algorithm

Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
$$ F = \begin{bmatrix} 1.53418120e-06 & -2.59050815e-06 & 3.15713946e-03 \\ -8.60736402e-07 & -7.03016912e-07 & -9.93002311e-03 \\ -5.00963458e-03 & 1.14479731e-02 & 1.00000000e+00 \end{bmatrix} $$

RANSAC + 7-point algorithm

Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
$$ F = \begin{bmatrix} 1.02291096e-06 & -1.17216049e-06 & 2.57028730e-03 \\ -1.63342399e-06 & -5.93239453e-07 & -9.14184787e-03 \\ -4.09580318e-03 & 1.02186925e-02 & 1.00000000e+00 \end{bmatrix} $$

RANSAC + 8-point algorithm

Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
$$ F = \begin{bmatrix} -9.52109981e-08 & -3.14779558e-06 & -5.31780567e-03 \\ 3.72887064e-06 & -9.11811489e-07 & -1.36359696e-02 \\ 7.26709570e-03 & 1.36557626e-02 & 1.00000000e+00 \end{bmatrix} $$

RANSAC + 7-point algorithm

Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
$$ F = \begin{bmatrix} 9.22737906e-09 & -2.88027862e-06 & -7.71150052e-03 \\ 3.45362992e-06 & -8.89319044e-07 & -1.59957905e-02 \\ 1.03988901e-02 & 1.57720905e-02 & 1.00000000e+00 \end{bmatrix} $$

RANSAC + 8-point algorithm

Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
$$ F = \begin{bmatrix} 1.12852148e-07 & -3.28745950e-06 & -1.47199005e-03 \\ 4.74735673e-06 & 1.23210502e-07 & -1.65336893e-02 \\ 1.23684392e-03 & 1.62180674e-02 & 1.00000000e+00 \end{bmatrix} $$

RANSAC + 7-point algorithm

Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
$$ F = \begin{bmatrix} 6.79876148e-08 & -5.10001081e-06 & -1.00394230e-03 \\ 6.64456871e-06 & -1.64380270e-07 & -1.77984741e-02 \\ 6.97430158e-04 & 1.77758776e-02 & 1.00000000e+00 \end{bmatrix} $$

Implementation Details:¶

During each iteration, a random selection of the required number of correspondences is made from the available set. Using these correspondences, the fundamental matrix $ \mathbf{F} $ is estimated by applying either the 7-point or 8-point algorithm, as described previously. Once the matrix is computed, the error for each correspondence is calculated. For a given pair of points $ (p_1, p_2) $, the corresponding epipolar lines are determined by calculating $ l_1 = \mathbf{F}_i p_1 $ and $ l_2 = \mathbf{F}_i^T p_2 $. The total error is derived by summing the distance between point $ p_2 $ and line $ l_1 $, and the distance between $ p_1 $ and line $ l_2 $.

Afterward, the number of inliers is determined by comparing the total error to a predefined threshold (I set threshold=3). The fundamental matrix $ \mathbf{F} $ candidate that results in the most inliers is tracked throughout the iterations. Once the iterations are complete (the total number of iterations is 10000), the final fundamental matrix is computed using the inliers identified from the best candidate.

Q3: Triangulation (20 points)¶

Viewpoint 1 Viewpoint 2 Viewpoint 3

Implementation Details:¶

The algorithm uses the projection matrices $ P_1 $ and $ P_2 $ from both cameras, along with their corresponding matched points. For each set of matched points, we aim to compute the 3D point $ X $. Given the relationships $ P_1 X = x_1 $ and $ P_2 X = x_2 $, we can formulate the problem as solving $ P_1 X \times p_1 = 0 $ and $ P_2 X \times p_2 = 0 $. These two equations can be combined into a matrix form $ A X = 0 $, where the matrix $ A $ is defined as:

$$ A = \begin{bmatrix} p_1[0]~P_1[2,:] - P_1[0,:] \\ p_1[1]~P_1[2,:] - P_1[1,:] \\ p_2[0]~P_2[2,:] - P_2[0,:] \\ p_2[1]~P_2[2,:] - P_2[1,:] \end{bmatrix} $$

Next, SVD is applied to the matrix $ A $, and the 3D point $ X $ is derived from the last column of the matrix $ V $.

Q4: Reconstruct your own scene! (20 points)¶

I collected the images from Mip-NeRF 360 dataset.

Scene 1:

Input images: (311 in total)

Output:

Scene 2:

Input images: (125 in total)

Output:

Q5: Bonus 1 - Fundamental matrix estimation on your own images. (10 points)¶

Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)
Viewpoint 1 (Points) Viewpoint 2 (Epipolar Lines)

Implementation Details:¶

I use a SIFT detector to detect keypoints and compute their descriptors for both images. After that, I use a brute-force matcher to find the closest matches between the two sets of descriptors. To filter out false matches, I apply the ratio test: only matches where the closest match is significantly better than the second closest are kept. So far, I can get the matching keypoints from both images, along with the list of good matches. Finally, I use RANSAC and 8-point algorithm previously mentioned to estimate the fundamental matrix.