16-822 | Qin Han | qinh@andrew.cmu.edu
Bench: \[ F = \begin{bmatrix} -1.19265833e-07 &-2.53255522e-06 & 2.07584109e-04 \\ -3.96465204e-07 & 2.27096267e-07 & 2.48867750e-02 \\ -1.06890748e-03 &-2.30052117e-02 &1.0 \end{bmatrix} \] Remote: \[ F = \begin{bmatrix} 7.19006698e-07 &-1.90583125e-07 &2.36215166e-03 \\ -1.88159055e-06& -8.60510729e-07 &-8.45685523e-03 \\ -3.98058321e-03 & 9.46500248e-03 & 1.0 \end{bmatrix} \]
Fundamental matrix \(F\) is calculated by the 8-point algorithm.
First, we normalize the points. Then we calculate the matrix \(A\) by the normalized points. Each row of the matrix \(A\) is the Kronecker product of the two points, e.g. \(A_i = [x_ix'_i, x_iy'_i, x_i, y_ix'_i, y_iy'_i, y_i, x'_i, y'_i, 1]\).
Then we conduct singular value decomposition on the matrix \(A\), and get the fundamental matrix \(F\) by the last column of the matrix \(V\).
To enforce the rank-2 constraint, we conduct singular value decomposition on the matrix \(F\). And set the smallest singular value to 0 to get the final fundamental matrix \(F\).
Finally, we unnormalize the fundamental matrix \(F\) by \(F = T_2^T F T_1\), where \(T_1\) and \(T_2\) are the normalization matrices.
Bench: \[ E = \begin{bmatrix} -0.41524274 & -8.81748894 & -2.15055808 \\ -1.3803559 & 0.79067134 & 69.08265036 \\ -3.58890876 & -68.47438701 & 1. \end{bmatrix} \] Remote: \[ E = \begin{bmatrix} 2.5298064 & -0.67056178 & 7.33094837 \\ -6.62032749 & -3.02768466 & -28.29861665 \\ -14.50071092 & 26.41824781 & 1. \end{bmatrix} \]
Essential matrix \(E\) is calculated by \(E = K_2^T F K_1\), where \(K_1\) and \(K_2\) are the camera calibration matrices.
Ball: \[ F = \begin{bmatrix} -3.3507805e-09 &-2.9154501e-06 &-6.7489175e-03 \\ 3.5168268e-06& -8.8423747e-07 &-1.5004416e-02 \\ 9.0939850e-03 & 1.4874865e-02 & 1.0 \end{bmatrix} \] \[ E = \begin{bmatrix} -7.87994079e-03 &-6.85618544e+00 &-1.13753920e+01 \\ 8.27042675e+00 &-2.07943749e+00& -1.67226696e+01 \\ 1.48533850e+01 & 1.49490595e+01& 1.0 \end{bmatrix} \] Hydrant: \[ F = \begin{bmatrix} 1.06836666e-07 & -3.30854232e-06 &-1.37397507e-03 \\ 4.79268920e-06 & 1.63305018e-07& -1.67456791e-02 \\ 1.12974015e-03 & 1.64136607e-02 & 1.0 \end{bmatrix} \] \[ E = \begin{bmatrix} 0.14236493 &-4.40879 & -3.7364802 \\ 6.386486 & 0.2176117 & -16.400328\\ 4.5467257 & 16.751945 & 1. \end{bmatrix} \]
We use the 7-point algorithm to calculate the fundamental matrix \(F\).
We first normalize the points. Then we calculate the matrix \(A\) by the normalized points. Each row of the matrix \(A\) is the Kronecker product of the two points, e.g. \(A_i = [x_ix'_i, x_iy'_i, x_i, y_ix'_i, y_iy'_i, y_i, x'_i, y'_i, 1]\).
Then we conduct singular value decomposition on the matrix \(A\), and get the fundamental matrix \(F_1\) and \(F_2\) by the last two columns of the matrix \(V\).
Then we calculate the fundamental matrix \(F\) by the linear combination of \(F_1\) and \(F_2\), e.g. \(F = \alpha F_1 + (1-\alpha) F_2\). And we choose the \(F\) so that the determinant of the matrix \(F\) is 0. This requires solving a cubic equation over \(\alpha\) and choosing the real root.
Finally, we unnormalize the fundamental matrix \(F\) by \(F = T_2^T F T_1\), where \(T_1\) and \(T_2\) are the normalization matrices.
Ball: \( F \) (8 pts) \[ F = \begin{bmatrix} -3.42940323 \times 10^{-8} & -2.97769567 \times 10^{-6} & -6.25181665 \times 10^{-3} \\ 3.56443007 \times 10^{-6} & -8.94621337 \times 10^{-7} & -1.44781538 \times 10^{-2} \\ 8.46042866 \times 10^{-3} & 1.44011210 \times 10^{-2} & 1 \end{bmatrix} \] \( E \) (8 pts) \[ E = \begin{bmatrix} -0.08310355 & -7.21574727 & -11.20129323 \\ 8.63756052 & -2.16790505 & -16.56350512 \\ 14.5594713 & 14.76713017 & 1 \end{bmatrix} \] \( F \) (7 pts) \[ F = \begin{bmatrix} -1.11772177 \times 10^{-8} & -2.91062190 \times 10^{-6} & -6.86800547 \times 10^{-3} \\ 3.48587111 \times 10^{-6} & -8.99461015 \times 10^{-7} & -1.51975570 \times 10^{-2} \\ 9.29590770 \times 10^{-3} & 1.50939928 \times 10^{-2} & 1 \end{bmatrix} \] \( E \) (7 pts) \[ E = \begin{bmatrix} -0.02566405 & -6.6830884 & -11.24531261 \\ 8.00391999 & -2.06525536 & -16.58927551 \\ 14.69932934 & 14.83805038 & 1 \end{bmatrix} \]
Bench: \( F \) (8 pts) \[ F = \begin{bmatrix} -6.88584473 \times 10^{-8} & -1.67610929 \times 10^{-6} & 4.69822495 \times 10^{-5} \\ -1.11654117 \times 10^{-6} & 4.68110428 \times 10^{-7} & 2.42539583 \times 10^{-2} \\ -9.44592249 \times 10^{-4} & -2.25743076 \times 10^{-2} & 1 \end{bmatrix} \] \( E \) (8 pts) \[ E = \begin{bmatrix} -0.24712224 & -6.01529502 & -1.70920254 \\ -4.00709228 & 1.67997538 & 68.32133828 \\ -3.98752329 & -67.53545424 & 1 \end{bmatrix} \] \( F \) (7 pts) \[ F = \begin{bmatrix} -6.23536892 \times 10^{-8} & -1.60832764 \times 10^{-6} & 1.20086520 \times 10^{-5} \\ -1.16702024 \times 10^{-6} & 4.98254916 \times 10^{-7} & 2.41171457 \times 10^{-2} \\ -9.12092040 \times 10^{-4} & -2.24703933 \times 10^{-2} & 1 \end{bmatrix} \] \( E \) (7 pts) \[ E = \begin{bmatrix} -0.22531885 & -5.81178989 & -1.74049402 \\ -4.21709872 & 1.80047449 & 68.33351381 \\ -3.96095543 & -67.54407114 & 1 \end{bmatrix} \]
Hydrant: \( F \) (8 pts) \[ F = \begin{bmatrix} 1.25247931 \times 10^{-7} & -3.24217592 \times 10^{-6} & -1.43490838 \times 10^{-3} \\ 4.75297456 \times 10^{-6} & 1.44356565 \times 10^{-7} & -1.63507808 \times 10^{-2} \\ 1.14477508 \times 10^{-3} & 1.59994179 \times 10^{-2} & 1 \end{bmatrix} \] \( E \) (8 pts) \[ E = \begin{bmatrix} 0.17112754 & -4.42981845 & -3.84566317 \\ 6.49403823 & 0.19723587 & -16.40092298 \\ 4.65816282 & 16.72578606 & 1 \end{bmatrix} \] \( F \) (7 pts) \[ F = \begin{bmatrix} 8.92635819 \times 10^{-8} & -4.45846360 \times 10^{-6} & -1.15598106 \times 10^{-3} \\ 5.99486141 \times 10^{-6} & -6.07787488 \times 10^{-8} & -1.72358414 \times 10^{-2} \\ 8.52547392 \times 10^{-4} & 1.70969768 \times 10^{-2} & 1 \end{bmatrix} \] \( E \) (7 pts) \[ E = \begin{bmatrix} 0.11663309 & -5.82549321 & -4.21001104 \\ 7.83297287 & -0.07941439 & -16.30401764 \\ 4.96017701 & 16.57507548 & 1 \end{bmatrix} \]
Remote: \( F \) (8 pts) \[ F = \begin{bmatrix} 8.22027173 \times 10^{-7} & -6.82877632 \times 10^{-7} & 2.52934250 \times 10^{-3} \\ -1.42863486 \times 10^{-6} & -9.49110965 \times 10^{-7} & -9.02512580 \times 10^{-3} \\ -4.36716885 \times 10^{-3} & 1.02893028 \times 10^{-2} & 1 \end{bmatrix} \] \( E \) (8 pts) \[ E = \begin{bmatrix} 2.56228725 & -2.12855329 & 6.24573692 \\ -4.45310445 & -2.95841184 & -26.29017162 \\ -13.05143224 & 24.99553881 & 1 \end{bmatrix} \] \( F \) (7 pts) \[ F = \begin{bmatrix} 2.84913501 \times 10^{-7} & 1.06901878 \times 10^{-5} & 2.06618574 \times 10^{-3} \\ -1.84422013 \times 10^{-5} & -8.13541552 \times 10^{-6} & -1.80984719 \times 10^{-2} \\ -3.89150514 \times 10^{-3} & 2.67791256 \times 10^{-2} & 1 \end{bmatrix} \] \( E \) (7 pts) \[ E = \begin{bmatrix} 0.36875403 & 13.83595332 & 9.33713039 \\ -23.86912564 & -10.52939678 & -31.69428925 \\ -16.35288429 & 27.88936434 & 1 \end{bmatrix} \]
To do RANSAC, we first randomly select 7 or 8 points from the matched points. Then we use the 7-point or 8-point algorithm to calculate the fundamental matrix \(F\). We then calculate the epipolar lines using the estimated fundamental matrix \(F\) and the points in the 1st image, and then compute the distance between the epipolar lines and the corresponding points in the 2nd image. We count the number of inliers whose distance is less than a threshold, which was set as 1 in my experiment. We repeat the above steps for 10_000 iterations, and choose the fundamental matrix with the most inliers. Finally, we use the most inliers to re-estimate the fundamental matrix \(F\) using 8-point algorithm.
![]() |
![]() |
![]() |
---|
To do triangulation, the algorithm takes in the projection matrices \(P_1\) and \(P_2\) of the two cameras and the matched points. For each pair of matched points, we define the corresponding 3D point as \(X\). We know that \(P_1X = x_1\) and \(P_2X = x_2\). So we can write the equation as \(P_1X \times x_1 = 0\) and \(P_2X \times x_2 = 0\). We can stack these two equations into a matrix form \(AX = 0\), where \[ A = \begin{bmatrix} x_1^T P_1(3,:) - P_1(1,:) \\ x_1^T P_1(3,:) - P_1(2,:) \\ x_2^T P_2(3,:) - P_2(1,:) \\ x_2^T P_2(3,:) - P_2(2,:) \\ \end{bmatrix} \] We then conduct singular value decomposition on the matrix \(A\), and get the 3D point \(X\) by the last column of the matrix \(V\).
Scene 1: Trevi Fountain
Input multiview images: I collected 7 images of the Trevi Fountain from different angles on the internet.
The 3D reconstruction of the Trevi Fountain is shown in the following GIF.
Scene 2: CMU Campus
Input multiview images: I collected 9 images of the CMU campus from different angles on the internet.
The 3D reconstruction of the CMU campus is shown in the following GIF.
I first utilize the Scale-Invariant Feature Transform (SIFT) algorithm to detect and extract distinctive keypoints and descriptors from both images, capturing scale- and rotation-invariant features.
Then I employ a brute-force matcher to identify feature correspondences between the two images, using k-nearest neighbors (k-NN) with k=2 to find the two closest matches for each feature descriptor. To ensure robust and accurate matching, I apply a filtering criterion, retaining only those matches where the distance of the first nearest match is less than 0.75 times the distance of the second nearest match. This ratio test helps eliminate ambiguous matches, enhancing the quality of the matches and contributing to more accurate image alignment and feature matching reliability.
Here is the visualization of the matched points.
After obtaining keypoint matches, I use Random Sample Consensus (RANSAC) to estimate the fundamental matrix using the 8-point algorithm. Here is the visualization of the epipolar lines.
The fundamental matrix is \[ F = \begin{bmatrix} -1.81907087e-05 & 1.33636189e-04 & -8.42624687e-02 \\ -1.00742052e-04 & 1.09051085e-05 & 1.56390912e-01 \\ 1.05906704e-01 & -2.05051308e-01 & 1.0 \end{bmatrix} \]. And the percentage of inliers is 0.43636364.