HW3: 3D Reconstruction
cque
HW3: 3D Reconstruction
Q1: 8-point and 7-point algorithm (40 points)
(A1) F matrix using 8-point algorithm (15 points)
Brief explanation:
- Load the Data
- Use
np.load
to load the data
- Use
plt.imread
to read the files
- Call the
eightpoint
function
- Normalize the Coordinates
- Convert the points to
float64
for easy processing
- Divide each coordinate pts1 and pts2 by
M
to scale for numerical stability
- Construct the Matrix A
Using formula:
\[\begin{bmatrix}
x' & y' & 1
\end{bmatrix}F
\begin{bmatrix}
x \\
y \\
1
\end{bmatrix} = 0\]
\[xx' f_{11} + xy' f_{12} + x f_{13} + yx' f_{21} + yy' f_{22} + y f_{23} + x' f_{31} + y' f_{32} +
f_{33} = 0\]
In form Af = 0
\[\begin{bmatrix} x_1 x_1' & x_1 y_1' & x_1 & y_1 x_1' & y_1 y_1' & y_1 & x_1' & y_1' & 1 \\ \vdots &
\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ x_n x_n' & x_n y_n' & x_n & y_n
x_n' & y_n y_n' & y_n & x_n' & y_n' & 1 \end{bmatrix} \begin{bmatrix}
f_{11} \\
f_{12} \\
f_{13} \\
f_{21} \\
f_{22} \\
f_{23} \\
f_{31} \\
f_{32} \\
f_{33}
\end{bmatrix} = 0\]
f would contain values of the fundamental matrix F
- Solve for F Using SVD
Perform SVD on A. F is the singular vector corresponding to the smallest singular value of A reshaped to
(3,3).
- Enforce the Rank-2 Constraint
Use SVD on \( F \) to enforce the rank-2 constraint, zeroing out the smallest singular value of \( F \).
- Unscale the Fundamental Matrix
Unscale F: \( F_{\text{unscaled}} = T^T \cdot F \cdot T\)
where
T = np.diag([1/M,1/M,1])
Normalize F such that
F[2][2] = 1
- Plot
Created a matplotlib plot with 10 random x,y points and their corresponding epipolar lines.
Using formula l' = F (dot) x to find the line equation
normalize the line the magnitude of the line
- calculate intersection points for the line
- found 2 points on image2 that intersect this line to draw the line
Epipolar lines: Show lines from fundamental matrix over the two images.
Viewpoint 1 (Points) |
Viewpoint 2 (Epipolar Lines) |
 |
 |
 |
 |
(A2) E matrix using 8-point algorithm (5 points)
Brief explanation
- Used the previous eightpoint formula to find F
- Loaded
K_1
and K_2
from corresp.npz
- Utilized the equation
\[ E = K_2^T\cdot F \cdot K_1\ \]
Provide your estimated E
Bench
\( E = \begin{bmatrix} -0.2763 & -1.8676 & 0.9918 \\ -2.8557 & -1.9883 & 35.8908 \\ -3.8694 & -35.0732 & 0.6788
\end{bmatrix} \)
Remote
\( E = \begin{bmatrix} 1.0167 & -0.2695 & 2.9462 \\ -2.6606 & -1.2168 & -11.3726 \\ -5.8275 & 10.6169 & 0.4019
\end{bmatrix} \)
(B) 7-point algorithm (20 points)
Brief explanation:
- Load the Data
- Use
np.load
to load the data
- Use
plt.imread
to read the files
- Call the
eightpoint
function
- Normalize the Coordinates
- Convert the points to
float64
for easy processing
- Divide each coordinate pts1 and pts2 by
M
to scale for numerical stability
- Construct the Matrix A
Using formula:
As derived in Q1a
In form Af = 0
\[\begin{bmatrix} x_1 x_1' & x_1 y_1' & x_1 & y_1 x_1' & y_1 y_1' & y_1 & x_1' & y_1' & 1 \\ \vdots &
\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ x_n x_n' & x_n y_n' & x_n & y_n
x_n' & y_n y_n' & y_n & x_n' & y_n' & 1 \end{bmatrix} \begin{bmatrix}
f_{11} \\
f_{12} \\
f_{13} \\
f_{21} \\
f_{22} \\
f_{23} \\
f_{31} \\
f_{32} \\
f_{33}
\end{bmatrix} = 0\]
f would contain values of the fundamental matrix F
- Solve for F Using SVD
Perform SVD on A. F is the singular vector corresponding to the smallest singular value of A reshaped to
(3,3).
- Polynomial
A has rank 7.
The dimension of the nullspace is 2, with f1 and f2 as the nullspaces.
The solution can be written in the form
\[ F = aF_1 + (1-a)F_2 \]
where a is a scalar.
Use the singular constraint det(F) = 0 to construct the equations for finding a.
- substitute
\(\text{det}(aF_1 + (1-a)F_2) = 0\)
\(\text{det}(aF_1) + \text{det}((1-a)F_2) = 0\)
\(a^3 \text{det}(F_1) + (1-a)^3 \text{det}(F_2) = 0\)
- since we already know F1 and F2 from SVD, we can expand this polynomial to solve for a and F
This creates 1-3 fundamental matrices. we later find the best one
- Unscale the Fundamental Matrix
Unscale F: \( F_{\text{unscaled}} = T^T \cdot F \cdot T\)
where
T = np.diag([1/M,1/M,1])
Normalize F such that
F[2][2] = 1
- Iterate through all solutions of F to find the optimal one using error minimization
In calc epi error, we project the points to the epipolar line using l' = F x.
We find the distance between the corresponding points in the second image and see how much the projection
error is for each potential fundamental matrix.
At this point, we find the correct fundamental matrix.
- Plot
Created a matplotlib plot with 10 random x,y points and their corresponding epipolar lines.
Using formula l' = F (dot) x to find the line equation
normalize the line the magnitude of the line
- calculate intersection points for the line
- found 2 points on image2 that intersect this line to draw the line
Fundamental Matrix
Viewpoint 1 (Points) |
Viewpoint 2 (Epipolar Lines) |
 |
 |
 |
 |
Q2: RANSAC with 7-point and 8-point algorithm (20 points)
Submission
Brief explanation
- Within a 10,000 or specified iterations, we perform the following:
- Find 8 random (or 7 for seven point algorithm) points to calculate a fundamental matrix.
- Calculate projection error of all points using this fundamental matrix.
- Criteria for considering inliers
- We use the sum of squared distance between the corresponding points and the estimated epipolar lines to calculate the error.
- We use a certain threshold to count a data as inlier if the error is less than the threshold.
- For each point pair and projection error, if the projection error for a point pair is less than a certain threshold, count it as an inlier.
- Report the fundamental matrix with most inliers.
Report your best solution and plot the epipolar lines -- show lines from the fundamental matrix that you calculate over the inliers.
Ball:
7 point epipolar line |
8 point epipolar line |
 |
 |
RANSAC |
 |
Bench:
7 point epipolar line |
8 point epipolar line |
 |
 |
RANSAC |
 |
Remote:
7 point epipolar line |
8 point epipolar line |
 |
 |
RANSAC |
 |
Hydrant:
7 point epipolar line |
8 point epipolar line |
 |
 |
RANSAC |
 |
Q3: Triangulation (20 points)
Given 2D correspondences and 2 camera matrices, your goal is to triangulate the 3D points.
Submission
Brief explanation
- Use formulas
\[x_i = \frac{p_{00}X + p_{01}Y + p_{02}Z + p_{03}W}{p_{20}X+p_{21}Y + p_{22}Z + p_{23}W}\]
and
\[y_i = \frac{p_{10}X + p_{11}Y + p_{12}Z + p_{13}W}{p_{20}X+p_{21}Y + p_{22}Z + p_{23}W}\]
Solve for
\[[X,Y,Z,W].T\]
\[x_i (p_{20}X+p_{21}Y + p_{22}Z + p_{23}W) - (p_{00}X + p_{01}Y + p_{02}Z + p_{03}W) =0\]
e.g.
\[(x_i*p_{20} - p_{00})X=0\]
\[A_i =
\begin{bmatrix}
x_1 C_{1}[2,0] - C_{1}[0,0],
x_1 C_{1}[2,1] - C_{1}[0,1],
x_1 C_{1}[2,2] - C_{1}[0,2],
x_1 C_{1}[2,3] - C_{1}[0,3] \\
y_1 C_{1}[2,0] - C_{1}[1,0],
y_1 C_{1}[2,1] - C_{1}[1,1],
y_1 C_{1}[2,2] - C_{1}[1,2],
y_1 C_{1}[2,3] - C_{1}[1,3] \\
x_2 C_{2}[2,0] - C_{2}[0,0],
x_2 C_{2}[2,1] - C_{2}[0,1],
x_2 C_{2}[2,2] - C_{2}[0,2],
x_2 C_{2}[2,3] - C_{2}[0,3] \\
y_2 C_{2}[2,0] - C_{2}[1,0],
y_2 C_{2}[2,1] - C_{2}[1,1],
y_2 C_{2}[2,2] - C_{2}[1,2],
y_2 C_{2}[2,3] - C_{2}[1,3]
\end{bmatrix}
\]
-
Use SVD to solve for the null space of A for each 3D point.
-
Normalize the 3D point.
-
For each 3D point, find its corresponding 2D point and the color for that 2D point on the original image.
-
Use this color to paint the 3D points.
A colored point cloud as below:
Q4: Reconstruct your own scene! (20 points)
Submissions
- Multi-view input images.
- A gif to visualize the reconstruction of the scene and location of cameras (extrinsics).
- Run this on at least 2 sequences / objects / scenes.
Example Multi-view images |
Output |
 |
 |
 |
 |
 |
 |
Q5: Bonus 1 - Fundamental matrix estimation on your own images. (10 points)
Brief explanation
- I used SIFT to capture the keypoints and descriptors, and founded the matches using cv2.BFMatcher with descriptor.
- bfmatcher with l2 norm.
- Since the points can be noisy, I used RANSAC eight points to eliminate the noise.
Viewpoint 1 (Points) |
Viewpoint 2 (Epipolar Lines) |
 |
 |
 |
 |
Q6: Bonus 2 - Stress test the hyperparameters of COLMAP (10 points)
- What happens if we reduce the number of input images?
There will be fewer points in the point cloud as not as much information is given. The image becomes less detailed, and the resolution is lower.
11 cameras |
6 cameras connected |
3 cameras connected |
 |
 |
 |
- Under what scenario and conditions does the reconstruction pipeline break?
When camera views are too far apart that the images do not have enough overlap to detect matching features, as seen below.
With the same number of camera view points, one works (see below on the right) and one breaks (see below on the left) due to the location of the cameras and the different overlap of viewpoints. Since 3 connecting camera view points have enough overlap, it still was able to reconstruct the image, while the 3 separate camera views from different locations are not able to provide enough overlap of viewpoints for the reconstruction.
Pipeline breaks |
3 cameras connected |
 |
 |