[Bench] Viewpoint 1 & Viepoint 2 (Both points and epipolar lines) |
---|
![]() |
[Remote] Viewpoint 1 & Viepoint 2 (Both points and epipolar lines) |
---|
![]() |
(left) Viewpoint 1 & (right) Viewpoint 2. The points and the corresponding epipolar lines have the same color. We can see that the epipolar lines cross the corresponding point on the image of different viewpoints.
= normalize_points(pts1)
pts1_normalized, T1 = normalize_points(pts2) pts2_normalized, T2
Step 2: Construct the Constraint Matrix A Construct matrix \(A\) using normalized points, where each correspondence provides one constraint.
Step 3: Solve for F using SVD Use SVD to solve \(Af = 0\) and get the fundamental matrix in the normalized space:
= np.linalg.svd(A)
U, S, Vt = Vt[-1].reshape(3, 3) F_normalized
= np.linalg.svd(F_normalized)
U, S, Vt 2] = 0
S[= U @ np.diag(S) @ Vt F_normalized
F
back to the original pixel space:= T2.T @ F_normalized @ T1
F /= F[-1, -1] F
Brief Explanation: Given the fundamental matrix \(F\), the essential matrix can be computed using the formula:
\(E = K'^T F K\) where \(K\) and \(K'\) are the intrinsic camera matrices. This allows us to convert \(F\), which is in pixel coordinates, into \(E\), which represents the geometry in normalized camera coordinates.
Alternatively, the essential matrix can also be calculated directly using the 8-point algorithm:
E
to a rank-2 matrix by setting the singular values to \(\text{diag}(d', d', 0)\), where
\(d' = (d_1 + d_2) / 2\).Implementing this method yields the same essential matrix, as
demonstrated in the code q1.py
.
Estimated E:
Bench:
\(E_1 \begin{bmatrix} -0.41524274 & -8.81748894 & -2.15055808\\ -1.3803559 & 0.79067134 & 69.08265036\\ -3.58890876 &-68.47438701 & 1. \\ \end{bmatrix}\)
Remote:
\(E_2 = \begin{bmatrix} 2.52980651 & -0.67056175 & 7.33094879 \\ -6.62032788 & -3.02768481 &-28.29861829 \\ -14.50071107 & 26.41824808 & 1. \end{bmatrix}\)
[Ball] Viewpoint 1 & Viepoint 2 (Both points and epipolar lines) |
---|
![]() |
[Hydrant] Viewpoint 1 & Viepoint 2 (Both points and epipolar lines) |
---|
![]() |
(Left) Viewpoint 1 and (right) Viewpoint 2. Points and their corresponding epipolar lines are shown in matching colors. Notice how each epipolar line passes through the corresponding point in the image from the other viewpoint.
Step 1: Normalize the Points: Normalize points using similarity transforms \(T\) and \(T'\) to ensure zero mean and unit variance:
Step 2: Construct the Constraint Matrix A: Construct matrix \(A\) using normalized points, where each correspondence provides one constraint. In this case, a \(7 \times 9\) matrix is constructed since there are 7 correspondences.
Step 3: Solve for F using SVD: Use SVD to solve \(Af = 0\) and obtain the fundamental matrix in the normalized space. Since the rank of the null space is 2, two vectors \(f_1\) and \(f_2\) can be found:
= np.linalg.svd(A)
U, S, Vt = Vt[-1].reshape(3, 3)
F1 = Vt[-2].reshape(3, 3) F2
The general solution for F
is given by:
\(F = \lambda F_1 + (1 - \lambda) F_2\)
F
has rank 2, enforce the determinant condition \(\text{det}(F) = 0\). This determinant is
represented as a third-order polynomial in \(\lambda\):\(a_3 \lambda^3 + a_2 \lambda^2 + a_1 \lambda + a_0\)
The coefficients can be found as follows:
def det_polynomial(alpha):
= alpha * F1 + (1 - alpha) * F2
F return np.linalg.det(F)
= np.zeros(4)
coefficients 0] = det_polynomial(0)
coefficients[2] = (det_polynomial(1) - det_polynomial(-1)) / 2
coefficients[3] = (det_polynomial(2) - 2 * det_polynomial(1) - 2 * coefficients[2] + coefficients[0]) / 6
coefficients[1] = det_polynomial(1) - coefficients[0] - coefficients[2] - coefficients[3] coefficients[
Using np.roots(coefficients)
, the values for \(\lambda\) are found. Only the real-valued
solution is used.
Step 5: Transform F Back to Original Space:
Transform F
back to the original pixel space using the
similarity transforms from Step 1.
Step 6: Choose the Best Solution: Since multiple
solutions for F
may exist, choose the one that minimizes
the epipolar error, \(\sum_i x_i'^T F
x_i\):
= np.zeros(len(Fs))
errors for j in range(len(Fs)):
= Fs[j]
F = 0
error for i in range(pts1.shape[0]):
= np.array([pts1[i][0], pts1[i][1], 1])
x1 = np.array([pts2[i][0], pts2[i][1], 1])
x2 += np.abs(x2.T @ F @ x1)
error
= error
errors[j]
= Fs[np.argmin(errors)] F
In all the results, the points and their corresponding epipolar lines are depicted in matching colors. The four points are randomly sampled from the annotations, which may include some noisy labels. Note that only the valid correspondences without noise are accurately matched by the computed fundamental matrix, as the epipolar lines pass through only the correct annotations.
Ball
Viewpoint 1 & Viepoint 2 (Both points and epipolar lines) (8pt algorithm) |
---|
![]() |
Viewpoint 1 & Viepoint 2 (Both points and epipolar lines) (7pt algorithm) |
---|
![]() |
% of inliers vs. # of RANSAC iterations (blue: 8pt, orange: 7pt) |
---|
![]() |
\(F = \begin{bmatrix} -2.09107219e-08 & -3.02470117e-06 & -6.89123821e-03\\ 3.60956323e-06 & -9.30216369e-07 & -1.53286060e-02\\ 9.33512247e-03 & 1.52377635e-02 & 1.00000000e+00\\ \end{bmatrix}\)
Bench
Viewpoint 1 & Viepoint 2 (Both points and epipolar lines) (8pt algorithm) |
---|
![]() |
Viewpoint 1 & Viepoint 2 (Both points and epipolar lines) (7pt algorithm) |
---|
![]() |
% of inliers vs. # of RANSAC iterations (blue: 8pt, orange: 7pt) |
---|
![]() |
\(F = \begin{bmatrix} 7.29040371e-07 & 8.32382175e-06 & -3.65908188e-03 \\ -8.52448262e-06 & 4.33718835e-06 & 7.11822616e-03 \\ 2.38509499e-03 & -9.74101756e-03 & 1.00000000e+00 \\ \end{bmatrix}\)
Hydrant
Viewpoint 1 & Viepoint 2 (Both points and epipolar lines) (8pt algorithm) |
---|
![]() |
Viewpoint 1 & Viepoint 2 (Both points and epipolar lines) (7pt algorithm) |
---|
![]() |
% of inliers vs. # of RANSAC iterations (blue: 8pt, orange: 7pt) |
---|
![]() |
\(F = \begin{bmatrix} 6.35918894e-07 & 7.79037556e-06 & -6.51790205e-03\\ -7.26010653e-06 & 1.71216453e-06 & 7.68984491e-04\\ 6.34554497e-03 & -3.55478347e-03 & 1.00000000e+00\\ \end{bmatrix}\)
Remote
Viewpoint 1 & Viepoint 2 (Both points and epipolar lines) (8pt algorithm) |
---|
![]() |
Viewpoint 1 & Viepoint 2 (Both points and epipolar lines) (7pt algorithm) |
---|
![]() |
% of inliers vs. # of RANSAC iterations (blue: 8pt, orange: 7pt) |
---|
![]() |
\(F = \begin{bmatrix} 1.71791593e-06 & -8.04922051e-06 & 3.07539900e-03 \\ 7.84457044e-06 & 3.34304187e-06 & -4.19533878e-03 \\ -4.71734082e-03 & 1.33775959e-03 & 1.00000000e+00 \\ \end{bmatrix}\)
Explanation
Explanation of RANSAC Implementation in : The RANSAC
algorithm is used to robustly estimate the fundamental matrix in the
presence of noisy point correspondences, based on the algorithm
implemented in q1
. The algorithm repeats
n_iterations
times to find the best estimate of
F
.
Draw Minimum Set of Correspondences: Randomly sample the minimum required points—8 points for the 8-point algorithm and 7 points for the 7-point algorithm.
Fit Fundamental Matrix: Estimate the fundamental
matrix (F
) using the sampled points based on the method
described in q1
.
Count Inliers: Calculate the number of
correspondences whose distances to the fitted epipolar lines are below
the threshold thr
. These are considered inliers.
Refine Estimate if Enough Inliers: If the ratio
of inliers to total points exceeds d_min
, re-estimate
F
using all inliers to obtain a more accurate
model:
if (len(inliers) / num_points) > d_min:
= pts1[inliers]
inlier_pts1 = pts2[inliers]
inlier_pts2 = estimate_fundamental_matrix(inlier_pts1, inlier_pts2)
F_candidate
# Recompute the inliers using the updated fundamental matrix
= np.abs(np.sum(pts2_h @ F_candidate * pts1_h, axis=1))
errors = np.where(errors < thr)[0] inliers
The hyperparameter was manually set to either \(thr = 0.001\) or \(thr = 0.01\), as it is sensitive to the image size (when not normalized) and the proportion of inliers. Additionally, d_min was set to 0.3 based on the provided hint.
Due to the computation burden, I randomly sampled 1/5 of the total points for visualization.
The triangulation problem is based on the relationship:
\(x \times P X = 0, \quad x' \times P' X = 0.\)
This leads to the formulation of the constraint matrix \([x]_{\times} P X = 0\), which can be implemented in code as follows:
= np.zeros((num_points, 4, 4))
A 0] = pts1[:, 1, np.newaxis] * P1[2] - P1[1]
A[:, 1] = P1[0] - pts1[:, 0, np.newaxis] * P1[2]
A[:, 2] = pts2[:, 1, np.newaxis] * P2[2] - P2[1]
A[:, 3] = P2[0] - pts2[:, 0, np.newaxis] * P2[2] A[:,
The 3D point is then recovered by solving this system \(AX = 0\) using SVD:
= np.linalg.svd(A)
_, _, Vt = Vt[:, -1] X
I used COLMAP
for Structure from Motion (SfM). I
recorded a short video of an indoor scene consisting of a table and
chairs. Due to the variety of textures present, the scene was
reconstructed successfully
Input images (Video) | Reconstructed Points |
---|---|
![]() |
![]() |
Following that, I questioned whether an object with minimal texture could also be reconstructed. To test this, I recorded a short video of a black cup with low texture. The textureless black regions could not be reconstructed, while the red text and the boundaries were successfully reconstructed.
Input images (Video) | Reconstructed Points |
---|---|
![]() |
![]() |
For feature matching, I used cv2 features. Specifically, the SIFT algorithm was used to extract features, and cv2.FlannBasedMatcher was applied for feature matching. From all the matches, only the best matches were selected. The feature matching results are shown below.
# Applying SIFT detector
= cv2.SIFT_create()
sift = sift.detectAndCompute(gray_1, None)
kp_1, des_1 = sift.detectAndCompute(gray_2, None)
kp_2, des_2
= 1
FLANN_INDEX_KDTREE = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
index_params = dict(checks=100) # Increase the number of checks for more accurate matching
search_params = cv2.FlannBasedMatcher(index_params, search_params)
flann = flann.knnMatch(des_1, des_2, k=2)
matches
# Apply a stricter ratio test to find good matches
= []
good_matches for m, n in matches:
if m.distance < 0.68 * n.distance:
good_matches.append(m)
Subsequently, the RANSAC variant of the 8-point algorithm was used to estimate the fundamental matrix. The results are shown below.
Input images 1 | Input images 2 | Feature Matching |
---|---|---|
![]() |
![]() |
![]() |
Results |
---|
![]() |
The picture taken at Abu Dhabi where I visited for the IROS last week.
Input images 1 | Input images 2 | Feature Matching |
---|---|---|
![]() |
![]() |
![]() |
Results |
---|
![]() |