HW2: Single-view Reconstruction¶

Late day image:

No description has been provided for this image

Q1: Camera matrix `P` from 2D-3D correspondences (30 points)¶

In this question, your goal is to compute P from 2D-3D point correspondences.

Instructions

Compute the camera matrix P using the provided 2D-3D correspondences.

Camera matrix P:

[[ 6.43169368e+03 -2.94843744e+03  1.14658061e+03  2.22724350e+03]
 [-9.34819249e+02 -6.75486473e+03  2.02949013e+03  1.82218778e+03]
 [ 5.79307220e-01 -1.42385366e+00 -7.35268478e-01  1.00000000e+00]]

We provide a set of 3D surface points in data/q1/bunny_pts.npy. Project these points to the image using your calculated P. See the example below.
We provide the 12 edges of the bounding box in data/q1/bunny_bd.npy. Each line contains 6 numbers, where every 3 numbers denote 1 point. Project these points to the image and draw the cuboid. See the example below.

Surface Points	Bounding Box

(b) Cuboid (15 points)¶

Instructions

Find (or capture) 1 image of a cuboid. Come up with a 3D coordinate system (by measuring relative dimensions of the cuboid) and annotate 6 pairs of point correspondences.
Compute the camera matrix P using your annotated 2D-3D correspondences.

Camera matrix:

[[-2.26628047e+02  2.44924626e+02  3.20464669e-01  3.24470287e+02]
 [-7.06582192e+01 -9.59302995e+01 -2.82428673e+02  5.85450909e+02]
 [ 9.50952606e-02  8.03295458e-02  8.58556714e-03  1.00000000e+00]]

Draw the edges of the cuboid using your calculated P or do something fun!

Input Image	Annotated 2D points	Example Result

Q2: Camera calibration `K` from annotations (40 points + 10 points bonus)¶

(a) Camera calibration from vanishing points (20 points)¶

In this question, your goal is to compute K from a triad of orthogonal vanishing points, assuming that the camera has zero skew, and that the pixels are square.

Submission

Output plots of the vanishing points and the principal point.

Input Image	Annotated Parallel Lines	Vanishing points and principal point

Vanishing point 1: [-1204.64633052  1425.62820743]
Vanishing point 2: [ 559.88532351 -935.83692793]
Vanishing point 3: [1859.40405616 1391.62090484]
Principal point: [404.88101638 627.13739478]

Report K for the input image.

Camera matrix:

[[1.15526035e+03 0.00000000e+00 5.70765107e+02]
 [0.00000000e+00 1.15526035e+03 4.44291290e+01]
 [0.00000000e+00 0.00000000e+00 1.00000000e+00]]

Brief description of your implementation (i.e., the algorithm followed with relevant equations).### Brief Description: Algorithm to Compute Camera Intrinsics $\mathbf{K}$ Using Vanishing Points

The goal is to estimate the camera intrinsic matrix*$\mathbf{K}$ using three orthogonal vanishing points $\mathbf{v}_1$, $\mathbf{v}_2$, and $\mathbf{v}_3$.

Vanishing points $\mathbf{v}_i$ (in homogeneous coordinates) correspond to orthogonal directions in 3D. These points must satisfy the orthogonality constraint with respect to the camera intrinsics:

$$ \mathbf{v}_i^\top \mathbf{K}^{- \top} \mathbf{K}^{-1} \mathbf{v}_j = 0 \quad \text{for} \ i \neq j $$
The dual absolute conic is defined as:

$$ \omega = \mathbf{K}^{- \top} \mathbf{K}^{-1} $$

$\omega$ is a symmetric matrix with 5 unknowns. In this case, we assume that the skew is zero and the pixels are square, so we can compute $\omega$ using 2 vanishing points
Solve linear equations:

$$ A \mathbf{w} = 0 $$

where $\mathbf{w}$ is the vector form of $\omega$. Solving this using SVD provides $\mathbf{w}$, which gives the elements of $\omega$.
Using cholesky decomposition of $\omega^{-1}$, we recover $\mathbf{K}$. The matrix is normalized such that $\mathbf{K}_{33} = 1$.

(b) Camera calibration from metric planes (20 points)¶

In this question, your goal is to compute K from image of three squares. Different from (a), you will not make any additional assumption on K (except that it is a projective camera).

Submission

Visualizations of annotations that you used. See the following figure as an example:

Input Image	Annotated Square 1	Annotated Square 2	Annotated Square 3

Evaluate angles between each pair of planes. This will reflect the correctness of your calibration result.

From my implementation, I got the following angles between the planes:

Angle between plane 1 and plane 2: 112.42 degrees or 67.58 degrees
Angle between plane 1 and plane 3: 92.25 degrees or 87.75 degrees
Angle between plane 2 and plane 3: 85.22 degrees or 94.78 degrees

The angles are close to provided values, which indicates that the calibration is correct.

	Angle between planes(degree)
Plane 1 & Plane 2	67.40
Plane 1 & Plane 3	92.22
Plane 2 & Plane 3	94.70

Report K for the input image.

Camera matrix:

[[ 1.08447642e+03 -1.35121131e+01  5.20013594e+02]
 [ 1.17407507e-13  1.07900526e+03  4.02544642e+02]
 [-0.00000000e+00  0.00000000e+00  1.00000000e+00]]

Brief description of your implementation (i.e., the algorithm followed with relevant equations).

The goal is to compute the camera intrinsic matrix $\mathbf{K}$ using homographies derived from squares in an image.

Compute Homographies For each square in the image, compute the homography that maps the points of the square in the image to a reference (template) square in world coordinates.

Construct the system of equations based on the homography equation: $$ \mathbf{H} \mathbf{X}_{world} = \mathbf{X}_{image} $$
Compute the homography matrix $\mathbf{H}$ by solving $A \mathbf{h} = 0$ using singular value decomposition (SVD).
Normalize the homography matrix

Set Up Constraints Using Homographies Use the computed homographies to set up constraints on the dual absolute conic, which is related to the camera intrinsic matrix $\mathbf{K}$.

For each homography $\mathbf{H}_i$, extract its columns $\mathbf{h}_1$ and $\mathbf{h}_2$.
Use the orthogonality constraint of the image of the absolute conic: $$ \mathbf{h}_1^\top \omega \mathbf{h}_2 = 0 $$ where $\omega = \mathbf{K}^{- \top} \mathbf{K}^{-1}$.
Form a system of linear equations based on the homographies, solving for the unknowns in $\omega$.

Solve for the Dual Absolute Conic $\omega$ and Recover $\mathbf{K}$ The dual absolute conic $\omega$ is a symmetric $3 \times 3$ matrix with 5 unknowns, which we denote as $\omega = [\omega_{11}, \omega_{12}, \omega_{22}, \omega_{13}, \omega_{23}, \omega_{33}]$. However, in this case, we assume that the skew is zero and the pixels are square, so we can compute $\omega$ using 2 vanishing points.

Set up a linear system by stacking the constraints from multiple homographies: $$ A \mathbf{w} = 0 $$ where $\mathbf{w}$ contains the elements of $\omega$.
Solve the system using SVD to find the elements of $\omega$.
Recover $\mathbf{K}$ from $\omega^{-1}$ using Cholesky decomposition.
Normalize $\mathbf{K}$ such that $\mathbf{K}_{33} = 1$.

Evaluate the Angles Between Planes Compute the angles between the planes defined by the squares in the image. This can be done by computing the dot product of the normal vectors of the planes.

Get the normal vectors using vanishing points.
Compute the angles between the planes using the dot product formula.

Q3: Single View Reconstruction (30 points + 10 points bonus)¶

(a) (30 points)¶

Submissions

Output reconstruction from at least two different views. Also include visualizations of annotations that you used. See the following figure as an example:

Description	Image
Input Image
All Annotations
Annotations for Reconstruction
Reconstruction View 1
Reconstruction View 2

Brief description of your implementation (i.e., the algorithm followed with relevant equations).

In this question, the goal is to perform single-view reconstruction of a colored point cloud from an image with annotated plane boundaries using the following steps:

Read provided image and annotations to check the plane boundaries
- Check the order of annotated points in each plane.
- Visualize the annotated planes in the image.
- Select the reference point for each plane and set its depth. In my implementation, I choose the points that shared by many planes and set the depth to a constant. This can be found in the Annotations image above, the two white points are the reference points and I set their depth both to 1.
Get all points for each plane
- create a mask for the whole image and use cv2.fillPoly to fill the mask with all points in the plane area with the same plane index.
- Also, get the color of each point in the plane.
Compute the camera intrinsic matrix $\mathbf{K}$ using vanishing points
- Compute vanishing points from annotated parallel lines.This is done by finding the intersection of two lines formed by the edges of the plane using the cross product: $\mathbf{vp} = \mathbf{l_1} \times \mathbf{l_2}$.
- Estimate the camera intrinsic matrix $\mathbf{K}$ using the vanishing points. The camera intrinsic matrix $\mathbf{K}$ is estimated by solving a system of equations that represent the relationship between vanishing points and the dual absolute conic: $$ \mathbf{v}_i^\top \mathbf{K}^{- \top} \mathbf{K}^{-1} \mathbf{v}_j = 0 \quad \text{for} \ i \neq j $$
Compute the 3D coordinates of all points on the annotated planes
- Compute the normal vector of each plane using the vanishing points. The normal vector is derived by back-projecting the vanishing points into 3D using the camera intrinsic matrix $\mathbf{K}$. $\mathbf{d_1} = \mathbf{K}^{-1} \mathbf{vp_1}, \quad \mathbf{d_2} = \mathbf{K}^{-1} \mathbf{vp_2}$. Then,the normal vector is the cross product of these two direction vectors: $\mathbf{n} = \mathbf{d_1} \times \mathbf{d_2}$.
- For each plane, use the reference point to get the plane equation: $\mathbf{n}^\top \mathbf{X} + a = 0$. $\mathbf{X} = \mathbf{K}^{-1} \mathbf(p_{2D})$, where $p_{2D}$ is the 2D homogeneous coordinates of the point. Then, $a$ can be calculated using the reference point.
- Compute the 3D coordinates of each point on the plane by intersecting the ray from the image point with the plane.
Visualize the 3D point cloud
- Plot the 3D points in a 3D plot with the color of each point.

The final output consists of the computed 3D coordinates of all points on the annotated planes, visualized in 3D space. The normal vectors and camera intrinsic matrix $\mathbf{K}$ are also calculated as intermediate results.

Camera matrix: 
[[832.3588667    0.         745.40040869]
 [  0.         832.3588667  440.98046246]
 [  0.           0.           1.        ]]
Plane equation for plane 1: [-0.77778781 -0.19033158  0.59901587 -0.79341617]
Plane equation for plane 2: [-0.66454733  0.09678914 -0.74095122  0.53978188]
Plane equation for plane 3: [-0.00953907 -0.94913132 -0.31473598  0.25648284]
Plane equation for plane 4: [ 0.46454971  0.48702624  0.73959381 -0.46913142]
Plane equation for plane 5: [ 0.42165201 -0.70458315 -0.57076455  0.48888422]