16-822 Assignment 2: Single-view Reconstruction

16-822 Assignment 2: Single-view Reconstruction

Q1: Camera matrix P from 2D-3D correspondences (30 points)

(a) Stanford Bunny (15 points)

Camera matrix P:

$$ P = \begin{bmatrix} 6.43e+03 & -2.95e+03 & 1.15e+03 & 2.23e+03 \\ -9.35e+02 & -6.75e+03 & 2.03e+03 & 1.82e+03 \\ 5.79e-01 & -1.42 & -7.35e-01 & 1.00 \end{bmatrix} $$

Input Image	Annotated 2D Points	Surface Points	Bounding Box

(b) Cuboid (15 points)

Camera matrix P:

$$ P = \begin{bmatrix} 4.82e+01 & -3.50e+01 & -5.63e+00 & 1.85e+02 \\ -1.88e+01 & -2.02e+01 & -4.91e+01 & 1.28e+02 \\ 4.17e-02 & 3.15e-02 & -3.14e-02 & 1.00 \end{bmatrix} $$

Input Image	Annotated 2D Points	Bounding Box

Q2: Camera calibration K from annotations (40 points + 10 points bonus)

(a) Camera calibration from vanishing points (20 points)

Camera intrinsic K:

$$ K = \begin{bmatrix} 1.15e+03 & 0.00 & 5.75e+02 \\ 0.00 & 1.15e+03 & 4.32e+02 \\ 0.00 & 0.00 & 1.00 \end{bmatrix} $$

Principle point = K[:2, 2] = [575, 432]

Input Image	Annotated Parallel Lines	Vanishing points and principal point

Brief introduction of implementation:

Caculate the vanishing points corresponding to each pair of the parallel lines
Use each pair of orthogonal vanishing points to construct constrains on ω, following: v_i^Tωv_j = 0

which gives ((x_i,y_i,1) and (x_j,y_j,1) are coordinates of vanishing point i and j): $$ A=\left[\begin{array}{llll} x_{i} \cdot x_{j} + y_{i} \cdot y_{j} & x_{i}+x_{j} & y_{i}+y_{j} & 1 \end{array}\right] $$

Given that ω is symmetric and the camera has zero skew, and that the pixels are square, we can use 3 constraints to solve ω. $$ \omega = \begin{bmatrix} w1 & 0 & w2 \\ 0 & w1 & w3 \\ w2 & w3 & w4 \end{bmatrix} $$
Then, we conduct the singular value decomposition (SVD) to solve ω and intrinsic K can be computed from ω using Cholesky decomposition: ω = K^−TK⁻¹

(b) Camera calibration from metric planes (20 points)

Input Image	Annotated Square 1	Annotated Square 2	Annotated Square 3

Evaluate angles:

	Angle between planes(degree)
Plane 1 & Plane 2	67.60
Plane 1 & Plane 3	92.25
Plane 2 & Plane 3	94.80

Camera intrinsic K:

$$ K = \begin{bmatrix} 1.08e+03 & -1.47e+01 & 5.21e+02 \\ 0.00 & 1.08e+03 & 4.02e+02 \\ 0.00 & 0.00 & 1.00 \end{bmatrix} $$

Brief introduction of implementation:

Caculate the homograph H_i that maps (0,1), (1,1), (1,0), (0,0) to the annotated corners of the squares.
Use each H_i to construct constrains on ω, following: $$ \begin{gathered} h_1^T \omega h_2=0 \\ h_1^T \omega h_1=h_2^T \omega h_2 \end{gathered} $$
Given that ω has 6 unknows, we need three homographies to solve ω.
Then, we conduct the singular value decomposition (SVD) to solve ω and intrinsic K can be computed from ω using Cholesky decomposition: ω = K^−TK⁻¹

(c) Camera calibration from rectangles with known sizes (10 points bonus)

Input Image	Annotated Square 1	Annotated Square 2	Annotated Square 3

Evaluate angles:

	Angle between planes(degree)
Plane 1 & Plane 2	119.27
Plane 1 & Plane 3	78.14
Plane 2 & Plane 3	116.57

Camera intrinsic K:

$$ K = \begin{bmatrix} 881.40 & -16.09 & 684.85 \\ 0.00 & 894.16 & 496.52 \\ 0.00 & 0.00 & 1.00 \end{bmatrix} $$

Brief introduction of implementation:

Caculate the homograph H_i that maps (0,1), (ratio,1), (ratio,0), (0,0) to the annotated corners of the rectangles, where $ratio = x_{length} / y_{length} $. (Here the ratio of macbook is 1.413 and the ratio of the monitor is 1.778)
Use each H_i to construct constrains on ω, following: $$ \begin{gathered} h_1^T \omega h_2=0 \\ h_1^T \omega h_1=h_2^T \omega h_2 \end{gathered} $$
Given that ω has 6 unknows, we need three homographies to solve ω.
Then, we conduct the singular value decomposition (SVD) to solve ω and intrinsic K can be computed from ω using Cholesky decomposition: ω = K^−TK⁻¹

Q3: Single View Reconstruction (30 points + 10 points bonus)

(a) (30 points)

Input Image	Annotations	Reconstruction View 1	Reconstruction View 2

Brief introduction of implementation:

Use Q2a to compute K. Here we use opposite edges of the rectangle as the parallel lines
Compute the plane normals using n = K⁻¹v where v is the vanishing point.
Choose a reference 2D point p₀ and set its depth as 1. We can get the 3D coordinate of the reference point P₀ by

$$ P_0 = \frac{K^{-1} p_0 * depth }{ \| {K^{-1} p_0} \| }, $$

Then, we can construct the plane equation using plane normal and P₀

$$ \begin{gathered} l = [n , intercept] \\ intercept = n * P_0 \end{gathered} $$

Computer 3D coordinates or all other points (x,y) on the plane using

$$ P =r \frac{-d}{n \cdot r} $$ where $$ r = K^{-1} *\begin{bmatrix} x \\ y \\ 1 \end{bmatrix} $$

Repeat the above steps on all the planes. Calulate depths for new reference points if needed. Finally, we got the 3D reconstruction of the scene!