HW2: Single-view Reconstruction

Q1: Camera matrix $\mathbf{P}$ from 2D-3D correspondences

(a) Stanford Bunny

Description

Normalize the Coordinates

Compute a similarity transformation $T$ such that the centroid of the points in the image is moved to the origin, and the maximum distance from the origin is scaled to $\sqrt{2}$.

Formulate the Equation

For each pair of correspondences $(x,y,z)$ to $(u,v)$, expanding the equation $(\mathbf{P}(x,y,z,1)^T) \times (u,v,1)^T = 0$, we obtain the following expressions:

\[ x p_{11} + y p_{12} + z p_{13} + p_{14} - u x p_{31} - u y p_{32} - u z p_{33} - u p_{34} = 0 \]

\[ x p_{21} + y p_{22} + z p_{23} + p_{24} - v x p_{31} - v y p_{32} - v z p_{33} - v p_{34} = 0 \]

Combining these equations results in a linear equation for $\mathbf{P}$.

Solve for $\mathbf{P}$

Using Singular Value Decomposition (SVD) to solve for $\mathbf{P}$, the final result is given by $\mathbf{T}^{-1} \mathbf{P}$.

Output Images

Input Image	Annotated 2D points	Surface Points	Bounding Box

(b) Cuboid

Description

Use the same function as in question (a) to derive $\mathbf{P}$. By drawing the middle lines of each face of the cube and coloring each face with different colors, we create a cube that resembles a fidget cube.

Output Images

Input Image	Annotated 2D points	Result

Q2: Camera calibration $\mathbf{K}$ from annotations

(a) Camera calibration from vanishing points

Description

Compute Vanishing Points

We can easily compute the vanishing points of the 3 pairs of parallel lines using the cross product operation.

Formulate the Equation for $\omega$ and Solve for $\omega$

Let

\[ \omega = \begin{bmatrix} a & 0 & b \\ 0 & a & c \\ b & c & d \end{bmatrix} \]

For each pair of vanishing points $\mathbf{x} = (x_0, x_1, x_2)$ and $\mathbf{y} = (y_0, y_1, y_2)$ corresponding to perpendicular directions, we have the equation $\mathbf{x}^T \omega \mathbf{y} = 0$. Expanding this, we get:

\[ (x_0 y_0 + x_1 y_1) a + (x_0 y_2 + x_2 y_0) b + (x_1 y_2 + x_2 y_1) c + x_2 y_2 d = 0 \]

Combining these equations results in a linear equation for $\mathbf{P}$.

We then use SVD to solve for $\omega$.

Compute $\mathbf{K}$$

We use Cholesky decomposition to compute $\mathbf{K}$. Let $\omega = \mathbf{L} \mathbf{L}^T$, then $\mathbf{K}$ is given by $\mathbf{L}^{-T}$. The principal point is the last column of $\mathbf{K}$.

Output Images

Input Image	Annotated Parallel Lines	Vanishing points and principal point

The output $\mathbf{K}$ is

\[ \begin{bmatrix} 1154.18 & 0 & 575.07 \\ 0 & 1154.18 & 431.94 \\ 0 & 0 & 1 \end{bmatrix} \]

(b) Camera calibration from metric planes

Description

Normalize the cooridinates

Normalize the cooridiantes of the image so that all pixels in the image is in the range of [-1,1]

Compute Homography

We compute the homography $\mathbf{H}$ from a square on the plane to the square in the image using the 4 annotations of the square.

Formulate the Equation for $\omega$ and Solve for $\omega$

Let

\[ \mathbf{H} = [\mathbf{h}_1, \mathbf{h}_2, \mathbf{h}_3] \]

Using the fact that $\mathbf{H}\mathbf{I}$ lies on the conic $\omega$, we have:

\[ (\mathbf{h}_1 \pm i \mathbf{h}_2)^\mathsf{T} \boldsymbol{\omega} (\mathbf{h}_1 \pm i \mathbf{h}_2) = 0 \]

Therefore, we derive the following conditions:

\[ \mathbf{h}_1^\mathsf{T} \boldsymbol{\omega} \mathbf{h}_2 = 0 \]

\[ \mathbf{h}_1^\mathsf{T} \boldsymbol{\omega} \mathbf{h}_1 = \mathbf{h}_2^\mathsf{T} \boldsymbol{\omega} \mathbf{h}_2 = 0 \]

Combining all these equations results in a linear equation for $\omega$, which we solve using SVD.

Compute $\mathbf{K}$

The procedure to compute $\mathbf{K}$ is the same as in question (a).

Compute the Angle Between Points

For a plane $\pi$, we first find the horizontal line $\mathbf{l}$ of $\pi$ by finding the vanishing points of two pairs of parallel lines. The normal direction $\mathbf{n}$ of $\pi$ is then given by $\omega^{-1} \mathbf{l}$. The angle between two planes is given by:

\[ \cos(\theta) = \frac{\mathbf{n}_1^\mathsf{T} \omega \mathbf{n}_2}{\sqrt{\mathbf{n}_1^\mathsf{T} \omega \mathbf{n}_1} \sqrt{\mathbf{n}_2^\mathsf{T} \omega \mathbf{n}_2}} \]

Output Images

Input Image	Annotated Square 1	Annotated Square 2	Annotated Square 3

	Angle between planes(degree)
Plane 1 & Plane 2	67.41
Plane 1 & Plane 3	92.25
Plane 2 & Plane 3	94.76

The output $\mathbf{K}$ is

\[ \begin{bmatrix} 1079.99 & -8.27 & 515.42 \\ 0 & 1078.14 & 399.70 \\ 0 & 0 & 1 \end{bmatrix} \]

(c) Camera calibration from rectangles with known sizes

Description

The procedure is the same as (b), except that we compute the homography from a rectangle on the plane to the rectangle in the image.

Output Images

Input Image	Annotated Rectangle 1	Annotated Rectangle 2	Annotated Rectangle 3

	Angle between planes(degree)
Plane 1 & Plane 2	94.12
Plane 1 & Plane 3	88.94
Plane 2 & Plane 3	121.72

The output $\mathbf{K}$ is

\[ \begin{bmatrix} 1127.83 & 24.74 & 742.33 \\ 0 & 1117.23 & 576.02 \\ 0 & 0 & 1 \end{bmatrix} \]

Q3: Single View Reconstruction

(a)

Description

Compute $\mathbf{K}$ using the method from question 2(a).
Compute the ray for each pixel using the formula $\mathbf{d} = \mathbf{K}^{-1} \mathbf{x}$.
Compute the normal of each annotated plane by determining the vanishing points for 2 sets of parallel lines on the plane.
If the position of a point on the plane is known, the 3D coordinates of all points on the plane can be computed via ray-plane intersection.

By repeating step 4, the 3D position of each point in the image can be constructed.

Output Images

Input Image	Annotations	Reconstruction View 1	Reconstruction View 2

(b)

Input Image	Output

HW2: Single-view Reconstruction

Q1: Camera matrix \(\mathbf{P}\) from 2D-3D correspondences

(a) Stanford Bunny

Description

Output Images

(b) Cuboid

Description

Output Images

Q2: Camera calibration \(\mathbf{K}\) from annotations

(a) Camera calibration from vanishing points

Description

Output Images

(b) Camera calibration from metric planes

Description

Output Images

(c) Camera calibration from rectangles with known sizes

Description

Output Images

Q3: Single View Reconstruction

(a)

Description

Output Images

(b)