16-822: Geometry Based Methods in Vision¶

Assignment 2: Single View Reconstruction¶

Q1: Camera matrix from 2D-3D correspondences¶

(a) Stanford Bunny¶

Surface Points	Bounding Box

$$ P = \begin{bmatrix} 6.12468612e-01 & -2.80769806e-01 & 1.09185025e-01 & 2.12092927e-01 \\ -8.90197009e-02 & -6.43243106e-01 & 1.93261536e-01 & 1.73520830e-01 \\ 5.51654830e-05 & -1.35588807e-04 & -7.00171505e-05 & 9.52266452e-05 \end{bmatrix} $$

(a) Cube¶

Coordinate system: The corner at the bottom center of the image is taken as the origin. The edge on the right is the x-axis, the edge on the left is the y axis, and the perpendicular edge is the z-axis.

Input Image	Annotated 2D Points	Result

$$ P = \begin{bmatrix} -3.39842151e-01 & 2.89686042e-01 & 3.60814789e-02 & -5.18158407e-01 \\ 3.20477795e-02 & 4.64181213e-04 & 4.41192377e-01 & -5.78896132e-01 \\ -1.16386054e-04 & -9.44316625e-05 & 2.65774824e-05 & -7.46708068e-04 \end{bmatrix} $$

Q2: Intrinsics from Annotations¶

(a) Camera calibration from vanishing points¶

Algorithm:

Compute vanishing points using cross product
Use the relation that $x_1^T \omega x_2 = 0$ for vanishing points $x_1$ and $x_2$ where $\omega = K^{-T}K^{-1}$
Solve for $\omega$ by creating a system of linear equations of the form $Ax = b$
Use Cholesky decomposition to get $K^{-1}$, invert it to get $K$

Equations:

Since $K$ is an intrinsics matrix with zero skew and square pixels, it is of the form

$$ K = \begin{bmatrix} f & 0 & px \\ 0 & f & py \\ 0 & 0 & 1 \end{bmatrix} $$

This means $K^{-T}K^{-1}$ is of the form $$ K^{-T}K^{-1} = \begin{bmatrix} a & 0 & b \\ 0 & b & c \\ b & c & d \end{bmatrix} $$ Using this, we can rewrite $x_1^T \omega x_2 = 0$ as $Ax = b$ and solve. For points $(x_1, y_1, w_1)$ and $(x_2, y_2, w_2)$, each constraint is of the form $$ \begin{bmatrix} x_1x_2 + y_1y_2 & x_1w_2 + w_1x_2 & y_1w_2 + y_2w_1 \end{bmatrix} \begin{bmatrix} a \ b \ c \end{bmatrix} = -w_1w_2 $$

Input Image	Annotated 2D Points	Vanishing points and principal point

$$ K = \begin{bmatrix} 1154 & 0 & 575 \\ 0 & 1154 & 431 \\ 0 & 0 & 1 \end{bmatrix} $$

(b) Camera calibration from metric planes¶

Algorithm:

Find a homography which would convert a unit square centered at (0.5,0.5) to the annotated squares
For each homography, use the constraints $h_1^T \omega h_2 = 0$ and $h_1^T \omega h_1 = h_2^T \omega h_2$ where $h_i$ is the ith column of the H matrix
Arrange the constraints in a matrix to generate a system of equations of the form $Ax = 0$
Perform SVD of $A$ and use the last singular vector as $\omega$
Use Cholesky decomposition to get $K^{-1}$, invert it to get $K$

Equations:

Since we know nothing about $K$, we $\omega$ matrix is of the form $$ K = \begin{bmatrix} a & b & d \\ b & c & e \\ d & e & f \end{bmatrix} $$ For each $h_1 = (x_1, y_1, w_1)$ and $h_2 = (x_2, y_2, w_2)$, we get $$ \begin{bmatrix} x_1x_2 & x_1y_2 + y_1x_2 & y_1y_2 & x_1w_2 + w_1x_2 & y_1w_2 + w_1y_2 & w_1w_2 \\ x_1^2-x_2^2 & 2(x_1y_1-x_2y_2) & y_1^2 - y_2^2 & 2(x_1w_1 - x_2w_2) & 2(y_1w_1 - y_2w_2) & w_1^2-w_2^2 \end{bmatrix} \begin{bmatrix} a \\ b \\ c \\ d \\ e \\ f \end{bmatrix} = 0 $$

Input Image	Annotated Squares

	Angle
Plane 1 & 2	67.603
Plane 2 & 3	94.805
Plane 1 & 3	92.250

$$ K = \begin{bmatrix} 1095 & -14.7 & 520 \\ 0 & 1079 & 401 \\ 0 & 0 & 1 \end{bmatrix} $$

Q3: Single View Reconstruction¶

Algorithm:

Find the direction vectors of two sides for each plane by finding the vanishing point
Perform cross product to get the direction of the normal in the camera frame
Set the distance of the first corner of the first plane to an arbitrary number
Use this distance to find the complete equation of the plane
Since the first plane intersects all other planes, use common points to find the complete equation of the other planes
For each plane, get all the pixels corresponding to it and find the ray-plane intersection for each pixel to get the point cloud

Equations:

Ray direction: $K^{-1}x$ where $x$ is a 2D homogeneous point
Ray-plane intersection $t = \frac{-d} {(n_x \cdot x + n_y \cdot y + n_z \cdot z)} $ where $n$ is the normal vector and $\begin{bmatrix} x & y & z \end{bmatrix}$ is the ray direction.

Input Image	Annotation	View 1	View 2

Note: Point count was reduced to prevent lag