16-822: Geometry Based Methods in Vision¶

Assignment 2: Single View Reconstruction¶

Q1: Camera matrix from 2D-3D correspondences¶

(a) Stanford Bunny¶

Surface Points Bounding Box
Image 1 Image 2
$$ P = \begin{bmatrix} 6.12468612e-01 & -2.80769806e-01 & 1.09185025e-01 & 2.12092927e-01 \\ -8.90197009e-02 & -6.43243106e-01 & 1.93261536e-01 & 1.73520830e-01 \\ 5.51654830e-05 & -1.35588807e-04 & -7.00171505e-05 & 9.52266452e-05 \end{bmatrix} $$

(a) Cube¶

Coordinate system: The corner at the bottom center of the image is taken as the origin. The edge on the right is the x-axis, the edge on the left is the y axis, and the perpendicular edge is the z-axis.

Input Image Annotated 2D Points Result
Image 1 Image 2 Image 2
$$ P = \begin{bmatrix} -3.39842151e-01 & 2.89686042e-01 & 3.60814789e-02 & -5.18158407e-01 \\ 3.20477795e-02 & 4.64181213e-04 & 4.41192377e-01 & -5.78896132e-01 \\ -1.16386054e-04 & -9.44316625e-05 & 2.65774824e-05 & -7.46708068e-04 \end{bmatrix} $$

Q2: Intrinsics from Annotations¶

(a) Camera calibration from vanishing points¶

Algorithm:

  1. Compute vanishing points using cross product
  2. Use the relation that $x_1^T \omega x_2 = 0$ for vanishing points $x_1$ and $x_2$ where $\omega = K^{-T}K^{-1}$
  3. Solve for $\omega$ by creating a system of linear equations of the form $Ax = b$
  4. Use Cholesky decomposition to get $K^{-1}$, invert it to get $K$

Equations:

Since $K$ is an intrinsics matrix with zero skew and square pixels, it is of the form

$$ K = \begin{bmatrix} f & 0 & px \\ 0 & f & py \\ 0 & 0 & 1 \end{bmatrix} $$

This means $K^{-T}K^{-1}$ is of the form $$ K^{-T}K^{-1} = \begin{bmatrix} a & 0 & b \\ 0 & b & c \\ b & c & d \end{bmatrix} $$ Using this, we can rewrite $x_1^T \omega x_2 = 0$ as $Ax = b$ and solve. For points $(x_1, y_1, w_1)$ and $(x_2, y_2, w_2)$, each constraint is of the form $$ \begin{bmatrix} x_1x_2 + y_1y_2 & x_1w_2 + w_1x_2 & y_1w_2 + y_2w_1 \end{bmatrix} \begin{bmatrix} a \ b \ c \end{bmatrix} = -w_1w_2 $$

Input Image Annotated 2D Points Vanishing points and principal point
Image 1 Image 2 Image 2
$$ K = \begin{bmatrix} 1154 & 0 & 575 \\ 0 & 1154 & 431 \\ 0 & 0 & 1 \end{bmatrix} $$

(b) Camera calibration from metric planes¶

Algorithm:

  1. Find a homography which would convert a unit square centered at (0.5,0.5) to the annotated squares
  2. For each homography, use the constraints $h_1^T \omega h_2 = 0$ and $h_1^T \omega h_1 = h_2^T \omega h_2$ where $h_i$ is the ith column of the H matrix
  3. Arrange the constraints in a matrix to generate a system of equations of the form $Ax = 0$
  4. Perform SVD of $A$ and use the last singular vector as $\omega$
  5. Use Cholesky decomposition to get $K^{-1}$, invert it to get $K$

Equations:

Since we know nothing about $K$, we $\omega$ matrix is of the form $$ K = \begin{bmatrix} a & b & d \\ b & c & e \\ d & e & f \end{bmatrix} $$ For each $h_1 = (x_1, y_1, w_1)$ and $h_2 = (x_2, y_2, w_2)$, we get $$ \begin{bmatrix} x_1x_2 & x_1y_2 + y_1x_2 & y_1y_2 & x_1w_2 + w_1x_2 & y_1w_2 + w_1y_2 & w_1w_2 \\ x_1^2-x_2^2 & 2(x_1y_1-x_2y_2) & y_1^2 - y_2^2 & 2(x_1w_1 - x_2w_2) & 2(y_1w_1 - y_2w_2) & w_1^2-w_2^2 \end{bmatrix} \begin{bmatrix} a \\ b \\ c \\ d \\ e \\ f \end{bmatrix} = 0 $$

Input Image Annotated Squares
Image 1 Image 2
Angle
Plane 1 & 2 67.603
Plane 2 & 3 94.805
Plane 1 & 3 92.250
$$ K = \begin{bmatrix} 1095 & -14.7 & 520 \\ 0 & 1079 & 401 \\ 0 & 0 & 1 \end{bmatrix} $$

Q3: Single View Reconstruction¶

Algorithm:

  1. Find the direction vectors of two sides for each plane by finding the vanishing point
  2. Perform cross product to get the direction of the normal in the camera frame
  3. Set the distance of the first corner of the first plane to an arbitrary number
  4. Use this distance to find the complete equation of the plane
  5. Since the first plane intersects all other planes, use common points to find the complete equation of the other planes
  6. For each plane, get all the pixels corresponding to it and find the ray-plane intersection for each pixel to get the point cloud

Equations:

  1. Ray direction: $K^{-1}x$ where $x$ is a 2D homogeneous point
  2. Ray-plane intersection $t = \frac{-d} {(n_x \cdot x + n_y \cdot y + n_z \cdot z)} $ where $n$ is the normal vector and $\begin{bmatrix} x & y & z \end{bmatrix}$ is the ray direction.
Input Image Annotation View 1 View 2
Image 1 Image 2 Image 2 Image 2

Note: Point count was reduced to prevent lag