16-822: Geometry Based Methods in Vision¶
Assignment 2: Single View Reconstruction¶
Q1: Camera matrix from 2D-3D correspondences¶
(a) Stanford Bunny¶
Surface Points |
Bounding Box |
 |
 |
$$
P = \begin{bmatrix} 6.12468612e-01 & -2.80769806e-01 & 1.09185025e-01 & 2.12092927e-01 \\
-8.90197009e-02 & -6.43243106e-01 & 1.93261536e-01 & 1.73520830e-01 \\
5.51654830e-05 & -1.35588807e-04 & -7.00171505e-05 & 9.52266452e-05
\end{bmatrix}
$$
(a) Cube¶
Coordinate system: The corner at the bottom center of the image is taken as the origin. The edge on the right is the x-axis, the edge on the left is the y axis, and the perpendicular edge is the z-axis.
Input Image |
Annotated 2D Points |
Result |
 |
 |
 |
$$
P = \begin{bmatrix} -3.39842151e-01 & 2.89686042e-01 & 3.60814789e-02 & -5.18158407e-01 \\
3.20477795e-02 & 4.64181213e-04 & 4.41192377e-01 & -5.78896132e-01 \\
-1.16386054e-04 & -9.44316625e-05 & 2.65774824e-05 & -7.46708068e-04
\end{bmatrix}
$$
Q2: Intrinsics from Annotations¶
(a) Camera calibration from vanishing points¶
Algorithm:
- Compute vanishing points using cross product
- Use the relation that $x_1^T \omega x_2 = 0$ for vanishing points $x_1$ and $x_2$ where $\omega = K^{-T}K^{-1}$
- Solve for $\omega$ by creating a system of linear equations of the form $Ax = b$
- Use Cholesky decomposition to get $K^{-1}$, invert it to get $K$
Equations:
Since $K$ is an intrinsics matrix with zero skew and square pixels, it is of the form
$$
K = \begin{bmatrix} f & 0 & px \\
0 & f & py \\
0 & 0 & 1
\end{bmatrix}
$$
This means $K^{-T}K^{-1}$ is of the form
$$
K^{-T}K^{-1} = \begin{bmatrix} a & 0 & b \\
0 & b & c \\
b & c & d
\end{bmatrix}
$$
Using this, we can rewrite $x_1^T \omega x_2 = 0$ as $Ax = b$ and solve. For points $(x_1, y_1, w_1)$ and $(x_2, y_2, w_2)$, each constraint is of the form
$$
\begin{bmatrix}
x_1x_2 + y_1y_2 & x_1w_2 + w_1x_2 & y_1w_2 + y_2w_1
\end{bmatrix}
\begin{bmatrix}
a \ b \ c
\end{bmatrix}
= -w_1w_2
$$
Input Image |
Annotated 2D Points |
Vanishing points and principal point |
 |
 |
 |
$$
K = \begin{bmatrix} 1154 & 0 & 575 \\
0 & 1154 & 431 \\
0 & 0 & 1
\end{bmatrix}
$$
(b) Camera calibration from metric planes¶
Algorithm:
- Find a homography which would convert a unit square centered at (0.5,0.5) to the annotated squares
- For each homography, use the constraints $h_1^T \omega h_2 = 0$ and $h_1^T \omega h_1 = h_2^T \omega h_2$ where $h_i$ is the ith column of the H matrix
- Arrange the constraints in a matrix to generate a system of equations of the form $Ax = 0$
- Perform SVD of $A$ and use the last singular vector as $\omega$
- Use Cholesky decomposition to get $K^{-1}$, invert it to get $K$
Equations:
Since we know nothing about $K$, we $\omega$ matrix is of the form
$$
K = \begin{bmatrix} a & b & d \\
b & c & e \\
d & e & f
\end{bmatrix}
$$
For each $h_1 = (x_1, y_1, w_1)$ and $h_2 = (x_2, y_2, w_2)$, we get
$$
\begin{bmatrix}
x_1x_2 & x_1y_2 + y_1x_2 & y_1y_2 & x_1w_2 + w_1x_2 & y_1w_2 + w_1y_2 & w_1w_2 \\
x_1^2-x_2^2 & 2(x_1y_1-x_2y_2) & y_1^2 - y_2^2 & 2(x_1w_1 - x_2w_2) & 2(y_1w_1 - y_2w_2) & w_1^2-w_2^2
\end{bmatrix}
\begin{bmatrix}
a \\ b \\ c \\ d \\ e \\ f
\end{bmatrix}
= 0 $$
Input Image |
Annotated Squares |
 |
 |
|
Angle |
Plane 1 & 2 |
67.603 |
Plane 2 & 3 |
94.805 |
Plane 1 & 3 |
92.250 |
$$
K = \begin{bmatrix} 1095 & -14.7 & 520 \\
0 & 1079 & 401 \\
0 & 0 & 1
\end{bmatrix}
$$
Q3: Single View Reconstruction¶
Algorithm:
- Find the direction vectors of two sides for each plane by finding the vanishing point
- Perform cross product to get the direction of the normal in the camera frame
- Set the distance of the first corner of the first plane to an arbitrary number
- Use this distance to find the complete equation of the plane
- Since the first plane intersects all other planes, use common points to find the complete equation of the other planes
- For each plane, get all the pixels corresponding to it and find the ray-plane intersection for each pixel to get the point cloud
Equations:
- Ray direction: $K^{-1}x$ where $x$ is a 2D homogeneous point
- Ray-plane intersection $t = \frac{-d} {(n_x \cdot x + n_y \cdot y + n_z \cdot z)} $ where $n$ is the normal vector and $\begin{bmatrix} x & y & z \end{bmatrix}$ is the ray direction.
Input Image |
Annotation |
View 1 |
View 2 |
 |
 |
 |
 |
Note: Point count was reduced to prevent lag