16-822 HW2

16-822 Assignment2

Q1: Camera matrix P from 2D-3D correspondences (30 points)

(a) Stanford Bunny (15 points)

Since we have the corresponding 2D and 3D points, we can get the equations for the parameters in camera matrix P. To solve these equations (or constraints), we use singular value decomposition. Finally, we can get the camera matrix P as:

[[ 6.43169368e+03 -2.94843744e+03 1.14658061e+03 2.22724350e+03] [-9.34819249e+02 -6.75486473e+03 2.02949013e+03 1.82218778e+03] [ 5.79307220e-01 -1.42385366e+00 -7.35268478e-01 1.00000000e+00]]

The visualization of surface points, and bounding box are shown as below:

q1a1 q1a2

(b) Cuboid (15 points)

Using the same method as in Q1(a), we annotate 6 pixel positions for the cuboid, and then try to map it in the 3D unit cube (side length=1). We can get the camera matrix P is:

[[-3.65943937e+01 5.30597004e+01 -3.02669429e+00 8.48080960e+01] [-1.28736036e+01 -1.08885879e+01 -5.64824172e+01 9.19991273e+01] [ 6.92863940e-02 1.11058784e-01 -4.01273802e-02 1.00000000e+00]]

I draw the annotations, and the edges of the cuboid as below:

q1b10 q1a1

Q2: Camera calibration K from annotations (40 points + 10 points bonus)

(a) Camera calibration from vanishing points (20 points)

The input image, annotated parallel lines, and the vanishing points & principal point figures are shown as follows:

q2a1 q2a2 q2a3

The K for the input image is:

[[1.15417802e+03 0.00000000e+00 5.75066005e+02] [0.00000000e+00 1.15417802e+03 4.31939090e+02] [0.00000000e+00 0.00000000e+00 1.00000000e+00]]

$\cos \theta = \frac{\mathbf{v_i}^T \omega \mathbf{v_j}}{\sqrt{\cdots}\sqrt{\cdots}}=0$ $\omega$ $\omega$ $\begin{bmatrix}\omega_1 & 0 & \omega_2 \\ 0 & \omega_1 & \omega_3 \\ \omega_2 & \omega_3 & \omega_4\end{bmatrix}$ $Ax=0$ $x$ $\omega$ $V^T$ $\omega$ $K$ $\omega=K^{-T}K^{-1}$ $K$ .

(b) Camera calibration from metric planes (20 points)

I used the default annotations, and they are shown as follows:

q2b1

The K for the input image is:

[[ 1.08447642e+03 -1.35121131e+01 5.20013594e+02] [ 0.00000000e+00 1.07900526e+03 4.02544642e+02] [ 0.00000000e+00 0.00000000e+00 1.00000000e+00]]

The angles between each pairs of planes I get are as follows:

	Angle between planes (degree)
Plane 1 & Plane 2	67.58
Plane 2 & Plane 3	92.25
Plane 2 & Plane 3	94.78

$h_1^T \omega h_2 = 0$ $h_1^T \omega h_1 - h_2^T \omega h_2=0$ $Ax=0$ $\omega$ $K$ .

(c) Camera calibration from rectangles with known sizes (10 points bonus)

The input image and the annotations are as follows:

q2c1 q2c2

The K for the input image is:

[[ 3.08588499e+03 -1.46138057e+02 1.59587545e+03] [ 0.00000000e+00 3.20926163e+03 1.67590571e+03] [ 0.00000000e+00 0.00000000e+00 1.00000000e+00]]

I evaluated the angles between 3 planes, and they are as follows:

	Angle between planes (degree)
Plane 1 & Plane 2	133.51
Plane 2 & Plane 3	80.69
Plane 2 & Plane 3	91.04

My implementation is similar to Q2(b), with the same equations. The only difference is the source points, since now we have rectangles rather than squares. For the first two rectangles, the source points now are [[0, 0, 1], [1.6, 0, 1], [1.6, 1, 1], [0, 1, 1]]; for the third rectangle, the source points are [[0, 0, 1], [0.7, 0, 1], [0.7, 1, 1], [0, 1, 1]]. Then we follow the same process as Q2(b).

Q3: Single View Reconstruction (30 points + 10 points bonus)

(a) (30 points)

The visualization of annotations and the output two views are as follows:

q2a1 q2a1 q2a1

$\mathbf{n^T X+a}=0$ $\mathbf{x}$ $\mathbf{X=K^{-1}x}$ $\mathbf{a}$ $\lambda=\frac{depth}{norm(\mathbf{X})}$ $\mathbf{a=-\lambda n^T X}$ $\mathbf{a}$ $\lambda$ $\lambda = \mathbf{\frac{-a}{n^T X}}$ $\mathbf{K^{-1}}$ $\lambda$ .

(b) (10 points bonus)

We run the same code on 3 different inputs with different annotations and known points. The visulizations are as follows:

q2a1 q2a1 q2a1