**Geometry-based Methods in Vision** **Assignment 2** - Camera Calibration and Single-view Reconstruction Camera matrix P from 2D-3D correspondences ============================================================== Stanford bunny -------------------------------------------------------------- The computed camera matrix P is
[[ 6.43169368e+03     -2.94843744e+03     1.14658061e+03     2.22724350e+03]
[-9.34819249e+02     -6.75486473e+03     2.02949013e+03     1.82218778e+03]
[ 5.79307220e-01     -1.42385366e+00     -7.35268478e-01     1.00000000e+00]]
| Input Image | Annotated 2D points | |----------|---------| |

| | Surface Points | Bounding Box | |----------|---------| |

| Cuboid -------------------------------------------------------------- The computed camera matrix P is
[[-1.46811226e+00     -2.26996553e+01     1.12543991e+01     1.70114482e+02]
[-2.03818484e+01     -8.35820483e+00     -8.40506796e+00     1.91191584e+02]
[-1.37133262e-02     1.90548409e-03     1.93440977e-03     1.00000000e+00]]
| Input Image | Annotated 2D points | |----------|---------| |

| | Edges using P | G for Geometry | |----------|---------| |

| Camera calibration K from annotations ============================================================== Camera calibration from vanishing points -------------------------------------------------------------- The computed camera calibration K is
[[1.15417802e+03     0.00000000e+00     5.75066005e+02]
[0.00000000e+00     1.15417802e+03     4.31939090e+02]
[0.00000000e+00     0.00000000e+00     1.00000000e+00]]
| Input Image | Annotated parallel lines | Vanishing points and principal point |----------|---------|-----------| |

| The principal point is - (575.066, 431.939) **Implementation details**
We start by computing the vanishing points from the annotated parallel lines by taking the cross product. Each pair of vanishing points are orthogonal and relate to $\omega$ by $$v_i^T \omega v_j = 0$$ Given the assumption that camera has zero skew and pixels are square, $\omega$ can be written as - $$\omega = \begin{bmatrix} w1 & 0 & w2 \\ 0 & w1 & w3 \\ w2 & w3 & w4 \end{bmatrix}$$ We can use pairs of vanishing points $(v1, v2)$ to form a linear system of equations of the form $A.w = 0$ where each row is
$$a = [v1[0]*v2[0] + v1[1]*v2[1], v1[0]*v2[2]+v1[2]*v2[0], v1[1]*v2[2]+v1[2]*v2[1], v1[2]*v2[2]]$$ We can then apply SVD on $A$ and solve for $w$. With $w$, we get $\omega$ using $$\omega = [[w[0], 0, w[1]], [0, w[0], w[2]], [w[1], w[2], w[3]]]$$ Finally we get K by performing Cholesky Decomposition on $\omega$ since $\omega = K^{-T} K^{-1}$ Camera calibration from metric planes --------------------------------------------------------------

The computed camera calibration K is
[[ 1.07692614e+03     -4.52638065e+00     5.11568923e+02]
[ 0.00000000e+00     1.07626752e+03     3.95526278e+02]
[ 0.00000000e+00     0.00000000e+00     1.00000000e+00]]
| Planes | Angle between planes (degree) | |----------|---------| | Plane 1 & Plane 2 | 67.28 | | Plane 2 & Plane 3 | 94.71 | | Plane 3 & Plane 1 | 92.20 | **Implementation details**
Here we use the squares with known metric un-rectification points to compute $K$. We start by computing the homography matrices which projects points $(0, 1), (1, 1), (1, 0), (0, 0)$ to the corners of the squares in the image. We get three homographies for the three squares. Each pair of homographies gives the constraints $$h_1^T \omega h_1 = 0 \\ h_2^T \omega h_2 = 0 \\ h_1^T \omega h_2 = 0$$ We get 2 constraints from each squares and since $\omega$ has 5 DoF, we can use these constraints in the form of $A.w = 0$ to get $A$ and we can get $\omega$ by performing SVD on A. Then, $K$ can be obtained from $\omega$ using Cholesky Decomposition. Camera calibration from rectangles with known sizes --------------------------------------------------------------

The computed camera calibration K is
[[ 2.06025196e+03     -3.80169688e+01     1.37094015e+03]
[ 0.00000000e+00     2.18700850e+03     1.17644879e+03]
[ 0.00000000e+00     0.00000000e+00     1.00000000e+00]]
| Planes | Angle between planes (degree) | |----------|---------| | Plane 1 & Plane 2 | 94.51 | | Plane 2 & Plane 3 | 89.89 | | Plane 3 & Plane 1 | 90.23 | **Implementation details**
We can use the same implementation details as the previous question except that there are rectangles in the image instead of squares. So the metrically rectified points should have approriate changes in the coordinate values. We can either give exact dimensions since we know the dimensions of the rectangles or we can give any values as long as the ratio of width to height remains the same. In my image, all 3 rectangles have the same width/height ratio = 1.5. So I begin with $(0, 0), (1.5, 0), (1.5, 1), (0, 1)$ for all the rectangles and find their homographies. The rest of the process is the same as before. Single View Reconstruction ============================================================== The computed camera calibration K is
[[808.5202361     0.     510.71188676]
[ 0.     808.5202361     363.63611542]
[ 0.     0.     1. ]]
| Input Image | Annotations | |----------|---------| |

| | Reconstruction View 1 | Reconstruction View 2 | |----------|---------| |

| | Reconstruction View 3 | Reconstruction View 4 | |----------|---------| |

| **Implementation details**
We take 3 pairs of parallel lines from the annotations given and compute the calibration matrix $K$ as described in q2a previously. For each plane we can calculate the vanishing point using the annotation and we can compute normal to the plane using $$n = K^{T} v$$. We can then get the ray for points on the plane using $$r = K^{-1} x$$. We then choose a reference point, assign its depth as 1 on the plane and compute the distance of all points on the plane from the reference point using the equation $$n^{T}.x + a = 0$$ We find the intersection of the ray with the plane to obtain the 3D coordinates for all the points in the plane. Then, we choose a point common to another plane, take that as the reference point and repeat the above process for that plane. By repeating this process for all the planes, we get 3D coordinates of all the points in all the planes which can then be plotted in 3D.