**16-822
Projective Geometry and Homography** **Anisha Jain (anishaja)** Overview =============================================================================== In this assignment, we explore the concepts of camera calibration and single view reconstruction. We first compute the camera matrix $P$ from 2D to 3D correspondences. We then compute the camera calibration matrix $K$ from annotations of vanishing points and metric planes. Finally, we perform single view reconstructions using the computed intrinsic matrix $K$. Camera Matrix $P$ from 2D to 3D correspondences =============================================================================== Standford Bunny ------------------------------------------------------------------------------- The projection matrix computted from the 2D to 3D correspondences is: $$P = \begin{bmatrix} 6.43169368e^3 & -2.94843744e^3 & 1.14658061e^3 & 2.22724350e^3 \\ -9.34819249e^2 & -6.75486473e^3 & 2.02949013e^3 & 1.82218778e^3 \\ 5.79307220e^{-1} & -1.42385366 & -7.35268478e^{-1} & 1.0 \end{bmatrix}$$ ![Annotated 2D points](media/q1_point_annotate.png) ![Surface points](media/q1_test_point_annotate.png) ![Bounding box](media/q1_line_annotate.png) Cuboid ------------------------------------------------------------------------------- The projection matrix computted from the 2D to 3D correspondences is: $$P = \begin{bmatrix} 1.58444340e^2 & -6.72514068e^1 & -2.18275214e^1 & 1.79608726e^3 \\ -2.28113088e^1 & -3.92422039e^1 & -1.74001422e^2 & 3.30614401e^3 \\ 1.22838792e^{-2} & 1.60287814e^{-2} & -1.28631657e^{-2} & 1.00000000 \end{bmatrix}$$ ![Input image and annotated 2D points](media/q1_cube_annotate.png) ![Cube edges using P](media/q1_cube_edges.png) ![Shapes on the faces](media/q1_cube_shapes.png) Camera Calibration $K$ from annotations =============================================================================== From vanishing points (using given annotations) ------------------------------------------------------------------------------- We compute K from a triad of orthogonal vanishing points with the assumption that the camera has zero skew and the pixels are sqaure. Below is a brief summary of the steps followed to compute K: 1. Compute the orthogonal vanishing points from the parallel lines. 2. Each pair of orthogonal vanishing points gives a constraint on $\omega$ $$v_i^T \omega v_j = 0$$ 3. We also know that $\omega$ is symmetric matrix with the assumption that camera has zero skew and pixels are sqaure. This simplifies $\omega$ to: $$\omega = \begin{bmatrix} a & 0 & e \\ 0 & a & f \\ e & f & c \end{bmatrix}$$ 4. Thus the constraints on $\omega$ and the unknowns can be written as $A x = 0$ where $x$ is the flattened vector of unknowns. 5. The solution for $x$ can be obtained by computing the SVD of $A$ and taking the last column of $V$. 6. The intrinsic matrix $K$ can be computed from $\omega$ using Cholesky decomposition: $$\omega = K^{-T} K^{-1}$$ Using the above steps, we compute the intrinsic matrix $K$ from the vanishing points as: $$ K = \begin{bmatrix} 1.1541781e^3 & 0.0000000 & 5.7506604e^2 \\ -1.0214895e^{13} & 1.1541781e^3 & 4.3193912e^2 \\ 0.00 & 0.00 & 1.00 \end{bmatrix}$$ The principal point is thus at $(575.06604, 431.93912)$. ![Input image](media/q2a.png) ![Annotated parallel lines](media/q2a_annotations.png) ![Vanishing point and principal point](media/q2a_vanishing_points.png) From vanishing points (using annotations of own) ------------------------------------------------------------------------------- We follow the same steps as above except that the annotations are now self annotated, which means that they are not perfect :( The intrinsic matrix $K$ computed using these annotations is: $$ K = \begin{bmatrix} 8.1273724e^2 & -5.7392771e^{-14} & 4.3947455e^2 \\ 1.4258761e^{-14} & 8.1273724e^2 & 6.9093457e^2 \\ -2.0531784e^{-16} & 0.00 & 1.00 \end{bmatrix} $$ The principal point is thus estimated to be at $(439.47455, 690.93457)$. ![Input image](media/q2a.png) ![Annotated parallel lines](media/q2a_ec_annotations.png) ![Vanishing point and principal point](media/q2a_ec_vanishing_points.png) This shows that annotation errors can lead to large errors in the computed intrinsic matrix. From metric planes - Squares (using given annotations) ------------------------------------------------------------------------------- We compute the intrinsic matrix $K$ from 3 annotated square planes. The steps followed are: 1. For each square, we compute the homogrpahy that maps (0, 1), (1, 1), (1, 0), (0, 0) to the corners of the square in the image. 2. The metric rectification computed for each of them gives us three homographies: $$Hs = {H_1, H_2, H_3}$$ 3. For each $H_i = [h_1, h_2, h_3]$, we have the following constraints, $$h_1^T \omega h_1 = h_2^T \omega h_2 = 0 \\ h_1^T \omega h_2 = 0$$ 4. Thus each $H_i$ gives us two constraints on $\omega$. We can thus write the constraints in the form $A x = 0$ where $x$ is the flattened vector of unknowns. We also, exploit the fact that $\omega$ is symmetric and thus we have 6 unknowns and 6 constraints. 5. The solution for $x$ can be obtained by computing the SVD of $A$ and taking the last column of $V$. 6. The intrinsic matrix $K$ can be computed from $\omega$ using Cholesky decomposition: $$\omega = K^{-T} K^{-1}$$ The intrinsic matrix $K$ computed using these annotations is: $$ K = \begin{bmatrix} 1.08447642e^3 & -1.35121131^e1 & 5.20013594e^2 \\ -1.17407507e^{-13} & 1.07900526e^3 & 4.02544642e^2 \\ -2.32155867e^{-16} & 3.691251151e^{-18} & 1.00 \end{bmatrix}$$ The principal point is thus at $(520.013594, 402.544642)$. We evaluate the angles between each pair of planes. The angles are: | Planes | Angle | |--------|-------| | 1, 2 | 67.27 | | 1, 3 | 92.16 | | 2, 3 | 94.68 | The visualization of the metric planes is shown below: ![Input image](media/q2b.png) ![Sqaure plane 1](media/q2b_plane_0.png) ![Sqaure plane 2](media/q2b_plane_1.png) ![Sqaure plane 3](media/q2b_plane_2.png) From metric planes - Rectangles (using annotations of own) ------------------------------------------------------------------------------- This is similar to the above case except that the annotations are now of rectangles. Hence, the homographies now map known height and width of the rectangle to the image annotations. In the below example, we use credit card size library cards and shopping card as the rectangles. It is known that credit cards are of size 3.375 inches x 2.125 inches. We exploit this known size to compute the intrinsic matrix $K$. The intrinsice matrix $K$ computed using these annotations is: $$ K = \begin{bmatrix} 1.4753640e^3 & -2.4937010e^1 & 8.5325555 \\ 9.4552553e^{-14} & 1.4688972e^3 & 6.9038965e^2 \\ 0.00 & 0.00 & 1.00 \end{bmatrix}$$ The principal point is thus at $(853.25555, 690.38965)$. We evaluate the angles between each pair of planes. The angles are: | Planes | Angle | |--------|-------| | 1, 2 | 74.81 | | 1, 3 | 51.50 | | 2, 3 | 59.36 | The visualization of the metric planes is shown below: ![Input image](media/q2c.jpg width=600px) ![Rectangle plane 1](media/q2c_plane_0.png) ![Rectangle plane 2](media/q2c_plane_1.png) ![Rectangle plane 3](media/q2c_plane_2.png) Single View Recontructions =============================================================================== We follow the below steps to perform single view reconstructions: 1. Compute the intrinsic matrix $K$ from the annotation using the method from vanishing points. 2. Compute ray for one points as reference and set it depth to 1 using equation $$d_{ray} = K^{-1} x$$ 3. Compute the direction of lines of the plane using vanishing points. $$ d_{line} = K^{-1} v$$ 4. Compute line equation using the reference point and the direction from step 3. 5. Compute the intersection of the lines with the correponding rays to get the 3D points. 6. Find the plane equation using the 3D points by computing the normal to the plane in the camera frame. 7. Compute the 3D coordinates of all points on plane via ray-plane intersection. 8. Use point from above plane as reference to compute the 3D coordinates of all points on the other planes. 9. Repeat the above steps for all planes. Fellows quad, Merton College, Oxford ------------------------------------------------------------------------------- Using the given annotations and assuming that the camera has zero skew and the pixels are sqaure, we compute the intrinsic matrix $K$ as: $$ K = \begin{bmatrix} 8.0995142e^2 & 0.00 & 5.1072006e^2 \\ 6.6194041e^{-15} & 8.0995142e^2 & 3.5626895e^2 \\ 0.00 & 0.000 & 1.00 \end{bmatrix}$$ The principal point is thus at $(510.72006, 356.26895)$. ![Input image](media/q3.png) ![Annotated planes](media/q3_annotations.png) ![Reconstruction view 1](media/q3_a_view_1.png) ![Reconstruction view 2](media/q3_a_view_2.png) ![Reconstruction view 3](media/q3_a_view_3.png) Some more reconstructions ------------------------------------------------------------------------------- ![Input image](media/q3_ec_1.png width=400px)![Reconstruction GIF](media/q3_ec_1.gif) ![Input image](media/q3_ec_2.png width=400px)![Reconstruction GIF](media/q3_ec_2.gif) ![Input image](media/q3_ec_0.png width=400px)![Reconstruction GIF](media/q3_ec_0.gif)