16-822 Assignment 2 🎯¶

📅 Due Date: 10/08/2024
👤 Name: Nicholas Beach (nbeach)


🖼️ Q1: Camera matrix P from 2D-3D correspondences (30 points)¶

Objective:
The goal of this task is to compute P from 2D-3D point correspondences.


Part A Results:¶

  • Standford Bunny:
    shape

    $$ P = \begin{pmatrix} 6431.6937 & -2948.4374 & 1146.5806 & 2227.2435 \\ -934.8192 & -6754.8647 & 2029.4901 & 1822.1878 \\ 0.5793 & -1.4239 & -0.7353 & 1 \end{pmatrix} $$

Part B Results:¶

  • Cuboid:
    shape

    $$ P = \begin{pmatrix} 51.2163 & -35.6975 & -2.9882 & 83.8286 \\ -10.5438 & -10.9962 & -56.3862 & 89.9993 \\ 0.0884 & 0.0676 & -0.0397 & 1 \end{pmatrix} $$


📐 Q2: Camera calibration K from annotations (40 points)¶

Objective:
The goal is to compute K from a triad of orthogonal vanishing points, assuming that the camera has zero skew, and that the pixels are square.


Part A Algorithm Description¶

  1. Loading Annotation Data:

    • The function get_point_data(image) loads pre-captured annotation data from a .npy file, containing points corresponding to lines in the image. These points are used to calculate the lines in the image that should be parallel.
  2. Calculating Lines:

    • The function calc_lines(points) computes the line equation for each pair of points using the cross product. These lines are then used for further calculations. Each line is represented as a 3D vector $ l = [a, b, c] $, where the equation of the line is: $$ ax + by + c = 0 $$
  3. Finding Intersection Points:

    • The function get_intersect_points(lines) computes the intersection of lines, specifically looking for points at infinity where parallel lines meet. The intersection is determined using the cross product of the corresponding line equations.
  4. Computing Omega and K:

    • For parallel lines in the real world that appear to converge due to perspective distortion, the camera intrinsics K can be calculated from 3x4 $ \omega$ by using 3 othogonal pairs of lines and the property $ A \omega = 0$ where $ \omega $ is the flattened version of $\omega$ elements. Only 3 constraints are needed because it was assumed that there was zero skew and square pixels which provides 2 additional constraints. A is constructed by using the 3 constraints from the 3 vanishing points $ v_1 \omega v_2 $, $ v_1 \omega v_3 $ , $ v_2 \omega v_3 $. SVD is then used to get $\omega$. Once $\omega$ is calculated, K can be obtained by using Cholesky Decomposition and taking the inverse transpose of L.

Part A Results:¶

  • Vanishing Points Method:
    shape

    $$ K = \begin{pmatrix} 1154.178 & 0 & 575.066 \\ 0 & 1154.178 & 431.9391 \\ 0 & 0 & 1 \end{pmatrix} $$

Part B Algorithm Description¶

  1. Loading Annotation Data:

    • The function get_point_data(image) loads pre-captured annotation data from a .npy file, containing points corresponding to lines in the image. These points are used to calculate the lines in the image that should be parallel.
  2. Homography Calculation:
    Homography is computed to transform points from the perspective image (pers_points) to corresponding points in the normal image (norm_points), using the computeH() and computeH_norm() functions.

    Affine Homography Matrix (H):
    The matrix is computed by solving the linear system for point correspondences using Singular Value Decomposition (SVD): $$ A \mathbf{h} = 0 $$ where $A$ is the matrix containing the relationships between points, and $h$ is the homography matrix. After SVD, the smallest singular value corresponds to the desired homography.

    Normalization:
    Before computing the homography, the points are normalized using translation and scaling transformations to improve numerical stability. The normalization includes:

    • Translating the points to the centroid.
    • Scaling the points so that the farthest point from the origin has a distance of $\sqrt{2}$.

    After homography estimation, the result is denormalized to get the final transformation matrix.

  3. Computing Omega and K:

    • Image planes and their homographies provide 2 constraints: $ h_1 \omega h_2 = 0$ and $ h_1 \omega h_2 = h_1 \omega h_2$ where $h_i$ corresponds to the ith column of the image plane's homography. Using 3 such planes give 6 constraints wich can be used to calculate $\omega$ from the propery $ A \omega = 0$ where $ \omega $ is the flattened version of $\omega$ elements and A is constructed from the constraints. SVD is then used to get $\omega$. Once $\omega$ is calculated, K can be obtained by using Cholesky Decomposition and taking the inverse transpose of L.
  4. Computing Angles Between Planes

    • Angles between the planes can be calculated by using the property $$ \cos \theta = \frac{l_1^T \omega^* l_2}{\sqrt{l_1^T \omega^* l_1} \sqrt{l_2^T \omega^* l_2}} $$ where $l_i$ is the vanishing line of plane i and $\omega^* = K * K^T$. Since the given planes are rectangluar, the perimeter lines can be used to calculate vansihing points which can then be used to calculate vanishing lines using functions described previously.

Part B Results:¶

  • Metric Planes Method:
    shape

    $$ K = \begin{pmatrix} 1083.725 & -11.7859 & 517.3594 \\ 0 & 1078.2298 & 398.2625 \\ 0 & 0 & 1 \end{pmatrix} $$

    Angle between planes(deg)
    Plane 1 and 2 67.50040549736326
    Plane 1 and 3 87.76621056186771
    Plane 2 and 3 94.79064336880757

🔗 Q3: Single View Reconstruction (30 points)¶

Objective:
The goal is to reconstruct a colored point cloud from a single image.

Algorithm Overview¶

  1. Loading Annotation Data:

    • The function get_point_data(image) loads pre-captured annotation data from a .npy file, containing points corresponding to lines in the image. These points are used to calculate the lines in the image that should be parallel.
  2. Calculating Lines:

    • The function calc_lines(points) computes the line equation for each pair of points using the cross product. These lines are then used for further calculations. Each line is represented as a 3D vector $ l = [a, b, c] $, where the equation of the line is: $$ ax + by + c = 0 $$
  3. Finding Intersection Points:

    • The function get_intersect_points(lines) computes the intersection of lines, specifically looking for points at infinity where parallel lines meet. The intersection is determined using the cross product of the corresponding line equations.
  4. Computing Omega and K:

    • For parallel lines in the real world that appear to converge due to perspective distortion, the camera intrinsics K can be calculated from 3x4 $ \omega$ by using 3 othogonal pairs of lines and the property $ A \omega = 0$ where $ \omega $ is the flattened version of $\omega$ elements. Only 3 constraints are needed because it was assumed that there was zero skew and square pixels which provides 2 additional constraints. A is constructed by using the 3 constraints from the 3 vanishing points $ v_1 \omega v_2 $, $ v_1 \omega v_3 $ , $ v_2 \omega v_3 $. SVD is then used to get $\omega$. Once $\omega$ is calculated, K can be obtained by using Cholesky Decomposition and taking the inverse transpose of L.
  5. Calculating 3D Points:

    • The 3D points can be calculated by first calculating the rays formed by every image point and the camera center using the property $ ray = K^(-1) p_i $ where p_i is the ith image point contained in the plane. After all the rays for one plane, one can arbitrarily set one of the rays to a reference depth denoted as $r_ref$. However, selecting that point/ray to be a common point across multiple planes allows that reference depth to be used for all of those planes. A points correspoinding 3D point can then be calculated by $ P_i = \frac{r_i * d}{n * r_i} $ where r_i is the ith ray of the plane, d plane equation ($d = n * r_ref$), and n is the plane normal where $ n = K^T l $ where is the vanishing line of that plane. This can be repeated for all planes in the image to get all 3D points.

Results:¶

  • Reconstruction:
    shape

    $$

      K = \begin{pmatrix}
      808.5202 & 0 & 510.7118 \\
      0 & 808.5202 & 363.6361 \\
      0 & 0 & 1
      \end{pmatrix}
    

    $$

    Angle between planes(deg)
    Plane left and right 90.0
    Plane left and ground 89.48754224219331
    Plane right and ground 93.17281814528123