Geometry-based Methods in Vision

Assignment 2

Q1a:

Matrix P:

original:

surface points:

bounding box:

Test Lines on Affine-Rectified Image:

Q1b:

Matrix P:

original:

result:

Q2a:

original:

annotations:

vanishing points and principal point:

Description:

1.1 Vanishing Point Calculation

1. Extract Annotated Parallel Lines: Read the coordinates of annotated parallel lines provided in the file data/q2/q2a.npy. Each pair of parallel lines corresponds to one vanishing point.

2. Calculate Vanishing Points: Vanishing points are determined by calculating the intersection of two lines. For each pair of parallel lines l₁ and l₂, the intersection (vanishing point) can be found using the following formula:

Vanishing Point = l₁ × l₂

where l₁ and l₂ are represented as line equations, and × denotes the cross product.

1.2 Estimation of the Camera Intrinsic Matrix

1. Principal Point Calculation: Given three orthogonal vanishing points v_x, v_y, v_z, their relationship with the camera intrinsic matrix K is defined as follows:

v_x^T K^-T K^-1 v_y = 0

v_y^T K^-T K^-1 v_z = 0

v_z^T K^-T K^-1 v_x = 0

Using these three orthogonality constraints, the camera intrinsic matrix K can be derived.

2. Solve for K: Under the assumption of zero skew and square pixels, the intrinsic matrix K has the following form:


    K =   [ f   0   c_x ] 

         [ 0   f   c_y ] 

         [ 0   0   1 ]

where f is the focal length, and (c_x, c_y) are the principal point coordinates.

1.3 Implementation and Output

The function get_K() is used to estimate the camera intrinsic matrix K based on the vanishing points.
The output includes a visualized image with marked vanishing points and the numerical values of the intrinsic matrix.

2. Derivation of Relevant Equations

1. Line Equations and Vanishing Point:

Given two line equations l₁ = (a₁, b₁, c₁) and l₂ = (a₂, b₂, c₂), the intersection (vanishing point) can be computed using the cross product:


    v = l₁ × l₂ = 

    [ b₁   c₂ − c₁   b₂ ] 

    [ c₁   a₂ − a₁   c₂ ] 

    [ a₁   b₂ − b₁   a₂ ]

2. Orthogonality Relationship Between Vanishing Points:

The orthogonality relationship between vanishing points can be expressed as:

v_x^T w · v_y = 0

where w = K^-T K^-1 is a transformation matrix related to the camera intrinsic matrix.

Q2b:

original:

Annotated square1:

Annotated square2:

Annotated square2:

1. Image Annotation and Plane Extraction:

The input image data/q2b.png is read, and the annotation file data/q2/q2b.npy containing three plane annotations (square regions) is loaded.
The annotate_planes function is used to load the annotated points and convert them into plane contours. Each plane is represented by its corner points and is visualized for verification.

2. Homography Matrix Computation:

Using the function get_H, the algorithm computes the homography matrix H for each plane. This matrix describes the geometric transformation between the plane and the image, and it is a crucial element for calculating the camera intrinsics.
For each homography matrix H, the function get_IAC_constraint generates constraints for the Image of the Absolute Conic (IAC) matrix. The IAC matrix characterizes the geometric properties of all planes in the image and is used to determine the camera intrinsics.

3. IAC Constraint Solving and Intrinsic Matrix Calculation:

The algorithm performs Singular Value Decomposition (SVD) on the combined constraint matrix from all planes to solve for the IAC matrix. The IAC matrix is reshaped into a 3 x 3 form, which encodes the geometric constraints for the camera.
The function validate_H_constraints is used to verify if the homography matrices satisfy the constraints imposed by the IAC matrix, ensuring that the obtained IAC is correct.
Finally, using the function calculate_intrinsic_matrix, the intrinsic matrix K is calculated from the IAC matrix using Cholesky Decomposition. This matrix is normalized so that the bottom-right element is 1.

4. Calculating Angles Between Planes:

The corner points of each plane are converted into homogeneous coordinates. Using the function compute_plane_angles, the algorithm calculates the angles between each pair of planes, using the intrinsic matrix K.

5. Final Output:

The final output includes the calculated camera intrinsic matrix K and the computed angles between each pair of planes to verify the accuracy of the calibration.

Relevant Equations

Homography Matrix Constraints: For each plane, compute the constraint matrix:
IAC · h₁ · h₂ = 0
where h₁ and h₂ are the first and second columns of the homography matrix H.

Angle Calculation: The angle between two homogeneous plane equations n_i, n_j is given by:


            cos θ = (n_i^T · K^-1 · K^-T · T · n_j) / (||n_i^T · K^-1|| ||n_j^T · K^-1||)

This formula calculates the angle between the normal vectors of the two planes using the intrinsic matrix.

Q3a:

inpput:

annotations:

reconstructed view1:

reconstructed view2:

Description:

1. Camera Projection Model and Intrinsic Matrix

In single view geometry, all 2D image coordinates are obtained through the projection of 3D spatial points. The projection model is usually expressed in the homogeneous coordinate form:

s · x = K · [R | t] · X

where:

X = (X, Y, Z, 1)^T is the homogeneous coordinate of a 3D point in space.
x = (x, y, 1)^T is the homogeneous coordinate of a point in the image.
s is a scaling factor, used to represent the proportional relationship in homogeneous coordinates.
K is the camera intrinsic matrix, which is typically of the form:

          K = | f_x  0   c_x |
              | 0    f_y  c_y |
              | 0     0     1 |

where f_x, f_y are the focal lengths in the horizontal and vertical directions, and c_x, c_y are the coordinates of the principal point (the intersection of the optical axis and image plane).

2. Vanishing Points and Linear Constraints

Parallel lines in the real world will converge at a point in the image, called a vanishing point. The vanishing point is computed based on the intersection of line equations. Given two line equations:

l₁ = (a₁, b₁, c₁)    and    l₂ = (a₂, b₂, c₂)

The intersection can be computed using:

v = l₁ × l₂

where × denotes the cross product.

3. Ray-Plane Intersection

For each pixel in the image, the corresponding spatial ray direction can be derived using the inverse of the intrinsic matrix:

r = K^-1 · x

where r is the direction vector of the ray, and x is the pixel coordinate in the image plane.

4. Plane Equation and Normal Vector Calculation

If multiple points on a plane in the image are known, the plane equation can be derived. Given three non-collinear points (P₁, P₂, P₃), the normal vector n can be calculated using the cross product:

n = (P₂ − P₁) × (P₃ − P₁)

5. Color Mapping and 3D Point Cloud Generation

Each 3D point can be assigned a color value based on its corresponding 2D pixel in the image. For a point in the image at (x, y), the color value C is given by:

C = I(y, x)

6. View Transformation and 3D Visualization

Once the 3D point cloud is generated, different view transformations can be applied to visualize the 3D structure from various angles. A view transformation typically uses a rotation matrix or a view transformation matrix to change the observation perspective:

          V = | R  t |
              | 0  1 |

where R is a rotation matrix and t is a translation vector. By applying V to the 3D point cloud, different views of the 3D image can be obtained.