16-822 Assignment 1 🎯¶

📅 Due Date: 09/24/2024
👤 Name: Nicholas Beach (nbeach)

🖼️ Q1: Affine Rectification (30 points)¶

Objective:
The goal of this task is to generate affinely correct warps for images captured through perspective cameras, assuming a pinhole camera model. You will be using annotations of at least 2 pairs of parallel lines to accomplish this.

Algorithm Description¶

Loading Annotation Data:
- The function get_point_data(image) loads pre-captured annotation data from a .npy file, containing points corresponding to lines in the image. These points are used to calculate the lines in the image that should be parallel.
Calculating Lines:
- The function calc_lines(points) computes the line equation for each pair of points using the cross product. These lines are then used for further calculations. Each line is represented as a 3D vector $ l = [a, b, c] $, where the equation of the line is: $$ ax + by + c = 0 $$
Finding Intersection Points:
- The function get_intersect_points(lines) computes the intersection of lines, specifically looking for points at infinity where parallel lines meet. The intersection is determined using the cross product of the corresponding line equations.
Affine Transformation Matrix:
- For parallel lines in the real world that appear to converge due to perspective distortion, an affine transformation can be applied to make them parallel again.
- The intersection points of the lines at infinity are used to compute the horizon line, which is the vanishing line. The affine transformation matrix $ H $ is then calculated: $$ H = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ l_x & l_y & l_z \end{bmatrix} $$ where $ l_x $, $ l_y $, and $ l_z $ are the components of the vanishing line computed from the points at infinity.
Rectifying the Image:
- The affine transformation matrix is applied to the image to remove perspective distortion and correct parallel lines using MyWarp(image, H). This step aligns parallel lines that previously appeared to converge.
Calculating Angles:
- The function calc_angle(points, H) computes the angles between the parallel lines before and after rectification. The dot product is used to calculate the angle between two lines $ l_1 $ and $ l_2 $: $$ \cos(\theta) = \frac{{l_1 \cdot l_2}}{{\|l_1\| \|l_2\|}} $$ where $ \|l_1\| $ and $ \|l_2\| $ are the norms of the line vectors. This comparison shows how the affine transformation has affected the angle.

Results:¶

Chess:

Before After

0.98784708 0.99977038

0.98459413 0.99999785
Book1:

Before After

0.99998958 0.99999344

0.95998441 0.99529896
Tiles3:

Before After

0.99284 0.99997443

0.99585623 0.99955085
Tiles5:

Before After

0.98683276 0.9999252

0.99871911 0.99993835
Checker:

Before After

0.89644481 0.99999437

0.95745224 0.99997294
Facade:

Before After

0.78423243 0.99997814

0.99999702 0.99998694
Poster:

Before After

0.99968551 0.99997965

0.99995754 0.99994058
Floor:

Before After

0.9856242 0.99979708

0.9934574 0.99999959

Before	After
0.98784708	0.99977038
0.98459413	0.99999785

Before	After
0.99998958	0.99999344
0.95998441	0.99529896

Before	After
0.99284	0.99997443
0.99585623	0.99955085

Before	After
0.98683276	0.9999252
0.99871911	0.99993835

Before	After
0.89644481	0.99999437
0.95745224	0.99997294

Before	After
0.78423243	0.99997814
0.99999702	0.99998694

Before	After
0.99968551	0.99997965
0.99995754	0.99994058

Before	After
0.9856242	0.99979708
0.9934574	0.99999959

📐 Q2: Metric Rectification (40 points)¶

Objective:
Building on your result from Q1, the objective here is to generate metrically correct warps for images captured through perspective cameras, again assuming a pinhole camera model. In this case, you will need annotations of at least 2 pairs of perpendicular lines.

Algorithm Description¶

1. Overview¶

This implementation performs affine and metric rectification of images. The algorithm utilizes user-annotated points on an image to identify lines and calculates transformations to rectify images with respect to affine and metric distortions.

2. Affine Rectification¶

Affine rectification aims to remove the affine distortions in the image. This involves transforming parallel lines in the real world (which may appear non-parallel due to perspective distortion) back to parallel lines in the image.

2.1 Key Steps¶

Load Image and Annotations:
- Images are loaded along with corresponding annotated points. These points mark features like parallel and perpendicular lines on the object in the image.
Compute Lines from Points:
- Lines are computed using pairs of annotated points.
- For two points $ p_1 = (x_1, y_1, 1) $ and $ p_2 = (x_2, y_2, 1) $, the line equation is calculated as:
  
  $$ l = p_1 \times p_2 $$
Compute Intersection Points:
- Lines that are parallel in the real world will intersect at infinity. The intersections of pairs of parallel lines are computed using the cross product.
- The line at infinity $ l_{\infty} $ is computed by taking the cross product of two intersection points at infinity.
Affine Transformation:
- The affine transformation matrix $ H_{affine} $ is calculated as:
  
  $$ H_{affine} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ l_{\infty} \end{bmatrix} $$
- The transformation is then applied to the image to rectify it.

3. Metric Rectification¶

Metric rectification goes further to remove perspective distortions, ensuring that perpendicular lines in the real world are perpendicular in the image.

3.1 Key Steps¶

Compute Orthogonal Lines:
- Orthogonal lines are annotated in the image, and their equations are computed from the annotated points.
Construct the $ A $ Matrix:
- From the orthogonal lines, the matrix $ A $ is constructed, which relates the elements of the conic at infinity. For two lines $ l_1 = (a_1, b_1, c_1) $ and $ l_2 = (a_2, b_2, c_2) $, the constraints on orthogonality give:
  
  $$ A = \begin{bmatrix} a_1a_2 & a_1b_2 + a_2b_1 & b_1b_2 \end{bmatrix} $$
Metric Rectification Transformation:
- The conic at infinity is estimated using Singular Value Decomposition (SVD) on the matrix $ A $.
- The metric rectification matrix $ H_{metric} $ is then computed, which transforms the image to ensure perpendicularity of lines.

4. Implementation¶

4.1 Functions¶

get_point_data(image, case): Returns the annotated points for an image, depending on the source (case).
calc_lines(points): Computes the lines from a list of points using the cross product.
get_intersect_points(lines): Calculates the intersection points of lines using the cross product.
metric_rect(A): Computes the metric rectification matrix $ H_{metric} $ using SVD.
calc_angle(points, H): Computes and compares the angles between lines before and after rectification.
draw_lines(image, points): Draws lines on an image based on annotated points.
output(im, image_name, case): Performs the complete affine and metric rectification process on an image.

4.2 Key Equations¶

Line Calculation: $$ l = p_1 \times p_2 $$
Affine Transformation: $$ H_{affine} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ l_{\infty} \end{bmatrix} $$
Metric Rectification: $$ H_{metric} = \text{SVD-based solution from the conic at infinity} $$

4.3 Steps¶

Load and annotate the image.
Perform affine rectification using parallel lines.
Perform metric rectification to restore perpendicularity.
Visualize the rectified image.
Measure and compare angles before and after rectification to verify correctness.

Results:¶

Chess:

Before After

0.66947617 -0.0449211

-0.0472401 -0.0982218

Before	After
0.66947617	-0.0449211
-0.0472401	-0.0982218

Book1:

Before After

-0.11753991 -0.00898161

0.16034301 -0.0493737
Tiles3:

Before After

0.14789988 -0.03999051

0.39741012 0.00615781
Tiles5:

Before After

0.16544187 0.1032251

0.02765608 -0.04071392
Checker:

Before After

-0.25063073 0.03751169

0.08556568 0.09906659
Facade:

Before After

-0.18919012 -0.20661038

-0.10552677 0.21458487
Poster:

Before After

-0.45022493 -0.83774631

-0.13841859 -0.83949455
Floor:

Before After

-0.06977927 -0.11081206

-0.50008206 0.12119253

Before	After
-0.11753991	-0.00898161
0.16034301	-0.0493737

Before	After
0.14789988	-0.03999051
0.39741012	0.00615781

Before	After
0.16544187	0.1032251
0.02765608	-0.04071392

Before	After
-0.25063073	0.03751169
0.08556568	0.09906659

Before	After
-0.18919012	-0.20661038
-0.10552677	0.21458487

Before	After
-0.45022493	-0.83774631
-0.13841859	-0.83949455

Before	After
-0.06977927	-0.11081206
-0.50008206	0.12119253

🔗 Q3: Planar Homography from Point Correspondences (30 points)¶

Objective:
Your task is to estimate homographies between two images using point correspondences.

Algorithm Overview¶

Load Images:
The images are loaded using cv2.imread(), and their color channels are adjusted to match standard RGB format (from BGR used by OpenCV).
Annotating Points:
Perspective images are annotated using four points. These points represent the corners of an object or region in the perspective view (pers_points). These annotations are pre-saved in the q3_captured.npy file and are loaded using the get_point_data() function.
Homography Calculation:
Homography is computed to transform points from the perspective image (pers_points) to corresponding points in the normal image (norm_points), using the computeH() and computeH_norm() functions.

Affine Homography Matrix (H):
The matrix is computed by solving the linear system for point correspondences using Singular Value Decomposition (SVD): $$ A \mathbf{h} = 0 $$ where $A$ is the matrix containing the relationships between points, and $h$ is the homography matrix. After SVD, the smallest singular value corresponds to the desired homography.

Normalization:
Before computing the homography, the points are normalized using translation and scaling transformations to improve numerical stability. The normalization includes:
- Translating the points to the centroid.
- Scaling the points so that the farthest point from the origin has a distance of $\sqrt{2}$.
After homography estimation, the result is denormalized to get the final transformation matrix.
Warping and Composition:
The normal image is warped to match the perspective image using the computed homography. This is achieved by applying the homography matrix to the normal image. The warped image is then combined with the perspective image to create a composite image. A mask is created to blend the warped template into the perspective image using the compositeH() function.

Key Equations¶

Homography Estimation:
The homography matrix $H$ relates two sets of points ($\mathbf{x}_1$ and $\mathbf{x}_2$) using the transformation: $$ \mathbf{x}_2 = H \mathbf{x}_1 $$ The system is solved by minimizing the residual in the linear system: $$ A \mathbf{h} = 0 $$
Normalization:
The points are normalized by translating them to the centroid and scaling them using: $$ \mathbf{x}_{\text{normalized}} = \frac{\mathbf{x} - \mathbf{x}_{\text{centroid}}}{\text{max distance from origin}} \times \sqrt{2} $$
Denormalization:
After computing the homography, it is denormalized by applying inverse scaling and translation transformations.

Annotations Used¶

Perspective Image Points: Four annotated points on the perspective image, stored in q3_captured.npy, are used to compute the homography matrix.
Normal Image Points: The corners of the normal image are assumed to be at coordinates $(0, 0)$, $(x, 0)$, $(x, y)$, and $(0, y)$, where $x$ and $y$ are the dimensions of the normal image.

Results:¶

CV Book onto Desk:
Star Wars Poster onto Wean Hall:

📏 Q4: Bonus: Metric Rectification from Perpendicular Lines (10 points)¶

1. Loading and Annotating Image Data:¶

get_point_data(image, case): Loads pre-saved annotation data (key points from the image) from files q4_captured.npy or ec_q4.npy, depending on the case argument.
load_image(i, case): Loads the image using cv2.imread() and swaps the red and blue channels for correct visualization since OpenCV uses BGR instead of RGB.

2. Calculation of Lines:¶

Cross Product to Define Lines:
- calc_lines(points) calculates lines passing through pairs of points (annotations) using the cross product of two points in homogeneous coordinates.
- The equation for a line ( l ) through two points ( p_1 ) and ( p_2 ) in homogeneous coordinates is: $$ l = p_1 \times p_2 $$
The lines are then stored for further use in metric rectification.

3. Building the Constraint Matrix:¶

Creating the Matrix for Metric Rectification:
- create_a_matrix(lines) builds a matrix ( A ), used to estimate the conic at infinity ( C_{\infty} ), which encodes the orthogonality constraints between pairs of lines.
- The structure of this matrix is based on constraints derived from: $$ l_1^T C_{\infty} l_2 = 0 $$ where ( l_1 ) and ( l_2 ) are supposed to be orthogonal.

4. Metric Rectification:¶

Singular Value Decomposition (SVD):
- metric_rect(A) performs SVD on the matrix ( A ), extracting the last column of ( V ), which gives the parameters of the conic ( C_{\infty} ).
- It constructs a homography ( H_{\text{met}} ) for metric rectification, which transforms the image to correct for projective distortions, ensuring that perpendicular lines in the scene remain perpendicular in the rectified image.

5. Drawing Lines and Visualization:¶

Annotation:
- draw_lines(rec_image, affine_points, size) draws lines on the image between pairs of points, using random colors from the predefined list for visualization.
Plotting:
- The function output() generates a visual comparison of the original image, annotated image with perpendicular lines, and the final rectified image using MyWarp() to apply the homography.

6. Angle Calculation:¶

Dot Product:
- calc_angle(points, H) calculates the angle between lines before and after rectification using the dot product: $$ \cos(\theta) = \frac{l_1 \cdot l_2}{\|l_1\| \|l_2\|} $$
- This shows how well the rectification has corrected the angles.

7. Main Function:¶

The program processes all images in the directory data/q1, applies metric rectification, and displays the results.
Finally, the program also processes the image 'Floor.jpg', a specific test case.

Equations Used:¶

Line through two points: $$ l = p_1 \times p_2 $$
Conic at infinity constraint: $$ l_1^T C_{\infty} l_2 = 0 $$
SVD decomposition: $$ A = U \Sigma V^T $$
Homography for rectification: $$ H_{\text{met}} = U_h^T \cdot \text{diag}\left(\sqrt{\frac{1}{s_1}}, \sqrt{\frac{1}{s_2}}, 1\right) $$

Annotations:¶

The annotated image shows the rectified lines, while the angles between lines before and after rectification are printed as a quality measure of the rectification process.

Results:¶

Tiles3:
shape

Before	After
-0.1921958	0.223017571
-0.050748446	0.267158122
-0.106192250	-0.33064616
0.2468485559	-0.25417146
0.1591520744	-0.30023704
-0.040209134	0.372900133

Tiles5:
shape

Before	After
0.18134901	-0.00668417
-0.2008803	-0.06464913
-0.0814482	0.005642101
0.03668181	-0.01543135
0.00375937	-0.01323422
0.17078672	0.023707298

Floor:
shape

Before	After
0.04133108	0.039819726
-0.0590544	0.051731302
0.45674996	0.167035787
-0.2862859	-0.00255669
0.33015491	0.072987101
-0.0592635	0.051615821

📏 Q5: Bonus: More Planar Homography from Point Correspondences (10 points)¶

Implementation Description¶

This implementation focuses on a series of image overlays using homography transformations. The goal is to blend multiple images into a single composite output by aligning perspective images with normal images based on computed corner points.

Key Functions¶

Data Loading:
- get_point_data(image): Loads corner coordinates from a NumPy file, which contain annotations for perspective images.
Image Loading:
- load_image(i, case): Reads an image and swaps the red and blue channels for proper color representation.
Homography Computation:
- computeH(x1, x2): Computes the homography matrix ( H ) that maps points from one image to another using the Direct Linear Transformation (DLT).
- computeH_norm(x1, x2): Normalizes the point sets by centering and scaling before computing the homography: $$ H = T_1^{-1} H_{\text{DLT}} T_2 $$
Composite Image Creation:
- compositeH(H2to1, template, img): Warps a template image according to the computed homography and overlays it onto the original image using a mask to manage the blending.
Overlaying Images:
- overlay(pers_image, norm_image, pers_points, custom): This function performs a series of overlays:
  - Loads the perspective and normal images.
  - Draws the corners on the perspective image for visualization.
  - Computes the homography to align the perspective image with the normal image.
  - Creates a composite image that blends the warped template and the original image.
  - This process is repeated for multiple normal images, resulting in a cumulative overlay effect.

Process Flow¶

The sequence begins with a specified perspective image (e.g., a billboard) and a series of normal images (e.g., different backgrounds).
Each overlay step uses the corners of the perspective image to align it with the corresponding normal image.
The final output is a richly blended image that integrates elements from all normal images into the perspective view.

Final Output¶

The end result is a composite image created by successively overlaying the perspective image with multiple normal images, producing a cohesive visual effect.

Results:¶

Board 1:
Board 2:
Board 3:
Final Output: