📅 Due Date: 09/24/2024
👤 Name: Nicholas Beach (nbeach)
Objective:
The goal of this task is to generate affinely correct warps for images captured through perspective cameras, assuming a pinhole camera model. You will be using annotations of at least 2 pairs of parallel lines to accomplish this.
Loading Annotation Data:
get_point_data(image)
loads pre-captured annotation data from a .npy
file, containing points corresponding to lines in the image. These points are used to calculate the lines in the image that should be parallel.Calculating Lines:
calc_lines(points)
computes the line equation for each pair of points using the cross product. These lines are then used for further calculations. Each line is represented as a 3D vector $ l = [a, b, c] $, where the equation of the line is:
$$
ax + by + c = 0
$$Finding Intersection Points:
get_intersect_points(lines)
computes the intersection of lines, specifically looking for points at infinity where parallel lines meet. The intersection is determined using the cross product of the corresponding line equations.Affine Transformation Matrix:
Rectifying the Image:
MyWarp(image, H)
. This step aligns parallel lines that previously appeared to converge.Calculating Angles:
calc_angle(points, H)
computes the angles between the parallel lines before and after rectification. The dot product is used to calculate the angle between two lines $ l_1 $ and $ l_2 $:
$$
\cos(\theta) = \frac{{l_1 \cdot l_2}}{{\|l_1\| \|l_2\|}}
$$
where $ \|l_1\| $ and $ \|l_2\| $ are the norms of the line vectors. This comparison shows how the affine transformation has affected the angle.Chess:
Before | After |
---|---|
0.98784708 | 0.99977038 |
0.98459413 | 0.99999785 |
Book1:
Before | After |
---|---|
0.99998958 | 0.99999344 |
0.95998441 | 0.99529896 |
Tiles3:
Before | After |
---|---|
0.99284 | 0.99997443 |
0.99585623 | 0.99955085 |
Tiles5:
Before | After |
---|---|
0.98683276 | 0.9999252 |
0.99871911 | 0.99993835 |
Checker:
Before | After |
---|---|
0.89644481 | 0.99999437 |
0.95745224 | 0.99997294 |
Facade:
Before | After |
---|---|
0.78423243 | 0.99997814 |
0.99999702 | 0.99998694 |
Poster:
Before | After |
---|---|
0.99968551 | 0.99997965 |
0.99995754 | 0.99994058 |
Floor:
Before | After |
---|---|
0.9856242 | 0.99979708 |
0.9934574 | 0.99999959 |
Objective:
Building on your result from Q1, the objective here is to generate metrically correct warps for images captured through perspective cameras, again assuming a pinhole camera model. In this case, you will need annotations of at least 2 pairs of perpendicular lines.
This implementation performs affine and metric rectification of images. The algorithm utilizes user-annotated points on an image to identify lines and calculates transformations to rectify images with respect to affine and metric distortions.
Affine rectification aims to remove the affine distortions in the image. This involves transforming parallel lines in the real world (which may appear non-parallel due to perspective distortion) back to parallel lines in the image.
Load Image and Annotations:
Compute Lines from Points:
For two points $ p_1 = (x_1, y_1, 1) $ and $ p_2 = (x_2, y_2, 1) $, the line equation is calculated as:
$$ l = p_1 \times p_2 $$
Compute Intersection Points:
Affine Transformation:
The affine transformation matrix $ H_{affine} $ is calculated as:
$$ H_{affine} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ l_{\infty} \end{bmatrix} $$
The transformation is then applied to the image to rectify it.
Metric rectification goes further to remove perspective distortions, ensuring that perpendicular lines in the real world are perpendicular in the image.
Compute Orthogonal Lines:
Construct the $ A $ Matrix:
From the orthogonal lines, the matrix $ A $ is constructed, which relates the elements of the conic at infinity. For two lines $ l_1 = (a_1, b_1, c_1) $ and $ l_2 = (a_2, b_2, c_2) $, the constraints on orthogonality give:
$$ A = \begin{bmatrix} a_1a_2 & a_1b_2 + a_2b_1 & b_1b_2 \end{bmatrix} $$
Metric Rectification Transformation:
get_point_data(image, case)
: Returns the annotated points for an image, depending on the source (case
).calc_lines(points)
: Computes the lines from a list of points using the cross product.get_intersect_points(lines)
: Calculates the intersection points of lines using the cross product.metric_rect(A)
: Computes the metric rectification matrix $ H_{metric} $ using SVD.calc_angle(points, H)
: Computes and compares the angles between lines before and after rectification.draw_lines(image, points)
: Draws lines on an image based on annotated points.output(im, image_name, case)
: Performs the complete affine and metric rectification process on an image.Line Calculation: $$ l = p_1 \times p_2 $$
Affine Transformation: $$ H_{affine} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ l_{\infty} \end{bmatrix} $$
Metric Rectification: $$ H_{metric} = \text{SVD-based solution from the conic at infinity} $$
Before | After |
---|---|
0.66947617 | -0.0449211 |
-0.0472401 | -0.0982218 |
Book1:
Before | After |
---|---|
-0.11753991 | -0.00898161 |
0.16034301 | -0.0493737 |
Tiles3:
Before | After |
---|---|
0.14789988 | -0.03999051 |
0.39741012 | 0.00615781 |
Tiles5:
Before | After |
---|---|
0.16544187 | 0.1032251 |
0.02765608 | -0.04071392 |
Checker:
Before | After |
---|---|
-0.25063073 | 0.03751169 |
0.08556568 | 0.09906659 |
Facade:
Before | After |
---|---|
-0.18919012 | -0.20661038 |
-0.10552677 | 0.21458487 |
Poster:
Before | After |
---|---|
-0.45022493 | -0.83774631 |
-0.13841859 | -0.83949455 |
Floor:
Before | After |
---|---|
-0.06977927 | -0.11081206 |
-0.50008206 | 0.12119253 |
Objective:
Your task is to estimate homographies between two images using point correspondences.
Load Images:
The images are loaded using cv2.imread()
, and their color channels are adjusted to match standard RGB format (from BGR used by OpenCV).
Annotating Points:
Perspective images are annotated using four points. These points represent the corners of an object or region in the perspective view (pers_points
). These annotations are pre-saved in the q3_captured.npy
file and are loaded using the get_point_data()
function.
Homography Calculation:
Homography is computed to transform points from the perspective image (pers_points
) to corresponding points in the normal image (norm_points
), using the computeH()
and computeH_norm()
functions.
Affine Homography Matrix (H
):
The matrix is computed by solving the linear system for point correspondences using Singular Value Decomposition (SVD):
$$
A \mathbf{h} = 0
$$
where $A$ is the matrix containing the relationships between points, and $h$ is the homography matrix. After SVD, the smallest singular value corresponds to the desired homography.
Normalization:
Before computing the homography, the points are normalized using translation and scaling transformations to improve numerical stability. The normalization includes:
After homography estimation, the result is denormalized to get the final transformation matrix.
Warping and Composition:
The normal image is warped to match the perspective image using the computed homography. This is achieved by applying the homography matrix to the normal image. The warped image is then combined with the perspective image to create a composite image. A mask is created to blend the warped template into the perspective image using the compositeH()
function.
Homography Estimation:
The homography matrix $H$ relates two sets of points ($\mathbf{x}_1$ and $\mathbf{x}_2$) using the transformation:
$$
\mathbf{x}_2 = H \mathbf{x}_1
$$
The system is solved by minimizing the residual in the linear system:
$$
A \mathbf{h} = 0
$$
Normalization:
The points are normalized by translating them to the centroid and scaling them using:
$$
\mathbf{x}_{\text{normalized}} = \frac{\mathbf{x} - \mathbf{x}_{\text{centroid}}}{\text{max distance from origin}} \times \sqrt{2}
$$
Denormalization:
After computing the homography, it is denormalized by applying inverse scaling and translation transformations.
q3_captured.npy
, are used to compute the homography matrix.CV Book onto Desk:
Star Wars Poster onto Wean Hall:
get_point_data(image, case)
: Loads pre-saved annotation data (key points from the image) from files q4_captured.npy
or ec_q4.npy
, depending on the case argument.load_image(i, case)
: Loads the image using cv2.imread()
and swaps the red and blue channels for correct visualization since OpenCV uses BGR instead of RGB.calc_lines(points)
calculates lines passing through pairs of points (annotations) using the cross product of two points in homogeneous coordinates.create_a_matrix(lines)
builds a matrix ( A ), used to estimate the conic at infinity ( C_{\infty} ), which encodes the orthogonality constraints between pairs of lines.metric_rect(A)
performs SVD on the matrix ( A ), extracting the last column of ( V ), which gives the parameters of the conic ( C_{\infty} ).draw_lines(rec_image, affine_points, size)
draws lines on the image between pairs of points, using random colors from the predefined list for visualization.output()
generates a visual comparison of the original image, annotated image with perpendicular lines, and the final rectified image using MyWarp()
to apply the homography.calc_angle(points, H)
calculates the angle between lines before and after rectification using the dot product:
$$ \cos(\theta) = \frac{l_1 \cdot l_2}{\|l_1\| \|l_2\|} $$data/q1
, applies metric rectification, and displays the results.'Floor.jpg'
, a specific test case.Before | After |
---|---|
-0.1921958 | 0.223017571 |
-0.050748446 | 0.267158122 |
-0.106192250 | -0.33064616 |
0.2468485559 | -0.25417146 |
0.1591520744 | -0.30023704 |
-0.040209134 | 0.372900133 |
Before | After |
---|---|
0.18134901 | -0.00668417 |
-0.2008803 | -0.06464913 |
-0.0814482 | 0.005642101 |
0.03668181 | -0.01543135 |
0.00375937 | -0.01323422 |
0.17078672 | 0.023707298 |
Before | After |
---|---|
0.04133108 | 0.039819726 |
-0.0590544 | 0.051731302 |
0.45674996 | 0.167035787 |
-0.2862859 | -0.00255669 |
0.33015491 | 0.072987101 |
-0.0592635 | 0.051615821 |
This implementation focuses on a series of image overlays using homography transformations. The goal is to blend multiple images into a single composite output by aligning perspective images with normal images based on computed corner points.
Data Loading:
get_point_data(image)
: Loads corner coordinates from a NumPy file, which contain annotations for perspective images.Image Loading:
load_image(i, case)
: Reads an image and swaps the red and blue channels for proper color representation.Homography Computation:
computeH(x1, x2)
: Computes the homography matrix ( H ) that maps points from one image to another using the Direct Linear Transformation (DLT).computeH_norm(x1, x2)
: Normalizes the point sets by centering and scaling before computing the homography:
$$
H = T_1^{-1} H_{\text{DLT}} T_2
$$Composite Image Creation:
compositeH(H2to1, template, img)
: Warps a template image according to the computed homography and overlays it onto the original image using a mask to manage the blending.Overlaying Images:
overlay(pers_image, norm_image, pers_points, custom)
: This function performs a series of overlays:Board 1:
Board 2:
Board 3:
Final Output: