16-822 Geometry-based Methods in Vision

Assignment 2: Single-view Reconstruction

Qitao Zhao (qitaoz), Fall 2024

Q1: Camera matrix P from 2D-3D correspondences

(a) Stanford Bunny

Surface PointsBounding Box

(b) Cuboid

Input ImageAnnotated 2D pointsEdges

Q2: Camera calibration K from annotations

(a) Camera calibration from vanishing points

Input ImageAnnotated Parallel LinesVanishing points and principal point

K=[1.15417802e+030.00000000e+005.75066005e+020.00000000e+001.15417802e+034.31939090e+020.00000000e+000.00000000e+001.00000000e+00]

Brief Description of the Implementation

1. Vanishing Point Calculation

We first compute vanishing points from annotated lines. Each pair of 2D points forms a line, and two such lines are used to compute a vanishing point — the intersection of the two lines. The vanishing points represent directions in 3D space where parallel lines converge when projected onto the image plane.

For each pair of annotated points ( p_1(x_1, y_1) ) and ( p_2(x_2, y_2) ), the line equation is computed using the form:

(1)ax+by+c=0

The intersection of two such lines gives the vanishing point ( (x, y) ).

2. Forming the Equation for IAC

In the case of square pixels, the image of the absolute conic is assumed to have the form:

(2)ω=[w10w20w1w3w2w3w4]

Each pair of vanishing points ( v_i ) and ( v_j ) generates a constraint in the form of the equation:

(3)viTωvj=0

Using three pairs of vanishing points, these constraints are stacked to form a system of linear equations in the matrix form:

(4)Aw=0

Where ( A ) is a ( 3 × 4 ) matrix, and ( w ) is the vector of unknowns ( [w_1, w_2, w_3, w_4] ).

3. Solving for ω

The vector ( w ) is obtained as the null vector of ( A ), which is found by solving the system ( A w = 0 ) using SVD. The last row of the matrix ( V^T ) from the SVD solution corresponds to the vector ( w ).

This determines the matrix ( ω ).

4. Estimating the Camera Intrinsic Matrix ( K )

Once ( ω ) is determined, the camera intrinsic matrix ( K ) is computed from the relation:

(5)ω=(KKT)1

To extract ( K ), Cholesky factorization is applied to (ω), and the result is then inverted. Finally, the matrix ( K ) is normalized such that the last element ( K_{33} = 1 ).

(b) Camera calibration from metric planes

Input ImageAnnotated Squares
 Angle between planes(degree)
Plane 1 & Plane 267.575126638156
Plane 1 & Plane 387.7527831744175
Plane 2 & Plane 385.21620854556966

K=[1.08447642e+031.35121131e+015.20013594e+021.17407507e131.07900526e+034.02544642e+020.00000000e+000.00000000e+001.00000000e+00]

Brief Description of the Implementation

1. Homography Computation

For each planar square, we compute the homography H that maps the known corner points of the square to their corresponding imaged 2D points. These corner points are:

Given the observed image points for each square, the homography H is computed, which describes the transformation between the real-world plane of the square and its image.

2. Imaged Circular Points

Once the homography H for each square is obtained, we compute the imaged circular points. The circular points are defined as (1, ± i, 0), where i is the imaginary unit. Using the homography, these circular points are projected into the image as:

(6)H(1,±i,0)T

Given that ( H = [h_1, h_2, h_3] ) (where ( h_1 ), ( h_2 ), and ( h_3 ) are the column vectors of the homography matrix), the imaged circular points become:

(7)h1±ih2

These points lie on the image of the absolute conic (IAC), which allows us to impose constraints on the camera’s intrinsic matrix.

3. Fitting the Conic ( ω )

The IAC encodes the intrinsic parameters of the camera. The constraint that the imaged circular points ( h_1 \pm i h_2 ) lie on ( ω ) provides two real constraints:

  1.  

(8)h1Tωh2=0
  1.  

(9)h1Tωh1=h2Tωh2

These are linear equations in the elements of ( \omega ), the conic we are solving for. With multiple squares, we generate more than five equations, which is sufficient to solve for ( \omega ) up to a scale factor.

4. Calibration Matrix ( K )

Once the conic ( \omega ) is computed, the final step is to extract the camera’s intrinsic matrix ( K ) (as what we did in (a)).

Q3: Single View Reconstruction

Input ImageAnnotationsReconstruction View 1Reconstruction View 2

Brief Description of the Implementation

  1. Compute ( K ) from 3 vanishing points: Estimate the camera intrinsic matrix ( K ) using three vanishing points in the image.

  2. Select a reference point: Choose a reference point in the image to start the unprojection (we choose two points lying on the same vertical line covering all planes and set their depth as 1, so that we do not need to worry about scale).

  3. Unproject the reference point: Convert the reference point from 2D to 3D using the inverse of the intrinsic matrix ( K ) and assign a depth (scale) to the point:

    (10)Xr=K1x
  4. Find the plane normal and scalar ( a ): Compute the plane's normal vector ( n ) and the scalar ( a ) using with the known 3D point:

    (11)a=nTXr
  5. Unproject other points to 3D: Apply the same unprojection method to the remaining 2D points:

    (12)X=K1x
  6. Repeat for all planes: Repeat the process for every plane in the scene to obtain the 3D geometry.