16-822 Assignment2

Q1: Camera matrix P from 2D-3D correspondences (30 points)

(a) Stanford Bunny (15 points)

Since we have the corresponding 2D and 3D points, we can get the equations for the parameters in camera matrix P. To solve these equations (or constraints), we use singular value decomposition. Finally, we can get the camera matrix P as:

[[ 6.43169368e+03 -2.94843744e+03 1.14658061e+03 2.22724350e+03] [-9.34819249e+02 -6.75486473e+03 2.02949013e+03 1.82218778e+03] [ 5.79307220e-01 -1.42385366e+00 -7.35268478e-01 1.00000000e+00]]

The visualization of surface points, and bounding box are shown as below:

q1a1q1a2

(b) Cuboid (15 points)

Using the same method as in Q1(a), we annotate 6 pixel positions for the cuboid, and then try to map it in the 3D unit cube (side length=1). We can get the camera matrix P is:

[[-3.65943937e+01 5.30597004e+01 -3.02669429e+00 8.48080960e+01] [-1.28736036e+01 -1.08885879e+01 -5.64824172e+01 9.19991273e+01] [ 6.92863940e-02 1.11058784e-01 -4.01273802e-02 1.00000000e+00]]

I draw the annotations, and the edges of the cuboid as below:

q1b10 q1a1

Q2: Camera calibration K from annotations (40 points + 10 points bonus)

(a) Camera calibration from vanishing points (20 points)

The input image, annotated parallel lines, and the vanishing points & principal point figures are shown as follows:

q2a1q2a2q2a3

The K for the input image is:

[[1.15417802e+03 0.00000000e+00 5.75066005e+02] [0.00000000e+00 1.15417802e+03 4.31939090e+02] [0.00000000e+00 0.00000000e+00 1.00000000e+00]]

My implementation is as follows. First, through the points for pairs of parallel lines, using cross product, I can get the line representation, and further get the vanishing points. For each pairs of vanishing points, we have constraints cosθ=viTωvj=0. Thus, we can get equations for the parameters in ω. Also, since we assume the camera has zero skew, so ω can be represented as [ω10ω20ω1ω3ω2ω3ω4]. With above pairs of contraints, we can get equations in the form of Ax=0, where x represents the unknown parameters in ω. We solve it by using SVD, and using the last row of VT. After getting ω, we can apply cholesky decomposition to get K since ω=KTK1. Now we can get the principal points from K.

(b) Camera calibration from metric planes (20 points)

I used the default annotations, and they are shown as follows:

q2b1

The K for the input image is:

[[ 1.08447642e+03 -1.35121131e+01 5.20013594e+02] [ 0.00000000e+00 1.07900526e+03 4.02544642e+02] [ 0.00000000e+00 0.00000000e+00 1.00000000e+00]]

The angles between each pairs of planes I get are as follows:

 Angle between planes (degree)
Plane 1 & Plane 267.58
Plane 2 & Plane 392.25
Plane 2 & Plane 394.78

My implementation is as follows. I compute the homography by setting the source points as [[0, 0, 1], [1, 0, 1], [1, 1, 1], [0, 1, 1]]. To solve these H matrices, we still use SVD to compute its parameters. For each pairs of H matrices, we get two constraints as h1Tωh2=0 and h1Tωh1h2Tωh2=0. Similarly, we use these constraints to construct Ax=0 format, then apply SVD to get ω, and finally use cholesky decomposition to get K.

(c) Camera calibration from rectangles with known sizes (10 points bonus)

The input image and the annotations are as follows:

q2c1q2c2

The K for the input image is:

[[ 3.08588499e+03 -1.46138057e+02 1.59587545e+03] [ 0.00000000e+00 3.20926163e+03 1.67590571e+03] [ 0.00000000e+00 0.00000000e+00 1.00000000e+00]]

I evaluated the angles between 3 planes, and they are as follows:

 Angle between planes (degree)
Plane 1 & Plane 2133.51
Plane 2 & Plane 380.69
Plane 2 & Plane 391.04

My implementation is similar to Q2(b), with the same equations. The only difference is the source points, since now we have rectangles rather than squares. For the first two rectangles, the source points now are [[0, 0, 1], [1.6, 0, 1], [1.6, 1, 1], [0, 1, 1]]; for the third rectangle, the source points are [[0, 0, 1], [0.7, 0, 1], [0.7, 1, 1], [0, 1, 1]]. Then we follow the same process as Q2(b).

Q3: Single View Reconstruction (30 points + 10 points bonus)

(a) (30 points)

The visualization of annotations and the output two views are as follows:

q2a1q2a1q2a1

My implementation is as follows. We can select 3 pairs of parallel lines from the rectangles, and using the same method as Q2(a), we can compute the K. Then, we can compute the plane normals similar to Q2, using vanishing points and K. To get the plane equation nTX+a=0, we select a joint point x for these planes (for example, the point intersecting 3 planes), and set its depth=1. We can use X=K1x to project it as ray. With this known point and its known depth, we can get the a by first calculating λ=depthnorm(X), and then calculating a=λnTX. With this a, we can compute the λ for all the points in the planes by λ=anTX. Now the 3D points for each plane can be calculated by using the K1 and λ.

(b) (10 points bonus)

We run the same code on 3 different inputs with different annotations and known points. The visulizations are as follows:

q2a1q2a1q2a1

q2a1q2a1q2a1

q2a1q2a1q2a1