Single View to 3D

1. Exploring Loss Functions

1.1 Fitting a Voxel Grid (Binary Cross Entropy Loss)

Voxel fitting result 1

1.2 Fitting a Point Cloud (Chamfer Loss)

Point cloud fitting result 1

1.3 Fitting a Mesh (Smoothness Loss)

Mesh fitting result 1

2. Reconstructing 3D from Single View

All the training and eval use --load_feat and runs in cpu

2.1 Image to Voxel Grid

Voxel prediction 1
Voxel prediction 2
Voxel prediction 3

2.2 Image to Point Cloud

Point cloud prediction 1
Point cloud prediction 2
Point cloud prediction 3

2.3 Image to Mesh

Mesh prediction 1
Mesh prediction 2
Mesh prediction 3

2.4 Quantitative Comparisons

Mesh prediction 1
Mesh prediction 2
Mesh prediction 3

2.5 Analyze Hyperparameter Variations

Hyperparameter Analysis:

Modify the w_smooth to see the difference. The results are really spikey when w_smooth == 0.1, so I change the w_smooth to 5 to see how the results would be changed correspondingly.

As we can see below, smoother parameters lead to smoother results. But it also makes all the results look really similar. Difference image inputs can generate almost the same results. I think it is because the w_smooth is to control how much each vertex deviates from the neighbors, making the results really similar, especially when the number of training step (around 1500) is not that large.

Mesh prediction 1

w_smooth == 0.1

Mesh prediction 2

w_smooth == 0.1

Mesh prediction 3

w_smooth == 0.1

Mesh prediction 1

w_smooth == 5

Mesh prediction 2

w_smooth == 5

Mesh prediction 3

w_smooth == 5

2.6 Visualize Learned Representations

Add Gaussian Noise

Gaussian noise has been added to the loaded image features. Set the mean = 0.0 and std = [0.1, 0.5, 1.0, 2.0, 5.0]

std = [0.1, 0.5] will not significantly influence the results. std = 1.0 or higher will cause visable influence on the results.

std=2.0 result 1

mean = 0.0, std = 0.1

std=5.0 result 1

mean = 0.0, std = 0.5

std=0.1 result 2

mean = 0.0, std = 1.0

std=0.5 result 2

mean = 0.0, std = 2.0

std=1.0 result 2

mean = 0.0, std = 5.0

std=2.0 result 1

mean = 0.0, std = 0.1

std=5.0 result 1

mean = 0.0, std = 0.5

std=0.1 result 2

mean = 0.0, std = 1.0

std=0.5 result 2

mean = 0.0, std = 2.0

std=1.0 result 2

mean = 0.0, std = 5.0

3. Exploring Other Architectures

3.2 Parametric Network

At first the model is without Position Encoding. Even trained for over 3k steps, the results look like just noise. The current results are based PE and mlp.

Parametric network result 1
Parametric network result 2
Parametric network result 3