Single View to 3D

2. Reconstructing 3D from Single View

All the training and eval use --load_feat and runs in cpu

2.1 Image to Voxel Grid

2.2 Image to Point Cloud

2.3 Image to Mesh

2.4 Quantitative Comparisons

F1@0.05 scores for vox, point cloud and mesh: 62.28, 75.13, 70.04
Why the result of vox seems to be the poorest: 32*32*32 is a really low resolution, preventing the good training performance.
Why point cloud is the best: 1. No connection limitation here. It is more flexible. 2. Applied a ResMLP here, making the training more stable and lead to better performance.

2.5 Analyze Hyperparameter Variations

Hyperparameter Analysis:

Modify the w_smooth to see the difference. The results are really spikey when w_smooth == 0.1, so I change the w_smooth to 5 to see how the results would be changed correspondingly.

As we can see below, smoother parameters lead to smoother results. But it also makes all the results look really similar. Difference image inputs can generate almost the same results. I think it is because the w_smooth is to control how much each vertex deviates from the neighbors, making the results really similar, especially when the number of training step (around 1500) is not that large.

w_smooth == 0.1