Task 1. ChatGPT

I decided to ask about L3DV-related topics.

Prompt: What are some of the most surprising facts about learning methods for 3D vision?

Response:

Great question — 3D vision is a fascinating field because it sits at the intersection of computer vision, graphics, and geometry, and a lot of the methods developed over the past few years are surprisingly counterintuitive. Here are some of the most surprising facts about learning methods for 3D vision:

1. You don’t always need 3D supervision to learn 3D

Many state-of-the-art 3D reconstruction methods (e.g., NeRFs, implicit representations) are trained only on 2D images and their camera poses — no ground truth 3D mesh or depth required.
The models learn geometry as a byproduct of explaining the 2D views. 👉 Surprising because it overturns the assumption that “to learn 3D, you need 3D labels.”

2. 2D CNNs are often better than 3D CNNs

One might expect 3D convolutions (operating on voxels) to be the natural fit. But in practice, 2D CNNs plus clever projection (multi-view, point-based, or differentiable rendering) often outperform direct 3D CNNs. 👉 This is because voxel grids are memory-heavy and waste resolution, while 2D methods leverage decades of progress in 2D vision.

3. Differentiable rendering unlocked learning-based 3D

Before differentiable rendering, you couldn’t backpropagate from 2D pixels into 3D shapes. Once differentiable renderers were introduced, it became possible to optimize 3D structures using only images, changing the landscape of 3D learning.

4. Implicit representations beat explicit ones

Traditional 3D representations (meshes, voxels, point clouds) are being replaced by implicit neural fields (e.g., signed distance functions, occupancy networks, NeRFs).
These are continuous, memory-efficient, and can model fine details better than explicit discretized formats. 👉 Surprising because it means a neural network function can be a “shape” in itself.

5. Scale and regularization matter more than you think

Training on unconstrained 3D data can lead to degenerate solutions (e.g., flat reconstructions). Methods often rely on subtle priors like multi-view consistency, smoothness, or symmetry.
These priors, not just raw supervision, are what make the geometry come out correctly.

6. Synthetic data is extremely powerful

A lot of 3D learning methods train on synthetic datasets (like ShapeNet, Blender renders) and transfer surprisingly well to real-world data, even though textures and lighting differ. 👉 This is because geometry is more universal than appearance, making domain gaps smaller than expected.

7. Language can guide 3D understanding

Models like CLIP and text-to-3D methods show that natural language supervision can substitute for explicit 3D data.
A model that’s never seen a 3D mesh can generate one just from text, because it aligns concepts between images and text embeddings.

8. Multi-modality > pure 3D

Some of the best-performing methods use multi-modal learning — combining text, images, videos, and sometimes depth.
For example, training with image-language pairs can lead to stronger 3D feature learning than training on 3D data alone.

9. Humans are still way better with less data

Even with massive datasets, models often struggle with occlusion, symmetry, or reasoning about hollow vs. solid shapes. Humans can infer these with a single glance. 👉 Suggests that current methods are still missing strong causal and physical priors.

10. Evaluation is tricky and unintuitive

A model can look visually perfect but score poorly on standard metrics (e.g., IoU, Chamfer distance).
Conversely, models with mediocre scores sometimes produce more usable 3D outputs. 👉 Surprising because metrics don’t always match human perception of quality.

Would you like me to go deeper into the history of the big shifts (like voxel → point cloud → implicit field → NeRF → diffusion) and why each one was such a leap, or would you prefer a focus on practical tips for learning 3D methods yourself?

Task 2. 3D shape images

I found something called Small Stellated Truncated Dodecahedron (Link).

3D image of a Small Stellated Truncated Dodecahedron (taken from Wikipedia, linked above).

Task 3. Funny GIF

I really like cats!

Taken from Giphy (Link), framerate reduced for file size limit.