DreamGaussian: The Stable Diffusion Moment of AIGC 3D Generation

October 13, 2023 • 3 min read

Since I released the 3D generation tool Dreamfields-3D last October (I, a programming novice, built an AI tool for generating 3D models from text (open-sourced)), it's hard to believe a year has passed. Recently, numerous AIGC 3D generation algorithms have emerged, and the latest release, DreamGaussian, feels like we've nearly reached the stable diffusion moment for 3D generation—challenging the "quality, speed, and generalization" triangle of impossibility!

Example of mesh quality exported from the DreamGaussian generation model.

🥰DreamGaussian = Dreamfusion + 3D Gaussian Splatting iteration + mesh export + SD optimized texture.

DreamGaussian technical framework.

✨Dreamfusion is the first algorithm to transplant the 2D generation capabilities of stable diffusion to 3D Nerf generation (using an iterative approach known as SDS loss, detailed in my earlier popular science article: From Hand Modeling to Mouth Modeling: An Overview of the Latest AI Algorithms for Generating 3D Models from Text).

👀3D Gaussian Splatting is currently the state-of-the-art method for 3D scene representation using neural radiance fields, achieving the best quality and speed in 3D scene reconstruction (see this popular science article: Next-Generation Nerf, Gaussian Splatting is Here~).

💕By combining these, we can use SD + 3D Gaussian to quickly (within 1 minute) iterate a rough Gaussian field from text/image input; then export the Gaussian field as a mesh and UV texture, and use SD to continue optimizing the texture without changing the mesh (also within 1 minute).

😍The result of this entire process is: achieving SOTA (state-of-the-art) quality + short iteration time (about 1.5 minutes on a 3090) + support for various tasks (text to 3D & image to 3D) + strong generalization (leveraging the generalization ability of the SD model).

Examples of DreamGaussian converting images to 3D models.

🤔However, considering that 3D Gaussian Splatting has only been open-sourced for three months, there is still significant room for optimization. After the release of the Dreamfusion algorithm, many other algorithms leveraging SD model capabilities for 3D generation have emerged, incorporating various tricks to improve performance, which have yet to be introduced. Therefore, in the foreseeable year ahead, 3D generation algorithms will rapidly iterate, transitioning from the "SD 1.0" era to the "controlnet + SDXL" era. (P.S.: Who remembers that stable diffusion was only released last August, and it's just over a year now? We use SD so naturally, it's almost like breathing. 🐶)