abstract:639be174d98d26f1.tex

1: \begin{abstract}

2: Score distillation has emerged as one of the most prevalent approaches for text-to-3D asset synthesis.

3: Essentially, score distillation updates 3D parameters by lifting and back-propagating scores averaged over different views.

4: In this paper, we reveal that the gradient estimation in score distillation is inherent to high variance.

5: Through the lens of variance reduction, the effectiveness of SDS and VSD can be interpreted as applications of various control variates to the Monte Carlo estimator of the distilled score.

6: Motivated by this rethinking and based on Stein's identity, we propose a more general solution to reduce variance for score distillation, termed \textit{Stein Score Distillation (SSD)}. SSD incorporates control variates constructed by Stein identity,

7: allowing for arbitrary baseline functions. This enables us to include flexible guidance priors and network architectures to explicitly optimize for variance reduction.

8: In our experiments, the overall pipeline, dubbed \textit{SteinDreamer}, is implemented by instantiating the control variate with a monocular depth estimator.

9: The results suggest that SSD can effectively reduce the distillation variance and consistently improve visual quality for both object- and scene-level generation.

10: Moreover, we demonstrate that SteinDreamer achieves faster convergence than existing methods due to more stable gradient updates.

11: \end{abstract}

12: