1b34e047e42be00f.tex
1: \begin{abstract}
2: \textcolor{black}{
3: Text-to-3D generation has attracted much attention from the computer vision community. Existing methods mainly optimize a neural field from scratch for each text prompt, relying on heavy and repetitive training cost which impedes their practical deployment.
4: %
5: In this paper, we propose a novel framework for fast text-to-3D generation, dubbed {\name}.
6: Once trained, {\name} is able to create a 3D object for an unseen text prompt in less than one second with a single run of a feedforward network. 
7: %
8: We achieve this remarkable speed by devising a new network that directly constructs a 3D triplane from a text prompt.
9: The core innovation of our {\name} lies in our exploration of strategies to effectively inject text conditions into the network. In particular, we propose to combine three key mechanisms: cross-attention, style injection, and token-to-plane transformation, which collectively ensure precise alignment of the output with the input text.
10: %
11: Furthermore, we propose a simple yet effective activation function, the scaled-sigmoid, to replace the original sigmoid function, which speeds up the training convergence by more than ten times.
12: %
13: Finally, to address the Janus (multi-head) problem in 3D generation, we propose an adaptive Perp-Neg algorithm that can dynamically adjust its concept negation scales according to the severity of the Janus problem during training, effectively reducing the multi-head effect.
14: %
15: Extensive experiments on a wide variety of benchmark datasets demonstrate that the proposed algorithm performs favorably against the state-of-the-art methods both qualitatively and quantitatively, while achieving significantly better efficiency.
16: % The project page is at \url{https://ming1993li.github.io/Instant3DProj/}.
17: The code, data, and models are available at \url{https://github.com/ming1993li/Instant3DCodes}.}
18: 
19: \keywords{Text-to-3D Generation \and Large-Scale Generative Models \and Neural Radiance Fields}
20: \end{abstract}
21: