Acknowledgements
We would like to express our deepest gratitude to Ben Poole for helpful suggestions, guidance, and contributions. We also thank George Kopanas, Sander Dieleman, Matthew Burruss, Matthew Levine, Peter Hedman, Songyou Peng, Rundi Wu, Alex Trevithick, Daniel Duckworth, Hadi Alzayer, David Charatan, Jiapeng Tang and Akshay Krishnan for valuable discussions and insights. Finally, we extend our gratitude to Shlomi Fruchter, Kevin Murphy, Mohammad Babaeizadeh, Han Zhang and Amir Hertz for training the base text-to-image latent diffusion model. Website template is borrowed from CAT3D and CAT4D.