**New Methodology Uses Pre-trained Image-Text Diffusion Models and GAN for Generating High-Quality, Stylized 3D Avatars **
This paper presents a novel method for creating high-quality, stylized 3D avatars using pre-trained image-text diffusion models and a Generative Adversarial Network (GAN). The method leverages the broad knowledge of appearance and geometry provided by image-text diffusion models to create multi-view images of avatars in various styles. During data generation, poses from existing 3D models are used to guide the creation of multi-view images. To solve the issue of misalignment between poses and images, the authors use view-specific prompts and develop a coarse-to-fine discriminator for GAN training. To increase avatar diversity, they explore attribute-related prompts. Furthermore, a latent diffusion model is developed within StyleGAN’s style space to generate avatars based on image inputs. The method outperforms current methods in terms of visual quality and diversity of the produced avatars.
=> https://huggingface.co/papers/2305.19012