Capability
Multi Model Video Generation With Unified Interface
20 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “video generation with 3d unet and temporal consistency”
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
Unique: Uses Unet3D with 3D convolutions and temporal attention to generate videos while maintaining shared architecture with image generation, enabling transfer learning from image models and flexible frame count handling
vs others: Extends cascading diffusion architecture to temporal domain using 3D convolutions rather than separate video models, enabling unified text-to-image-to-video pipeline with shared conditioning mechanisms