Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.
-
Updated
Jul 21, 2025 - Cuda
Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.
[CVPR-2025] The official code of HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation
Add a description, image, and links to the video-generate topic page so that developers can more easily learn about it.
To associate your repository with the video-generate topic, visit your repo's landing page and select "manage topics."