Das mit Spannung erwartete Wan 2 2 erscheint bald: Ein Tag des Community-Wahnsinns

Unlike closed-source alternatives, you get complete access to source code, model weights, and can run it on your own hardware. Discover how the open-source community uses Wan2.2 to create professional cinematic videos with MoE architecture and advanced motion control. Images created with aesthetic data fine-tuning for lighting, composition, and color that work perfectly with Wan2.2 video models.

Run Speech-to-Video Generation

Each model uses a different KSampler with total steps halved between them.
This repository supports the Wan2.2-I2V-A14B Image-to-Video model and can simultaneously support video generation at 480P and 720P resolutions.
Create high-quality audio and video content easily with the open-source Wan 2.2 video generation model.

To enable more efficient deployment, Wan2.2 also explores a high-compression design. In addition to the 27B MoE models, a 5B dense model, i.e., TI2V-5B, is released. With an additional patchification layer, the total compression ratio of TI2V-5B reaches $4\times32\times32$.

💡The –num_clip parameter controls the number of video clips generated, useful for quick preview with shorter generation time.
Prepare visuals with cinematic aesthetic data and fine-grained control over lighting and composition.
💡If you encounter OOM (Out-of-Memory) issues, you can use the –offload_model True, –convert_model_dtype and –t5_cpu options to reduce GPU memory usage.
In addition to the 27B MoE models, a 5B dense model, i.e., TI2V-5B, is released.
Transform your ideas into stunning videos with Wan 2 AI’s cutting-edge video generation technology.

Instareal is a new foundational LoRA engineered to deliver an exceptional level of photorealism for Wan 2.2. It can be used as a standalone realism engine or stacked with our other models for advanced control. Download the models from GitHub, try our online demo, or access ready-to-use deployments on Hugging Face. Comprehensive documentation and community support are available to help you get started. Wan2.2 is trained on +65.6% more images than previous versions, enhancing generalization across motions, semantics, and aesthetics. Yes, Wan2.2 is fully open-source with no licensing fees for most use cases.

Download Wan2.2 on GitHub, create cinematic videos at 720P, and join a global community of developers and creators pushing the boundaries of AI video generation. 💡The –pose_video parameter enables pose-driven generation, allowing the model to follow specific pose sequences while generating videos synchronized with audio input. Create high-quality audio and video content easily with the open-source Wan 2.2 video generation model. Wan2.2 builds on the foundation of Wan2.1 with notable improvements in generation quality and model capability. This upgrade is driven by a series of key technical innovations, mainly including the Mixture-of-Experts (MoE) architecture, upgraded training data, and high-compression video generation.

Model Download

This repository supports the Wan2.2-S2V-14B Speech-to-Video model and can simultaneously support video generation at 480P and 720P resolutions. This repository supports the Wan2.2-I2V-A14B Image-to-Video model and can simultaneously support video generation at 480P and 720P resolutions. This repository supports the Wan2.2-T2V-A14B Text-to-Video model and can simultaneously support video generation at 480P and 720P resolutions. Optimize images for seamless integration with Wan2.2’s T2V and I2V models for consistent high-quality video output.

To facilitate implementation, we will start with a basic version of the inference process that skips the prompt extension step. Transform your ideas into stunning videos with Wan 2 AI’s cutting-edge video generation technology. The TI2V-5B model is optimized to run on single consumer-grade GPUs like the RTX 4090.

Wie unterscheidet sich Wan 2.2 von Wan 2.1?

Wan2.2 introduces Mixture-of-Experts architecture that separates the denoising process across timesteps with specialized expert models. This enlarges model capacity while maintaining computational efficiency. Transform your ideas into cinematic masterpieces using Wan2.2’s advanced MoE architecture. Create stunning 720P videos with precise prompt following and sweeping motion control. Perfect for filmmakers and content creators seeking professional results. 💡The model can generate videos from audio input combined with reference image and optional text prompt.

Wan2.2-I2V-A14B: Bild-zu-Video-Modell

Create images specifically designed for seamless video integration and animation. Wan2.2 introduces Mixture-of-Experts architecture into video diffusion models, enlarging model capacity while maintaining computational efficiency. 💡The –num_clip parameter controls the number of video clips generated, useful for quick preview with shorter generation time. Achieve professional cinematic narratives through deep command of shot language.

Commercial licensing options are available for enterprise solutions requiring additional support and features. 💡 If you’re using Wan-Animate, we do not recommend using LoRA models trained on Wan2.2, since weight changes during training may lead to unexpected behavior. 💡If you are running on a GPU with at least 80GB VRAM, you can remove the –offload_model True, –convert_model_dtype and –t5_cpu options to speed up execution. 💡If you encounter OOM (Out-of-Memory) issues, you can use the –offload_model True, –convert_model_dtype and –t5_cpu options to reduce GPU memory usage. From the creators of Instagirl, this is the next step in our pursuit of perfect photorealism. Instareal is a specialized foundational LoRA for Wan 2.2, engineered to serve as a high-fidelity realism engine for your creations.

The Wan2.2 (MoE) (our final version) achieves the lowest validation loss, indicating that its generated video distribution is closest to ground-truth and exhibits superior convergence. Experience Wan2.2, the world’s first open-source MoE video generation model. Create professional cinematic videos from text or images at 720P resolution. Prepare visuals with cinematic aesthetic data and fine-grained control over lighting and composition.

Wan2.2 incorporates specially curated aesthetic data with precise labels for lighting, composition, and color for controllable cinematic style. The models in this repository are licensed under the Apache 2.0 License. We claim no rights over the your generated contents, granting you the freedom to use them while ensuring that your usage complies with the provisions of this license.

videoEffect.duration

Current state-of-the-art (SOTA) methods for audio-driven character animation demonstrate promising performance for scenarios primarily involving speech and singing. To address this long-standing challenge of achieving film-level character animation, we propose an audio-driven model, which we refere to as Wan-S2V, built upon Wan. Our model achieves significantly enhanced expressiveness and fidelity in cinematic contexts compared to existing approaches. We conducted extensive experiments, benchmarking our method against cutting-edge models such as Hunyuan-Avatar and Omnihuman.

The experimental results consistently demonstrate that our approach significantly outperforms these existing solutions. Additionally, we explore the versatility of our method through its applications in long-form video generation chicken road and precise video lip-sync editing. To validate the effectiveness of the MoE architecture, four settings are compared based on their validation loss curves.