zeroscope v2 xl Watermark free modelscope based video model generating high quality video at 1024x576 16:9, to be used with text2video extension for automatic1111
StoryDiffusion Consistent Long-Range Image and Video Generation
Open-Sora Open implementation approach for video generation
CogVideo SOTA video generation and consistency generating 6 seconds of video with 8fps at 720x480 using 18-36GB vRAM
Pyramid-Flow is a highly efficient autoregressive video generation method that leverages flow matching for improved computational efficiency, capable of generating high-quality 10-second videos at 768p resolution and 24 FPS, and supporting image-to-video generation.
HunyuanVideo Tencent's open-weight video-generation model
mochi-1 state of the art video generation model with high-fidelity motion and strong prompt adherence by Genmo
Wan2.1 is an open large-scale video generative model that excels in multiple tasks, including text-to-video and video editing, while achieving SOTA performance on consumer-grade GPUs
SkyReels-V2 an advanced video generation model available in 1.3B and 14B variants, capable of producing unlimited duration videos for both text-to-video and image-to-video tasks, and demonstrating superior performance compared to leading models like HunyuanVideo-13B and Wan2.1-14B
Magi-1 autoregressive video generation model that enables unlimited duration video creation with precise control over timing and dynamics, supporting text-to-video, image-to-video, and video-to-video tasks while leading the Physics-IQ Benchmark for its exceptional performance
Segment and Track Anything, code. an innovative framework combining the Segment Anything Model (SAM) and DeAOT tracking model, enables precise, multimodal object tracking in video, demonstrating superior performance in benchmarks
Track Anything, code. extends the Segment Anything Model (SAM) to achieve high-performance, interactive tracking and segmentation in videos with minimal human intervention, addressing SAM's limitations in consistent video segmentation
MAGVIT Single model for multiple video synthesis outperforming existing methods in quality and inference time, code and models, paper
FastSAM Fast Segment Anything, a CNN trained achieving a comparable performance with the SAM method at 50× higher run-time speed.
SAM-PT Extending SAM to zero-shot video segmentation with point-based tracking, paper
DEVA Tracking Anything with Decoupled Video Segmentation, paper
Cutie Putting the Object Back into Video Object Segmentation, paper
Instant-ngp Train NeRFs in under 5 seconds on windows/linux with support for GPUs
NeRFstudio A Collaboration Friendly Studio for NeRFs simplifying the process of creating, training, and testing NeRFs and supports web-based visualizer, benchmarks, and pipeline support.
Threestudio A Framework for 3D Content Creation from Text Prompts, Single Images, and Few-Shot Images or text2image created single image to 3D
Zero-1-to-3 Zero-shot One Image to 3D Object for novel view synthesis and 3D reconstruction
localrf NeRFs for reconstructing large-scale stabilized scenes from shakey videos, paper, project page
gaussian-splatting reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering", paper
4d-gaussian-splatting Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting, paper