PhotoMaker Rapid customization within seconds, with no additional LoRA training preserving ID with high fidelity and text controllability which can serve as an adapter for other models
StableCascade successor to Stable Diffusion by Stability AI with smaller latent space, higher speeds and better quality
ConsistentID Portrait Generation with Multimodal Fine-Grained Identity Preservation
Flux Black Forrest Labs consisting of ex stabilityAi staff built a SOTA text-to-image model Flux and Flux schnell, a 13B parameter transformer capable of writing text, following complex prompts released under apache 2 license
Lumina-mGPT multimodal autoregressive LLMs capable of generating flexible and photorealistic images from text descriptions
stable-dreamfusion A PyTorch implementation of the text-to-3D model Dreamfusion using the Stable Diffusion text-to-2D model
image to 3d:
Wonder3D A cross-domain diffusion model for 3D reconstruction from a single image
DreamCraft3D Official implementation of DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Spann3R is a transformer-based model for dense 3D reconstruction from images, with spatial memory to track and predict 3D structures and capable of real-time processing
stable-diffusion.cpp CPU inference of Stable Diffusion in pure C/C++ with huge performance gains, supporting ggml, 16/32 bit float, 4/5/8 bit quantization, AVX/AVX2/AVX512, SD1.x, SD2.x, txt2img/img2img
FaceFusion Next generation face swapper and enhancer
StabilityMatrix is a portable package manager and UI for GUIs like Forge, SD.Next, ComfyUI and more, supporting multiple packages, offering built-in Git and Python dependencies, and features like syntax highlighting, workspace management, and model browsing
OneDiff is a PyTorch-based acceleration library for diffusion models, offering out-of-the-box speedups, GPU optimization, and broad model and NVIDIA GPU support
sd-webui-EasyPhoto / easyphoto plugin for generating AI portraits that can be used to train digital doppelgangers with 5-10 photos and a quick LoRA fine tune, paper
StableTuner Windows GUI for Finetuning / Dreambooth Stable Diffusion models (abandoned)
SimpleTuner fine-tuning for StableDiffusion, PixArt, Flux with LoRA and full U-Net training, multi GPU support, DeepSpeed
x-flux LoRA and ControlNet training scripts for Flux model by Black Forest Labs using DeepSpeed
ORCa converts glossy objects into radiance-field cameras, enabling depth estimation and novel-view synthesis, project, code
cocktail Mixing Multi-Modality Controls for Text-Conditional Image Generation, project, code
SnapFusion Fast text-to-image diffusion on mobile phones in 2 seconds
Objaverse-xl dataset of 10 million annotated high quality 3D objects, hf
LightGlue Local Feature Matching at Light Speed, a lightweight feature matcher with high accuracy and blazing fast inference. It takes as input a set of keypoints and descriptors for each image and returns the indices of corresponding points
ml-mgie Guiding Instruction-based Image Editing via Multimodal Large Language Models