<h1 align="center">
<a href="https://prompts.chat">
π LocalAI is now part of a comprehensive suite of AI tools designed to work together:
Sign in to like and favorite skills
π‘ Get help - βFAQ πDiscussions π¬ Discord :book: Documentation website
π» Quickstart πΌοΈ Models π Roadmap π Explorer π« Examples Try on
LocalAI is the free, Open Source OpenAI alternative. LocalAI act as a drop-in replacement REST API that's compatible with OpenAI (Elevenlabs, Anthropic... ) API specifications for local AI inferencing. It allows you to run LLMs, generate images, audio (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families. Does not require GPU. It is created and maintained by Ettore Di Giacinto.
π LocalAI is now part of a comprehensive suite of AI tools designed to work together:
|
LocalAGIA powerful Local AI agent management platform that serves as a drop-in replacement for OpenAI's Responses API, enhanced with advanced agentic capabilities. |
|
LocalRecallA REST-ful API and knowledge base management system that provides persistent memory and storage capabilities for AI agents. |
| Talk Interface | Generate Audio |
|---|---|
![]() | ![]() |
| Models Overview | Generate Images |
|---|---|
![]() | ![]() |
| Chat Interface | Home |
|---|---|
![]() | ![]() |
| Login | Swarm |
|---|---|
![]() | ![]() |
Run the installer script:
# Basic installation curl https://localai.io/install.sh | sh
For more installation options, see Installer Options.
Note: the DMGs are not signed by Apple as quarantined. See https://github.com/mudler/LocalAI/issues/6268 for a workaround, fix is tracked here: https://github.com/mudler/LocalAI/issues/6244
Or run with docker:
π‘ Docker Run vs Docker Start
creates and starts a new container. If a container with the same name already exists, this command will fail.docker run starts an existing container that was previously created withdocker start.docker runIf you've already run LocalAI before and want to start it again, use:
docker start -i local-ai
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest
# CUDA 12.0 docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12 # CUDA 11.7 docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-11 # NVIDIA Jetson (L4T) ARM64 docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64
docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-gpu-hipblas
docker run -ti --name local-ai -p 8080:8080 --device=/dev/dri/card1 --device=/dev/dri/renderD128 localai/localai:latest-gpu-intel
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-vulkan
# CPU version docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu # NVIDIA CUDA 12 version docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-12 # NVIDIA CUDA 11 version docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-11 # Intel GPU version docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-gpu-intel # AMD GPU version docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-aio-gpu-hipblas
For more information about the AIO images and pre-downloaded models, see Container Documentation.
To load models:
# From the model gallery (see available models with `local-ai models list`, in the WebUI from the model tab, or visiting https://models.localai.io) local-ai run llama-3.2-1b-instruct:q4_k_m # Start LocalAI with the phi-2 model directly from huggingface local-ai run huggingface://TheBloke/phi-2-GGUF/phi-2.Q8_0.gguf # Install and run a model from the Ollama OCI registry local-ai run ollama://gemma:2b # Run a model from a configuration file local-ai run https://gist.githubusercontent.com/.../phi-2.yaml # Install and run a model from a standard OCI registry (e.g., Docker Hub) local-ai run oci://localai/phi-2:latest
β‘ Automatic Backend Detection: When you install models from the gallery or YAML files, LocalAI automatically detects your system's GPU capabilities (NVIDIA, AMD, Intel) and downloads the appropriate backend. For advanced configuration options, see GPU Acceleration.
For more information, see π» Getting started
development suffix in the gallery ): https://github.com/mudler/LocalAI/pull/6049 https://github.com/mudler/LocalAI/pull/6119 https://github.com/mudler/LocalAI/pull/6121 https://github.com/mudler/LocalAI/pull/6060Roadmap items: List of issues
llama.cpp, transformers, vllm ... :book: and more)whisper.cpp)LocalAI supports a comprehensive range of AI backends with multiple acceleration options:
| Backend | Description | Acceleration Support |
|---|---|---|
| llama.cpp | LLM inference in C/C++ | CUDA 11/12, ROCm, Intel SYCL, Vulkan, Metal, CPU |
| vLLM | Fast LLM inference with PagedAttention | CUDA 12, ROCm, Intel |
| transformers | HuggingFace transformers framework | CUDA 11/12, ROCm, Intel, CPU |
| exllama2 | GPTQ inference library | CUDA 12 |
| MLX | Apple Silicon LLM inference | Metal (M1/M2/M3+) |
| MLX-VLM | Apple Silicon Vision-Language Models | Metal (M1/M2/M3+) |
| Backend | Description | Acceleration Support |
|---|---|---|
| whisper.cpp | OpenAI Whisper in C/C++ | CUDA 12, ROCm, Intel SYCL, Vulkan, CPU |
| faster-whisper | Fast Whisper with CTranslate2 | CUDA 12, ROCm, Intel, CPU |
| bark | Text-to-audio generation | CUDA 12, ROCm, Intel |
| bark-cpp | C++ implementation of Bark | CUDA, Metal, CPU |
| coqui | Advanced TTS with 1100+ languages | CUDA 12, ROCm, Intel, CPU |
| kokoro | Lightweight TTS model | CUDA 12, ROCm, Intel, CPU |
| chatterbox | Production-grade TTS | CUDA 11/12, CPU |
| piper | Fast neural TTS system | CPU |
| kitten-tts | Kitten TTS models | CPU |
| silero-vad | Voice Activity Detection | CPU |
| neutts | Text-to-speech with voice cloning | CUDA 12, ROCm, CPU |
| Backend | Description | Acceleration Support |
|---|---|---|
| stablediffusion.cpp | Stable Diffusion in C/C++ | CUDA 12, Intel SYCL, Vulkan, CPU |
| diffusers | HuggingFace diffusion models | CUDA 11/12, ROCm, Intel, Metal, CPU |
| Backend | Description | Acceleration Support |
|---|---|---|
| rfdetr | Real-time object detection | CUDA 12, Intel, CPU |
| rerankers | Document reranking API | CUDA 11/12, ROCm, Intel, CPU |
| local-store | Vector database | CPU |
| huggingface | HuggingFace API integration | API-based |
| Acceleration Type | Supported Backends | Hardware Support |
|---|---|---|
| NVIDIA CUDA 11 | llama.cpp, whisper, stablediffusion, diffusers, rerankers, bark, chatterbox | Nvidia hardware |
| NVIDIA CUDA 12 | All CUDA-compatible backends | Nvidia hardware |
| AMD ROCm | llama.cpp, whisper, vllm, transformers, diffusers, rerankers, coqui, kokoro, bark, neutts | AMD Graphics |
| Intel oneAPI | llama.cpp, whisper, stablediffusion, vllm, transformers, diffusers, rfdetr, rerankers, exllama2, coqui, kokoro, bark | Intel Arc, Intel iGPUs |
| Apple Metal | llama.cpp, whisper, diffusers, MLX, MLX-VLM, bark-cpp | Apple M1/M2/M3+ |
| Vulkan | llama.cpp, whisper, stablediffusion | Cross-platform GPUs |
| NVIDIA Jetson | llama.cpp, whisper, stablediffusion, diffusers, rfdetr | ARM64 embedded AI |
| CPU Optimized | All backends | AVX/AVX2/AVX512, quantization support |
Build and deploy custom containers:
WebUIs:
Agentic Libraries:
MCPs:
Model galleries
Voice:
Other:
If you utilize this repository, data in a downstream project, please consider citing it with:
@misc{localai, author = {Ettore Di Giacinto}, title = {LocalAI: The free, Open source OpenAI alternative}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/go-skynet/LocalAI}},
Do you find LocalAI useful?
Support the project by becoming a backer or sponsor. Your logo will show up here with a link to your website.
A huge thank you to our generous sponsors who support this project covering CI expenses, and our Sponsor list:
LocalAI is a community-driven project created by Ettore Di Giacinto.
MIT - Author Ettore Di Giacinto [email protected]
LocalAI couldn't have been built without the help of great software already available from the community. Thank you!
This is a community project, a special thanks to our contributors! π€