General
PromptBeginner5 minmarkdown
<h1 align="center">
<a href="https://prompts.chat">
5
Due to projects like [Explore the LLMs](https://llm.extractum.io/) specializing in model indexing, the custom list has been removed.
Sign in to like and favorite skills
Due to projects like Explore the LLMs specializing in model indexing, the custom list has been removed.
| Model | Link | Description | Date added |
|---|---|---|---|
| BitNet b1.58 2B4T | https://huggingface.co/microsoft/bitnet-b1.58-2B-4T | the first native 1-bit LLM at the 2-billion parameter scale achieving performance comparable to full-precision models of similar size but with computational efficiency (memory, energy, latency) | 2025-04-25 |
| OpenHands-LM | https://huggingface.co/all-hands | openhands 1.5b, 7b and 32b coding models with verified strong performance on SWE-Bench using the OpenHands Coding-Agent | 2025-04-25 |
| OpenThinker2-32B | https://huggingface.co/open-thoughts/OpenThinker2-32B | fine-tuned version of Qwen2.5-32B-Instruct on the OpenThoughts2-1M dataset with increased quality compared to base model | 2025-04-25 |
| Cogito-V1 | https://huggingface.co/collections/deepcogito/cogito-v1-preview-67eb105721081abe4ce2ee53 | Cogito model family with Qwen and Llama fine-tunes using Iterated Distillation and Amplification (IDA) to increase coding, STEM and IF quality compared to their base models | 2025-04-25 |
| DeepCoder-14B-Preview | https://huggingface.co/agentica-org/DeepCoder-14B-Preview | code reasoning LLM fine-tuned from DeepSeek-R1-Distilled-Qwen-14B using RL to scale up to long context lengths using DeepScaleR and GRPO+ | 2025-04-25 |
| ZR1-1.5B | https://huggingface.co/Zyphra/ZR1-1.5B | Fine Tuned DeepSeek-R1-Distill-Qwen-1.5B trained extensively on both verified coding and mathematics problems with reinforcement learning | 2025-04-25 |
| Skywork-OR1 | https://huggingface.co/Skywork/Skywork-OR1-32B-Preview | 7B, 7B math and 32B reasoning models with open sourced weights, training data and training code | 2025-04-25 |
| GLM-4 | https://huggingface.co/collections/THUDM/glm-4-0414-67f3cbcb34dd9d252707cb2e | 3 32B models as general, reasoning and deep reasoning variant as well as a 9B SML | 2025-04-25 |
| MAI-DS-R1 | https://huggingface.co/microsoft/MAI-DS-R1 | post-trained DeepSeek-R1 reasoning model by Microsoft AI that enhances responsiveness on blocked topics while maintaining strong reasoning capabilities | 2025-04-25 |
| Gemma 3 QAT | https://huggingface.co/google/gemma-3-27b-it-qat-q4_0-gguf | Quantization Aware Training models from Google, regaining bf16 quality in int4 quants and slashing memory footprint | 2025-04-25 |
| Sky-T1-7B-Mini | https://huggingface.co/NovaSky-AI/Sky-T1-mini | Trained with simple RL applied on DeepSeek-R1-Distill-Qwen-7B model, achieving close to OpenAI o1-mini performance on math benchmarks | 2025-02-21 |
| OmniParser-v2 | https://huggingface.co/microsoft/OmniParser-v2.0 | A VLM converting screenshots of Phone and Desktop UIs into structured list of interactable elements for Computer-Use | 2025-02-21 |
| R1-1776 | https://huggingface.co/perplexity-ai/r1-1776 | Deepseek-R1 671B Param model with removed Chinese Communist Party Censorship | 2025-02-21 |
| Step-Audio-Chat | https://huggingface.co/stepfun-ai/Step-Audio-Chat | Multimodal Large Language Model with 130B parameters for speech recognition, semantic understanding, dialogue management, voice cloning, and speech generation | 2025-02-21 |
| Qwen2.5-VL | https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct | 3B, 7B and 72B Vision Text Multimodal Model with support for bounding boxes, structured output, OCR for tables, forms etc, long video understanding, agentic computer and phone use, visual and text understanding | 2025-02-21 |
| Arcee-Maestro-7B | https://huggingface.co/arcee-ai/Arcee-Maestro-7B-Preview | RL trained reasoning model based on DeepSeek-R1-Distill-Qwen-7B with further GPRO training for reasoning, math and coding | 2025-02-21 |
| Arcee-Blitz | https://huggingface.co/arcee-ai/Arcee-Blitz | Mistral-Small-24B-Instruct base distilled with DeepSeek-R1 for fast and efficient resaoning with 32k context | 2025-02-21 |
| OpenThinker-32B | https://huggingface.co/open-thoughts/OpenThinker-32B | fine-tuned reasoning model of Qwen/Qwen2.5-32B-Instruct on the DeepSeek-R1 distilled OpenThoughts-114k dataset | 2025-02-21 |
| MiniCPM-o | https://huggingface.co/openbmb/MiniCPM-o-2_6 | GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone | 2025-02-21 |
| DeepSeek-R1 | https://huggingface.co/deepseek-ai/DeepSeek-R1 | Ground Breaking reasoning model from DeepSeek trained on novel method to decrease RLHF efforts with distilled variants of various sizes | 2025-02-01 |
| Sky-T1 | https://huggingface.co/NovaSky-AI/Sky-T1-32B-Preview | UC Berkeley's reasoning model with 32B parameters | 2025-01-15 |
| QwQ | https://huggingface.co/Qwen/QwQ-32B-Preview | Qwen's reasoning model with 32B parameter | 2025-01-15 |
| Moxin LLM | https://huggingface.co/moxin-org/moxin-llm-7b | Fully open data, open training 7B base and chat fine tuned model | 2024-12-20 |
| Bamba-9b | https://huggingface.co/blog/bamba | Hybrid Mamba2 model by IBM, Princeton, CMU, UIUC trained on open data with 2.5x throughput available for vLLM, TRL, llama and transformers | 2024-12-20 |
| Command R7B | https://huggingface.co/CohereForAI/c4ai-command-r7b-12-2024 | open weights research 7B model with reasoning, summarization, question answering, coding, tool use and RAG capabilities | 2024-12-20 |
| DeepSeek-V2.5-1210-236B | https://huggingface.co/deepseek-ai/DeepSeek-V2.5-1210 | 1210 improvement over original V2.5 with Math, Coding and Reasoning improvements | 2024-12-20 |
| QwQ-32b | https://qwenlm.github.io/blog/qwq-32b-preview/ | Apache 2 licensed LLM from Alibaba Cloud's Qwen team, inspired by OpenAI's o1 reasoning model for test time compute via reasoning tokens to improve performance | 2024-12-02 |
| Sparse-Llama-3.1-8B-2of4 | https://neuralmagic.com/blog/24-sparse-llama-smaller-models-for-efficient-gpu-inference/ | 2:4 Sparse Llama: Smaller Models for Efficient GPU Inference | 2024-12-02 |
| CursorCore | https://huggingface.co/collections/TechxGenus/cursorcore-series-6706618c38598468866b60e2 | Coding LLMs for use within CursorCore and CursorWeb | |
| ichigo | https://huggingface.co/homebrewltd | an open research project extending text-based llama3 to have native "listening" ability, using an early fusion technique, with improved multiturn capabilities and refusal to process inaudible queries | |
| Zamba2 | https://www.zyphra.com/post/zamba2-7b | a 7B SOTA SML for running on-device with 25% faster first token time and 20% token per second rate compared to other architectures using Mamba2 blocks interleaved shared attention blocks and LoRA shared MLP block | |
| reader-lm | https://jina.ai/news/reader-lm-small-language-models-for-cleaning-and-converting-html-to-markdown | Jina AI's LLM to convert HTML to Markdown, making heuristics, cleanup and content identification an LLM task | |
| Pixtral | https://huggingface.co/mistralai/Pixtral-12B-2409 | 12B LLM with a 400M vision encoder for multi modal image and text inference and 128k sequence length by Mistral | |
| llama-3.2 | https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/ | small and medium sized vision LLMs in 11b and 90b and text only 1b and 3b models by Meta | |
| gemma2 2b | https://huggingface.co/bartowski/gemma-2-2b-it-GGUF | 2b small language model by google achieving SOTA performance for sub 3b models on LLM Leaderboard 2 | |
| DeepSeekCoderv2 | https://github.com/deepseek-ai/DeepSeek-Coder-V2?tab=readme-ov-file#2-model-downloads | 16b and 236b mixture of experts coding models with 128k context length | |
| codegemma | https://huggingface.co/google/codegemma-7b | google's coding models from 2b base, 7b base and 7b instruct | |
| codeqwen1.5 | https://huggingface.co/Qwen/CodeQwen1.5-7B | base and chat models with 7B parameters and good quality | |
| Qwen2 | https://huggingface.co/collections/Qwen/qwen2-6659360b33528ced941e557f | English and Chinese models from 0.5b, 1.5b, 7b, and 72b sizes with great performance and 128k context windows for the 7 and 72b models | |
| Phi | https://huggingface.co/collections/microsoft/phi-3-6626e15e9585a200d2d761e3 | Microsoft's small language and vision models with small and medium parameter sizes, short and long context lengths and great performance | |
| Yi-1.5 | [https://huggingface.co/01-ai/Yi-9B](https://huggingface.co/01-ai/Yi-1.5-34B-Chat | 9b model focusing on multilingual text understanding, available as 9B and 34B variants | |
| InternLM2.5 | https://huggingface.co/internlm/internlm2_5-7b-chat | 7B base and chat models focusing reasoning, math and tool use and 1M context window | |
| Mistral-Large | https://huggingface.co/mistralai/Mistral-Large-Instruct-2407 | a 123B sized model beating llama-3.1 and gpt-4o in several categories with a focus on multilinguality, coding, agentic tasks and reasoning. | |
| Llama-3.1 | https://ai.meta.com/blog/meta-llama-3-1/ | Metas most advanced model providing 8b, 70b and 405b base and instruction tuned models and 128k context window with on par quality of current SOTA closed source models | |
| Nuextract | https://huggingface.co/numind/NuExtract | is a structure extraction model based on phi-3-mini, allowing to instruct based on a json template that the model fills from unstructured text provided | |
| Mistral Nemo | https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407 | a 12B model by mistral and nvidia offering 128k context window offered as instruct and base models | |
| CodeGeeX4 | https://huggingface.co/THUDM/codegeex4-all-9b | 9B multilingual code generation model for chat and instruct with a 128k context length | |
| Mamba-Codestral | https://huggingface.co/mistralai/Mamba-Codestral-7B-v0.1 | by mistral based on the Mamba2 architecture performing on par with SOTA transformer based code models | |
| Aya-23 | https://huggingface.co/CohereForAI/aya-23-35B | 8B and 35B instruction tuned multi lingual model focusing on 23 languages | |
| Mistral-7b-instruct-v0.3 | https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3 | with function calling, new tokenizer and 32k max context | |
| CodeStral-22B | https://huggingface.co/mistralai/Codestral-22B-v0.1 | Coding model trained on 80+ languages with instruct and Fill in the Middle tasks, 32k max context | |
| Yuan2-M32 | https://huggingface.co/IEITYuan/Yuan2-M32-hf | Mixture of Experts with Attention Router, 32 Experts, 2 Active, TOtal 40B parameters, 3.7B active and max length of 16K | |
| DeepSeek-V2 | https://github.com/deepseek-ai/DeepSeek-V2#2-model-downloads | 21B Strong, Economical, and Efficient Mixture-of-Experts Language Model | |
| Granite | https://huggingface.co/ibm-granite | family of Code Models from IBM with 3b, 8b, 20b, 34b, base and instruct models for code completion and chat | |
| GemMoE | https://huggingface.co/Crystalcareai/GemMoE-Base-Random | An 8x8 Mixture Of Experts based on Gemma | |
| wavecoder-ultra-6.7b | https://huggingface.co/microsoft/wavecoder-ultra-6.7b | covering four general code-related tasks: code generation, code summary, code translation, and code repair | |
| Mixtral-8x22B-Instruct-v0.1 | https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1 | an instruct fine-tuned version of the Mixtral-8x22B-v0.1 | |
| WizardLM-2-8x22B | https://huggingface.co/alpindale/WizardLM-2-8x22B | Microsoft's WizardLM 2 8x22B beating gpt-4-0314 on MT-Bench | |
| WizardLM-2-7B | https://huggingface.co/microsoft/WizardLM-2-7B | Microsoft's WizardLM 2 7B, release for 70B coming up backup0 | |
| aiXcoder | https://huggingface.co/aiXcoder/aixcoder-7b-base | 7B Code LLM for code completion, comprehension, generation | |
| Mixtral-8x22B-v0.1 | https://huggingface.co/v2ray/Mixtral-8x22B-v0.1 | Sparse MoE model with 176B total and 44B active parameters, 65k context size | |
| grok-1 | https://huggingface.co/xai-org/grok-1 | 314b MoE model by xAI | |
| DBRX | https://huggingface.co/databricks/dbrx-base | base and instruct MoE models from databricks with 132B total parameters and a larger number of smaller experts supporting RoPE and 32K context size | |
| command-r-plus | https://huggingface.co/CohereForAI/c4ai-command-r-plus | a 104B model with highly advanced capabilities including RAG and tool use for English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Arabic, and Simplified Chinese | |
| StarCoder2 | https://huggingface.co/bigcode/starcoder2-15b | 15B, 7B and 3B code completion models trained on The Stack v2 | |
| command-r | https://www.maginative.com/article/cohere-launches-command-r-scalable-ai-model-for-enterprise-rag-and-tool-use/ | 35B optimized for retrieval augmented generation (RAG) and tool use supporting Embed and Rerank methodology. model weights | |
| AI21 Jamba | https://huggingface.co/ai21labs/Jamba-v0.1 | production-grade Mamba-based hybrid SSM-Transformer Model licensed under Apache 2.0 with 256K context and 52B MoE at 12B each | |
| Smaug-72B | https://huggingface.co/abacusai/Smaug-72B-v0.1 | Based on Qwen-72B and MoMo-72B-Lora then finetuned by Abacus.AI, is the best performing Open LLM on the HF leaderboard by Feb-2024 | |
| SLIM Model Family | https://huggingface.co/llmware | Small Specialized Function-Calling Models for Multi-Step Automation, focused on enterprise RAG workflows | |
| aya-101 | https://huggingface.co/CohereForAI/aya-101 | 13b model fine tuned open acess multilingual LLM from Cohere For AI | |
| seamlessM4T v2 | https://huggingface.co/docs/transformers/en/model_doc/seamless_m4t_v2 | Multimodal Audio and Text Translation between many languages | |
| SeaLLM | https://huggingface.co/SeaLLMs/SeaLLM-7B-v2 | multilingual LLM for Southeast Asian (SEA) languages 🇬🇧 🇨🇳 🇻🇳 🇮🇩 🇹🇭 🇲🇾 🇰🇭 🇱🇦 🇲🇲 🇵🇭 | |
| meditron | https://github.com/epfLLM/meditron | 7B and 70B Llama2 based LLM fine tuning adapted for the medical domain | |
| Mixtral of experts | https://mistral.ai/news/mixtral-of-experts/ | A high quality Sparse Mixture-of-Experts. | |
| Poro | https://huggingface.co/LumiOpen/Poro-34B | SiloGen model checkpoints of a family of multilingual open source LLMs covering all official European languages and code, news | |
| deepseek-coder | https://github.com/deepseek-ai/DeepSeek-Coder | code language models, trained on 2T tokens, 87% code 13% English / Chinese, up to 33B with 16K context size achieving SOTA performance on coding benchmarks | |
| openchat | https://github.com/imoneoi/openchat | Advancing Open-source Language Models with Mixed-Quality Data | |
| llmware RAG models | https://huggingface.co/llmware | small LLMs and sentence transformer embedding models specifically fine-tuned for RAG workflows | |
| HelixNet | https://huggingface.co/migtissera/HelixNet | Mixture of Experts with 3 Mistral-7B, LoRA, HelixNet-LMoE optimized version | |
| Mistral-7B-german-assistant-v3 | https://huggingface.co/flozi00/Mistral-7B-german-assistant-v3 | finetuned version for german instructions and conversations in style of Alpaca. "### Assistant:" "### User:", trained with a context length of 8k tokens. The dataset used is deduplicated and cleaned, with no codes inside. The focus is on instruction following and conversational tasks | |
| WizardMath-70B-V1.0 | https://huggingface.co/WizardLM/WizardMath-70B-V1.0 | SOTA Mathematical Reasoning | |
| leo-hessianai-13b-chat-bilingual | https://huggingface.co/LeoLM/leo-hessianai-13b-chat-bilingual | based on llama-2 13b is a fine tune of the base leo-hessianai-13b for chat | |
| em_german_leo_mistral | https://huggingface.co/jphme/em_german_leo_mistral | LeoLM Mistral fine tune of LeoLM with german instructions | |
| SauerkrautLM-13B-v1 | https://huggingface.co/VAGOsolutions/SauerkrautLM-13b-v1 | fine tuned llama-2 13b on a mix of German data augmentation and translations, SauerkrautLM-7b-v1-mistral German SauerkrautLM-7b fine-tuned using QLoRA on 1 A100 80GB with Axolotl | |
| CodeShell | https://github.com/WisdomShell/codeshell/blob/main/README_EN.md | code LLM with 7b parameters trained on 500b tokens, context length of 8k outperforming CodeLlama and Starcoder on humaneval, weights | |
| sqlcoder | https://github.com/defog-ai/sqlcoder | 15B parameter model that outperforms gpt-3.5-turbo for natural language to SQL generation tasks | |
| ChatGLM2-6B | https://github.com/THUDM/ChatGLM2-6B | v2 of the GLM 6B open bilingual EN/CN model | |
| baichuan-7b | https://github.com/baichuan-inc/baichuan-7B | Baichuan Intelligent Technology developed baichuan-7B, an open-source language model with 7 billion parameters trained on 1.2 trillion tokens. Supporting Chinese and English, it achieves top performance on authoritative benchmarks (C-EVAL, MMLU) | |
| salesforce/CodeT5 | https://github.com/salesforce/codet5 | code assistant, has released their codet5+ 16b and other model sizes | |
| VPGTrans | https://vpgtrans.github.io/ | Transfer Visual Prompt Generator across LLMs and the VL-Vicuna model is a novel VL-LLM. Paper, code | |
| replit-code | https://huggingface.co/replit/ | focused on Code Completion. The model has been trained on a subset of the Stack Dedup v1.2 dataset. | |
| Visual-med-alpaca | https://github.com/cambridgeltl/visual-med-alpaca | fine-tuning llama-7b on self instruct for the biomedical domain. Models locked behind a request form. | |
| Multimodal-GPT | https://github.com/open-mmlab/Multimodal-GPT | multi-modal visual/language chatbot, using llama with custom LoRA weights and openflamingo-9B. | |
| mPLUG-Owl | https://github.com/X-PLUG/mPLUG-Owl | Multimodal finetuned model for visual/language tasks | |
| MOSS by Fudan University | https://github.com/OpenLMLab/MOSS | a 16b Chinese/English custom foundational model with additional models fine tuned on sft and plugin usage | |
| BigCode | Open Scientific collaboration to train a coding LLM | https://huggingface.co/bigcode | |
| CodeGeeX 13B | Multi Language Code Generation Model | https://huggingface.co/spaces/THUDM/CodeGeeX | |
| RWKV: Parallelizable RNN with Transformer-level LLM Performance | https://github.com/BlinkDL/RWKV-LM | ||
| GeoV/GeoV-9b | https://huggingface.co/GeoV/GeoV-9b | 9B parameter, in-progress training to 300B tokens (33:1) | |
| LAION OpenFlamingo | Multi Modal Model and training architecture | https://github.com/mlfoundations/open_flamingo | |
| Cerebras GPT-13b | https://huggingface.co/cerebras | (release notes) |