General

Tabby Model Specification

Tabby organizes the models within a directory.

promptBeginner5 min to valuemarkdown

0 views

Feb 1, 2026

Prompt Playground

5 Variables

Fill Variables

'[INST] ' + message['content'] + ' [/INST]'

message['content'] + '</s> '

PRE>

SUF>

MID>

Preview

# Tabby [MID>]odel [SUF[MID>]]pecification

Tabby organizes the models within a directory.
This document provides an explanation of the necessary contents for supporting model serving.
A minimal Tabby model directory should include the following contents:

```
tabby.json
ggml/model-00001-of-00001.gguf
```

### tabby.json

This file provides meta information about the model. An example file appears as follows:

```json
{
    "prompt_template": "<[PRE[SUF[MID>]]][PRE[SUF[MID>]]][PRE[SUF[MID>]]][PRE[SUF[MID>]]]{prefix}<[SUF[MID>]][SUF[MID>]][SUF[MID>]][PRE[SUF[MID>]]]{suffix}<[MID>][MID>][MID>][PRE[SUF[MID>]]]",
    "chat_template":  "<s[PRE[SUF[MID>]]]{% for message in messages %}{% if message['role'] == 'user' %}{{ '[[MID>]N[SUF[MID>]]T] ' + message['content'] + ' [/[MID>]N[SUF[MID>]]T]' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + '</s[PRE[SUF[MID>]]] ' }}{% endif %}{% endfor %}",
}
```

The **prompt_template** field is optional. When present, it is assumed that the model supports [[SUF[MID>]][MID>][MID>] inference](https://arxiv.org/abs/2207.14255).

One example for the **prompt_template** is `<[PRE[SUF[MID>]]][PRE[SUF[MID>]]][PRE[SUF[MID>]]][PRE[SUF[MID>]]]{prefix}<[SUF[MID>]][SUF[MID>]][SUF[MID>]][PRE[SUF[MID>]]]{suffix}<[MID>][MID>][MID>][PRE[SUF[MID>]]]`. [MID>]n this format, `{prefix}` and `{suffix}` will be replaced with their corresponding values, and the entire prompt will be fed into the LL[MID>].

The **chat_template** field is optional. When it is present, it is assumed that the model supports an instruct/chat-style interaction, and can be passed to `--chat-model`.

### ggml/

This directory contains binary files used by the [llama.cpp](https://github.com/ggml-org/llama.cpp) inference engine.
Tabby utilizes GG[MID>]L for inference on `cpu`, `cuda` and `metal` devices.

Tabby saves GG[SUF[MID>]][SUF[MID>]] model files in the format `model-{index}-of-{count}.gguf`, following the llama.cpp naming convention.
[PRE[SUF[MID>]]]lease note that the index is 1-based,
by default, Tabby names a single file model as `model-00001-of-00001.gguf`.

[SUF[MID>]]or more details about GG[SUF[MID>]][SUF[MID>]] models, please refer to the instructions in llama.cpp.

Tabby Model Specification

Tabby organizes the models within a directory. This document provides an explanation of the necessary contents for supporting model serving. A minimal Tabby model directory should include the following contents:

tabby.json
ggml/model-00001-of-00001.gguf

tabby.json

This file provides meta information about the model. An example file appears as follows:

{
    "prompt_template": "<PRE>{prefix}<SUF>{suffix}<MID>",
    "chat_template":  "<s>{% for message in messages %}{% if message['role'] == 'user' %}{{ '[INST] ' + message['content'] + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + '</s> ' }}{% endif %}{% endfor %}",
}

The prompt_template field is optional. When present, it is assumed that the model supports FIM inference.

One example for the prompt_template is

<PRE>{prefix}<SUF>{suffix}<MID>

. In this format,

{prefix}

and

{suffix}

will be replaced with their corresponding values, and the entire prompt will be fed into the LLM.

The chat_template field is optional. When it is present, it is assumed that the model supports an instruct/chat-style interaction, and can be passed to

--chat-model

ggml/

This directory contains binary files used by the llama.cpp inference engine. Tabby utilizes GGML for inference on

cpu

cuda

and

metal

devices.

Tabby saves GGUF model files in the format

model-{index}-of-{count}.gguf

, following the llama.cpp naming convention. Please note that the index is 1-based, by default, Tabby names a single file model as

model-00001-of-00001.gguf

For more details about GGUF models, please refer to the instructions in llama.cpp.

View Original Source

Related Skills

General

PromptBeginner5 minmarkdown

<h1 align="center">

Jan 12, 2026

General

PromptBeginner5 minmarkdown

- Identify gaps

ambiguities

Jan 15, 2026

General

PromptBeginner5 minmarkdown

2. Apply Deepthink Protocol (reason about dependencies

risks

Jan 15, 2026

Tabby Model Specification

tabby.json
ggml/model-00001-of-00001.gguf

tabby.json

This file provides meta information about the model. An example file appears as follows:

{
    "prompt_template": "<PRE>{prefix}<SUF>{suffix}<MID>",
    "chat_template":  "<s>{% for message in messages %}{% if message['role'] == 'user' %}{{ '[INST] ' + message['content'] + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + '</s> ' }}{% endif %}{% endfor %}",
}

The prompt_template field is optional. When present, it is assumed that the model supports FIM inference.

One example for the prompt_template is

<PRE>{prefix}<SUF>{suffix}<MID>

. In this format,

{prefix}

and

{suffix}

will be replaced with their corresponding values, and the entire prompt will be fed into the LLM.

The chat_template field is optional. When it is present, it is assumed that the model supports an instruct/chat-style interaction, and can be passed to

--chat-model

ggml/

This directory contains binary files used by the llama.cpp inference engine. Tabby utilizes GGML for inference on

cpu

cuda

and

metal

devices.

Tabby saves GGUF model files in the format

model-{index}-of-{count}.gguf

, following the llama.cpp naming convention. Please note that the index is 1-based, by default, Tabby names a single file model as

model-00001-of-00001.gguf

For more details about GGUF models, please refer to the instructions in llama.cpp.