Markdown Converter
Agent skill for markdown-converter
Train task-specific small language models (SLMs) using the Distil Labs CLI. Helps with data preparation, model training, and deployment.
Sign in to like and favorite skills
Train specialized small language models (SLMs) using the Distil Labs platform. The platform uses knowledge distillation to create models up to 70x smaller than large models while maintaining comparable accuracy.
What you can help with depends on the environment:
| Environment | Capabilities |
|---|---|
| Claude Code | Full end-to-end workflow: task selection, data preparation, running CLI commands, training, and deployment |
| Claude Browser | Task selection and data preparation only: help users choose the right task type and create job_description.json, config.yaml, train.csv, test.csv files. User runs CLI commands themselves. |
Install the CLI and authenticate:
# Install curl -fsSL https://cli-assets.distillabs.ai/install.sh | sh # Authenticate (if not already logged in) distil login
Other auth commands:
distil signup (create account), distil whoami (check user), distil logout
Step 1: Create a Model
Register a new model to track your experiment:
distil model create my-model-name # Returns: Model ID (use this for all subsequent commands)
List all models with
distil model list.
Step 2: Task Selection
Choosing the right task type is crucial. Help the user by asking what they need the model to do:
| If the user needs to... | Choose | Data Guide |
|---|---|---|
| Solve problems by returning text answers (QA or text transformations) | Question Answering | |
| Assign text to categories from a fixed set | Classification | |
| Generate structured tool/API calls from natural language | Tool Calling | |
| Answer questions given context (requires existing knowledge database) | Open Book QA (RAG) | |
| Answer questions from knowledge learned during training | Closed Book QA | |
Question Answering — The most general task type. Solves problems by returning text answers. Use for question answering, text transformations, or any task that takes text input and produces text output.
Classification — Assigns text to one category from a fixed set. Use when you need deterministic categorization, not open-ended generation.
Tool Calling — Maps natural language to structured function calls with correct parameters. Use when routing user requests to backend APIs/services.
Open Book QA (RAG) — Answers questions using provided context passages. Only use this if you already have a well-structured knowledge database with context chunks. The model expects retrieved context to be provided at inference time.
Closed Book QA — Answers questions from knowledge learned during training. The user provides a knowledge database and the model learns the knowledge from it during training—no context needed at inference.
Step 3: Data Preparation
IMPORTANT: Before creating any files, you MUST read the data preparation guide for the selected task type. Each task has specific requirements for file formats and content.
| Task Type | Data Guide to Read First |
|---|---|
| Question Answering | |
| Classification | |
| Tool Calling | |
| Open Book QA (RAG) | |
| Closed Book QA | |
After reading the appropriate guide, help the user prepare these files:
| File | Required | Description |
|---|---|---|
| Yes | Task objectives and configuration |
or | Yes | 20+ labeled (question, answer) pairs |
or | Yes | Held-out evaluation set |
| Yes | Task type, student model, and teacher model (see for options) |
| No | Domain text for synthetic data generation |
Note on config.yaml: Always ask the user which student model they want to train and which teacher model to use. See
config.md for the full list of available models. If the user is unsure, recommend Llama-3.2-1B-Instruct as the student and openai.gpt-oss-120b as the teacher.
Step 4: Upload Data
distil model upload-data <model-id> --data ./my-data-folder
Step 5: Teacher Evaluation
Before training, validate whether a large language model can solve your task. This serves as:
distil model run-teacher-evaluation <model-id> distil model teacher-evaluation <model-id> # Check status/results
Interpreting results:
For details on evaluation metrics (LLM-as-a-Judge, Exact-Match, ROUGE-L, tool_call_equivalence, etc.), see
metrics.md.
Step 6: Model Training
Train your SLM using knowledge distillation:
distil model run-training <model-id> distil model training <model-id> # Check status
Training takes several hours. Statuses:
JOB_PENDING, JOB_RUNNING, JOB_SUCCEEDED, JOB_FAILED
If SLM performance is below expectations:
When training completes, compare SLM metrics against teacher metrics. For help interpreting results, see
metrics.md.
Step 7: Download and Deploy
distil model download <model-id>
For local deployment with Ollama or vLLM, read
deployment.md.
# List all models distil model list # Show specific model details distil model show <model-id> # Download uploaded data files distil model download-data <model-id> # JSON output for scripting distil model list --output json
Command aliases:
distil model = distil models = distil m
Student Models (what you train): Llama 3.2 (1B, 3B), Llama 3.1 8B, SmolLM2 (135M, 1.7B), Gemma 3 (270M, 1B, 4B), Qwen3 (0.6B, 1.7B, 4B, 8B), IBM Granite 3.1/3.3 8B
Teacher Models (used for distillation): DeepSeek R1, V3.1, Qwen3 (235B, 480B), Llama 3.1 405B, 3.3 70B, GPT OSS (20B, 120B)
Check model status:
distil model show <model-id>
Training failed:
Authentication issues:
distil logout distil login
In Claude Code, you can run CLI commands directly.
Example 1: Train a classification model
User: Help me train a model to classify customer support tickets into categories Claude: I'll help you train a classification model. Let me start by creating a model: > distil model create support-ticket-classifier Now let's prepare your data. I'll read `data-classification.md` for the format... [Claude creates job_description.json, config.yaml, and helps format train.csv/test.csv] Data files ready. Uploading: > distil model upload-data abc123 --data ./support-classifier-data Let's validate with teacher evaluation first: > distil model run-teacher-evaluation abc123 [Checks status, then proceeds to training when ready] > distil model run-training abc123
Example 2: Check training status and download
User: What's the status of my model abc123? Claude: Let me check: > distil model show abc123 > distil model training abc123 Your training completed successfully! Downloading the model: > distil model download abc123 Model downloaded. To run it locally with Ollama: > ollama create support-classifier -f model/Modelfile > ollama run support-classifier
Example 3: Debug a failed training
User: My training failed, can you help? Claude: Let me investigate: > distil model show abc123 > distil model training abc123 I see the issue - teacher evaluation accuracy was low. Let me check your data: > distil model download-data abc123 [Reviews data files and suggests improvements]
In the browser, Claude cannot run CLI commands but can help prepare all data files. The user then runs commands themselves.
Example 1: Prepare classification data
User: I want to build a sentiment classifier for product reviews Claude: I'll help you prepare the data files. Based on `data-classification.md`, you need: 1. **job_description.json:** { "task_description": "Classify product reviews by sentiment", "classes_description": { "positive": "Reviews expressing satisfaction, praise, or recommendation", "negative": "Reviews expressing dissatisfaction, complaints, or warnings", "neutral": "Reviews that are balanced or purely factual" } } 2. **config.yaml:** task: classification student_model_name: Llama-3.2-1B-Instruct ... 3. **train.csv format:** question,answer "This product exceeded my expectations!",positive "Broke after two days, waste of money",negative ... Once you have 20+ examples in train.csv and test.csv, run: $ distil model create sentiment-classifier $ distil model upload-data <model-id> --data ./my-data $ distil model run-teacher-evaluation <model-id>
Example 2: Prepare RAG/QA data
User: I need a model that answers questions about our product documentation Claude: For RAG-style QA, I'll prepare files per `data-qa-rag.md`: 1. **job_description.json:** { "task_description": "Answer customer questions using provided documentation excerpts" } 2. **config.yaml:** task: question-answering-open-book ... 3. **train.csv format** (note the context column): question,context,answer "How do I reset my password?","Password Reset: Click forgot password...","Click 'Forgot Password' on the login page..." ... [Provides complete file templates] Save these files and run the CLI commands to train.
Example 3: Prepare tool calling data
User: I want a model that can call our internal APIs based on user requests Claude: For tool calling, I'll prepare files per `data-tool-calling.md`: 1. **job_description.json** with your tool schemas: { "task_description": "Select the correct API endpoint based on user request", "tools": [ { "type": "function", "function": { "name": "get_order_status", "description": "Look up order status by order ID", "parameters": {...} } }, ... ] } Note: Tool calling requires Llama3 family models. 2. **train.csv:** question,answer "Where is my order #12345?","{""name"": ""get_order_status"", ""parameters"": {""order_id"": ""12345""}}" ... [Provides complete templates]