<h1 align="center">
<a href="https://prompts.chat">
A step-by-step guide to getting started with Instructor for structured outputs from LLMs
Sign in to like and favorite skills
This guide will walk you through the basics of using Instructor to extract structured data from language models. By the end, you'll understand how to:
First, install Instructor:
pip install instructor
To use a specific provider, install the appropriate extras:
# For OpenAI (included by default) pip install instructor # For Anthropic pip install "instructor[anthropic]" # For other providers pip install "instructor[google-genai]" # For Google/Gemini pip install "instructor[vertexai]" # For Vertex AI pip install "instructor[cohere]" # For Cohere pip install "instructor[litellm]" # For LiteLLM (multiple providers) pip install "instructor[mistralai]" # For Mistral pip install "instructor[xai]" # For xAI
Set your API keys as environment variables:
# For OpenAI export OPENAI_API_KEY=your_openai_api_key # For Anthropic export ANTHROPIC_API_KEY=your_anthropic_api_key # For other providers, set relevant API keys
Let's start with a simple example using OpenAI:
import instructor from pydantic import BaseModel # Define your output structure class UserInfo(BaseModel): name: str age: int # Create an instructor client with from_provider client = instructor.from_provider("openai/gpt-5-nano") # Extract structured data user_info = client.create( response_model=UserInfo, messages=[ {"role": "user", "content": "John Doe is 30 years old."} ], ) print(f"Name: {user_info.name}, Age: {user_info.age}") # Output: Name: John Doe, Age: 30
This example demonstrates the core workflow:
from_providerresponse_model parameterInstructor leverages Pydantic's validation to ensure your data meets requirements:
from pydantic import BaseModel, Field, field_validator class User(BaseModel): name: str age: int = Field(gt=0, lt=120) # Age must be between 0 and 120 @field_validator('name') def name_must_have_space(cls, v): if ' ' not in v: raise ValueError('Name must include first and last name') return v # This will make the LLM retry if validation fails user = client.create( response_model=User, messages=[ {"role": "user", "content": "Extract: Tom is 25 years old."} ], )
Instructor works seamlessly with nested Pydantic models:
from pydantic import BaseModel from typing import List class Address(BaseModel): street: str city: str state: str zip_code: str class Person(BaseModel): name: str age: int addresses: List[Address] person = client.create( response_model=Person, messages=[ {"role": "user", "content": """ Extract: John Smith is 35 years old. He has homes at 123 Main St, Springfield, IL 62704 and 456 Oak Ave, Chicago, IL 60601. """} ], )
For larger responses or better user experience, use streaming:
from instructor import Partial # Stream the response as it's being generated stream = client.create_partial( response_model=Person, messages=[ {"role": "user", "content": "Extract a detailed person profile for John Smith, 35, who lives in Chicago and Springfield."} ], ) for partial in stream: # This will incrementally show the response being built print(partial)
Instructor supports multiple LLM providers. Here's how to use Anthropic:
import instructor from pydantic import BaseModel class UserInfo(BaseModel): name: str age: int # Create an instructor client with from_provider client = instructor.from_provider("anthropic/claude-3-opus-20240229") user_info = client.create( response_model=UserInfo, messages=[ {"role": "user", "content": "John Doe is 30 years old."} ], ) print(f"Name: {user_info.name}, Age: {user_info.age}")
start-here.md and getting-started.md?OpenAI is the most popular choice for beginners due to reliability and wide support. Once comfortable, you can explore Anthropic Claude, Google Gemini, or open-source models.
Basic knowledge helps, but you can start with simple models. Instructor works with any Pydantic BaseModel. Learn more advanced features as you need them.
Yes! Use
async_client=True when creating your client: client = instructor.from_provider("openai/gpt-4o", async_client=True), then use await client.create().
Instructor automatically retries with validation feedback. You can configure retry behavior with
max_retries parameter. See retry mechanisms for details.
Now that you've mastered the basics, here are some next steps:
Using older patterns? If you're using
instructor.patch() or provider-specific functions like from_openai(), check out the Migration Guide to modernize your code.
New to Instructor? Start with Start Here for a conceptual overview.
For more detailed information on any topic, visit the Concepts section.
If you have questions or need help, join our Discord community or check the GitHub repository.