<h1 align="center">
<a href="https://prompts.chat">
A beginner-friendly introduction to using Instructor for structured outputs from LLMs
Sign in to like and favorite skills
Welcome! This guide will help you understand what Instructor does and how to start using it in your projects, even if you're new to working with language models.
Instructor is a Python library that helps you get structured, predictable data from language models like GPT-4 and Claude. It's like giving the LLM a form to fill out instead of letting it respond however it wants.
Here's how Instructor fits into your application:
flowchart LR A[Your Application] --> B[Instructor] B --> C[LLM Provider] C --> B B --> A style B fill:#e2f0fb,stroke:#b8daff,color:#004085
Without Instructor, getting structured data from LLMs can be challenging:
Instructor solves these problems by:
Let's see Instructor in action with a basic example:
import instructor from pydantic import BaseModel # Define the structure you want class Person(BaseModel): name: str age: int city: str # Connect to the LLM with Instructor client = instructor.from_provider("openai/gpt-4o-mini") # Extract structured data person = client.create( response_model=Person, messages=[ {"role": "user", "content": "Extract: John is 30 years old and lives in New York."} ] ) # Now you have a structured object print(f"Name: {person.name}") # Name: John print(f"Age: {person.age}") # Age: 30 print(f"City: {person.city}") # City: New York
That's it! Instructor handled all the complexity of getting the LLM to format the data correctly.
Ready to get started? Follow our step-by-step guide →
Here are the main concepts you need to know:
Response models define the structure you want the LLM to return. They are built using Pydantic, which is a data validation library.
from pydantic import BaseModel, Field class User(BaseModel): name: str = Field(description="The user's full name") age: int = Field(description="The user's age in years") # The descriptions help the LLM understand what to extract
The
from_provider function connects Instructor to your LLM provider. It automatically handles provider-specific configurations:
# For OpenAI client = instructor.from_provider("openai/gpt-4o-mini") # For Anthropic client = instructor.from_provider("anthropic/claude-3-5-haiku-latest") # For Google Gemini client = instructor.from_provider("google/gemini-3-flash")
Modes control how Instructor gets structured data from the LLM. Different providers support different modes, and Instructor automatically selects the best one. You can also specify a mode manually if needed.
Learn more about client setup →
Here are some popular ways people use Instructor:
Now that you understand the basics, here are some suggested next steps:
While knowing Pydantic helps, you don't need to be an expert. The basic patterns shown above will get you started. You can learn more advanced features as you need them.
OpenAI is the most popular choice for beginners because of its reliability and wide support. As you grow more comfortable, you can explore other providers like Anthropic Claude, Gemini, or open-source models.
No! If you're familiar with Python classes and working with APIs, you'll find Instructor straightforward. The core concepts are simple, and you can gradually explore advanced features.
Instructor focuses specifically on structured outputs with a simple, clean API. Unlike larger frameworks that try to do everything, Instructor does one thing very well: getting structured data from LLMs.
If you get stuck:
Welcome aboard, and happy extracting!