<h1 align="center">
<a href="https://prompts.chat">
Common questions and answers about using Instructor
Sign in to like and favorite skills
This page answers common questions about using Instructor with various LLM providers.
Instructor is a library that makes it easy to get structured data from Large Language Models (LLMs). It uses Pydantic to define output schemas and provides a consistent interface across different LLM providers.
Instructor "patches" LLM clients to add a
response_model parameter that accepts a Pydantic model. When you make a request, Instructor:
Instructor supports many providers, including:
See the Integrations section for the complete list.
Instructor supports generic modes across providers:
Mode.TOOLS - Tool/function calling when supportedMode.JSON - JSON generation for providers that support it (GenAI)Mode.JSON_SCHEMA - JSON schema enforcement (OpenAI, Mistral, Cohere)Mode.MD_JSON - JSON embedded in markdownMode.PARALLEL_TOOLS - Parallel tool calls where supportedThe optimal mode depends on your provider and use case. See Patching for details.
Basic installation:
pip install instructor
For specific providers:
pip install "instructor[anthropic]" # For Anthropic pip install "instructor[google-generativeai]" # For Google/Gemini
This depends on your provider:
OPENAI_API_KEYANTHROPIC_API_KEYGOOGLE_API_KEYEach provider has specific requirements documented in their integration guide.
Common reasons include:
Try simplifying your schema or providing clearer instructions in your prompt.
Instructor automatically retries when validation fails. You can customize this behavior:
from tenacity import stop_after_attempt result = client.create( response_model=MyModel, max_retries=stop_after_attempt(5), # Retry up to 5 times messages=[...] )
Yes, use
create_with_completion:
result, completion = client.create_with_completion( response_model=MyModel, messages=[...] )
result is your Pydantic model, and completion is the raw response.
Use
create_partial for partial updates as the response is generated:
stream = client.create_partial( response_model=MyModel, messages=[...] ) for partial in stream: print(partial) # Partial model with fields filled in as they're generated
MD_JSON or JSON mode for simple schemasInstructor uses the
tenacity library for retries, which you can configure:
from tenacity import retry_if_exception_type, wait_exponential from openai.error import RateLimitError result = client.create( response_model=MyModel, max_retries=retry_if_exception_type(RateLimitError), messages=[...], )
Instructor works seamlessly with FastAPI:
from fastapi import FastAPI from pydantic import BaseModel import instructor app = FastAPI() client = instructor.from_provider("openai/gpt-5-nano") class UserInfo(BaseModel): name: str age: int @app.post("/extract") async def extract_user_info(text: str) -> UserInfo: return client.create( model="gpt-3.5-turbo", response_model=UserInfo, messages=[{"role": "user", "content": text}] )
Use the async client:
import instructor import asyncio client = instructor.from_provider("openai/gpt-5-nano", async_client=True) async def extract_data(): result = await client.create( response_model=MyModel, messages=[...] ) return result asyncio.run(extract_data())