<h1 align="center">
<a href="https://prompts.chat">
This is a project that uses Stagehand V3, a browser automation framework with AI-powered `act`, `extract`, `observe`, and `agent` methods.
Sign in to like and favorite skills
This is a project that uses Stagehand V3, a browser automation framework with AI-powered
act, extract, observe, and agent methods.
The main class can be imported as
Stagehand from @browserbasehq/stagehand.
Key Classes:
Stagehand: Main orchestrator class providing act, extract, observe, and agent methodscontext: A V3Context object that manages browser contexts and pagespage: Individual page objects accessed via stagehand.context.pages()[i] or created with stagehand.context.newPage()import { Stagehand } from "@browserbasehq/stagehand"; const stagehand = new Stagehand({ env: "LOCAL", // or "BROWSERBASE" verbose: 2, // 0, 1, or 2 model: "openai/gpt-4.1-mini", // or any supported model }); await stagehand.init(); // Access the browser context and pages const page = stagehand.context.pages()[0]; const context = stagehand.context; // Create new pages if needed const page2 = await stagehand.context.newPage();
Actions are called on the
stagehand instance (not the page). Use atomic, specific instructions:
// Act on the current active page await stagehand.act("click the sign in button"); // Act on a specific page (when you need to target a page that isn't currently active) await stagehand.act("click the sign in button", { page: page2 });
Important: Act instructions should be atomic and specific:
Cache the results of
observe to avoid unexpected DOM changes:
const instruction = "Click the sign in button"; // Get candidate actions const actions = await stagehand.observe(instruction); // Execute the first action await stagehand.act(actions[0]);
To target a specific page:
const actions = await stagehand.observe("select blue as the favorite color", { page: page2, }); await stagehand.act(actions[0], { page: page2 });
Extract data from pages using natural language instructions. The
extract method is called on the stagehand instance.
import { z } from "zod/v3"; // Extract with explicit schema const data = await stagehand.extract( "extract all apartment listings with prices and addresses", z.object({ listings: z.array( z.object({ price: z.string(), address: z.string(), }), ), }), ); console.log(data.listings);
// Extract returns a default object with 'extraction' field const result = await stagehand.extract("extract the sign in button text"); console.log(result); // Output: { extraction: "Sign in" } // Or destructure directly const { extraction } = await stagehand.extract( "extract the sign in button text", ); console.log(extraction); // "Sign in"
Extract data from a specific element using a selector:
const reason = await stagehand.extract( "extract the reason why script injection fails", z.string(), { selector: "/html/body/div[2]/div[3]/iframe/html/body/p[2]" }, );
When extracting links or URLs, use
z.string().url():
const { links } = await stagehand.extract( "extract all navigation links", z.object({ links: z.array(z.string().url()), }), );
// Extract from a specific page (when you need to target a page that isn't currently active) const data = await stagehand.extract( "extract the placeholder text on the name field", { page: page2 }, );
Plan actions before executing them. Returns an array of candidate actions:
// Get candidate actions on the current active page const [action] = await stagehand.observe("Click the sign in button"); // Execute the action await stagehand.act(action);
Observing on a specific page:
// Target a specific page (when you need to target a page that isn't currently active) const actions = await stagehand.observe("find the next page button", { page: page2, }); await stagehand.act(actions[0], { page: page2 });
Use the
agent method to autonomously execute complex, multi-step tasks.
const page = stagehand.context.pages()[0]; await page.goto("https://www.google.com"); const agent = stagehand.agent({ model: "google/gemini-2.0-flash", executionModel: "google/gemini-2.0-flash", }); const result = await agent.execute({ instruction: "Search for the stock price of NVDA", maxSteps: 20, }); console.log(result.message);
For more advanced scenarios using computer-use models:
const agent = stagehand.agent({ mode: "cua", // Enable Computer Use Agent mode model: "anthropic/claude-sonnet-4-20250514", // or "google/gemini-2.5-computer-use-preview-10-2025" systemPrompt: `You are a helpful assistant that can use a web browser. Do not ask follow up questions, the user will trust your judgement.`, }); await agent.execute({ instruction: "Apply for a library card at the San Francisco Public Library", maxSteps: 30, });
const agent = stagehand.agent({ model: { modelName: "google/gemini-2.5-computer-use-preview-10-2025", apiKey: process.env.GEMINI_API_KEY, }, systemPrompt: `You are a helpful assistant.`, });
const agent = stagehand.agent({ integrations: [`https://mcp.exa.ai/mcp?exaApiKey=${process.env.EXA_API_KEY}`], systemPrompt: `You have access to the Exa search tool.`, });
Hybrid mode uses both DOM-based and coordinate-based tools (act, click, type, dragAndDrop) for visual interactions. This requires
experimental: true and models that support reliable coordinate-based actions.
Recommended models for hybrid mode:
google/gemini-3-flash-previewanthropic/claude-sonnet-4-20250514, anthropic/claude-sonnet-4-5-20250929, anthropic/claude-haiku-4-5-20251001const stagehand = new Stagehand({ env: "LOCAL", experimental: true, // Required for hybrid mode }); await stagehand.init(); const agent = stagehand.agent({ mode: "hybrid", model: "google/gemini-3-flash-preview", }); await agent.execute({ instruction: "Click the submit button and fill the form", maxSteps: 20, highlightCursor: true, // Enabled by default in hybrid mode });
Agent modes:
"dom" (default): Uses DOM-based tools (act, fillForm) - works with any model"hybrid": Uses both DOM-based and coordinate-based tools (act, click, type, dragAndDrop) - requires grounding-capable models"cua": Uses Computer Use Agent providersTarget specific elements across shadow DOM and iframes:
await page .deepLocator("/html/body/div[2]/div[3]/iframe/html/body/p") .highlight({ durationMs: 5000, contentColor: { r: 255, g: 0, b: 0 }, });
const page1 = stagehand.context.pages()[0]; await page1.goto("https://example.com"); const page2 = await stagehand.context.newPage(); await page2.goto("https://example2.com"); // Act/extract/observe operate on the current active page by default // Pass { page } option to target a specific page await stagehand.act("click button", { page: page1 }); await stagehand.extract("get title", { page: page2 });