You are working with Scrapybara, a TypeScript SDK for deploying and managing remote desktop instances for AI agents. Use this guide to properly interact with the SDK.
CORE SDK USAGE:
- Initialize client: import { ScrapybaraClient } from "scrapybara"; const client = new ScrapybaraClient({ apiKey: "KEY" });
- Instance lifecycle:
const instance = await client.startUbuntu({ timeoutHours: 1 });
await instance.pause(); // Pause to save resources
await instance.resume({ timeoutHours: 1 }); // Resume work
await instance.stop(); // Terminate and clean up
- Instance types:
const ubuntuInstance = client.startUbuntu(); // supports bash, computer, edit, browser
const browserInstance = client.startBrowser(); // supports computer, browser
const windowsInstance = client.startWindows(); // supports computer
TYPE IMPORTS:
- Core types:
import { ScrapybaraClient, UbuntuInstance, BrowserInstance, WindowsInstance } from "scrapybara";
- Tool types:
import { bashTool, computerTool, editTool } from "scrapybara/tools";
- Model types:
import { anthropic } from "scrapybara/anthropic";
- Message types:
import { z } from "zod";
- Error types:
import { ScrapybaraError } from "scrapybara";
- Request/Response types:
import { Scrapybara } from "scrapybara"; // Namespace containing all request/response types
CORE INSTANCE OPERATIONS:
- Screenshots: const base64Image = await instance.screenshot().base64Image;
- Bash commands: await instance.bash({ command: "ls -la" });
- Mouse control: await instance.computer({ action: "move_mouse", coordinates: [x, y] });
- Click actions: await instance.computer({ action: "click_mouse", button: "right", coordinates: [x, y] });
- Drag actions: await instance.computer({ action: "drag_mouse", path: [[x1, y1], [x2, y2]] });
- Scroll actions: await instance.computer({ action: "scroll", coordinates: [x, y], delta_x: 0, delta_y: 0 });
- Key actions: await instance.computer({ action: "press_key", keys: ["a", "b", "c"] });
- Type actions: await instance.computer({ action: "type_text", text: "Hello world" });
- Wait actions: await instance.computer({ action: "wait", duration: 3 });
- Get cursor position: await instance.computer({ action: "get_cursor_position" });
- File operations: await instance.file.read({ path: "/path/file" }), await instance.file.write({ path: "/path/file", content: "data" });
ACT SDK (Primary Focus):
- Purpose: Enables building computer use agents with unified tools and model interfaces
- Core components:
- Model: Handles LLM integration (currently Anthropic)
import { anthropic } from "scrapybara/anthropic";
const model = anthropic(); // Or model = anthropic({ apiKey: "KEY" }) for own key
- Tools: Interface for computer interactions
- bashTool: Run shell commands
- computerTool: Mouse/keyboard control
- editTool: File operations
const tools = [
bashTool(instance),
computerTool(instance),
editTool(instance),
];
- Prompt:
- system: system prompt, recommend to use UBUNTU_SYSTEM_PROMPT, BROWSER_SYSTEM_PROMPT, WINDOWS_SYSTEM_PROMPT
- prompt: simple user prompt
- messages: list of messages
- Only include either prompt or messages, not both
const { messages, steps, text, output, usage } = await client.act({
model: anthropic(),
tools,
system: UBUNTU_SYSTEM_PROMPT,
prompt: "Task",
onStep: handleStep
});
MESSAGE HANDLING:
- Response Structure: Messages are structured with roles (user/assistant/tool) and typed content
- Content Types:
- TextPart: Simple text content
{ type: "text", text: "content" }
- ImagePart: Base64 or URL images
{ type: "image", image: "base64...", mimeType: "image/png" }
- ReasoningPart: Model reasoning content
{
type: "reasoning",
id: "id",
reasoning: "reasoning",
signature: "signature",
instructions: "instructions"
}
- ToolCallPart: Tool invocations
{
type: "tool-call",
toolCallId: "id",
toolName: "bash",
args: { command: "ls" }
}
- ToolResultPart: Tool execution results
{
type: "tool-result",
toolCallId: "id",
toolName: "bash",
result: "output",
isError: false
}
STEP HANDLING:
const handleStep = (step: Step) => {
console.log(
Text: ${step.text}
);
if (step.toolCalls) {
for (const call of step.toolCalls) {
console.log(
Tool: ${call.toolName}
);
}
}
if (step.toolResults) {
for (const result of step.toolResults) {
console.log(
Result: ${result.result}
);
}
}
console.log(
Tokens: ${step.usage?.totalTokens ?? 'N/A'}
);
};
STRUCTURED OUTPUT:
Use the schema parameter to define a desired structured output. The response's output field will contain the validated typed data returned by the model.
const schema = z.object({
posts: z.array(z.object({
title: z.string(),
url: z.string(),
points: z.number(),
})),
});
const { output } = await client.act({
model: anthropic(),
tools,
schema,
system: UBUNTU_SYSTEM_PROMPT,
prompt: "Get the top 10 posts on Hacker News",
});
const posts = output.posts;
TOKEN USAGE:
- Track token usage through TokenUsage objects
- Fields: promptTokens, completionTokens, totalTokens
- Available in both Step and ActResponse objects
EXAMPLE:
import { ScrapybaraClient } from "scrapybara";
import { anthropic } from "scrapybara/anthropic";
import { UBUNTU_SYSTEM_PROMPT } from "scrapybara/prompts";
import { bashTool, computerTool, editTool } from "scrapybara/tools";
const client = new ScrapybaraClient();
const instance = await client.startUbuntu();
await instance.browser.start();
const { messages, steps, text, output, usage } = await client.act({
model: anthropic(),
tools: [
bashTool(instance),
computerTool(instance),
editTool(instance),
],
system: UBUNTU_SYSTEM_PROMPT,
prompt: "Go to the YC website and fetch the HTML",
onStep: (step) => console.log(
${step}\n
),
});
await instance.browser.stop();
await instance.stop();
EXECUTION PATTERNS:
- Basic agent execution:
const { messages, steps, text, output, usage } = await client.act({
model: anthropic(),
tools,
system: "System context here",
prompt: "Task description"
});
- Browser automation:
const cdpUrl = await instance.browser.start().cdpUrl;
const authStateId = await instance.browser.saveAuth({ name: "default" }).authStateId; // Save auth
await instance.browser.authenticate({ authStateId }); // Reuse auth
- File management:
await instance.file.write({ path: "/tmp/data.txt", content: "content" });
const content = await instance.file.read({ path: "/tmp/data.txt" }).content;
- Environment variables:
await instance.env.set({ API_KEY: "value" });
const vars = await instance.env.get().variables;
await instance.env.delete(["VAR_NAME"]);
ERROR HANDLING:
import { ApiError } from "scrapybara/core";
try {
await client.startUbuntu();
} catch (e) {
if (e instanceof ApiError) {
console.error(
Error ${e.statusCode}: ${e.body}
);
}
}
IMPORTANT GUIDELINES:
- Always stop instances after use to prevent unnecessary billing
- Use async/await for all operations as they are asynchronous
- Handle API errors with try/catch blocks
- Default timeout is 60s; customize with timeout parameter or requestOptions
- Instance auto-terminates after 1 hour by default
- For browser operations, always start browser before browserTool usage
- Prefer bash commands over GUI interactions for launching applications