Sandboxes
Sandbox Capabilities
Advanced sandbox features — browser pairing, AI agent integration, human handoff, task completion reports, document processing, and storage options.
Sandboxes support optional capabilities that extend them beyond basic code execution. These are configured in the capabilities field when creating a sandbox.
Browser Pairing
A sandbox can be paired with a browser session, giving you both a code environment and a live browser in a single coordinated workspace.
Browser modes
| Mode | Description | Required Plan |
|---|---|---|
none | No browser (default) | All plans |
paired-banata-browser | A dedicated Banata browser session is created alongside the sandbox | Pro and above |
local-chromium | A local Chromium instance runs inside the sandbox container and is previewed as the real in-sandbox desktop/browser session | All plans |
Creating a sandbox with a paired browser
{
"runtime": "bun",
"size": "standard",
"capabilities": {
"browser": {
"mode": "local-chromium",
"persistentProfile": true,
"streamPreview": true,
"humanInLoop": true,
"viewport": {
"width": 1280,
"height": 800
}
}
}
}| Field | Type | Default | Description | Required Plan |
|---|---|---|---|---|
mode | string | "none" | Browser mode (see table above) | — |
recording | boolean | false | Record the browser session as MP4 | Pro and above |
persistentProfile | boolean | false | Persist browser cookies/state across sessions | — |
streamPreview | boolean | Auto | Enable live browser viewport streaming. In practice you normally leave this on for browser-enabled sandboxes. | — |
humanInLoop | boolean | false | Enable human-in-the-loop control of the browser | — |
viewport.width | number | Worker default | Requested browser viewport width in pixels | — |
viewport.height | number | Worker default | Requested browser viewport height in pixels | — |
When the sandbox is ready, the session response includes pairedBrowser with connection details:
{
"pairedBrowser": {
"sessionId": "browser-session-id",
"cdpUrl": "wss://...",
"previewUrl": "https://...",
"recording": true,
"persistentProfile": true,
"controlMode": "shared"
}
}Browser preview
When browser preview is enabled, get the browser preview metadata:
curl "https://api.boxes.banata.dev/v1/sandboxes/browser-preview?id=SANDBOX_ID" \
-H "Authorization: Bearer br_live_..."The response includes WebSocket URLs for real-time browser viewport streaming. This lets you build UIs that show what the browser is doing. For native sandbox browsers (local-chromium), the preview is the actual in-sandbox desktop/browser session exposed through noVNC.
If you want to hand a human a simple browser-only page, the SDK also derives a standalone viewer URL on your app host. That viewer page only renders the browser and handoff state and can be opened outside the dashboard.
Controlling the browser
Switch control between AI automation and human interaction:
curl -X POST "https://api.boxes.banata.dev/v1/sandboxes/browser-preview/control" \
-H "Authorization: Bearer br_live_..." \
-H "Content-Type: application/json" \
-d '{
"id": "SANDBOX_ID",
"mode": "human",
"controller": "user@example.com",
"leaseMs": 300000
}'| Field | Type | Required | Description |
|---|---|---|---|
id | string | Yes | Sandbox session ID |
mode | "ai" | "human" | "shared" | Yes | Who controls the browser |
controller | string | No | Identifier for the current controller |
leaseMs | number | No | How long the controller has exclusive access (ms) |
OpenCode Integration
OpenCode is an AI agent that can generate and execute code inside the sandbox. It turns your sandbox into a prompt-driven development environment.
Enabling OpenCode
{
"runtime": "bun",
"size": "standard",
"capabilities": {
"opencode": {
"enabled": true,
"defaultAgent": "build",
"allowPromptApi": true
}
}
}| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | false | Enable OpenCode |
defaultAgent | "build" | "plan" | — | Default agent type for prompts |
allowPromptApi | boolean | — | Allow sending prompts via the API. If false, OpenCode works internally but cannot receive prompts via /v1/sandboxes/opencode/prompt. |
Sending prompts
curl -X POST "https://api.boxes.banata.dev/v1/sandboxes/opencode/prompt" \
-H "Authorization: Bearer br_live_..." \
-H "Content-Type: application/json" \
-d '{
"id": "SANDBOX_ID",
"prompt": "Create a TypeScript script that fetches the top 10 Hacker News stories and saves them as JSON."
}'| Field | Type | Required | Description |
|---|---|---|---|
id | string | Yes | Sandbox session ID |
prompt | string | Yes | The instruction for the AI agent |
agent | "build" | "plan" | No | Agent type (uses defaultAgent if omitted) |
sessionId | string | No | OpenCode session ID for conversation continuity |
noReply | boolean | No | If true, send the prompt without waiting for a response |
OpenCode reads the current workspace state, generates code, writes files, and runs commands to fulfill the prompt.
If you are building your own chat UI, treat OpenCode as a server-side agent runtime:
GET /v1/sandboxes/opencode/statefor current session/runtime statusGET /v1/sandboxes/opencode/messagesfor message historyGET /v1/sandboxes/opencode/eventsfor live SSE streamingPOST /v1/sandboxes/opencode/promptor/prompt-asyncto submit work
This is the same model the hosted dashboard uses.
Checking OpenCode status
curl "https://api.boxes.banata.dev/v1/sandboxes/runtime?id=SANDBOX_ID" \
-H "Authorization: Bearer br_live_..."The response includes the current state of OpenCode (idle, working, complete) along with any outputs.
Error responses
| Status | Body | Cause |
|---|---|---|
| 403 | OpenCode prompt API is disabled for this sandbox | allowPromptApi is false |
| 409 | OpenCode is not enabled | capabilities.opencode.enabled is not true |
Human Handoff
Automated workflows sometimes encounter situations that require human judgment — MFA prompts, ambiguous UI elements, or CAPTCHA challenges. The human handoff system lets the sandbox request human assistance and pause automation until the human completes the task.
Requesting handoff
curl -X POST "https://api.boxes.banata.dev/v1/sandboxes/handoff/request" \
-H "Authorization: Bearer br_live_..." \
-H "Content-Type: application/json" \
-d '{
"id": "SANDBOX_ID",
"reason": "mfa",
"message": "Please enter the 6-digit code from your authenticator app."
}'| Field | Type | Required | Description |
|---|---|---|---|
id | string | Yes | Sandbox session ID |
reason | string | Yes | One of: mfa, captcha, approval, login_failed, ambiguous_ui, file_download, custom |
message | string | Yes | Human-readable description of what's needed |
requestedBy | string | No | Who initiated: opencode, worker, sdk, dashboard. Defaults to sdk. |
resumePrompt | string | No | Prompt to automatically execute when the human returns control |
expiresInMs | number | No | When the handoff request expires (ms) |
Accepting handoff
A human sees the request through your application's UI and takes control:
curl -X POST "https://api.boxes.banata.dev/v1/sandboxes/handoff/accept" \
-H "Authorization: Bearer br_live_..." \
-H "Content-Type: application/json" \
-d '{
"id": "SANDBOX_ID",
"controller": "user@example.com",
"leaseMs": 300000
}'| Field | Type | Required | Description |
|---|---|---|---|
id | string | Yes | Sandbox session ID |
controller | string | Yes | Identifier for the human taking control |
leaseMs | number | No | How long the human has control (ms) |
Completing handoff
After the human finishes, they return control to automation:
curl -X POST "https://api.boxes.banata.dev/v1/sandboxes/handoff/complete" \
-H "Authorization: Bearer br_live_..." \
-H "Content-Type: application/json" \
-d '{
"id": "SANDBOX_ID",
"note": "MFA code entered successfully",
"returnControlTo": "ai",
"runResumePrompt": true
}'| Field | Type | Required | Description |
|---|---|---|---|
id | string | Yes | Sandbox session ID |
controller | string | No | Controller identifier |
note | string | No | Completion note |
returnControlTo | "ai" | "shared" | No | Who gets control after completion |
runResumePrompt | boolean | No | Execute the resumePrompt from the original request |
Checking handoff state
curl "https://api.boxes.banata.dev/v1/sandboxes/handoff?id=SANDBOX_ID" \
-H "Authorization: Bearer br_live_..."Returns the current handoff state including:
{
"humanHandoff": {
"requestId": "...",
"status": "pending",
"reason": "mfa",
"message": "Please enter the 6-digit code from your authenticator app.",
"requestedBy": "sdk",
"requestedAt": 1772078705353,
"expiresAt": 1772079005353
}
}Handoff statuses: pending → accepted → completed (or cancelled / expired).
Task Completion Reports
When the AI agent finishes a task, it can submit a structured completion report. The report contains a detailed step-by-step recipe — URLs visited, selectors used, values entered, wait times, and error workarounds — that can be saved as a browser skill and replayed by a cheaper model later.
How it works
OpenCode calls the banata-complete-task CLI tool (automatically installed in every sandbox with OpenCode enabled):
banata-complete-task \
--result "Successfully scraped pricing data from 3 competitor sites" \
--report "1. Navigate to https://example.com/pricing\n2. Wait for .pricing-table selector\n3. Extract text from .plan-name and .plan-price elements\n4. Click 'Show annual pricing' toggle (.billing-toggle)\n5. Re-extract prices for annual plans\n6. Save results to /workspace/output/pricing.json" \
--url "https://example.com/pricing" \
--steps 6 \
--duration 45000The report is automatically uploaded to R2 as a JSON artifact and synced to Convex.
Retrieving the last task report
curl "https://api.boxes.banata.dev/v1/sandboxes/runtime?id=SANDBOX_ID" \
-H "Authorization: Bearer br_live_..."The response includes a lastTaskReport field:
{
"lastTaskReport": {
"result": "Successfully scraped pricing data from 3 competitor sites",
"report": "1. Navigate to https://example.com/pricing\n2. Wait for .pricing-table selector\n3. ...",
"url": "https://example.com/pricing",
"stepsCompleted": 6,
"durationMs": 45000,
"completedAt": 1772078705353
}
}Report fields
| Field | Type | Required | Description |
|---|---|---|---|
result | string | Yes | High-level summary of what was accomplished |
report | string | Yes | Detailed step-by-step recipe: URLs visited, selectors used, values entered, wait times, error workarounds. This is the procedural recipe that gets auto-saved as a browser skill for future replay. |
url | string | No | Final URL the browser was on when the task completed |
stepsCompleted | number | No | Total number of steps executed |
actions | string[] | No | List of individual actions taken |
durationMs | number | No | Total duration in milliseconds |
usage | object | No | LLM token usage metadata |
Replaying reports
Task reports are designed as replay recipes for cheaper models. The report field contains enough detail — selectors, values, timing, and workarounds — that a smaller model can follow the same procedure without needing to reason from scratch. Reports are persisted as R2 artifacts under {artifactPrefix}/task-reports/.
Document Processing
For sandboxes that need to work with office documents, enable the documents capability:
{
"runtime": "base",
"size": "standard",
"capabilities": {
"documents": {
"libreofficeHeadless": true
}
}
}This pre-installs LibreOffice in headless mode, enabling document conversion, rendering, and manipulation:
# Convert DOCX to PDF
libreoffice --headless --convert-to pdf document.docx
# Convert spreadsheet to CSV
libreoffice --headless --convert-to csv spreadsheet.xlsxStorage Options
Configure workspace persistence and artifact organization:
{
"capabilities": {
"storage": {
"workspace": "checkpointed",
"artifactPrefix": "my-project/batch-1/"
}
}
}| Field | Type | Default | Description |
|---|---|---|---|
workspace | "ephemeral" | "checkpointed" | "ephemeral" | Whether workspace can be checkpointed |
artifactPrefix | string | — | Custom prefix for artifact storage paths |
Combining Capabilities
Capabilities can be combined freely. A common pattern is a sandbox with browser pairing, OpenCode, and human handoff:
{
"runtime": "bun",
"size": "standard",
"capabilities": {
"browser": {
"mode": "local-chromium",
"recording": true,
"streamPreview": true,
"humanInLoop": true,
"viewport": {
"width": 1280,
"height": 800
}
},
"opencode": {
"enabled": true,
"defaultAgent": "build",
"allowPromptApi": true
},
"storage": {
"workspace": "checkpointed"
}
}
}This creates an AI-driven workspace that can browse the web, write code, execute it, request human help for tricky situations, save everything for later, and produce detailed task reports that can be replayed by cheaper models.
All sandbox examples here use the single public sandbox size: standard.
Next Steps
- Sandboxes — Overview and session management
- Sandbox Execution — Running commands and code
- Sandbox File System — File operations and persistence
- API Reference — Full endpoint documentation