Start Here
Introduction
Understand what Banata is, what a sandbox contains, and how to think about agent workflows, browser work, human handoff, and durable outputs.
Banata gives your app a remote workspace that can do real work on your behalf.
That workspace is called a sandbox.
A sandbox can:
- run commands
- execute code
- read and write files
- use a live browser
- let an AI agent work across those tools
- let a person take over when needed
- return durable outputs back to your app
The important idea is that all of that happens in one place.
You are not stitching together one browser service, another file worker, and a separate agent session.
Your job runs inside one sandbox, and every step of the workflow shares the same workspace and state.
What problems Banata solves
Banata is useful when a job is bigger than one API call.
Examples:
- research a site, open supporting pages, write a report, and return a PDF
- sign in to a service, pause for a person to complete a verification step, then continue automatically
- gather files, transform them, and export them in a format your app needs
- let an AI agent work for a long time without forcing your app to keep one request open
- keep a real browser, real files, and real agent context together instead of rebuilding context every step
If your use case is only “run one command and return stdout”, Banata still works, but it is most valuable when a workflow spans:
- tools
- time
- browser state
- files
- people
The mental model
Think about Banata in this order:
- A sandbox is the job container.
- The browser is one capability inside that sandbox.
- The AI agent is another capability inside that same sandbox.
- Files in
/workspaceare the shared memory of the workflow. - Artifacts are the durable outputs you hand back to your app.
- Webhooks are how your app learns that long-running work finished.
That mental model matters because it helps you choose the right API shape.
For example:
- use
exec()when you already know the command you want - use
promptAsync()when the work is multi-step or may take a while - use
/workspacewhen the workflow should produce files another step will read later - use a webhook when your own backend should react when the job finishes
What a sandbox includes
A sandbox is not only a shell session.
It is a working environment that can include:
- a writable workspace at
/workspace - an AI agent
- a live browser preview
- human handoff controls
- document tools
- checkpoints
- artifacts
- optional browser recording
- optional outbound proxy configuration on supported plans
From your app’s point of view, that means you can build workflows like:
- “open these pages, summarize them, save the result, convert it to PDF, and notify me when done”
- “drive the browser until login is needed, hand off to a person, then resume”
- “download a document, transform it, and return the final file to my system”
How to think about job boundaries
A good default is:
- one meaningful job
- one sandbox
Examples of a good boundary:
- one customer onboarding flow
- one research request
- one document generation task
- one browser automation run
That keeps the workflow easier to reason about because the browser state, generated files, and agent context all belong to the same job.
Direct control vs agent-driven work
Banata supports both.
Direct control
Use direct methods when your app already knows exactly what should happen.
Examples:
exec()to run a known commandrunCode()to execute a short snippetfs.write()to create a filenavigatePreview()when you want to push the browser to a known URL
Direct methods are best when the workflow is deterministic.
Agent-driven work
Use the AI agent when the job needs decision-making or multi-step reasoning.
Examples:
- research across several pages
- compare content from different sources
- read files, then produce a new document
- use the browser, then save a result to
/workspace
The best default for real product usage is promptAsync() because:
- it returns quickly
- it gives you a
taskId - it works well with webhooks
- it fits long-running jobs better than one open request
How long-running work fits
This is one of the most important concepts in the product.
You do not need to keep one HTTP request open while the AI agent works.
A good production flow is:
- create or launch a sandbox
- call
promptAsync() - store the returned
taskId - pass your own metadata with the task
- receive completion through a webhook
- fetch files or artifacts when the work is done
That is the normal Banata pattern for jobs that may run for minutes or longer.
Where results live
Results usually appear in one of two places:
/workspace
Use /workspace for working files:
- notes
- reports
- downloaded inputs
- intermediate outputs
- files the agent should keep editing
artifacts
Use artifacts for outputs your app should download or keep as durable results:
- converted documents
- workspace packages
- browser recordings
- checkpoint-related outputs
The common pattern is:
- the agent writes working files into
/workspace - your app reads small files directly if needed
- your app uses artifact download URLs for larger or durable outputs
Human handoff
Not every step should be automated.
Banata includes handoff because real workflows often need:
- login
- approval
- verification
- manual inspection
The important part is that handoff does not create a separate environment.
The person and the AI agent work on the same browser and the same sandbox.
When the person is done, control can go back to the agent and the workflow can continue from that exact state.
The best first way to use Banata
If you are new to the product, start with this pattern:
- use
launch()to create a sandbox and wait for readiness - send one
promptAsync()task - have the agent save a result into
/workspace - inspect the preview URL while it works
- read the file back or download an artifact
- end the sandbox when finished
That single flow teaches almost every core concept:
- readiness
- agent tasks
- browser work
- files
- outputs
- cleanup