The Browser API
Built for AI Agents

Control real browsers through a simple REST API. Get structured page data, stable element refs, and change diffs instead of raw HTML.

Stability detection built in

Fraction of the payload size

Diffs after every action

Start Free Trial Read the Docs

No credit card required. 1 hour of free runtime included.

See the API in Action

Six steps to show the full lifecycle: create a session, observe the page, fill forms, extract data, scroll, and screenshot.

POST /v1/sessions

{
  "url": "https://app.example.com/login",
  "viewport": {
    "width": 1280,
    "height": 720
  },
  "auto_dismiss_blockers": true
}

201 Created

{
  "session_id": "ses_abc123def456",
  "expires_at": "2026-03-22T14:05:00.000Z",
  "request_id": "req_8f3a2bc1d4e5",
  "completed": 0,
  "page": {
    "url": "https://app.example.com/login",
    "title": "Log In",
    "stable": true,
    "markdown": {
      "content": "# Log In\n\nWelcome back..."
    },
    "interactive_elements": [
      { "ref": "e1", "tag": "input",
        "label": "Email", "in": "form", "near": "Log In", "form": "f1" },
      { "ref": "e2", "tag": "input",
        "label": "Password", "in": "form", "near": "Log In", "form": "f1" },
      { "ref": "e3", "tag": "button",
        "label": "Sign In", "in": "form", "near": "Log In", "form": "f1" }
    ],
    "forms": [
      { "ref": "f1", "id": "login", "action": "/login", "method": "POST", "fields": ["e1", "e2", "e3"] }
    ]
  },
  "media": [],
  "extraction": null,
  "blockers_dismissed": ["cookie_consent"],
  "error": null
}

Without Browserbeam

Launch Puppeteer, set viewport, navigate, wait for networkidle, detect and dismiss cookie banner (2-3 extra actions), call page.content(), parse 15,000+ character HTML with cheerio, manually extract form fields.

~25 lines of code. A wall of raw HTML for your LLM to parse.

With Browserbeam

One POST request. Navigate, auto-dismiss the cookie banner, return markdown content, element refs, and form structures. The page is ready for your agent to read and act on.

1 API call. Markdown + refs + forms. Compact and LLM-ready.

POST /v1/sessions/ses_abc123/act

{
  "steps": [
    {
      "observe": {
        "scope": "#main-content",
        "format": "markdown",
        "include_links": true
      }
    }
  ]
}

200 OK

{
  "session_id": "ses_abc123def456",
  "expires_at": "2026-03-22T14:05:00.000Z",
  "request_id": "req_c7d41e8f2a09",
  "completed": 1,
  "page": {
    "url": "https://app.example.com/login",
    "stable": true,
    "markdown": {
      "content": "# Log In\n\nWelcome back.\nEnter your credentials.",
      "length": 48
    },
    "interactive_elements": [
      { "ref": "e1", "tag": "input",
        "label": "Email", "in": "form", "form": "f1" },
      { "ref": "e2", "tag": "input",
        "label": "Password", "in": "form", "form": "f1" },
      { "ref": "e3", "tag": "button",
        "label": "Sign In", "in": "form", "form": "f1" }
    ],
    "changes": null
  },
  "media": [],
  "extraction": null,
  "error": null
}

Without Browserbeam

Call page.content() for the full DOM (15,000+ characters), then use cheerio or regex to extract just the section you need. Convert HTML to markdown yourself. No way to detect what changed since your last read.

A bloated payload for a single page read. No scoping. No diff.

With Browserbeam

Scope to a CSS selector, get back clean markdown and element refs for just that section. The changes field shows what shifted since the last observation, so your agent never re-reads stale content.

Scoped and compact. Markdown built in. Diff tracking automatic.

POST /v1/sessions/ses_abc123/act

{
  "steps": [
    {
      "fill_form": {
        "fields": {
          "Email": "user@example.com",
          "Password": "secret123"
        },
        "submit": true
      }
    }
  ]
}

200 OK

{
  "session_id": "ses_abc123def456",
  "expires_at": "2026-03-22T14:05:00.000Z",
  "request_id": "req_e2b8c5f19d34",
  "completed": 1,
  "page": {
    "url": "https://app.example.com/dashboard",
    "title": "Dashboard",
    "stable": true,
    "changes": {
      "content_changed": true,
      "elements_added": [
        { "ref": "e1", "tag": "a" },
        { "ref": "e2", "tag": "a" }
      ]
    },
    "interactive_elements": [
      { "ref": "e1", "tag": "a",
        "label": "Settings", "in": "nav" },
      { "ref": "e2", "tag": "a",
        "label": "Logout", "in": "nav" }
    ]
  },
  "media": [],
  "extraction": null,
  "error": null
}

Without Browserbeam

Find email input by CSS selector (breaks if markup changes), page.type() the value, find password input, type again, find submit button by selector, page.click(), waitForNavigation(), then re-scrape the entire page to see results.

~15 lines. 3 fragile selectors. A full page re-read just to see what changed.

With Browserbeam

One step matches fields by label, fills both inputs, finds the submit button, clicks it, and waits for navigation. The response includes the new page state with a diff showing what changed.

1 API call. 0 selectors. Change diff included automatically.

POST /v1/sessions/ses_abc123/act

{
  "steps": [
    {
      "extract": {
        "products": [
          {
            "_parent": ".product-card",
            "name": "h2 >> text",
            "price": ".price >> text",
            "url": "a >> href"
          }
        ]
      }
    }
  ]
}

200 OK

{
  "session_id": "ses_abc123def456",
  "request_id": "req_a1d9f3c72b60",
  "completed": 1,
  "extraction": {
    "products": [
      {
        "name": "Wireless Headphones",
        "price": "$79.99",
        "url": "/products/wireless-headphones"
      },
      {
        "name": "USB-C Hub",
        "price": "$34.99",
        "url": "/products/usb-c-hub"
      }
    ]
  },
  "media": [],
  "error": null
}

Without Browserbeam

Write page.evaluate() with querySelectorAll, map each element to an object, handle null checks for missing fields, JSON.stringify the result, parse it back in Node.js. Selectors break when the site redesigns.

~20 lines of in-page JavaScript. Fragile. No schema validation.

With Browserbeam

Declare the shape of the data you want. The extract step handles the DOM traversal and returns clean, structured JSON. Tables are auto-parsed. Supports CSS selectors and XPath.

1 declarative schema. Structured JSON response.

POST /v1/sessions/ses_abc123/act

{
  "steps": [
    {
      "scroll_collect": {
        "max_text_length": 50000,
        "max_scrolls": 30
      }
    }
  ]
}

200 OK

{
  "session_id": "ses_abc123def456",
  "request_id": "req_f4a82d1c0e57",
  "completed": 1,
  "page": {
    "url": "https://news.example.com/feed",
    "title": "Latest News",
    "stable": true,
    "markdown": {
      "content": "# Latest News\n\n## Story 1...",
      "length": 32840
    },
    "interactive_elements": [
      { "ref": "e1", "label": "Load More", "in": "main" }
    ],
    "scroll": {
      "y": 14200,
      "height": 14200,
      "percent": 100
    }
  },
  "media": [],
  "extraction": null,
  "error": null
}

Without Browserbeam

Write a scroll loop: scroll down, wait for lazy content to load, check if you've reached the bottom, repeat. Handle race conditions with loading spinners and infinite scroll triggers. Collect content at each position, deduplicate.

~35 lines. Fragile timing. Content deduplication is your problem.

With Browserbeam

One step. Scrolls through the entire page, waits for lazy-loaded content at each position, deduplicates, and returns a single unified markdown observation. Handles infinite scroll, loading spinners, and content gates.

1 step. Full page content in one response. Up to 50,000 characters.

POST /v1/sessions/ses_abc123/act

{
  "steps": [
    {
      "screenshot": {
        "full_page": true,
        "format": "png"
      }
    },
    {
      "close": {}
    }
  ]
}

200 OK

{
  "session_id": "ses_abc123def456",
  "request_id": "req_d5c83e7f1b92",
  "completed": 2,
  "media": [
    {
      "type": "screenshot",
      "format": "png",
      "data": "iVBORw0KGgo..."
    }
  ],
  "extraction": null,
  "page": null,
  "error": null
}

Without Browserbeam

page.screenshot() to a temp file, read the file into a buffer, base64-encode it, browser.close(), handle cleanup errors if the process crashed. You manage Chrome process lifecycle yourself.

~12 lines. Must manage file I/O and process cleanup.

With Browserbeam

Screenshot and close in a single call. Base64 image data returned inline. The close step destroys the session and stops the billing clock. No cleanup code needed.

1 API call, 2 steps. Session cleaned up automatically.

An Intelligence Layer for the Browser

Nine capabilities that sit between your agent and the page, so the LLM spends tokens on the task, not on browser overhead.

Stability Detection

Every response includes a stability signal that tells your agent when the page is fully loaded and ready. No more guessing wait times or burning tokens on premature reads.

Element Registry

Interactive elements get short, stable refs like e1, e2, e3. Your agent clicks by ref instead of constructing fragile CSS selectors.

Diff Tracking

After each action, the API returns only what changed: elements added, removed, or modified. Your agent reads a 30-token diff instead of re-parsing the entire page.

Blocker Dismissal

Cookie banners, newsletter popups, and chat widgets are detected and dismissed automatically. Your agent never wastes actions on interruptions irrelevant to the task.

Semantic Extraction

Pages are compressed into a structured, token-efficient representation: interactive elements, headings, and visible text. Thousands of DOM nodes become a compact JSON object.

Error Enrichment

When an action fails, you get context, not just "element not found." The API tells you if an overlay is blocking the target, if a CAPTCHA appeared, and what to do next.

JavaScript Execution

Run custom JavaScript on any page when built-in steps aren't enough. Your agent writes a JS snippet and the API executes it in the browser context, returning the result as structured data.

Cookie Injection

Inject cookies at session creation to skip login flows entirely. Your agent authenticates once, saves the cookies, and resumes authenticated sessions instantly.

Smart Wait Conditions

Wait for JavaScript expressions to become truthy, not just DOM selectors. Your agent handles complex SPAs where visibility depends on framework state, not raw DOM presence.

What Will You Build?

One API, many possibilities. From autonomous agents to data pipelines, Browserbeam gives your code a browser it can see through.

Give your AI agent a real browser it can see and control. Every response includes markdown, stable element refs, and optional context—landmark, nearby heading, and parent form—plus forms grouped with their field refs.

Request

POST /v1/sessions

{
  "url": "https://app.example.com/login",
  "viewport": {
    "width": 1280,
    "height": 720
  },
  "auto_dismiss_blockers": true
}

Response

{
  "session_id": "ses_abc123def456",
  "expires_at": "2026-03-22T14:05:00.000Z",
  "request_id": "req_8f3a2bc1d4e5",
  "page": {
    "url": "https://app.example.com/login",
    "title": "Log In",
    "stable": true,
    "markdown": {
      "content": "# Log In\n\nWelcome back..."
    },
    "interactive_elements": [
      { "ref": "e1", "tag": "input", "label": "Email", "in": "form", "near": "Log In", "form": "f1" },
      { "ref": "e2", "tag": "input", "label": "Password", "in": "form", "near": "Log In", "form": "f1" },
      { "ref": "e3", "tag": "button", "label": "Sign In", "in": "form", "near": "Log In", "form": "f1" }
    ],
    "forms": [
      { "ref": "f1", "id": "login", "action": "/login", "method": "POST", "fields": ["e1", "e2", "e3"] }
    ]
  },
  "media": [],
  "extraction": null,
  "blockers_dismissed": ["cookie_consent"],
  "error": null
}

Request

POST /v1/sessions

{
  "url": "https://store.example.com",
  "steps": [{
    "extract": {
      "products": [{
        "_parent": ".product-card",
        "name": "h2 >> text",
        "price": ".price >> text",
        "url": "a >> href"
      }]
    }
  }, {
    "close": {}
  }]
}

Response

{
  "session_id": "ses_def789abc012",
  "request_id": "req_b4e2c7d81f30",
  "completed": 2,
  "extraction": {
    "products": [
      {
        "name": "Wireless Headphones",
        "price": "$79.99",
        "url": "/products/wireless-headphones"
      },
      {
        "name": "USB-C Hub",
        "price": "$34.99",
        "url": "/products/usb-c-hub"
      },
      {
        "name": "Mechanical Keyboard",
        "price": "$129.00",
        "url": "/products/mech-keyboard"
      }
    ]
  },
  "media": [],
  "page": null,
  "error": null
}

Request

POST /v1/sessions/:id/act

{
  "steps": [{
    "observe": {
      "scope": "#checkout-form",
      "format": "markdown",
      "include_links": true
    }
  }]
}

Response

{
  "session_id": "ses_abc123def456",
  "request_id": "req_91c4a8e72d05",
  "completed": 1,
  "page": {
    "url": ".../checkout",
    "stable": true,
    "markdown": {
      "content": "## Checkout\n\nShipping...",
      "length": 284
    },
    "interactive_elements": [
      { "ref": "e1", "tag": "input", "label": "Address", "in": "form", "near": "Checkout", "form": "f1" },
      { "ref": "e2", "tag": "select", "label": "Country", "in": "form", "near": "Checkout", "form": "f1" },
      { "ref": "e3", "tag": "button", "label": "Pay Now", "in": "form", "near": "Checkout", "form": "f1" }
    ],
    "forms": [
      { "ref": "f1", "id": "checkout-form", "action": "/checkout", "method": "POST", "fields": ["e1", "e2", "e3"] }
    ],
    "changes": null
  },
  "media": [],
  "extraction": null,
  "error": null
}

Request

POST /v1/sessions

{
  "url": "https://app.example.com/login",
  "steps": [
    { "fill_form": {
        "fields": {
          "Email": "user@example.com",
          "Password": "secret123"
        },
        "submit": true
    }},
    { "wait": { "ms": 2000 }},
    { "screenshot": {
        "full_page": true
    }},
    { "close": {} }
  ]
}

Response

{
  "session_id": "ses_ghi012jkl345",
  "request_id": "req_73b9e4f02c18",
  "completed": 4,
  "page": {
    "url": ".../dashboard",
    "title": "Dashboard",
    "stable": true,
    "changes": {
      "content_changed": true,
      "elements_added": [
        { "ref": "e1" },
        { "ref": "e2" }
      ]
    },
    "interactive_elements": [
      { "ref": "e1", "tag": "a", "label": "Settings", "in": "nav", "near": "Dashboard" },
      { "ref": "e2", "tag": "a", "label": "Logout", "in": "nav", "near": "Dashboard" }
    ],
    "forms": []
  },
  "media": [{
    "type": "screenshot",
    "format": "png",
    "data": "iVBORw0KGgo..."
  }],
  "extraction": null,
  "error": null
}

Drop Into Your Stack in Minutes

Official SDKs for the languages you already use, plus an MCP server that turns Browserbeam into tools your AI coding assistant can call.

Python SDK

pip install, full type hints, sync and async clients.

TypeScript SDK

npm install, full TypeScript types, ESM and CJS builds.

Ruby SDK

gem install, block-based sessions, Struct-based types.

MCP Server

Use as tools in Cursor, Claude Desktop, and Windsurf.

Browserbeam is a REST API. Any language that can make HTTP requests can use it.

Browse All Integrations

Simple Runtime Billing

Pay for the time your sessions are open. Start with a 1-hour free trial. No credit card needed.

Starter

For individuals and side projects

$29 / month

100 hours runtime / month
5 concurrent sessions
15 min per session
LLM-optimized browser API
Stability detection & diff tracking

Get Started

Best Value

Pro

For teams and production use

$99 / month

500 hours runtime / month
50 concurrent sessions
30 min per session
LLM-optimized browser API
Stability detection & diff tracking

Get Started

Scale

For agencies and high-volume use

$199 / month

1,500 hours runtime / month
100 concurrent sessions
1 hour per session
LLM-optimized browser API
Stability detection & diff tracking

Get Started

Billed monthly by runtime. Cancel anytime.
Prices exclude VAT where applicable.

Frequently Asked Questions

How does the browser API work?

Browserbeam provides a REST API that gives your LLM or automation code control over a real browser. Create a session, navigate to any URL, observe the page as structured JSON, interact with elements using short refs, and get change diffs after every action. You send HTTP requests. We handle the browser infrastructure.

How is this different from Browserless or Browserbase?

Those services give you hosted Playwright or Puppeteer. You still get raw HTML, guess at wait times, and parse the DOM yourself. Browserbeam adds an intelligence layer on top: stability detection, semantic extraction, element refs, diff tracking, and blocker dismissal. Your LLM gets structured, compact data instead of a wall of HTML.

What can I build with Browserbeam?

LLM-powered web agents, automated form filling, web scraping, QA testing, screenshot and PDF generation, and any workflow where AI needs to see and interact with web pages. Browserbeam handles the browser so your code stays simple.

How does pricing work?

You pay for runtime: the wall-clock time your browser sessions are open. No credits, no units, no bandwidth metering. The free trial gives you 1 hour of total runtime to test the full API. Paid plans include monthly runtime allowances so you never get cut off mid-session.

Is my data private and secure?

Every session runs in an isolated browser context with separate cookies, storage, and cache. No CDP port is exposed. Sessions are destroyed when closed and we never store page content beyond the session lifetime. API keys are hashed. Your usage data is accessible only to you.

Do I need to manage Playwright or Puppeteer?

No. Browserbeam is a REST API. You send JSON, you get JSON back. No browser libraries to install, no Chrome binaries to maintain, no WebSocket connections to manage. If your code can make HTTP requests, it can use Browserbeam.

The Browser API Built for AI Agents

See the API in Action

An Intelligence Layer for the Browser

Stability Detection

Element Registry

Diff Tracking

Blocker Dismissal

Semantic Extraction

Error Enrichment

JavaScript Execution

Cookie Injection

Smart Wait Conditions

What Will You Build?

Drop Into Your Stack in Minutes

Python SDK

TypeScript SDK

Ruby SDK

MCP Server

Simple Runtime Billing

Starter

Pro

Scale

Frequently Asked Questions

How does the browser API work?

How is this different from Browserless or Browserbase?

What can I build with Browserbeam?

How does pricing work?

Is my data private and secure?

Do I need to manage Playwright or Puppeteer?

Give your AI agent a faster, leaner browser

The Browser API
Built for AI Agents