How to Scrape YouTube: Videos and Transcripts

YouTube video page showing video title, view count, likes, channel info, and description metadata

YouTube has 2.5 billion monthly active users, and the YouTube Data API caps you at 10,000 quota units per day. A single search request costs 100 units. A single video details request costs 1 unit but returns no comments, no transcripts, and no related videos. If you're building a dataset of 50,000 videos with transcripts for NLP training, the official API would take months of careful quota management. Web scraping gets you there in hours.

The catch: YouTube runs one of the most aggressive headless browser detection systems on the web. Datacenter proxies get blocked. Default User-Agent strings trigger a "Please update your browser" wall. And the site renders entirely through JavaScript custom elements, so static HTTP requests return empty shells.

This guide walks through scraping YouTube with Browserbeam's cloud browser API, which handles JavaScript rendering, residential proxies, and cookie consent automatically. We'll cover everything from basic video metadata to full transcript extraction:

A scraper that returns video metadata (title, views, likes, duration, upload date, channel) from any YouTube video page
Structured JSON extraction from YouTube's ytInitialPlayerResponse object
Channel scraping: subscriber count, handle, and a full video list with views and upload dates
Transcript extraction using the interactive "Show transcript" panel
Why YouTube blocks default headless browsers (and the specific fix)
YouTube Data API vs web scraping: quota math, available data, and when to use each
CSV and JSON export for building research datasets

TL;DR: Use residential proxies and a custom Chrome User-Agent for YouTube. Datacenter IPs and default headless User-Agents both trigger blocks. Use observe for rich markdown, execute_js for structured JSON data. Transcripts require an interactive flow: expand the description, click "Show transcript," then read the panel. For video metadata via execute_js, block images, fonts, media, stylesheets, and scripts to minimize bandwidth -- the data lives in an inline script that executes regardless. For channel pages and transcripts, keep scripts and stylesheets enabled since YouTube's SPA framework must render the DOM.

Don't have an API key yet? Create a free Browserbeam account - you get 5,000 credits, no credit card required.

Quick Start: Scrape a YouTube Video

Let's start with the simplest case. Create a session on any YouTube video, then call observe to read the page as structured markdown. Two API calls. You get the title, view count, likes, channel name, subscriber count, description, and a list of related videos.

YouTube requires three specific settings:

Residential proxy (datacenter IPs get blocked)
Custom User-Agent matching a real Chrome browser
Resource blocking for images, fonts, and media (saves bandwidth, speeds up loading)

# Step 1: Create session with residential proxy + custom UA
SESSION_ID=$(curl -s -X POST https://api.browserbeam.com/v1/sessions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "proxy": { "kind": "residential", "country": "us" },
    "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    "block_resources": ["image", "font", "media"],
    "auto_dismiss_blockers": true
  }' | jq -r '.session_id')

# Step 2: Observe + close
curl -s -X POST "https://api.browserbeam.com/v1/sessions/$SESSION_ID/act" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"steps": [{"observe": {}}, {"close": {}}]}' \
  | jq '.page.markdown.content'

from browserbeam import Browserbeam

client = Browserbeam(api_key="YOUR_API_KEY")
session = client.sessions.create(
    url="https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    proxy={"kind": "residential", "country": "us"},
    user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    block_resources=["image", "font", "media"],
    auto_dismiss_blockers=True,
)

session.observe()
print(session.page.markdown.content)
session.close()

import Browserbeam from "@browserbeam/sdk";

const client = new Browserbeam({ apiKey: "YOUR_API_KEY" });
const session = await client.sessions.create({
  url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  proxy: { kind: "residential", country: "us" },
  user_agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
  block_resources: ["image", "font", "media"],
  auto_dismiss_blockers: true,
});

await session.observe();
console.log(session.page.markdown.content);
await session.close();

require "browserbeam"

client = Browserbeam::Client.new(api_key: "YOUR_API_KEY")
session = client.sessions.create(
  url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  proxy: { kind: "residential", country: "us" },
  user_agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
  block_resources: ["image", "font", "media"],
  auto_dismiss_blockers: true,
)

session.observe
puts session.page.markdown.content
session.close

The observe response includes the video title, view count, like count, full description, channel info, related videos, and comment count in clean markdown. No HTML parsing required.

What Data Can You Extract from YouTube?

YouTube pages contain more structured data than most websites. Here's what's available from each page type:

Page Type	Available Fields	Extraction Method
Video page	Title, views, likes, duration, upload date, description, channel name, subscriber count, tags, comment count, related videos	`observe` for markdown, `execute_js` for `ytInitialPlayerResponse`
Channel page	Channel name, handle, subscriber count, video list (title, views, upload date, duration, URL)	`observe` for video list, `execute_js` for structured data
Search results	Video titles, channels, view counts, durations, URLs	`observe` for markdown list
Playlist	Playlist title, video count, video list with titles and channels	`observe` for markdown

Video pages also expose a ytInitialPlayerResponse JavaScript object that contains structured metadata: videoDetails (title, view count, channel, keywords, duration) and microformat (category, publish date, description). This is the most reliable extraction source because it's populated during the initial page load and doesn't depend on YouTube's DOM structure.

Why YouTube Requires a Custom User-Agent

Most scraping guides focus on IP-based blocking. YouTube is different. The primary detection mechanism is browser fingerprinting through the User-Agent string.

During our API validation, we tested four configurations:

Configuration	Proxy	User-Agent	Result
Default	Datacenter	Default headless	"Please update your browser"
Residential only	Residential	Default headless	"Please update your browser"
UA only	Datacenter	Chrome 131	"Please update your browser"
Residential + UA	Residential	Chrome 131	Full page render

Both conditions are required: residential proxy AND a modern Chrome User-Agent. The detection happens before the page renders, so you can't work around it with JavaScript. The User-Agent string Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 works reliably.

There's one more gotcha with resource blocking. You should block image, font, and media resources to reduce bandwidth and speed up page loads. But do not block stylesheet on channel pages. YouTube's video grid uses CSS-dependent custom elements (ytd-rich-item-renderer), and blocking stylesheets prevents the video cards from rendering.

Resource Type	Block on Video Pages?	Block on Channel Pages?
`image`	Yes	Yes
`font`	Yes	Yes
`media`	Yes	Yes
`stylesheet`	Safe to block	Do NOT block

Scraping YouTube Video Pages

Video pages are the richest data source on YouTube. The observe endpoint returns the full page as markdown, including the title, view count, likes, description, and related videos. For structured data, execute_js can parse the LD+JSON VideoObject embedded in every video page.

Step 1: Create Session and Observe

The create call already returns page markdown. But for YouTube, we recommend calling observe separately to get a fresh read after any cookie consent dialogs are dismissed.

Step 2: Extract Structured Data via ytInitialPlayerResponse

Every YouTube video page populates a global ytInitialPlayerResponse object with the full video metadata. This gives you clean, structured data that's more reliable than parsing the DOM. The videoDetails object contains the title, view count, channel name, duration, and keywords. The microformat object adds the category, publish date, and full description.

Since ytInitialPlayerResponse is set by an inline <script> tag in the HTML, we can block external scripts and stylesheets too. This drops proxy bandwidth from ~15 MB to ~1.5 MB per request -- YouTube's main JavaScript bundle alone is 9.7 MB.

# Create session
SESSION_ID=$(curl -s -X POST https://api.browserbeam.com/v1/sessions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "proxy": { "kind": "residential", "country": "us" },
    "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    "block_resources": ["image", "font", "media", "stylesheet", "script"],
    "auto_dismiss_blockers": true
  }' | jq -r '.session_id')

# Extract video metadata from ytInitialPlayerResponse
curl -s -X POST "https://api.browserbeam.com/v1/sessions/$SESSION_ID/act" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "steps": [{
      "execute_js": {
        "code": "const vd = ytInitialPlayerResponse?.videoDetails; const mf = ytInitialPlayerResponse?.microformat?.playerMicroformatRenderer; return { title: vd?.title, viewCount: vd?.viewCount, channel: vd?.author, channelId: vd?.channelId, lengthSeconds: vd?.lengthSeconds, keywords: vd?.keywords, description: vd?.shortDescription, category: mf?.category, publishDate: mf?.publishDate, thumbnail: mf?.thumbnail?.thumbnails?.[0]?.url };",
        "result_key": "video"
      }
    }, {"close": {}}]
  }' | jq '.extraction.video'

from browserbeam import Browserbeam
import json

client = Browserbeam(api_key="YOUR_API_KEY")
session = client.sessions.create(
    url="https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    proxy={"kind": "residential", "country": "us"},
    user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    block_resources=["image", "font", "media", "stylesheet", "script"],
    auto_dismiss_blockers=True,
)

js_code = """
const vd = ytInitialPlayerResponse?.videoDetails;
const mf = ytInitialPlayerResponse?.microformat?.playerMicroformatRenderer;
return {
  title: vd?.title,
  viewCount: vd?.viewCount,
  channel: vd?.author,
  channelId: vd?.channelId,
  lengthSeconds: vd?.lengthSeconds,
  keywords: vd?.keywords,
  description: vd?.shortDescription,
  category: mf?.category,
  publishDate: mf?.publishDate,
  thumbnail: mf?.thumbnail?.thumbnails?.[0]?.url
};
"""

session.execute_js(js_code, result_key="video")
video = session.extraction["video"]
print(json.dumps(video, indent=2))
session.close()

import Browserbeam from "@browserbeam/sdk";

const client = new Browserbeam({ apiKey: "YOUR_API_KEY" });
const session = await client.sessions.create({
  url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  proxy: { kind: "residential", country: "us" },
  user_agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
  block_resources: ["image", "font", "media", "stylesheet", "script"],
  auto_dismiss_blockers: true,
});

const jsCode = `
const vd = ytInitialPlayerResponse?.videoDetails;
const mf = ytInitialPlayerResponse?.microformat?.playerMicroformatRenderer;
return {
  title: vd?.title,
  viewCount: vd?.viewCount,
  channel: vd?.author,
  channelId: vd?.channelId,
  lengthSeconds: vd?.lengthSeconds,
  keywords: vd?.keywords,
  description: vd?.shortDescription,
  category: mf?.category,
  publishDate: mf?.publishDate,
  thumbnail: mf?.thumbnail?.thumbnails?.[0]?.url
};
`;

await session.executeJs({ code: jsCode, result_key: "video" });
console.log(JSON.stringify(session.extraction.video, null, 2));
await session.close();

require "browserbeam"
require "json"

client = Browserbeam::Client.new(api_key: "YOUR_API_KEY")
session = client.sessions.create(
  url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  proxy: { kind: "residential", country: "us" },
  user_agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
  block_resources: ["image", "font", "media", "stylesheet", "script"],
  auto_dismiss_blockers: true,
)

js_code = <<~JS
  const vd = ytInitialPlayerResponse?.videoDetails;
  const mf = ytInitialPlayerResponse?.microformat?.playerMicroformatRenderer;
  return {
    title: vd?.title,
    viewCount: vd?.viewCount,
    channel: vd?.author,
    channelId: vd?.channelId,
    lengthSeconds: vd?.lengthSeconds,
    keywords: vd?.keywords,
    description: vd?.shortDescription,
    category: mf?.category,
    publishDate: mf?.publishDate,
    thumbnail: mf?.thumbnail?.thumbnails?.[0]?.url
  };
JS

session.execute_js(js_code, result_key: "video")
puts JSON.pretty_generate(session.extraction["video"])
session.close

The response looks like this:

{
  "title": "Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)",
  "viewCount": "1767841059",
  "channel": "Rick Astley",
  "channelId": "UCuAXFkgsw1L7xaCfnd5JJOw",
  "lengthSeconds": "213",
  "keywords": ["rick astley", "Never Gonna Give You Up", "nggyu", "never gonna give you up lyrics", "rick rolled"],
  "description": "The official video for \"Never Gonna Give You Up\" by Rick Astley...",
  "category": "Music",
  "publishDate": "2009-10-24T23:57:33-07:00",
  "thumbnail": "https://i.ytimg.com/vi/dQw4w9WgXcQ/sddefault.jpg"
}

The lengthSeconds gives you the video duration in seconds (213 = 3 minutes 33 seconds). The keywords array contains the tags the creator set. The category comes from YouTube's content classification system.

Scraping YouTube Channel Pages

Channel pages list all of a creator's videos with titles, view counts, upload dates, and durations. Navigate to /@handle/videos to get the video grid.

Channel pages need a slightly different configuration. Do not block stylesheets, or the video grid won't render.

# Create session on channel videos page
SESSION_ID=$(curl -s -X POST https://api.browserbeam.com/v1/sessions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/@RickAstleyYT/videos",
    "proxy": { "kind": "residential", "country": "us" },
    "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    "block_resources": ["image", "font", "media"],
    "auto_dismiss_blockers": true
  }' | jq -r '.session_id')

# Extract structured channel data via innerText parsing
curl -s -X POST "https://api.browserbeam.com/v1/sessions/$SESSION_ID/act" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "steps": [{
      "execute_js": {
        "code": "const items = document.querySelectorAll(\"ytd-rich-item-renderer\"); const videos = []; for (const el of items) { const link = el.querySelector(\"a[href*=\\\"/watch\\\"]\"); const titleEl = el.querySelector(\"h3\"); const title = titleEl ? (titleEl.innerText || \"\").trim() : null; const inner = el.innerText || \"\"; const vm = inner.match(/([0-9][0-9.,]*[KMB]?) views/); const tm = inner.match(/(\\d+ (?:seconds?|minutes?|hours?|days?|weeks?|months?|years?) ago)/); const dm = inner.match(/^(\\d+:\\d+)/m); videos.push({ title, url: link ? link.href : null, views: vm ? vm[0] : null, uploaded: tm ? tm[1] : null, duration: dm ? dm[1] : null }); } const header = document.querySelector(\"#page-header\"); const headerText = header?.innerText || \"\"; const nameEl = header?.querySelector(\"yt-dynamic-text-view-model\"); const channelName = nameEl?.innerText?.trim() || \"\"; const hm = headerText.match(/@[\\\\w-]+/); const handle = hm ? hm[0] : \"\"; const sm = headerText.match(/([\\\\d.]+[KMB]?) subscribers/); const subscribers = sm ? sm[1] + \" subscribers\" : \"\"; return { channel: channelName, handle, subscribers, videoCount: videos.length, videos: videos.slice(0, 10) };",
        "result_key": "channel"
      }
    }, {"close": {}}]
  }' | jq '.extraction.channel'

from browserbeam import Browserbeam
import json

client = Browserbeam(api_key="YOUR_API_KEY")
session = client.sessions.create(
    url="https://www.youtube.com/@RickAstleyYT/videos",
    proxy={"kind": "residential", "country": "us"},
    user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    block_resources=["image", "font", "media"],
    auto_dismiss_blockers=True,
)

js_code = """
const items = document.querySelectorAll('ytd-rich-item-renderer');
const videos = [];
for (const el of items) {
  const link = el.querySelector('a[href*="/watch"]');
  const titleEl = el.querySelector('h3');
  const title = titleEl ? (titleEl.innerText || '').trim() : null;
  const inner = el.innerText || '';
  const vm = inner.match(/([0-9][0-9.,]*[KMB]?) views/);
  const tm = inner.match(/(\\d+ (?:seconds?|minutes?|hours?|days?|weeks?|months?|years?) ago)/);
  const dm = inner.match(/^(\\d+:\\d+)/m);
  videos.push({
    title,
    url: link ? link.href : null,
    views: vm ? vm[0] : null,
    uploaded: tm ? tm[1] : null,
    duration: dm ? dm[1] : null
  });
}

const header = document.querySelector('#page-header');
const headerText = header?.innerText || '';
const nameEl = header?.querySelector('yt-dynamic-text-view-model');
const channelName = nameEl?.innerText?.trim() || '';
const hm = headerText.match(/@[\\w-]+/);
const handle = hm ? hm[0] : '';
const sm = headerText.match(/([\\d.]+[KMB]?) subscribers/);
const subscribers = sm ? sm[1] + ' subscribers' : '';

return {
  channel: channelName,
  handle,
  subscribers,
  videoCount: videos.length,
  videos: videos.slice(0, 10)
};
"""

session.execute_js(js_code, result_key="channel")
print(json.dumps(session.extraction["channel"], indent=2))
session.close()

import Browserbeam from "@browserbeam/sdk";

const client = new Browserbeam({ apiKey: "YOUR_API_KEY" });
const session = await client.sessions.create({
  url: "https://www.youtube.com/@RickAstleyYT/videos",
  proxy: { kind: "residential", country: "us" },
  user_agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
  block_resources: ["image", "font", "media"],
  auto_dismiss_blockers: true,
});

const jsCode = `
const items = document.querySelectorAll('ytd-rich-item-renderer');
const videos = [];
for (const el of items) {
  const link = el.querySelector('a[href*="/watch"]');
  const titleEl = el.querySelector('h3');
  const title = titleEl ? (titleEl.innerText || '').trim() : null;
  const inner = el.innerText || '';
  const vm = inner.match(/([0-9][0-9.,]*[KMB]?) views/);
  const tm = inner.match(/(\\d+ (?:seconds?|minutes?|hours?|days?|weeks?|months?|years?) ago)/);
  const dm = inner.match(/^(\\d+:\\d+)/m);
  videos.push({
    title,
    url: link ? link.href : null,
    views: vm ? vm[0] : null,
    uploaded: tm ? tm[1] : null,
    duration: dm ? dm[1] : null
  });
}

const header = document.querySelector('#page-header');
const headerText = header?.innerText || '';
const nameEl = header?.querySelector('yt-dynamic-text-view-model');
const channelName = nameEl?.innerText?.trim() || '';
const hm = headerText.match(/@[\\w-]+/);
const handle = hm ? hm[0] : '';
const sm = headerText.match(/([\\d.]+[KMB]?) subscribers/);
const subscribers = sm ? sm[1] + ' subscribers' : '';

return {
  channel: channelName,
  handle,
  subscribers,
  videoCount: videos.length,
  videos: videos.slice(0, 10)
};
`;

await session.executeJs({ code: jsCode, result_key: "channel" });
console.log(JSON.stringify(session.extraction.channel, null, 2));
await session.close();

require "browserbeam"
require "json"

client = Browserbeam::Client.new(api_key: "YOUR_API_KEY")
session = client.sessions.create(
  url: "https://www.youtube.com/@RickAstleyYT/videos",
  proxy: { kind: "residential", country: "us" },
  user_agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
  block_resources: ["image", "font", "media"],
  auto_dismiss_blockers: true,
)

js_code = <<~JS
  const items = document.querySelectorAll('ytd-rich-item-renderer');
  const videos = [];
  for (const el of items) {
    const link = el.querySelector('a[href*="/watch"]');
    const titleEl = el.querySelector('h3');
    const title = titleEl ? (titleEl.innerText || '').trim() : null;
    const inner = el.innerText || '';
    const vm = inner.match(/([0-9][0-9.,]*[KMB]?) views/);
    const tm = inner.match(/(\\d+ (?:seconds?|minutes?|hours?|days?|weeks?|months?|years?) ago)/);
    const dm = inner.match(/^(\\d+:\\d+)/m);
    videos.push({
      title,
      url: link ? link.href : null,
      views: vm ? vm[0] : null,
      uploaded: tm ? tm[1] : null,
      duration: dm ? dm[1] : null
    });
  }

  const header = document.querySelector('#page-header');
  const headerText = header?.innerText || '';
  const nameEl = header?.querySelector('yt-dynamic-text-view-model');
  const channelName = nameEl?.innerText?.trim() || '';
  const hm = headerText.match(/@[\\w-]+/);
  const handle = hm ? hm[0] : '';
  const sm = headerText.match(/([\\d.]+[KMB]?) subscribers/);
  const subscribers = sm ? sm[1] + ' subscribers' : '';

  return {
    channel: channelName,
    handle,
    subscribers,
    videoCount: videos.length,
    videos: videos.slice(0, 10)
  };
JS

session.execute_js(js_code, result_key: "channel")
puts JSON.pretty_generate(session.extraction["channel"])
session.close

The innerText parsing approach is necessary because YouTube's custom elements (ytd-rich-item-renderer) don't always populate standard textContent on their child elements. Parsing the raw innerText with regex and extracting views, upload dates, and durations is more reliable than querying individual span elements. For channel metadata (name, handle, subscribers), we parse the #page-header element's text content.

Loading More Videos

YouTube lazy-loads videos as you scroll. The initial page shows roughly 30 videos. To get more, use execute_js to scroll the page, wait for new content, then re-extract:

for i in range(3):
    session.execute_js("window.scrollTo(0, document.body.scrollHeight)")
    session.wait(ms=2000)

session.execute_js(js_code, result_key="channel")
all_videos = session.extraction["channel"]["videos"]

Each scroll loads approximately 30 more videos. Three scrolls gives you around 120 videos from any channel.

Extracting YouTube Transcripts

Transcript extraction is the highest-value feature for data science and NLP use cases. "YouTube video transcript" gets 22,200 monthly searches, making it one of the most sought-after pieces of YouTube data.

YouTube's transcript is loaded dynamically through an engagement panel. The extraction requires an interactive flow: expand the video description, click the "Show transcript" button, wait for the panel to populate, then read the timestamped segments.

from browserbeam import Browserbeam
import json

client = Browserbeam(api_key="YOUR_API_KEY")
session = client.sessions.create(
    url="https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    proxy={"kind": "residential", "country": "us"},
    user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    block_resources=["image", "font", "media"],
    auto_dismiss_blockers=True,
)

# Step 1: Expand the video description
session.execute_js("""
  const btn = document.querySelector('ytd-text-inline-expander #expand');
  if (btn) { btn.scrollIntoView({behavior: 'instant', block: 'center'}); btn.click(); }
""")
session.wait(ms=1000)

# Step 2: Click "Show transcript"
session.click(text="Show transcript")
session.wait(ms=2000)

# Step 3: Extract transcript segments
session.execute_js("""
  const segments = document.querySelectorAll('macro-markers-panel-item-view-model');
  const lines = [];
  segments.forEach(s => {
    const inner = s.innerText?.trim() || '';
    const m = inner.match(/^(\\d+:\\d+)/);
    if (m) {
      const rest = inner.replace(/^\\d+:\\d+\\s*/, '')
        .replace(/^[\\d\\s,minutesecondhor]+/, '').trim();
      lines.push({ time: m[1], text: rest });
    }
  });
  return { count: lines.length, transcript: lines };
""", result_key="transcript")

transcript = session.extraction["transcript"]
print(f"Found {transcript['count']} segments")
for line in transcript["transcript"][:5]:
    print(f"[{line['time']}] {line['text']}")

session.close()

Sample output:

{
  "count": 24,
  "transcript": [
    { "time": "0:01", "text": "[♪♪♪]" },
    { "time": "0:18", "text": "♪ We're no strangers to love ♪ ♪ You know the rules and so do I ♪" },
    { "time": "0:27", "text": "♪ A full commitment's what I'm thinking of ♪ ♪ You wouldn't get this from any other guy ♪" },
    { "time": "0:35", "text": "♪ I just wanna tell you how I'm feeling ♪ ♪ Gotta make you understand ♪" },
    { "time": "0:43", "text": "♪ Never gonna give you up ♪ ♪ Never gonna let you down ♪" }
  ]
}

Not all videos have transcripts. Auto-generated captions are available on most English-language videos, but some creators disable them. You can check availability by looking for the ytInitialPlayerResponse.captions.playerCaptionsTracklistRenderer.captionTracks array in the page's JavaScript context. If it's empty or missing, that video has no transcript.

# Check transcript availability without opening the panel
session.execute_js("""
  try {
    const tracks = ytInitialPlayerResponse
      ?.captions
      ?.playerCaptionsTracklistRenderer
      ?.captionTracks;
    if (!tracks || tracks.length === 0) return { available: false };
    return {
      available: true,
      languages: tracks.map(t => ({
        code: t.languageCode,
        name: t.name?.simpleText,
        kind: t.kind || 'manual'
      }))
    };
  } catch(e) { return { available: false, error: e.message }; }
""", result_key="captions")

Saving and Processing Your Data

Once you've scraped video metadata and transcripts, you'll want to save the data for analysis. Here's a Python script that scrapes multiple videos and exports to both CSV and JSON:

from browserbeam import Browserbeam
import json
import csv

client = Browserbeam(api_key="YOUR_API_KEY")

video_urls = [
    "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "https://www.youtube.com/watch?v=9bZkp7q19f0",
    "https://www.youtube.com/watch?v=kJQP7kiw5Fk",
]

extract_code = """
const vd = ytInitialPlayerResponse?.videoDetails;
const mf = ytInitialPlayerResponse?.microformat?.playerMicroformatRenderer;
return {
  title: vd?.title,
  viewCount: vd?.viewCount,
  channel: vd?.author,
  lengthSeconds: vd?.lengthSeconds,
  category: mf?.category,
  publishDate: mf?.publishDate
};
"""

results = []
for url in video_urls:
    session = client.sessions.create(
        url=url,
        proxy={"kind": "residential", "country": "us"},
        user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
        block_resources=["image", "font", "media"],
        auto_dismiss_blockers=True,
    )
    session.execute_js(extract_code, result_key="video")
    video = session.extraction.get("video")
    if video:
        video["url"] = url
        results.append(video)
    session.close()

# Save as JSON
with open("youtube_videos.json", "w") as f:
    json.dump(results, f, indent=2)

# Save as CSV
if results:
    with open("youtube_videos.csv", "w", newline="") as f:
        writer = csv.DictWriter(f, fieldnames=results[0].keys())
        writer.writeheader()
        writer.writerows(results)

print(f"Saved {len(results)} videos to youtube_videos.json and youtube_videos.csv")

For larger datasets, add a 1-2 second delay between requests to avoid rate limiting. Browserbeam handles session management and proxy rotation automatically, so you don't need to manage a pool of browser instances.

YouTube Data API vs Scraping

The YouTube Data API is the official way to access YouTube data. It's well-documented, returns clean JSON, and is free within quota limits. But those limits are restrictive for research and data science use cases.

Factor	YouTube Data API	Web Scraping (Browserbeam)
Daily quota	10,000 units (1 search = 100 units)	No quota limit
Transcripts	Not available	Full timestamped transcripts
Comments	Available (costs 1 unit per page of 20)	Available via scroll + observe
Video metadata	title, description, views, likes, tags	All API fields + rendered page content
Channel videos	Paginated (50/page, costs 1 unit)	30+ per scroll, unlimited scrolling
Authentication	API key required (free)	No YouTube auth needed
Rate limiting	Strict (quota resets daily)	Standard web rate limiting
Cost	Free (within quota)	Browserbeam credit cost per session
Reliability	Official, stable API	Depends on YouTube frontend structure
Languages	Any (REST API)	Python, TypeScript, Ruby, cURL

When to use the Data API: You need fewer than 100 searches per day, don't need transcripts, and want maximum reliability. The API is the right choice for small to medium projects where quota isn't a constraint.

When to scrape: You need transcripts, you're hitting quota limits, or you need data the API doesn't expose (rendered page content, related videos sidebar, comment replies). For NLP research requiring 10,000+ transcripts, scraping is the only practical option.

Third option: yt-dlp. The open-source yt-dlp tool extracts video metadata and subtitles from the command line. It's excellent for one-off tasks and CLI workflows. For programmatic access at scale with proxy management, Browserbeam is more practical.

DIY Scraping vs Browserbeam API

If you've scraped YouTube before, you know the pain points. Here's how a DIY Playwright setup compares to Browserbeam:

Concern	DIY (Playwright/Selenium)	Browserbeam
Browser management	Install, update, manage headless Chrome	Managed cloud browsers
Proxy rotation	Buy proxies, implement rotation logic	Built-in residential proxies
User-Agent spoofing	Manual header management	One parameter: `user_agent`
Cookie consent	Write custom dismiss logic per site	`auto_dismiss_blockers: true`
Resource blocking	Intercept requests manually	`block_resources: ["image", "font", "media"]`
Scroll handling	Write scroll loops with timing	`execute_js` + `wait`
Scaling	Manage browser pools, memory, crashes	API calls, no infrastructure
Transcript extraction	Build click + wait + extract flow	Same flow, but no browser to manage

The same YouTube scraper in Playwright requires roughly 80 lines of Python (browser launch, context setup, proxy config, stealth plugin, navigation, waiting, extraction, cleanup). The Browserbeam version is 15 lines.

# Playwright equivalent (for comparison)
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(
        headless=True,
        proxy={"server": "http://your-proxy:8080", "username": "user", "password": "pass"}
    )
    context = browser.new_context(
        user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
        viewport={"width": 1280, "height": 720}
    )
    page = context.new_page()
    page.route("**/*.{png,jpg,gif,svg,woff,woff2,mp4,webm}", lambda route: route.abort())

    page.goto("https://www.youtube.com/watch?v=dQw4w9WgXcQ", wait_until="networkidle")
    page.wait_for_timeout(3000)

    # Handle cookie consent manually
    try:
        page.click("button:has-text('Accept all')", timeout=3000)
    except:
        pass

    # Extract LD+JSON
    data = page.evaluate("""() => {
        const scripts = document.querySelectorAll('script[type="application/ld+json"]');
        for (const s of scripts) {
            const d = JSON.parse(s.textContent);
            if (d['@type'] === 'VideoObject') return d;
        }
    }""")

    print(data)
    browser.close()

The Playwright version needs you to manage browser installation, proxy credentials, route interception for resource blocking, manual cookie consent handling, and cleanup. Browserbeam handles all of that with configuration parameters.

Use Cases

Sentiment Analysis on Product Reviews

Scrape comments from product review videos to build a sentiment classifier. Extract the video transcript for the reviewer's opinion, then pair it with comment sentiment to gauge audience agreement.

# Scrape a product review video + transcript for sentiment analysis
session = client.sessions.create(
    url="https://www.youtube.com/watch?v=PRODUCT_REVIEW_ID",
    proxy={"kind": "residential", "country": "us"},
    user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    block_resources=["image", "font", "media"],
    auto_dismiss_blockers=True,
)
session.execute_js(extract_code, result_key="video")
# ... extract transcript using the interactive flow above ...
# Feed transcript + metadata into your NLP pipeline
session.close()

Content Research Across a Niche

Extract video metadata from the top channels in your niche. Compare upload frequency, average view counts, title patterns, and topics to identify content gaps.

Transcript Corpus for NLP Training

Build a training dataset from video transcripts in a specific domain (cooking, fitness, technology). Loop through channel pages to collect video URLs, then extract transcripts from each. A single channel with 500 videos gives you hundreds of hours of transcribed speech.

Common Mistakes When Scraping YouTube

1. Using the default headless User-Agent

YouTube checks the User-Agent string before rendering any content. The default Playwright/Puppeteer User-Agent contains HeadlessChrome, which triggers an immediate block. Always set a custom User-Agent matching a real Chrome release.

2. Blocking stylesheets on channel pages

Video pages render fine without stylesheets. Channel pages don't. The ytd-rich-item-renderer grid needs CSS to populate its video cards. Block image, font, and media on channel pages, but leave stylesheet alone.

3. Using datacenter proxies

Datacenter IP ranges are well-known and blocked by YouTube regardless of User-Agent. Residential proxies are required for consistent access.

4. Calling execute_js before the page stabilizes

The ytInitialPlayerResponse object is populated during the initial page load. If you call execute_js too early, the object might be incomplete. Either use the observe call first (which waits for page stability) or add a wait step before extracting data.

YouTube shows a cookie consent banner for EU visitors. Without auto_dismiss_blockers: true, the banner covers page content and blocks interaction. Always enable this setting.

Frequently Asked Questions

Is it legal to scrape YouTube?

YouTube's Terms of Service prohibit automated access. However, scraping publicly available metadata (titles, view counts, descriptions) for research purposes falls into a legal gray area similar to web indexing. Courts have generally protected scraping of public data (see hiQ Labs v. LinkedIn). Do not scrape private or authenticated content, and respect rate limits.

Does YouTube block scrapers?

Yes. YouTube uses browser fingerprinting (User-Agent detection), IP reputation scoring, and behavioral analysis. Datacenter proxies and default headless User-Agents are blocked. Residential proxies with a real Chrome User-Agent bypass these checks reliably.

How to get a YouTube video transcript?

Create a Browserbeam session on the video page, expand the description, click "Show transcript," wait for the panel to load, then extract the timestamped segments. You can also check ytInitialPlayerResponse.captions to see which languages are available before attempting extraction.

Can you scrape YouTube with Python?

Yes. Use the Browserbeam Python SDK (pip install browserbeam) with residential proxies and a custom User-Agent. The SDK handles browser management, proxy rotation, and session cleanup. See the code examples throughout this guide.

YouTube API vs web scraping: which should I use?

Use the YouTube Data API for small to medium projects (under 100 daily searches) that don't need transcripts. Use web scraping when you need transcripts, are hitting API quota limits, or need data the API doesn't provide. For CLI one-off jobs, yt-dlp is another option.

Can I scrape YouTube transcripts?

Yes. Transcripts are available on most YouTube videos through the "Show transcript" engagement panel. Auto-generated captions exist for most English-language content. Some creators disable captions, so check availability via ytInitialPlayerResponse.captions before attempting extraction.

How to scrape YouTube without getting blocked?

Two requirements: residential proxies and a modern Chrome User-Agent string. Set proxy: { kind: "residential" } and user_agent to a current Chrome version string. Enable auto_dismiss_blockers for cookie consent. Block image, font, and media resources to reduce detection surface.

What data does YouTube's ytInitialPlayerResponse contain?

Every video page populates a ytInitialPlayerResponse JavaScript object with videoDetails (title, view count, channel name, channel ID, duration in seconds, keywords, short description) and microformat (category, publish date, upload date, full description, available countries). This is the most stable extraction source because it's populated server-side during page load.

Conclusion

We covered four extraction patterns in this guide: observe for rich markdown, execute_js with ytInitialPlayerResponse for structured video data, innerText parsing for channel video grids, and the interactive transcript flow. Each pattern handles a different YouTube data type, and they all share the same foundation: residential proxies, a custom User-Agent, and selective resource blocking.

Try swapping the video URL with any other YouTube video. The same ytInitialPlayerResponse extraction code returns structured data from any video page. For bulk scraping, start with a channel page to collect video URLs, then loop through individual videos with a short delay between requests.

For the complete API reference, check the Browserbeam documentation. The IMDb scraping guide covers similar React challenges and LD+JSON extraction patterns. If you're building multi-site scrapers, the web scraping agent tutorial shows how to chain different sites into a single workflow. The data extraction guide explains Browserbeam's structured extraction in depth.

How to Scrape YouTube: Videos and Transcripts

Quick Start: Scrape a YouTube Video

What Data Can You Extract from YouTube?

Why YouTube Requires a Custom User-Agent

Scraping YouTube Video Pages

Step 1: Create Session and Observe

Step 2: Extract Structured Data via ytInitialPlayerResponse

Scraping YouTube Channel Pages

Loading More Videos

Extracting YouTube Transcripts

Saving and Processing Your Data

YouTube Data API vs Scraping

DIY Scraping vs Browserbeam API

Use Cases

Sentiment Analysis on Product Reviews

Content Research Across a Niche

Transcript Corpus for NLP Training

Common Mistakes When Scraping YouTube

1. Using the default headless User-Agent

2. Blocking stylesheets on channel pages

3. Using datacenter proxies

4. Calling execute_js before the page stabilizes

Frequently Asked Questions

Is it legal to scrape YouTube?

Does YouTube block scrapers?

How to get a YouTube video transcript?

Can you scrape YouTube with Python?

YouTube API vs web scraping: which should I use?

Can I scrape YouTube transcripts?

How to scrape YouTube without getting blocked?

What data does YouTube's ytInitialPlayerResponse contain?

Conclusion

You might also like:

AliExpress Scraper: Product Data, Pricing Tiers, and Supplier Info

How to Scrape Redfin in 2026: Property Prices, History, and Market Trends

Scrape Any Shopify Store: Products, Prices, and Variants

Give your AI agent a faster, leaner browser

How to Scrape YouTube: Videos and Transcripts

Quick Start: Scrape a YouTube Video

What Data Can You Extract from YouTube?

Why YouTube Requires a Custom User-Agent

Scraping YouTube Video Pages

Step 1: Create Session and Observe

Step 2: Extract Structured Data via ytInitialPlayerResponse

Scraping YouTube Channel Pages

Loading More Videos

Extracting YouTube Transcripts

Saving and Processing Your Data

YouTube Data API vs Scraping

DIY Scraping vs Browserbeam API

Use Cases

Sentiment Analysis on Product Reviews

Content Research Across a Niche

Transcript Corpus for NLP Training

Common Mistakes When Scraping YouTube

1. Using the default headless User-Agent

2. Blocking stylesheets on channel pages

3. Using datacenter proxies

4. Calling execute_js before the page stabilizes

5. Not handling cookie consent

Frequently Asked Questions

Is it legal to scrape YouTube?

Does YouTube block scrapers?

How to get a YouTube video transcript?

Can you scrape YouTube with Python?

YouTube API vs web scraping: which should I use?

Can I scrape YouTube transcripts?

How to scrape YouTube without getting blocked?

What data does YouTube's ytInitialPlayerResponse contain?

Conclusion

You might also like:

AliExpress Scraper: Product Data, Pricing Tiers, and Supplier Info

How to Scrape Redfin in 2026: Property Prices, History, and Market Trends

Scrape Any Shopify Store: Products, Prices, and Variants

Give your AI agent a faster, leaner browser