How to Scrape YouTube: Videos and Transcripts

May 01, 2026 10 min read

YouTube video page showing video title, view count, likes, channel info, and description metadata

YouTube has 2.5 billion monthly active users, and the YouTube Data API caps you at 10,000 quota units per day. A single search request costs 100 units. A single video details request costs 1 unit but returns no comments, no transcripts, and no related videos. If you're building a dataset of 50,000 videos with transcripts for NLP training, the official API would take months of careful quota management. Web scraping gets you there in hours.

The catch: YouTube runs one of the most aggressive headless browser detection systems on the web. Datacenter proxies get blocked. Default User-Agent strings trigger a "Please update your browser" wall. And the site renders entirely through JavaScript custom elements, so static HTTP requests return empty shells.

This guide walks through scraping YouTube with Browserbeam's cloud browser API, which handles JavaScript rendering, residential proxies, and cookie consent automatically. We'll cover everything from basic video metadata to full transcript extraction:

  • A scraper that returns video metadata (title, views, likes, duration, upload date, channel) from any YouTube video page
  • Structured JSON extraction from YouTube's ytInitialPlayerResponse object
  • Channel scraping: subscriber count, handle, and a full video list with views and upload dates
  • Transcript extraction using the interactive "Show transcript" panel
  • Why YouTube blocks default headless browsers (and the specific fix)
  • YouTube Data API vs web scraping: quota math, available data, and when to use each
  • CSV and JSON export for building research datasets

TL;DR: Use residential proxies and a custom Chrome User-Agent for YouTube. Datacenter IPs and default headless User-Agents both trigger blocks. Use observe for rich markdown, execute_js for structured JSON data. Transcripts require an interactive flow: expand the description, click "Show transcript," then read the panel. For video metadata via execute_js, block images, fonts, media, stylesheets, and scripts to minimize bandwidth -- the data lives in an inline script that executes regardless. For channel pages and transcripts, keep scripts and stylesheets enabled since YouTube's SPA framework must render the DOM.


Don't have an API key yet? Create a free Browserbeam account - you get 5,000 credits, no credit card required.

Quick Start: Scrape a YouTube Video

Let's start with the simplest case. Create a session on any YouTube video, then call observe to read the page as structured markdown. Two API calls. You get the title, view count, likes, channel name, subscriber count, description, and a list of related videos.

YouTube requires three specific settings:

  1. Residential proxy (datacenter IPs get blocked)
  2. Custom User-Agent matching a real Chrome browser
  3. Resource blocking for images, fonts, and media (saves bandwidth, speeds up loading)
# Step 1: Create session with residential proxy + custom UA
SESSION_ID=$(curl -s -X POST https://api.browserbeam.com/v1/sessions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "proxy": { "kind": "residential", "country": "us" },
    "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    "block_resources": ["image", "font", "media"],
    "auto_dismiss_blockers": true
  }' | jq -r '.session_id')

# Step 2: Observe + close
curl -s -X POST "https://api.browserbeam.com/v1/sessions/$SESSION_ID/act" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"steps": [{"observe": {}}, {"close": {}}]}' \
  | jq '.page.markdown.content'

The observe response includes the video title, view count, like count, full description, channel info, related videos, and comment count in clean markdown. No HTML parsing required.

What Data Can You Extract from YouTube?

YouTube pages contain more structured data than most websites. Here's what's available from each page type:

Page Type Available Fields Extraction Method
Video page Title, views, likes, duration, upload date, description, channel name, subscriber count, tags, comment count, related videos observe for markdown, execute_js for ytInitialPlayerResponse
Channel page Channel name, handle, subscriber count, video list (title, views, upload date, duration, URL) observe for video list, execute_js for structured data
Search results Video titles, channels, view counts, durations, URLs observe for markdown list
Playlist Playlist title, video count, video list with titles and channels observe for markdown

Video pages also expose a ytInitialPlayerResponse JavaScript object that contains structured metadata: videoDetails (title, view count, channel, keywords, duration) and microformat (category, publish date, description). This is the most reliable extraction source because it's populated during the initial page load and doesn't depend on YouTube's DOM structure.

Why YouTube Requires a Custom User-Agent

Most scraping guides focus on IP-based blocking. YouTube is different. The primary detection mechanism is browser fingerprinting through the User-Agent string.

During our API validation, we tested four configurations:

Configuration Proxy User-Agent Result
Default Datacenter Default headless "Please update your browser"
Residential only Residential Default headless "Please update your browser"
UA only Datacenter Chrome 131 "Please update your browser"
Residential + UA Residential Chrome 131 Full page render

Both conditions are required: residential proxy AND a modern Chrome User-Agent. The detection happens before the page renders, so you can't work around it with JavaScript. The User-Agent string Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 works reliably.

There's one more gotcha with resource blocking. You should block image, font, and media resources to reduce bandwidth and speed up page loads. But do not block stylesheet on channel pages. YouTube's video grid uses CSS-dependent custom elements (ytd-rich-item-renderer), and blocking stylesheets prevents the video cards from rendering.

Resource Type Block on Video Pages? Block on Channel Pages?
image Yes Yes
font Yes Yes
media Yes Yes
stylesheet Safe to block Do NOT block

Scraping YouTube Video Pages

Video pages are the richest data source on YouTube. The observe endpoint returns the full page as markdown, including the title, view count, likes, description, and related videos. For structured data, execute_js can parse the LD+JSON VideoObject embedded in every video page.

Step 1: Create Session and Observe

The create call already returns page markdown. But for YouTube, we recommend calling observe separately to get a fresh read after any cookie consent dialogs are dismissed.

Step 2: Extract Structured Data via ytInitialPlayerResponse

Every YouTube video page populates a global ytInitialPlayerResponse object with the full video metadata. This gives you clean, structured data that's more reliable than parsing the DOM. The videoDetails object contains the title, view count, channel name, duration, and keywords. The microformat object adds the category, publish date, and full description.

Since ytInitialPlayerResponse is set by an inline <script> tag in the HTML, we can block external scripts and stylesheets too. This drops proxy bandwidth from ~15 MB to ~1.5 MB per request -- YouTube's main JavaScript bundle alone is 9.7 MB.

# Create session
SESSION_ID=$(curl -s -X POST https://api.browserbeam.com/v1/sessions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "proxy": { "kind": "residential", "country": "us" },
    "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    "block_resources": ["image", "font", "media", "stylesheet", "script"],
    "auto_dismiss_blockers": true
  }' | jq -r '.session_id')

# Extract video metadata from ytInitialPlayerResponse
curl -s -X POST "https://api.browserbeam.com/v1/sessions/$SESSION_ID/act" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "steps": [{
      "execute_js": {
        "code": "const vd = ytInitialPlayerResponse?.videoDetails; const mf = ytInitialPlayerResponse?.microformat?.playerMicroformatRenderer; return { title: vd?.title, viewCount: vd?.viewCount, channel: vd?.author, channelId: vd?.channelId, lengthSeconds: vd?.lengthSeconds, keywords: vd?.keywords, description: vd?.shortDescription, category: mf?.category, publishDate: mf?.publishDate, thumbnail: mf?.thumbnail?.thumbnails?.[0]?.url };",
        "result_key": "video"
      }
    }, {"close": {}}]
  }' | jq '.extraction.video'

The response looks like this:

{
  "title": "Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)",
  "viewCount": "1767841059",
  "channel": "Rick Astley",
  "channelId": "UCuAXFkgsw1L7xaCfnd5JJOw",
  "lengthSeconds": "213",
  "keywords": ["rick astley", "Never Gonna Give You Up", "nggyu", "never gonna give you up lyrics", "rick rolled"],
  "description": "The official video for \"Never Gonna Give You Up\" by Rick Astley...",
  "category": "Music",
  "publishDate": "2009-10-24T23:57:33-07:00",
  "thumbnail": "https://i.ytimg.com/vi/dQw4w9WgXcQ/sddefault.jpg"
}

The lengthSeconds gives you the video duration in seconds (213 = 3 minutes 33 seconds). The keywords array contains the tags the creator set. The category comes from YouTube's content classification system.

Scraping YouTube Channel Pages

Channel pages list all of a creator's videos with titles, view counts, upload dates, and durations. Navigate to /@handle/videos to get the video grid.

Channel pages need a slightly different configuration. Do not block stylesheets, or the video grid won't render.

# Create session on channel videos page
SESSION_ID=$(curl -s -X POST https://api.browserbeam.com/v1/sessions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/@RickAstleyYT/videos",
    "proxy": { "kind": "residential", "country": "us" },
    "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    "block_resources": ["image", "font", "media"],
    "auto_dismiss_blockers": true
  }' | jq -r '.session_id')

# Extract structured channel data via innerText parsing
curl -s -X POST "https://api.browserbeam.com/v1/sessions/$SESSION_ID/act" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "steps": [{
      "execute_js": {
        "code": "const items = document.querySelectorAll(\"ytd-rich-item-renderer\"); const videos = []; for (const el of items) { const link = el.querySelector(\"a[href*=\\\"/watch\\\"]\"); const titleEl = el.querySelector(\"h3\"); const title = titleEl ? (titleEl.innerText || \"\").trim() : null; const inner = el.innerText || \"\"; const vm = inner.match(/([0-9][0-9.,]*[KMB]?) views/); const tm = inner.match(/(\\d+ (?:seconds?|minutes?|hours?|days?|weeks?|months?|years?) ago)/); const dm = inner.match(/^(\\d+:\\d+)/m); videos.push({ title, url: link ? link.href : null, views: vm ? vm[0] : null, uploaded: tm ? tm[1] : null, duration: dm ? dm[1] : null }); } const header = document.querySelector(\"#page-header\"); const headerText = header?.innerText || \"\"; const nameEl = header?.querySelector(\"yt-dynamic-text-view-model\"); const channelName = nameEl?.innerText?.trim() || \"\"; const hm = headerText.match(/@[\\\\w-]+/); const handle = hm ? hm[0] : \"\"; const sm = headerText.match(/([\\\\d.]+[KMB]?) subscribers/); const subscribers = sm ? sm[1] + \" subscribers\" : \"\"; return { channel: channelName, handle, subscribers, videoCount: videos.length, videos: videos.slice(0, 10) };",
        "result_key": "channel"
      }
    }, {"close": {}}]
  }' | jq '.extraction.channel'

The innerText parsing approach is necessary because YouTube's custom elements (ytd-rich-item-renderer) don't always populate standard textContent on their child elements. Parsing the raw innerText with regex and extracting views, upload dates, and durations is more reliable than querying individual span elements. For channel metadata (name, handle, subscribers), we parse the #page-header element's text content.

Loading More Videos

YouTube lazy-loads videos as you scroll. The initial page shows roughly 30 videos. To get more, use execute_js to scroll the page, wait for new content, then re-extract:

for i in range(3):
    session.execute_js("window.scrollTo(0, document.body.scrollHeight)")
    session.wait(ms=2000)

session.execute_js(js_code, result_key="channel")
all_videos = session.extraction["channel"]["videos"]

Each scroll loads approximately 30 more videos. Three scrolls gives you around 120 videos from any channel.

Extracting YouTube Transcripts

Transcript extraction is the highest-value feature for data science and NLP use cases. "YouTube video transcript" gets 22,200 monthly searches, making it one of the most sought-after pieces of YouTube data.

YouTube's transcript is loaded dynamically through an engagement panel. The extraction requires an interactive flow: expand the video description, click the "Show transcript" button, wait for the panel to populate, then read the timestamped segments.

from browserbeam import Browserbeam
import json

client = Browserbeam(api_key="YOUR_API_KEY")
session = client.sessions.create(
    url="https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    proxy={"kind": "residential", "country": "us"},
    user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    block_resources=["image", "font", "media"],
    auto_dismiss_blockers=True,
)

# Step 1: Expand the video description
session.execute_js("""
  const btn = document.querySelector('ytd-text-inline-expander #expand');
  if (btn) { btn.scrollIntoView({behavior: 'instant', block: 'center'}); btn.click(); }
""")
session.wait(ms=1000)

# Step 2: Click "Show transcript"
session.click(text="Show transcript")
session.wait(ms=2000)

# Step 3: Extract transcript segments
session.execute_js("""
  const segments = document.querySelectorAll('macro-markers-panel-item-view-model');
  const lines = [];
  segments.forEach(s => {
    const inner = s.innerText?.trim() || '';
    const m = inner.match(/^(\\d+:\\d+)/);
    if (m) {
      const rest = inner.replace(/^\\d+:\\d+\\s*/, '')
        .replace(/^[\\d\\s,minutesecondhor]+/, '').trim();
      lines.push({ time: m[1], text: rest });
    }
  });
  return { count: lines.length, transcript: lines };
""", result_key="transcript")

transcript = session.extraction["transcript"]
print(f"Found {transcript['count']} segments")
for line in transcript["transcript"][:5]:
    print(f"[{line['time']}] {line['text']}")

session.close()

Sample output:

{
  "count": 24,
  "transcript": [
    { "time": "0:01", "text": "[♪♪♪]" },
    { "time": "0:18", "text": "♪ We're no strangers to love ♪ ♪ You know the rules and so do I ♪" },
    { "time": "0:27", "text": "♪ A full commitment's what I'm thinking of ♪ ♪ You wouldn't get this from any other guy ♪" },
    { "time": "0:35", "text": "♪ I just wanna tell you how I'm feeling ♪ ♪ Gotta make you understand ♪" },
    { "time": "0:43", "text": "♪ Never gonna give you up ♪ ♪ Never gonna let you down ♪" }
  ]
}

Not all videos have transcripts. Auto-generated captions are available on most English-language videos, but some creators disable them. You can check availability by looking for the ytInitialPlayerResponse.captions.playerCaptionsTracklistRenderer.captionTracks array in the page's JavaScript context. If it's empty or missing, that video has no transcript.

# Check transcript availability without opening the panel
session.execute_js("""
  try {
    const tracks = ytInitialPlayerResponse
      ?.captions
      ?.playerCaptionsTracklistRenderer
      ?.captionTracks;
    if (!tracks || tracks.length === 0) return { available: false };
    return {
      available: true,
      languages: tracks.map(t => ({
        code: t.languageCode,
        name: t.name?.simpleText,
        kind: t.kind || 'manual'
      }))
    };
  } catch(e) { return { available: false, error: e.message }; }
""", result_key="captions")

Saving and Processing Your Data

Once you've scraped video metadata and transcripts, you'll want to save the data for analysis. Here's a Python script that scrapes multiple videos and exports to both CSV and JSON:

from browserbeam import Browserbeam
import json
import csv

client = Browserbeam(api_key="YOUR_API_KEY")

video_urls = [
    "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "https://www.youtube.com/watch?v=9bZkp7q19f0",
    "https://www.youtube.com/watch?v=kJQP7kiw5Fk",
]

extract_code = """
const vd = ytInitialPlayerResponse?.videoDetails;
const mf = ytInitialPlayerResponse?.microformat?.playerMicroformatRenderer;
return {
  title: vd?.title,
  viewCount: vd?.viewCount,
  channel: vd?.author,
  lengthSeconds: vd?.lengthSeconds,
  category: mf?.category,
  publishDate: mf?.publishDate
};
"""

results = []
for url in video_urls:
    session = client.sessions.create(
        url=url,
        proxy={"kind": "residential", "country": "us"},
        user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
        block_resources=["image", "font", "media"],
        auto_dismiss_blockers=True,
    )
    session.execute_js(extract_code, result_key="video")
    video = session.extraction.get("video")
    if video:
        video["url"] = url
        results.append(video)
    session.close()

# Save as JSON
with open("youtube_videos.json", "w") as f:
    json.dump(results, f, indent=2)

# Save as CSV
if results:
    with open("youtube_videos.csv", "w", newline="") as f:
        writer = csv.DictWriter(f, fieldnames=results[0].keys())
        writer.writeheader()
        writer.writerows(results)

print(f"Saved {len(results)} videos to youtube_videos.json and youtube_videos.csv")

For larger datasets, add a 1-2 second delay between requests to avoid rate limiting. Browserbeam handles session management and proxy rotation automatically, so you don't need to manage a pool of browser instances.

YouTube Data API vs Scraping

The YouTube Data API is the official way to access YouTube data. It's well-documented, returns clean JSON, and is free within quota limits. But those limits are restrictive for research and data science use cases.

Factor YouTube Data API Web Scraping (Browserbeam)
Daily quota 10,000 units (1 search = 100 units) No quota limit
Transcripts Not available Full timestamped transcripts
Comments Available (costs 1 unit per page of 20) Available via scroll + observe
Video metadata title, description, views, likes, tags All API fields + rendered page content
Channel videos Paginated (50/page, costs 1 unit) 30+ per scroll, unlimited scrolling
Authentication API key required (free) No YouTube auth needed
Rate limiting Strict (quota resets daily) Standard web rate limiting
Cost Free (within quota) Browserbeam credit cost per session
Reliability Official, stable API Depends on YouTube frontend structure
Languages Any (REST API) Python, TypeScript, Ruby, cURL

When to use the Data API: You need fewer than 100 searches per day, don't need transcripts, and want maximum reliability. The API is the right choice for small to medium projects where quota isn't a constraint.

When to scrape: You need transcripts, you're hitting quota limits, or you need data the API doesn't expose (rendered page content, related videos sidebar, comment replies). For NLP research requiring 10,000+ transcripts, scraping is the only practical option.

Third option: yt-dlp. The open-source yt-dlp tool extracts video metadata and subtitles from the command line. It's excellent for one-off tasks and CLI workflows. For programmatic access at scale with proxy management, Browserbeam is more practical.

DIY Scraping vs Browserbeam API

If you've scraped YouTube before, you know the pain points. Here's how a DIY Playwright setup compares to Browserbeam:

Concern DIY (Playwright/Selenium) Browserbeam
Browser management Install, update, manage headless Chrome Managed cloud browsers
Proxy rotation Buy proxies, implement rotation logic Built-in residential proxies
User-Agent spoofing Manual header management One parameter: user_agent
Cookie consent Write custom dismiss logic per site auto_dismiss_blockers: true
Resource blocking Intercept requests manually block_resources: ["image", "font", "media"]
Scroll handling Write scroll loops with timing execute_js + wait
Scaling Manage browser pools, memory, crashes API calls, no infrastructure
Transcript extraction Build click + wait + extract flow Same flow, but no browser to manage

The same YouTube scraper in Playwright requires roughly 80 lines of Python (browser launch, context setup, proxy config, stealth plugin, navigation, waiting, extraction, cleanup). The Browserbeam version is 15 lines.

# Playwright equivalent (for comparison)
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(
        headless=True,
        proxy={"server": "http://your-proxy:8080", "username": "user", "password": "pass"}
    )
    context = browser.new_context(
        user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
        viewport={"width": 1280, "height": 720}
    )
    page = context.new_page()
    page.route("**/*.{png,jpg,gif,svg,woff,woff2,mp4,webm}", lambda route: route.abort())

    page.goto("https://www.youtube.com/watch?v=dQw4w9WgXcQ", wait_until="networkidle")
    page.wait_for_timeout(3000)

    # Handle cookie consent manually
    try:
        page.click("button:has-text('Accept all')", timeout=3000)
    except:
        pass

    # Extract LD+JSON
    data = page.evaluate("""() => {
        const scripts = document.querySelectorAll('script[type="application/ld+json"]');
        for (const s of scripts) {
            const d = JSON.parse(s.textContent);
            if (d['@type'] === 'VideoObject') return d;
        }
    }""")

    print(data)
    browser.close()

The Playwright version needs you to manage browser installation, proxy credentials, route interception for resource blocking, manual cookie consent handling, and cleanup. Browserbeam handles all of that with configuration parameters.

Use Cases

Sentiment Analysis on Product Reviews

Scrape comments from product review videos to build a sentiment classifier. Extract the video transcript for the reviewer's opinion, then pair it with comment sentiment to gauge audience agreement.

# Scrape a product review video + transcript for sentiment analysis
session = client.sessions.create(
    url="https://www.youtube.com/watch?v=PRODUCT_REVIEW_ID",
    proxy={"kind": "residential", "country": "us"},
    user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    block_resources=["image", "font", "media"],
    auto_dismiss_blockers=True,
)
session.execute_js(extract_code, result_key="video")
# ... extract transcript using the interactive flow above ...
# Feed transcript + metadata into your NLP pipeline
session.close()

Content Research Across a Niche

Extract video metadata from the top channels in your niche. Compare upload frequency, average view counts, title patterns, and topics to identify content gaps.

Transcript Corpus for NLP Training

Build a training dataset from video transcripts in a specific domain (cooking, fitness, technology). Loop through channel pages to collect video URLs, then extract transcripts from each. A single channel with 500 videos gives you hundreds of hours of transcribed speech.

Common Mistakes When Scraping YouTube

1. Using the default headless User-Agent

YouTube checks the User-Agent string before rendering any content. The default Playwright/Puppeteer User-Agent contains HeadlessChrome, which triggers an immediate block. Always set a custom User-Agent matching a real Chrome release.

2. Blocking stylesheets on channel pages

Video pages render fine without stylesheets. Channel pages don't. The ytd-rich-item-renderer grid needs CSS to populate its video cards. Block image, font, and media on channel pages, but leave stylesheet alone.

3. Using datacenter proxies

Datacenter IP ranges are well-known and blocked by YouTube regardless of User-Agent. Residential proxies are required for consistent access.

4. Calling execute_js before the page stabilizes

The ytInitialPlayerResponse object is populated during the initial page load. If you call execute_js too early, the object might be incomplete. Either use the observe call first (which waits for page stability) or add a wait step before extracting data.

YouTube shows a cookie consent banner for EU visitors. Without auto_dismiss_blockers: true, the banner covers page content and blocks interaction. Always enable this setting.

Frequently Asked Questions

YouTube's Terms of Service prohibit automated access. However, scraping publicly available metadata (titles, view counts, descriptions) for research purposes falls into a legal gray area similar to web indexing. Courts have generally protected scraping of public data (see hiQ Labs v. LinkedIn). Do not scrape private or authenticated content, and respect rate limits.

Does YouTube block scrapers?

Yes. YouTube uses browser fingerprinting (User-Agent detection), IP reputation scoring, and behavioral analysis. Datacenter proxies and default headless User-Agents are blocked. Residential proxies with a real Chrome User-Agent bypass these checks reliably.

How to get a YouTube video transcript?

Create a Browserbeam session on the video page, expand the description, click "Show transcript," wait for the panel to load, then extract the timestamped segments. You can also check ytInitialPlayerResponse.captions to see which languages are available before attempting extraction.

Can you scrape YouTube with Python?

Yes. Use the Browserbeam Python SDK (pip install browserbeam) with residential proxies and a custom User-Agent. The SDK handles browser management, proxy rotation, and session cleanup. See the code examples throughout this guide.

YouTube API vs web scraping: which should I use?

Use the YouTube Data API for small to medium projects (under 100 daily searches) that don't need transcripts. Use web scraping when you need transcripts, are hitting API quota limits, or need data the API doesn't provide. For CLI one-off jobs, yt-dlp is another option.

Can I scrape YouTube transcripts?

Yes. Transcripts are available on most YouTube videos through the "Show transcript" engagement panel. Auto-generated captions exist for most English-language content. Some creators disable captions, so check availability via ytInitialPlayerResponse.captions before attempting extraction.

How to scrape YouTube without getting blocked?

Two requirements: residential proxies and a modern Chrome User-Agent string. Set proxy: { kind: "residential" } and user_agent to a current Chrome version string. Enable auto_dismiss_blockers for cookie consent. Block image, font, and media resources to reduce detection surface.

What data does YouTube's ytInitialPlayerResponse contain?

Every video page populates a ytInitialPlayerResponse JavaScript object with videoDetails (title, view count, channel name, channel ID, duration in seconds, keywords, short description) and microformat (category, publish date, upload date, full description, available countries). This is the most stable extraction source because it's populated server-side during page load.

Conclusion

We covered four extraction patterns in this guide: observe for rich markdown, execute_js with ytInitialPlayerResponse for structured video data, innerText parsing for channel video grids, and the interactive transcript flow. Each pattern handles a different YouTube data type, and they all share the same foundation: residential proxies, a custom User-Agent, and selective resource blocking.

Try swapping the video URL with any other YouTube video. The same ytInitialPlayerResponse extraction code returns structured data from any video page. For bulk scraping, start with a channel page to collect video URLs, then loop through individual videos with a short delay between requests.

For the complete API reference, check the Browserbeam documentation. The IMDb scraping guide covers similar React challenges and LD+JSON extraction patterns. If you're building multi-site scrapers, the web scraping agent tutorial shows how to chain different sites into a single workflow. The data extraction guide explains Browserbeam's structured extraction in depth.

You might also like:

Give your AI agent a faster, leaner browser

Structured page data instead of raw HTML. Your agent processes less, decides faster, and costs less to run.

Stability detection built in
Fraction of the payload size
Diffs after every action
No credit card required. 5,000 free credits included.