CrawlForge
HomeUse CasesIntegrationsPricingDocumentationBlog
Scrape Amazon, LinkedIn & 8 More Sites With One Tool
Use Cases
Back to Blog
Use Cases

Scrape Amazon, LinkedIn & 8 More Sites With One Tool

C
CrawlForge Team
Engineering Team
May 27, 2026
8 min read

On this page

Quick Answer

scrape_template is a CrawlForge tool with pre-built, maintained schemas for ten popular sites: Amazon, LinkedIn, GitHub, YouTube, Reddit, Hacker News, Stack Overflow, npm, Product Hunt, and Twitter/X. One call returns structured JSON. No CSS selectors required. Costs 1 credit per scrape.

Half the scraping requests we see at CrawlForge are the same ten sites: Amazon, LinkedIn, GitHub, YouTube, Reddit, Hacker News, Stack Overflow, npm, Product Hunt, and Twitter/X. We got tired of watching people write the same CSS selectors over and over -- and watching those selectors break the next time the site updated its layout. So we did the work once, packaged it as scrape_template, and now you pay 1 credit and get structured JSON.

Table of Contents

  • What Is scrape_template?
  • The 10 Supported Sites
  • Quick Start: Scrape an Amazon Product
  • LinkedIn Profiles (With Legal Notes)
  • GitHub Repos for AI Training Data
  • The Other Seven Templates
  • scrape_template vs scrape_structured vs extract_with_llm
  • Limitations

What Is scrape_template?

scrape_template is a single CrawlForge tool with ten pre-built site schemas. You pick the template, pass a URL, and get back structured JSON matching that site's natural shape. No CSS selectors. No HTML parsing. No schema definition.

The trade-off: you only get the ten sites we maintain. If you need something else, use scrape_structured (CSS-first) or extract_with_llm (LLM-first). For the long tail of "I want product data from Amazon" requests, scrape_template is the shortest path.

It costs 1 credit per scrape -- the same as a basic fetch_url -- because we have already done the schema work upstream.

The 10 Supported Sites

TemplateReturnsBest forExample URL pattern
amazonTitle, price, rating, review count, images, ASIN, availabilityPrice monitoring, product research/dp/<ASIN>
linkedinName, headline, current role, experience, skills, educationLead enrichment/in/<handle>
githubStars, forks, languages, README, license, topics, last commitRepo analysis, AI training data/<owner>/<repo>
youtubeTitle, channel, views, duration, transcript, descriptionContent research/watch?v=<id>
redditPost title, score, top comments, subreddit, awardsCommunity signals/r/<sub>/comments/<id>
hackernewsTitle, points, URL, author, comments treeTech trend tracking/item?id=<id>
stackoverflowQuestion, accepted answer, vote counts, tagsDeveloper Q&A mining/questions/<id>
npmPackage metadata, weekly downloads, versions, maintainersDependency analysis/package/<name>
producthuntProduct, tagline, upvotes, makers, hunterLaunch monitoring/posts/<slug>
tweetText, author, engagement, replies, quote tweetsSocial listening/<user>/status/<id>

Quick Start: Scrape an Amazon Product

Bash

Output:

Json

From an MCP client like Claude Code:

"Use scrape_template with the amazon template to get the current price and rating for ASIN B0CHX1W1XY."

Claude picks the tool, formats the call, and returns the data. One credit.

LinkedIn Profiles (With Legal Notes)

Bash

Output:

Json

A note on LinkedIn scraping. LinkedIn's terms of service restrict automated access. The hiQ Labs v. LinkedIn case (9th Circuit, 2022) established that scraping public profile data is generally permissible, but commercial use, login-required scraping, and aggressive frequency can still trigger legal action and ToS bans. Use scrape_template linkedin for public, low-frequency, non-resold data only.

GitHub Repos for AI Training Data

Bash

Output:

Json

This template is heavily used for AI training-data pipelines -- pulling READMEs at scale across thousands of repos. Pair it with batch_scrape to process a CSV of repo URLs.

The Other Seven Templates

YouTube -- title, channel, views, full transcript when available:

Bash

Reddit -- post + comment tree:

Bash

Hacker News -- structured story with comments tree:

Bash

Stack Overflow -- question, accepted answer, top alternatives:

Bash

npm -- package metadata + weekly downloads:

Bash

Product Hunt -- product, makers, upvotes:

Bash

Twitter/X -- single tweet with engagement and replies:

Bash

All return JSON. All cost 1 credit. All maintained centrally -- when LinkedIn or Amazon updates their layout, we update the template.

scrape_template vs scrape_structured vs extract_with_llm

A decision tree:

Is your target one of the 10 supported sites? Yes -> use scrape_template (1 credit, maintained for you) No Do you know the CSS selectors and are they stable? Yes -> use scrape_structured (2 credits, you maintain selectors) No -> use extract_with_llm (3 credits, schema-based, layout-resilient)

Quick comparison:

scrape_templatescrape_structuredextract_with_llm
Credits123
Coverage10 specific sitesAny site you can write selectors forAny site
MaintenanceWe maintainYou maintainLLM adapts
SpeedFast (cached schemas)FastSlower (LLM call)
Best forPopular sites, high volumeSpecific known structureUnknown or shifting structure

Limitations

  • Only 10 sites. If you need Etsy, eBay, TikTok, or others, you are waiting on the roadmap or rolling your own with scrape_structured / extract_with_llm. Request templates on Discord.
  • Public data only. No template requires login. Profiles set to private, gated repos, and protected tweets will return what is publicly visible only.
  • Layout changes happen. When a site ships a redesign, we usually have the template patched within 24 hours.
  • Rate limits apply. Heavy-volume LinkedIn or Amazon scraping should pair scrape_template with stealth_mode (5 credits) and respect each site's robots.txt.

Ready to skip the selectors? Start free with 1,000 credits -- enough for 1,000 template scrapes. New here? Read the v4.2.2 launch post for context, or the e-commerce extraction guide for a real-world workflow built around these templates.

Tags

scrape-templateAmazonLinkedInGitHubuse-casespre-built-scrapers

About the Author

C

CrawlForge Team

Engineering Team

Building the most comprehensive web scraping MCP server. We create tools that help developers extract, analyze, and transform web data for AI applications.

On this page

Frequently Asked Questions

What sites does scrape_template support?+

Ten sites in v4.2.2: Amazon, LinkedIn, GitHub, YouTube, Reddit, Hacker News, Stack Overflow, npm, Product Hunt, and Twitter/X. Each has a pre-built schema returning the fields you would normally want (product price/rating, profile name/role, repo stars/README, video transcript, etc.). More templates are coming in v4.3.

Is scraping LinkedIn legal?+

The hiQ Labs v. LinkedIn case (9th Circuit, 2022) established that scraping public profile data is generally permissible, but LinkedIn's ToS restricts automated access -- and aggressive scraping or commercial resale can still trigger legal action and bans. Use scrape_template linkedin for public, low-frequency, non-resold use cases. Consult a lawyer if you are scraping at scale or for commercial products.

Can I add a custom template?+

Not directly today, but we accept template requests on Discord and prioritize by demand. Sites with significant request volume (Etsy, eBay, TikTok, Instagram, Google Maps) are on the roadmap for v4.3. For one-off custom work, use scrape_structured (CSS selectors) or extract_with_llm (schema-driven).

What is the difference between scrape_template and scrape_structured?+

scrape_template is for ten specific sites where we already maintain the schema -- you just pick the template name. scrape_structured is general-purpose: you provide CSS selectors for any site, and CrawlForge runs them. Template is faster and cheaper (1 credit vs 2) when your target is one of the ten supported sites.

How fresh are the scrape_template schemas?+

We monitor each supported site for layout changes and typically ship a template patch within 24 hours of any breaking change. Updates are transparent to your code -- you keep calling the same template name and the data shape stays the same. If you notice a regression, report it on Discord or GitHub.

What happens if a supported site changes its layout?+

Calls keep returning JSON in the documented shape, even if the underlying selectors needed to change. We absorb the maintenance burden so you do not have to. If a layout change is severe enough to temporarily break a field, we mark that field nullable in the response until the patch is live (usually within 24 hours).

Related Articles

Web Scraping by Industry: 2026 Playbook
Use Cases

Web Scraping by Industry: 2026 Playbook

Industry-specific web scraping strategies for real estate, finance, e-commerce, healthcare, and travel. Data targets, CrawlForge tools, and compliance rules.

C
CrawlForge Team
|
Apr 14
|
12m
E-commerce Product Data Extraction at Scale
Use Cases

E-commerce Product Data Extraction at Scale

Extract product data from thousands of e-commerce pages with CrawlForge. Build catalogs, monitor inventory, and power comparison engines at scale.

C
CrawlForge Team
|
Apr 18
|
10m
Build a Research Agent with CrawlForge Deep Research
Use Cases

Build a Research Agent with CrawlForge Deep Research

Create an AI research agent that gathers, verifies, and synthesizes information from dozens of sources in minutes using CrawlForge deep_research.

C
CrawlForge Team
|
Apr 16
|
10m

Footer

CrawlForge

Enterprise web scraping for AI Agents. 23 specialized MCP tools designed for modern developers building intelligent systems.

Product

  • Features
  • Pricing
  • Use Cases
  • Integrations
  • Changelog

Resources

  • Getting Started
  • API Reference
  • Templates
  • Guides
  • Blog
  • FAQ

Developers

  • MCP Protocol
  • Claude Desktop
  • Cursor IDE
  • LangChain
  • LlamaIndex

Company

  • About
  • Contact
  • Privacy
  • Terms

Stay updated

Get the latest updates on new tools and features.

Built with Next.js and MCP protocol

© 2025-2026 CrawlForge. All rights reserved.