CrawlForge
HomeUse CasesIntegrationsPricingDocumentationBlog
Web Scraping From the CLI: The CrawlForge CLI Guide
Tutorials
Back to Blog
Tutorials

Web Scraping From the CLI: The CrawlForge CLI Guide

C
CrawlForge Team
Engineering Team
May 21, 2026
10 min read

On this page

Quick Answer

The CrawlForge CLI (@crawlforge/cli) is a terminal-first wrapper around all 23 CrawlForge tools. It works without an MCP client, outputs JSON for shell pipelines, and installs in 30 seconds with `npm install -g @crawlforge/cli`. Use it for cron jobs, CI/CD steps, one-off research, and any workflow where you would otherwise reach for curl plus a custom parser.

Most AI tools love to be agents. The CrawlForge CLI is built for the opposite: scriptable, terminal-first, predictable. You install it, set an environment variable, and every one of CrawlForge's 23 tools becomes a shell command. JSON in, JSON out. Pipe to jq, schedule with cron, run in CI -- it works the same way everywhere.

Table of Contents

  • What Is the CrawlForge CLI?
  • Install in 30 Seconds
  • The 15 Commands at a Glance
  • Your First Scrape
  • Piping JSON Output to jq
  • Scheduling With Cron
  • CLI vs MCP vs Raw API
  • Three Real-World Workflows
  • Global Flags Reference
  • What It Costs

What Is the CrawlForge CLI?

The CrawlForge CLI is a standalone npm package (@crawlforge/cli) that exposes all 23 CrawlForge tools as terminal commands. It is not a wrapper around an MCP server, and it does not need a long-running process. You type crawlforge scrape <url>, it makes an HTTPS call to CrawlForge's API, and prints JSON to stdout. That is the entire story.

It exists because half the scraping work people do is not agent-shaped. Cron jobs, CI steps, one-off research, ad-hoc pulls from a shell -- those want plain old commands, not a JSON-RPC handshake.

Install in 30 Seconds

Bash

That is it. No config file, no auth flow, no service to start. If you do not have an API key yet, grab one at crawlforge.dev/signup -- you get 1,000 free credits on signup.

To make the env var permanent on macOS or Linux:

Bash

On Windows (PowerShell):

Powershell

The 15 Commands at a Glance

Every command maps to one or more CrawlForge tools:

CommandPrimary toolCreditsExample
scrapefetch_url, extract_content1-2crawlforge scrape https://example.com
searchsearch_web5crawlforge search "MCP servers 2026"
crawlcrawl_deep4crawlforge crawl https://docs.example.com --depth 3
mapmap_site2crawlforge map https://example.com
extractextract_with_llm3crawlforge extract <url> --schema schema.json
tracktrack_changes3crawlforge track <url> --baseline
analyzeanalyze_content3crawlforge analyze <url>
researchdeep_research10crawlforge research "AI agents in 2026"
stealthstealth_mode5crawlforge stealth <url>
batchbatch_scrape5crawlforge batch urls.txt
actionsscrape_with_actions5crawlforge actions <url> --steps steps.json
localizelocalization2crawlforge localize <url> --country DE
llmstxtgenerate_llms_txt5crawlforge llmstxt https://example.com
templatescrape_template1crawlforge template amazon --url <url>
monitortrack_changes3crawlforge monitor <url> --interval 24h

Your First Scrape

The simplest possible call:

Bash

What comes back is the page's main content as JSON:

Json

Want just the URLs? Pipe to jq:

Bash

Want it in a file?

Bash

Piping JSON Output to jq

This is the workflow that makes the CLI worth installing. Everything outputs JSON, and JSON pipes into anything.

Get the top 10 HN story titles:

Bash

Search the web and extract URLs:

Bash

Scrape a page and count words:

Bash

Batch scrape, then filter for error responses:

Bash

The pattern: --json gives you machine-readable output, then jq slices and dices.

Scheduling With Cron

A daily check on a competitor's pricing page:

Bash

A nightly research run:

Bash

A weekly llms.txt regeneration for your own site:

Bash

In CI? Use the same commands in your GitHub Actions YAML. The CLI checks CRAWLFORGE_API_KEY first, so just set it as a repository secret.

Yaml

CLI vs MCP vs Raw API: When to Use Each

WorkflowUse the CLIUse MCPUse Raw API
One-off scrape from your terminalyesnono
Cron job or CI stepyesnoonly if you need to
Claude / Cursor / Windsurf agentnoyesno
Embedded in a Node/Python servicenoonly if MCP-shapedyes
Long-running background workernonoyes
Quick exploration of an unfamiliar siteyesmaybeno

Rule of thumb: if a human is typing the command, use the CLI. If an LLM is selecting the tool, use MCP. If a server is calling it in a loop, use the raw API.

Three Real-World Workflows

1. Competitive Pricing Monitor

A shell script that runs daily, scrapes three competitor pricing pages, diffs against yesterday's snapshot, and posts to Slack if anything changed.

Bash

Cost: ~9 credits per day (3 competitors × 3 credits for track).

2. Lead Enrichment From a CSV

Read a CSV of company domains, scrape each homepage for contact info, write enriched data back.

Bash

Cost: 1 credit per company.

3. Research Report Pipeline

A weekly Sunday cron that runs a research query, summarizes the result, and emails it to the team.

Bash

Cost: 13 credits per run (10 for research, 3 for analyze).

Global Flags Reference

These work on every command:

  • --json -- machine-readable output (default for piping; use --pretty for human-readable JSON)
  • --output <file> -- write to file instead of stdout
  • --timeout <ms> -- override default 30s timeout
  • --verbose -- print debug info to stderr
  • --api-key <key> -- override the env var

What It Costs

The CLI itself is free. You pay only for the underlying tool calls, billed against your existing credit balance. No extra subscription, no per-invocation fee. A daily cron that runs track against three URLs and research once a week costs roughly 100 credits per month -- well within the free tier.


Ready to install? Get your free API key at crawlforge.dev/signup and run npm install -g @crawlforge/cli. New here? Read the v4.2.2 launch announcement for everything new, or the original MCP quickstart for the MCP version instead.

Tags

CLIweb-scrapingtutorialterminalautomationscripting

About the Author

C

CrawlForge Team

Engineering Team

Building the most comprehensive web scraping MCP server. We create tools that help developers extract, analyze, and transform web data for AI applications.

On this page

Frequently Asked Questions

Is the CrawlForge CLI free?+

The CLI package itself is free and open. You pay only for the underlying tool calls billed against your normal CrawlForge credit balance, the same as you would from MCP or the raw API. There is no extra per-invocation fee.

Do I need a CrawlForge API key to use the CLI?+

Yes. The CLI reads the CRAWLFORGE_API_KEY environment variable on every call. Get a free key at crawlforge.dev/signup (no credit card required) and set it once in your shell profile.

Can I use the CrawlForge CLI in CI/CD pipelines?+

Yes -- this is one of its primary use cases. Install via "npm install -g @crawlforge/cli" in your CI runner, set CRAWLFORGE_API_KEY as a repository secret, and run any command. It works the same in GitHub Actions, GitLab CI, CircleCI, and Jenkins.

How is the CrawlForge CLI different from curl?+

curl gives you raw HTML. The CrawlForge CLI returns structured JSON: cleaned content, extracted metadata, links, headings, and tool-specific fields like search results, research summaries, or template-scraped product data. It also handles anti-bot defenses, stealth mode, and browser automation -- all things curl cannot do.

Does the CLI support all 23 CrawlForge tools?+

Yes. The 15 commands cover all 23 tools (some commands expose multiple tools via flags). For example, "crawlforge extract" maps to extract_with_llm by default and extract_structured with the --css flag.

Can the CrawlForge CLI output structured data for parsing?+

Yes -- pass --json on any command and the output is clean JSON suitable for piping into jq or any JSON-aware tool. Use --pretty for human-readable formatting, or --output <file> to write directly to disk.

Related Articles

How to Use CrawlForge with Make and Zapier
Tutorials

How to Use CrawlForge with Make and Zapier

Connect CrawlForge to Make (Integromat) and Zapier for automated web scraping. No-code setup with HTTP modules, webhooks, and workflow examples.

C
CrawlForge Team
|
Apr 23
|
8m
How to Scrape Websites with Claude Code (2026 Guide)
Tutorials

How to Scrape Websites with Claude Code (2026 Guide)

Scrape any website from your terminal with Claude Code and CrawlForge MCP. Fetch pages, extract data, bypass anti-bot -- in under 2 minutes.

C
CrawlForge Team
|
Apr 14
|
10m
How to Use CrawlForge with LangGraph Agents
Tutorials

How to Use CrawlForge with LangGraph Agents

Build stateful web scraping agents with LangGraph and CrawlForge. TypeScript guide covering graph nodes, state management, and conditional scraping flows.

C
CrawlForge Team
|
Apr 24
|
8m

Footer

CrawlForge

Enterprise web scraping for AI Agents. 23 specialized MCP tools designed for modern developers building intelligent systems.

Product

  • Features
  • Pricing
  • Use Cases
  • Integrations
  • Changelog

Resources

  • Getting Started
  • API Reference
  • Templates
  • Guides
  • Blog
  • FAQ

Developers

  • MCP Protocol
  • Claude Desktop
  • Cursor IDE
  • LangChain
  • LlamaIndex

Company

  • About
  • Contact
  • Privacy
  • Terms

Stay updated

Get the latest updates on new tools and features.

Built with Next.js and MCP protocol

© 2025-2026 CrawlForge. All rights reserved.