CrawlForge
Api Reference
...
Tools
Extract Structured
AI Tool3 credits

extract_structured

Give the extractor a JSON Schema and a natural-language prompt. The LLM reads the page and returns data matching your schema. When no LLM provider is configured it falls back to CSS selector extraction using your hints.

Use Cases

Schema-First Product Extraction

Define the fields you want once; the LLM maps any e-commerce site to your schema.

Resume & Document Parsing

Extract candidate names, skills, and work history directly into a typed object.

Knowledge Graph Seeding

Extract entities and relationships from articles into structured JSON for graph loaders.

Endpoint

POST/api/v1/tools/extract_structured
Auth Required
2 req/s on Free plan
3 credits

Parameters

NameTypeRequiredDefaultDescription
url
stringRequired-
URL to extract data from
Example: https://example.com/product/123
schema
objectRequired-
JSON Schema describing the data to extract
Example: {"type":"object","properties":{"title":{"type":"string"},"price":{"type":"number"}},"required":["title"]}
prompt
stringOptional-
Natural-language instructions guiding the LLM extraction
Example: Extract the product name, current price, and whether it is in stock
llmConfig
objectOptional-
Optional LLM provider configuration (provider, apiKey). Omit to use CSS selector fallback.
Example: {"provider": "openai", "apiKey": "sk-..."}
selectorHints
objectOptional-
CSS selector hints to guide extraction (also used by selector fallback)
Example: {"title": "h1.product-title", "price": ".price"}
fallbackToSelectors
booleanOptionaltrue
Fall back to CSS selector extraction when LLM is unavailable
Example: true
LLM vs. selector fallback: Provide llmConfig to use LLM-powered extraction. Without it, the tool uses selectorHints for deterministic CSS extraction — cheaper and no LLM key required.

Request Examples

cURL — LLM extraction

terminalBash

TypeScript — selector fallback

extractStructured.tsTypescript

Python

extract_structured.pyPython

Response Example

200 OK1.2s
{
"success": true,
"data": {
"url": "https://example.com/product/123",
"extracted": {
"title": "Premium Wireless Headphones",
"price": 299.99,
"in_stock": true
},
"extraction_method": "llm",
"schema_fields": 3,
"required_fields": 2,
"llm_provider": "openai",
"confidence": 0.92
},
"credits_used": 3,
"credits_remaining": 997,
"processing_time": 1240
}
Field Descriptions
data.extractedMatches the JSON Schema you provided
data.extraction_method"llm" when provider configured, "selector_fallback" otherwise
data.confidenceExtractor confidence (LLM confidence or selector match rate)
credits_usedFlat 3 credits per call

Credit Cost

3 credits
3 credits per request
Flat 3 credits whether the call uses the LLM or the selector fallback.

Tip: Pair with scrape_structured (2 credits, CSS-only) when you already have stable selectors and don't need LLM flexibility.

Related Tools

scrape_structured
CSS selector extraction (2 credits)
analyze_content
Sentiment, entities, and topic analysis (3 credits)
Ready to extract typed structured data? Sign up for free and get 1,000 credits.