Templates
Reusable scraping, crawling, and search recipes with variables and custom logic.
Introduction
Templates are reusable configurations for scraping, crawling, or searching. Instead of repeating the same options in every API call, you define them once in a template (or obtain a template from AnyCrawl Template Store) and reference it by template_id.
Benefits:
- Simplicity: Call APIs with just
template_id+ minimal inputs - Consistency: Standardize behavior across your team or projects
- Safety: Templates can restrict allowed domains and expose only necessary variables
- Power: Optional custom handlers for advanced transformations
Supported types:
scrape: single-page extraction via/v1/scrapecrawl: multi-page crawling via/v1/crawlsearch: search engine results via/v1/search
Template Marketplace
Browse ready-to-use templates at anycrawl.dev/template.
How to use:
- Browse the marketplace and find a template that fits your needs
- Copy the
template_idfrom the template detail page - Call the API with that
template_idand required inputs
Example:
curl -X POST "https://api.anycrawl.dev/v1/scrape" \
-H "Authorization: Bearer <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{
"template_id": "content-extractor",
"url": "https://example.com"
}'Using Templates in API Calls
Request Parameters
When using template_id, only minimal fields are allowed:
| Endpoint | Required Field | Optional Fields |
|---|---|---|
/v1/scrape | template_id | url, variables |
/v1/crawl | template_id | url, variables |
/v1/search | template_id | query, variables |
Important notes:
urlorquerymay be optional if the template predefines them. Check the template description.variablespasses dynamic inputs the template expects (see Variables section below).- Other fields (like
engine,formats,timeout, etc.) come from the template and cannot be overridden. - Providing disallowed fields returns a 400 validation error.
Scrape with a Template
curl -X POST "https://api.anycrawl.dev/v1/scrape" \
-H "Authorization: Bearer <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{
"template_id": "my-scrape-template",
"url": "https://example.com",
"variables": { "category": "tech" }
}'Crawl with a Template
curl -X POST "https://api.anycrawl.dev/v1/crawl" \
-H "Authorization: Bearer <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{
"template_id": "my-crawl-template",
"url": "https://docs.example.com",
"variables": { "maxPages": 50 }
}'Search with a Template
curl -X POST "https://api.anycrawl.dev/v1/search" \
-H "Authorization: Bearer <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{
"template_id": "my-search-template",
"query": "machine learning tutorials",
"variables": { "lang": "en" }
}'Variables
Templates can declare variables to accept dynamic inputs at call time.
- Each variable has a
type:string,number,boolean, orurl - Variables can be
requiredor optional withdefaultValue - Check the template description to see what variables it expects
Example request with variables:
{
"template_id": "blog-scraper",
"url": "https://example.com/blog/post-123",
"variables": {
"author": "john-doe",
"includeComments": true,
"maxComments": 50
}
}If you omit a required variable or provide the wrong type, you'll get a 400 validation error.
Response Format
Template responses follow the same format as standard API calls:
{
"success": true,
"data": {
"url": "https://example.com",
"markdown": "# Page Title\n\nContent...",
"metadata": { ... },
// Additional fields from custom handlers (if any)
"extractedData": { ... }
}
}Templates with custom handlers may add extra fields to the response.
Error Handling
Common errors when using templates:
| Error | HTTP Status | Description |
|---|---|---|
| Template not found | 404 | template_id doesn't exist or you lack access |
| Validation error | 400 | Missing required variables or wrong types |
| Domain restriction violation | 403 | URL not allowed by template's domain policy |
| Invalid fields | 400 | Extra top-level fields not permitted with templates |
Example error response:
{
"success": false,
"error": "Validation error",
"message": "When using template_id, only template_id, url, variables are allowed. Invalid fields: engine, formats",
"data": {
"type": "validation_error",
"issues": [
{
"field": "engine",
"message": "Field 'engine' is not allowed when using template_id",
"code": "invalid_field"
}
],
"status": "failed"
}
}Best Practices
For API Callers
- Always check the template description for required variables and allowed domains
- Use marketplace templates when available to save time
- Handle 404 errors (template may have been deleted or archived)
- Don't try to override template settings (engine, formats, etc.) - it will fail
For Template Authors
- Keep templates focused on a single use case
- Document all variables clearly with descriptions
- Use domain restrictions to prevent misuse
- Set appropriate pricing based on complexity
- Test templates thoroughly before publishing
Creating Templates (Advanced)
If you're creating your own templates, you can configure:
Domain Restrictions
Limit where your template can be used:
{
"allowedDomains": {
"type": "glob",
"patterns": ["*.example.com", "docs.mysite.com"]
}
}type:"exact"(exact match) or"glob"(pattern matching)patterns: array of allowed domains or patterns
Pricing
Set credit cost per call:
{
"pricing": {
"perCall": 10,
"currency": "credits"
}
}Custom Handlers
Write JavaScript/TypeScript code to:
requestHandler: Post-process scrape results and add custom fieldsfailedRequestHandler: Handle failures with custom retry logicqueryTransform(search only): Transform queries before searchingurlTransform(scrape/crawl only): Transform URLs before processing
Both transforms support:
- Template mode with placeholders (query:
{{query}}, url:{{url}}) - Append mode with
prefixandsuffix - Optional
regexExtractto pre-extract a substring before applying the mode
Example regex extraction for TikTok profiles:
{
"customHandlers": {
"urlTransform": {
"enabled": true,
"mode": "append",
"prefix": "",
"suffix": "",
"regexExtract": {
"pattern": "^(https?:\\/\\/www\\.tiktok\\.com\\/@[^\\/?#]+)",
"flags": "i",
"group": 1
}
}
}
}This extracts https://www.tiktok.com/@piperrockelle from inputs like:
https://www.tiktok.com/@piperrockelle?abb=ccchttps://www.tiktok.com/@piperrockelle
Example requestHandler:
// Extract structured data from page context
const title = context.data.title;
const content = context.data.markdown;
return {
extractedTitle: title,
wordCount: content.split(/\s+/).length,
customMetric: calculateMetric(content),
};Security Model
- Non-trusted templates: Run in a hardened VM sandbox with strict limitations
- Trusted templates: Can use async functions with controlled browser page access
Only templates reviewed and approved by AnyCrawl can be marked as trusted.
FAQ
Can I override template settings like engine or formats?
No. Templates are designed to be immutable configurations. You can only provide url/query and variables.
What happens if I use a template from the marketplace?
Marketplace templates are publicly available. You pay the credits defined by the template author.
Can templates see my API key?
No. Templates run in isolated sandboxes and have no access to your credentials.
How do I create my own templates?
Visit the AnyCrawl playground to create and test templates. Once published, they can be used via API.