Olostep Web Scraper
Pricing
Pay per usage
Olostep Web Scraper
Automate web search, scraping and crawling with Apify Actors using Olostep — the API to search, extract and structure web data.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Olostep
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
2 days ago
Last modified
Categories
Share
Olostep Web Scraper - Apify Actor
Official Apify Actor for Olostep — a Web search, scraping and crawling API; an API to search, extract and structure web data. Extract content from any website in multiple formats (Markdown, HTML, JSON, or plain text) with support for single page scraping, batch processing, website crawling, and URL mapping.
Overview
This Actor integrates Olostep's Web search, scraping and crawling capabilities into the Apify platform, allowing you to:
- Scrape single websites - Extract content from any URL
- Batch process URLs - Scrape up to 100,000 URLs in parallel
- Crawl websites - Automatically discover and scrape linked pages
- Map websites - Extract all URLs from a website for structure analysis
- AI-powered Answers - Ask natural-language questions and get structured JSON answers with sources
Features
- ✅ Multiple output formats (Markdown, HTML, JSON, Text)
- ✅ JavaScript rendering support with configurable wait times
- ✅ Country-specific scraping
- ✅ Specialized parsers for popular websites (Amazon, LinkedIn, etc.)
- ✅ Batch processing for large-scale data extraction
- ✅ Website crawling with link following
- ✅ URL mapping and discovery
- ✅ Integration-ready design for Apify workflows
Input
The Actor accepts the following input parameters:
Required Fields
- operation (string): Operation type -
scrape,batch,crawl, ormap - apiKey (string): Your Olostep API key from olostep.com/dashboard
Operation-Specific Fields
Scrape Operation
- url_to_scrape (string, required): URL to scrape
- formats (string): Output format -
html,markdown,json, ortext(default:markdown) - country (string): Country code for location-specific scraping (e.g.,
US,GB,CA) - wait_before_scraping (integer): Wait time in milliseconds for JavaScript rendering
- parser (string): Parser ID for specialized extraction (e.g.,
@olostep/amazon-product)
Batch Operation
- batch_array (string, required): JSON array of objects with
urland optionalcustom_id- Example:
[{"url":"https://example.com","custom_id":"site1"}]
- Example:
- formats (string): Output format
- country (string): Country code
- wait_before_scraping (integer): Wait time in milliseconds
- parser (string): Parser ID
Crawl Operation
- start_url (string, required): Starting URL for the crawl
- max_pages (integer): Maximum number of pages to crawl (default:
10) - follow_links (boolean): Whether to follow links (default:
true) - formats (string): Output format
- country (string): Country code
- parser (string): Parser ID
Map Operation
- website_url (string, required): Website URL to extract links from
- search_query (string): Optional search query to filter URLs
- top_n (integer): Limit the number of URLs returned
- include_patterns (string): Glob patterns to include (e.g.,
/blog/**) - exclude_patterns (string): Glob patterns to exclude (e.g.,
/admin/**)
Output
The Actor outputs data to the default dataset. Output format varies by operation:
Scrape Output
{"id": "scrape_abc123","url": "https://example.com","status": "completed","formats": "markdown","markdown_content": "# Example Content\n\n...","html_content": "<h1>Example Content</h1>...","json_content": "{...}","text_content": "Example Content...","markdown_hosted_url": "https://...","page_metadata": "{...}"}
Batch Output
{"batch_id": "batch_xyz789","status": "processing","total_urls": 100,"formats": "markdown","urls": [{"custom_id": "site1", "url": "https://example.com"}]}
Crawl Output
{"crawl_id": "crawl_def456","status": "in_progress","start_url": "https://example.com","max_pages": 10,"follow_links": true,"formats": "markdown"}
Map Output
{"map_id": "map_ghi789","website_url": "https://example.com","total_urls": 150,"urls": ["https://example.com/page1", "https://example.com/page2", ...]}
Usage Examples
Example 1: Scrape a Single Website
{"operation": "scrape","apiKey": "your-api-key","url_to_scrape": "https://example.com","formats": "markdown","country": "US"}
Example 2: Batch Scrape Multiple URLs
{"operation": "batch","apiKey": "your-api-key","batch_array": "[{\"url\":\"https://example.com\",\"custom_id\":\"site1\"},{\"url\":\"https://test.com\",\"custom_id\":\"site2\"}]","formats": "json","parser": "@olostep/amazon-product"}
Example 3: Crawl a Website
{"operation": "crawl","apiKey": "your-api-key","start_url": "https://example.com","max_pages": 50,"follow_links": true,"formats": "markdown"}
Example 4: Map a Website
{"operation": "map","apiKey": "your-api-key","website_url": "https://example.com","include_patterns": "/blog/**","top_n": 100}
Example 5: AI-powered Answers
{"operation": "answers","apiKey": "your-api-key","task": "What is the latest funding round of Olostep? Provide company, round, date, amount.","json": "{\"company\":\"\",\"round\":\"\",\"date\":\"\",\"amount\":\"\"}"}
Integration with Other Actors
This Actor is designed to work seamlessly with other Apify Actors:
- Input from other Actors: Use the
payloadfield to receive data from triggering actors - Output to other Actors: Output data is stored in the default dataset, accessible by other actors
- Workflow Integration: Chain multiple actors together for complex data extraction workflows
Specialized Parsers
Olostep provides pre-built parsers for popular websites:
@olostep/amazon-product- Amazon product pages@olostep/linkedin-profile- LinkedIn profiles@olostep/linkedin-company- LinkedIn company pages@olostep/google-search- Google search results@olostep/google-maps- Google Maps listings@olostep/instagram-profile- Instagram profiles
Error Handling
The Actor handles common errors:
- 401 Unauthorized: Invalid API key
- 429 Too Many Requests: Rate limit exceeded
- 500 Internal Server Error: Olostep service error
- Network Errors: Connection issues with detailed error messages
Pricing
Olostep charges based on API usage, independent of Apify:
- Scrapes: Pay per scrape
- Batches: Pay per URL in batch
- Crawls: Pay per page crawled
- Maps: Pay per map operation
Check current pricing at olostep.com/pricing.
Support
- Documentation: docs.olostep.com
- Support: olostep.com/support
- API Dashboard: olostep.com/dashboard
License
MIT License
Ready to scrape the web? Get your API key from olostep.com/dashboard and start extracting data today!