CatchAll
Pricing
$0.01 / 1,000 valid_records
CatchAll
Submit a CatchAll job, poll until completion, and retrieve all valid records. Results are saved to the Dataset and Key-Value Store.
Pricing
$0.01 / 1,000 valid_records
Rating
0.0
(0)
Developer
Newscatcher-CatchAll
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
21 days ago
Last modified
Categories
Share
CatchAll — structured web research
CatchAll transforms plain-text questions into structured, validated datasets extracted from billions of web pages. Enter a query like "Series B funding rounds for SaaS startups" and receive structured JSON records with company names, deal sizes, dates, and source citations — no scraping logic required.
CatchAll is not a traditional web scraper. It searches NewsCatcher's proprietary index of 2+ billion web pages, clusters related pages into real-world events, validates relevance, and extracts structured data — all in a single run.
What can CatchAll do?
- Find specific events at scale — acquisitions, funding rounds, product launches, regulatory approvals, executive changes, and more
- Return structured JSON — each record includes extracted fields, confidence scores, and source citations
- Handle the full job lifecycle — this Actor submits a job, polls until completion, and retrieves all results automatically
- Save results to Apify storage — records are stored in both a Dataset and a Key-value store for easy export
CatchAll pairs well with the Apify platform. Schedule recurring runs, chain with other Actors using integrations, export results via API, or send data to external services through webhooks.
How to use CatchAll
- Go to the CatchAll Actor page and click Try for free.
- Enter your CatchAll API key (get one at platform.newscatcherapi.com).
- Type a plain-text query describing what you want to find.
- Optionally adjust the record limit, or add custom validators and enrichments as JSON.
- Click Save & Start. The Actor submits the job, polls for status, and retrieves results when complete.
- Open the Output tab to review records, or go to Storage to download the dataset as JSON, CSV, or Excel.
A typical run takes 10–15 minutes depending on query complexity and the number of web pages processed.
Input
| Field | Type | Required | Description |
|---|---|---|---|
apiKey | String | Yes | Your CatchAll API key |
query | String | Yes | Plain-text question describing what to find |
context | String | No | Additional guidance to focus extraction |
limit | Integer | No | Maximum number of records to return (default: 50, minimum: 11) |
validatorsJson | String | No | JSON array of validator objects. Example: [{"name":"is_acquisition","description":"...","type":"boolean"}] |
enrichmentsJson | String | No | JSON array of enrichment objects. Example: [{"name":"acquirer_company","description":"...","type":"company"}] |
pollIntervalSeconds | Integer | No | How often to check job status, in seconds (default: 60) |
timeoutMinutes | Integer | No | Stop polling after this many minutes (default: 30) |
pageSize | Integer | No | Records to fetch per page when pulling results (default: 100) |
If you leave validatorsJson and enrichmentsJson empty (or as []), CatchAll generates them automatically based on your query.
Input example
{"apiKey": "YOUR_CATCHALL_API_KEY","query": "AI company acquisitions","context": "Focus on deal size and acquiring company details","limit": 10}
Input example with custom enrichments
{"apiKey": "YOUR_CATCHALL_API_KEY","query": "AI company acquisitions","context": "Focus on deal size and acquiring company details","limit": 10,"validatorsJson": "[{\"name\":\"is_acquisition\",\"description\":\"true if the page describes a company acquisition\",\"type\":\"boolean\"}]","enrichmentsJson": "[{\"name\":\"acquiring_company\",\"description\":\"Name of the acquiring company\",\"type\":\"company\"},{\"name\":\"deal_value\",\"description\":\"Deal value in USD\",\"type\":\"number\"}]"}
Output
Each record in the output dataset contains:
| Field | Description |
|---|---|
record_id | Unique identifier for the record |
record_title | Short title summarizing the event |
enrichment | Structured data extracted from web pages (dynamic fields) |
enrichment.enrichment_confidence | Overall confidence score: low, medium, or high |
citations | Array of source documents with title, URL, and publication date |
The enrichment object uses dynamic schemas — field names are generated based on your query. For example, a funding query might return funding_amount, investee_company, and funding_date. If you need consistent field names across runs, define custom enrichments in enrichmentsJson.
Output example
{"record_id": "6983973854314692457","record_title": "VulnCheck Raises $25M Series B Funding","enrichment": {"enrichment_confidence": "high","funding_amount": 25000000,"funding_currency": "USD","funding_date": "2026-02-17","investee_company": {"source_text": "VulnCheck","confidence": 0.99,"metadata": {"name": "VulnCheck","domain_url": "vulncheck.com","domain_url_confidence": "high"}},"investor_company": {"source_text": "Sorenson Capital","confidence": 0.99,"metadata": {"name": "Sorenson Capital","domain_url": null,"domain_url_confidence": null}}},"citations": [{"title": "Exclusive: VulnCheck raises $25M funding to help companies patch software bugs","link": "https://example.com/article","published_date": "2026-02-17T10:00:00Z"}]}
Tips for effective queries
- Be specific about what you're looking for. "Series B funding rounds for SaaS startups" works better than "startup funding."
- Use the journalist test. If a journalist would write a news article about it, CatchAll can find it.
- Target single entities or related entities. "Apple OR Google acquisitions in healthcare" is effective. Mixing unrelated topics in one query reduces accuracy.
- Add context to guide extraction. Use the
contextfield to specify what data points matter most. - Start with a small limit for testing. You can expand results later with the CatchAll Continue Actor without reprocessing.
How much does it cost?
This Actor is free to use on Apify — you only pay for Apify platform usage (compute units). However, each run consumes credits from your CatchAll API plan. Check your plan limits at platform.newscatcherapi.com.
Other CatchAll Actors
CatchAll also offers utility Actors for building custom workflows. Each maps to a single API endpoint:
| Actor | Description |
|---|---|
| CatchAll Initialize | Get suggested validators, enrichments, and date ranges before submitting |
| CatchAll Create Job | Submit a job without polling or fetching results |
| CatchAll Get Job Status | Check current job status and step progress |
| CatchAll Pull Results | Retrieve all records from a completed job |
| CatchAll Early Results | Get partial results before a job completes |
| CatchAll Continue | Expand a job to process more records |
| CatchAll Create Monitor | Schedule recurring jobs |
| CatchAll Update Monitor | Update a monitor's webhook configuration |
| CatchAll Start/Stop Monitor | Pause or resume a monitor |
Chain these Actors using Apify's built-in integrations and webhooks to build automated data pipelines.
FAQ
How long does a run take? A typical CatchAll job processes 50,000+ web pages and takes 10–15 minutes. The Actor polls the API automatically until the job completes or the timeout is reached (default: 30 minutes).
What are dynamic schemas?
CatchAll generates response schemas dynamically for each job. Field names in the enrichment object can vary between runs, even with the same query. To get consistent field names, define custom enrichments in enrichmentsJson. Learn more in the dynamic schemas guide.
Can I get more results after a run completes?
Yes. Use the CatchAll Continue Actor with the same jobId to process additional records without restarting the job.
Where can I get help?
- CatchAll documentation
- Write effective queries
- Open an issue on the Actor's Issues tab in Apify Console


