CatchAll avatar

CatchAll

Pricing

$0.01 / 1,000 valid_records

Go to Apify Store
CatchAll

CatchAll

Submit a CatchAll job, poll until completion, and retrieve all valid records. Results are saved to the Dataset and Key-Value Store.

Pricing

$0.01 / 1,000 valid_records

Rating

0.0

(0)

Developer

Newscatcher-CatchAll

Newscatcher-CatchAll

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

21 days ago

Last modified

Share

CatchAll — structured web research

CatchAll transforms plain-text questions into structured, validated datasets extracted from billions of web pages. Enter a query like "Series B funding rounds for SaaS startups" and receive structured JSON records with company names, deal sizes, dates, and source citations — no scraping logic required.

CatchAll is not a traditional web scraper. It searches NewsCatcher's proprietary index of 2+ billion web pages, clusters related pages into real-world events, validates relevance, and extracts structured data — all in a single run.

What can CatchAll do?

  • Find specific events at scale — acquisitions, funding rounds, product launches, regulatory approvals, executive changes, and more
  • Return structured JSON — each record includes extracted fields, confidence scores, and source citations
  • Handle the full job lifecycle — this Actor submits a job, polls until completion, and retrieves all results automatically
  • Save results to Apify storage — records are stored in both a Dataset and a Key-value store for easy export

CatchAll pairs well with the Apify platform. Schedule recurring runs, chain with other Actors using integrations, export results via API, or send data to external services through webhooks.

How to use CatchAll

  1. Go to the CatchAll Actor page and click Try for free.
  2. Enter your CatchAll API key (get one at platform.newscatcherapi.com).
  3. Type a plain-text query describing what you want to find.
  4. Optionally adjust the record limit, or add custom validators and enrichments as JSON.
  5. Click Save & Start. The Actor submits the job, polls for status, and retrieves results when complete.
  6. Open the Output tab to review records, or go to Storage to download the dataset as JSON, CSV, or Excel.

A typical run takes 10–15 minutes depending on query complexity and the number of web pages processed.

Input

FieldTypeRequiredDescription
apiKeyStringYesYour CatchAll API key
queryStringYesPlain-text question describing what to find
contextStringNoAdditional guidance to focus extraction
limitIntegerNoMaximum number of records to return (default: 50, minimum: 11)
validatorsJsonStringNoJSON array of validator objects. Example: [{"name":"is_acquisition","description":"...","type":"boolean"}]
enrichmentsJsonStringNoJSON array of enrichment objects. Example: [{"name":"acquirer_company","description":"...","type":"company"}]
pollIntervalSecondsIntegerNoHow often to check job status, in seconds (default: 60)
timeoutMinutesIntegerNoStop polling after this many minutes (default: 30)
pageSizeIntegerNoRecords to fetch per page when pulling results (default: 100)

If you leave validatorsJson and enrichmentsJson empty (or as []), CatchAll generates them automatically based on your query.

Input example

{
"apiKey": "YOUR_CATCHALL_API_KEY",
"query": "AI company acquisitions",
"context": "Focus on deal size and acquiring company details",
"limit": 10
}

Input example with custom enrichments

{
"apiKey": "YOUR_CATCHALL_API_KEY",
"query": "AI company acquisitions",
"context": "Focus on deal size and acquiring company details",
"limit": 10,
"validatorsJson": "[{\"name\":\"is_acquisition\",\"description\":\"true if the page describes a company acquisition\",\"type\":\"boolean\"}]",
"enrichmentsJson": "[{\"name\":\"acquiring_company\",\"description\":\"Name of the acquiring company\",\"type\":\"company\"},{\"name\":\"deal_value\",\"description\":\"Deal value in USD\",\"type\":\"number\"}]"
}

Output

Each record in the output dataset contains:

FieldDescription
record_idUnique identifier for the record
record_titleShort title summarizing the event
enrichmentStructured data extracted from web pages (dynamic fields)
enrichment.enrichment_confidenceOverall confidence score: low, medium, or high
citationsArray of source documents with title, URL, and publication date

The enrichment object uses dynamic schemas — field names are generated based on your query. For example, a funding query might return funding_amount, investee_company, and funding_date. If you need consistent field names across runs, define custom enrichments in enrichmentsJson.

Output example

{
"record_id": "6983973854314692457",
"record_title": "VulnCheck Raises $25M Series B Funding",
"enrichment": {
"enrichment_confidence": "high",
"funding_amount": 25000000,
"funding_currency": "USD",
"funding_date": "2026-02-17",
"investee_company": {
"source_text": "VulnCheck",
"confidence": 0.99,
"metadata": {
"name": "VulnCheck",
"domain_url": "vulncheck.com",
"domain_url_confidence": "high"
}
},
"investor_company": {
"source_text": "Sorenson Capital",
"confidence": 0.99,
"metadata": {
"name": "Sorenson Capital",
"domain_url": null,
"domain_url_confidence": null
}
}
},
"citations": [
{
"title": "Exclusive: VulnCheck raises $25M funding to help companies patch software bugs",
"link": "https://example.com/article",
"published_date": "2026-02-17T10:00:00Z"
}
]
}

Tips for effective queries

  • Be specific about what you're looking for. "Series B funding rounds for SaaS startups" works better than "startup funding."
  • Use the journalist test. If a journalist would write a news article about it, CatchAll can find it.
  • Target single entities or related entities. "Apple OR Google acquisitions in healthcare" is effective. Mixing unrelated topics in one query reduces accuracy.
  • Add context to guide extraction. Use the context field to specify what data points matter most.
  • Start with a small limit for testing. You can expand results later with the CatchAll Continue Actor without reprocessing.

How much does it cost?

This Actor is free to use on Apify — you only pay for Apify platform usage (compute units). However, each run consumes credits from your CatchAll API plan. Check your plan limits at platform.newscatcherapi.com.

Other CatchAll Actors

CatchAll also offers utility Actors for building custom workflows. Each maps to a single API endpoint:

ActorDescription
CatchAll InitializeGet suggested validators, enrichments, and date ranges before submitting
CatchAll Create JobSubmit a job without polling or fetching results
CatchAll Get Job StatusCheck current job status and step progress
CatchAll Pull ResultsRetrieve all records from a completed job
CatchAll Early ResultsGet partial results before a job completes
CatchAll ContinueExpand a job to process more records
CatchAll Create MonitorSchedule recurring jobs
CatchAll Update MonitorUpdate a monitor's webhook configuration
CatchAll Start/Stop MonitorPause or resume a monitor

Chain these Actors using Apify's built-in integrations and webhooks to build automated data pipelines.

FAQ

How long does a run take? A typical CatchAll job processes 50,000+ web pages and takes 10–15 minutes. The Actor polls the API automatically until the job completes or the timeout is reached (default: 30 minutes).

What are dynamic schemas? CatchAll generates response schemas dynamically for each job. Field names in the enrichment object can vary between runs, even with the same query. To get consistent field names, define custom enrichments in enrichmentsJson. Learn more in the dynamic schemas guide.

Can I get more results after a run completes? Yes. Use the CatchAll Continue Actor with the same jobId to process additional records without restarting the job.

Where can I get help?