Evidence-First Website Facts Extractor avatar

Evidence-First Website Facts Extractor

Pricing

from $5.00 / 1,000 useful fact pack results

Go to Apify Store
Evidence-First Website Facts Extractor

Evidence-First Website Facts Extractor

Extract source-linked pricing, FAQ, feature, policy, docs, company, or custom fact packs from public websites for AI agents and research workflows.

Pricing

from $5.00 / 1,000 useful fact pack results

Rating

0.0

(0)

Developer

jack su

jack su

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Extract compact, source-linked fact packs from public websites for AI agents, sales research, SEO audits, competitive research, RAG preparation, and due diligence workflows.

This Actor is intentionally evidence-first. It returns direct website snippets with source URLs and diagnostics. It does not use a private API, infer facts without evidence, enrich private individuals, or act as a broad unbounded AI web crawler.

Input

{
"urls": ["https://apify.com"],
"factPack": "features",
"customTerms": [],
"maxPagesPerSite": 8,
"maxFactsPerSite": 20,
"requestTimeoutSecs": 20
}

Supported factPack values:

  • pricing
  • faq
  • features
  • policies
  • docs
  • company
  • custom

For custom, provide customTerms such as ["SOC 2", "API", "enterprise"].

Output

Each website produces one fact-pack record with:

  • direct facts and evidence snippets;
  • source URLs for every fact;
  • the sanitized starting URL plus the site origin used as the crawl boundary;
  • matched terms;
  • pages scanned;
  • confidence and completeness scores;
  • missing fields and diagnostics;
  • uncharged error records for blocked or failed sites.

Pricing

  • $0.00005 when a run starts.
  • $0.005 for each useful evidence-complete website fact pack.
  • Failed, weak partial, duplicate, and empty records do not trigger useful-fact-pack-result.
  • There is no apify-default-dataset-item charge, so writing an error record does not create a result fee.

Safety

  • Public HTTP and HTTPS pages only.
  • URLs with credentials, query parameters, or fragments are rejected.
  • Clean path URLs such as /pricing, /docs, or /privacy are preserved as starting pages so users can target a specific fact page.
  • Account, invite, reset, unsubscribe, or token-like paths are rejected and redacted to the site origin.
  • Local and private-network addresses are blocked.
  • Redirects must stay on the requested website.
  • Each HTML page is limited to 3 MB.
  • Crawling is bounded by maxPagesPerSite.