SaaS Pricing Intelligence Extractor avatar

SaaS Pricing Intelligence Extractor

Pricing

Pay per usage

Go to Apify Store
SaaS Pricing Intelligence Extractor

SaaS Pricing Intelligence Extractor

Deterministic, SSRF-guarded extraction of SaaS pricing tiers from any public pricing-page URL. Returns structured plans (name, price, billing period, features) plus all detected price strings. Pure code, no AI and no paid API by default. Optional AI enrichment with your own key.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Ahmed Moussa

Ahmed Moussa

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Deterministic extraction of SaaS pricing tiers from any public pricing-page URL.

What it does

Give it a pricing page; it returns structured plans — plan name, price, billing period and feature bullets — plus every price string detected on the page.

Why it is safe & cheap

  • Pure code, no AI by default. Pricing is parsed with deterministic regex + structural HTML heuristics. No LLM and no paid API are called on a normal run, so there is $0 idle cost and $0 uncovered per-run cost beyond Apify compute.
  • SSRF-guarded. Reuses OMEGA's proven DataPulse fetch core: private / loopback / link-local / reserved IPs are blocked (fail-closed), a domain blocklist is enforced, and every redirect hop is re-validated.
  • Bounded. Hard caps on connect/read timeouts, response size (2 MB) and redirect count — it never hangs.

Input

{
"url": "https://www.example.com/pricing",
"urls": ["https://other.com/pricing"],
"llm_api_key": "<optional: YOUR OpenRouter key>",
"llm_model": "deepseek/deepseek-chat"
}

url / urls are public pricing-page URLs. llm_api_key is optional and, if supplied, enables AI enrichment using your own key — the actor never uses any built-in AI key.

Output (one dataset item per URL)

{
"url": "https://www.example.com/pricing",
"status": "completed",
"plans": [
{"name": "Starter", "price": "$19", "period": "month", "features": ["10 projects", "Email support"]},
{"name": "Pro", "price": "$49", "period": "month", "features": ["Unlimited projects", "Priority support"]}
],
"raw_prices": ["$19", "$49", "Custom"],
"parse_confidence": "high",
"method": "deterministic_code",
"extracted_at": "2026-06-23T00:00:00+00:00"
}

status is completed, blocked (security gate / SSRF / blocklist / auth-walled) or failed. parse_confidence is a code-owned heuristic (high ≥2 plans, low 1 plan, none 0). When you supply your own key an extra llm_plans field is added; the deterministic plans always stand on their own.

Use cases

  • Competitive pricing monitoring — track competitor plans and price changes.
  • Build a pricing-intelligence dataset across many SaaS vendors.
  • Normalize messy pricing pages into structured plans for analysis or dashboards.

How it works (deterministic, code-only)

The page is fetched through the SSRF-guarded DataPulse core, then plan cards, price strings and feature bullets are extracted with deterministic regex + structural HTML heuristics. A code-owned parse_confidence reflects how many clean plans were found. AI enrichment is optional and only runs with your key.

Limitations (honest)

Pricing pages that render entirely client-side (heavy JS, no server-side HTML) may expose fewer tiers; the actor still returns all detected raw_prices and an honest low/none confidence rather than fabricating plans.