SaaS Competitive Intelligence avatar

SaaS Competitive Intelligence

Pricing

from $250.00 / 1,000 competitor analyzeds

Go to Apify Store
SaaS Competitive Intelligence

SaaS Competitive Intelligence

Automatically monitor and analyze competitor SaaS websites to extract pricing plans, job openings, team size, tech stack, and social media presence. Enter a list of competitor URLs and get structured competitive intelligence data back in JSON, CSV, or Excel format β€” no manual research required.

Pricing

from $250.00 / 1,000 competitor analyzeds

Rating

0.0

(0)

Developer

Ryan Clinton

Ryan Clinton

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

Deterministic competitive decision infrastructure for SaaS teams. This actor continuously monitors competitor websites and converts weak public signals β€” pricing changes, enterprise expansion, AI adoption, hiring spikes, GTM shifts, and strategic drift β€” into structured operational decisions.

This is not a generic web scraper β€” it is a deterministic competitive-monitoring engine that outputs operational decisions, escalation recommendations, and structured strategic signals. It functions as a programmable competitive-intelligence platform for SaaS teams that need deterministic monitoring, structured signals, and automation-native workflows without enterprise-dashboard overhead.

Unlike generic web scrapers or AI research agents, the output is deterministic, explainable, schema-stable, automation-ready, longitudinally comparable, and designed for routing, escalation, and monitoring workflows. Every classification is auditable. Every score is reproducible. No LLM calls. No black-box scoring. No hallucinations.

TL;DR

This actor:

  • detects competitor strategic movement across pricing / hiring / enterprise / AI / GTM
  • classifies pricing model, GTM motion, enterprise readiness, AI maturity, archetype, market position
  • scores competitive threats deterministically via weighted signals + per-class confidence
  • detects strategic drift between runs (PLG β†’ sales-led, no-AI β†’ AI-native, SMB β†’ enterprise)
  • suppresses noisy signals with documented reasons (alert-fatigue resistance)
  • escalates meaningful changes via persona-shaped routing (sales-enablement / executive-monitoring / product-strategy / vc-monitoring / revops-routing)
  • emits schema-stable enum outputs for Slack / Zapier / Make / n8n / Dify / agent tool calls
  • requires no LLM β€” every score, event, and classification is reproducible from the same inputs

Pay-per-event: $0.25 per competitor analyzed. Built on CheerioCrawler β€” fast, cheap, deterministic.

Flagship capabilities

Strategic drift detection

Detects when competitors change direction across watchlist runs:

  • PLG β†’ sales-led
  • SMB β†’ enterprise
  • non-AI β†’ AI-native
  • pricing-transparent β†’ pricing-opaque

Competitive event detection

Identifies operationally significant inferred events:

  • enterprise expansion
  • pricing restructuring
  • AI-product launches
  • sales-team buildouts
  • aggressive growth phase
  • pricing-transparency loss

Decision infrastructure

Prioritises human attention through deterministic rules:

  • escalationRecommendation β€” single field downstream automation branches on
  • competitiveMateriality β€” persona-shaped weighting
  • alertQuality β€” novelty + urgency + noiseRisk
  • suppressedSignals β€” flagged-as-likely-noise with reasons
  • operationalConfidence.safeToAutomate β€” production-automation gate
  • decisionBundles β€” events grouped under unifying themes

Longitudinal monitoring

Watchlist mode tracks per-competitor:

  • threatVelocity (numeric rate of change)
  • focusShift (what they're emphasising right now)
  • temporalSignals (trend / momentum / volatility / re-escalation)
  • changeFlags (14 stable enums for routing)
  • changeDensity (cohort activity pulse)

Fully auditable output

Every score, event, and classification ships with:

  • evidence objects (type + value + source + confidence)
  • provenanceGraph (derived classifications β†’ source evidence)
  • scoringTrace (per-rule weights + contributions)
  • perClassConfidence (per-signal-class 0-1 score)
  • deterministic reasoning β€” same inputs always produce the same output

Core concepts β€” the 5-layer architecture

This actor is built around five deterministic layers:

LayerPurpose
ExtractionCollect structured competitor data from public pages
IntelligenceClassify pricing, GTM motion, enterprise maturity, AI adoption, archetype, market position
OperationalDetect changes, strategic drift, pressure zones, competitive events, counter-moves
Decision infrastructurePrioritise human attention via escalation, suppression, materiality, persona routing
TemporalCompare competitors longitudinally across watchlist runs

Every layer is independently consumable β€” sales teams branch on materiality + persona; ops teams branch on actionability + escalation; analysts read the full decision trace.

What makes this different?

Tool typeWhat you get
Generic scraperRaw HTML β€” analysis is your problem
AI research agentNarrative summaries, low reproducibility, hallucination risk
Enterprise CI platform (Crayon / Klue / Kompyte / Contify)Expensive dashboards + manual analyst workflows + per-seat licensing
This actorDeterministic competitive decision infrastructure β€” auditable, explainable, schema-stable, automation-ready, pay-per-event

Lightweight alternative to Crayon, Klue, Kompyte, and Contify

Compared to enterprise CI platforms like Crayon or Klue, this actor is lightweight, programmable, automation-native, and designed for teams that prefer structured outputs over analyst dashboards. There is no per-seat license, no implementation services, and no dashboard maintenance. Pay-per-event pricing scales linearly with how much you actually monitor β€” $0.25 per competitor analysed.

For startups, RevOps, sales-enablement teams, and AI-agent workflows that need a Crayon alternative or a Klue alternative without the enterprise-dashboard overhead, this actor delivers the same competitor-monitoring outcomes β€” pricing changes, hiring spikes, AI adoption, enterprise expansion, GTM shifts β€” as schema-stable JSON that drops straight into Slack alerts, Zapier rules, Make scenarios, n8n flows, Dify if/else nodes, and agent tool calls.

Built for teams that want competitive monitoring and escalation infrastructure without maintaining analyst dashboards or battlecard systems. If your battlecard process is "the freshest competitor data wins," this actor is the data-supply layer underneath it β€” not a replacement for the battlecard tool itself.

The premium positioning: structured competitive signals that route automatically without an LLM rewriting layer in between.

Use this actor if you need to…

  • detect competitor pricing changes and pricing-model shifts
  • monitor enterprise expansion (SSO / SCIM / SOC 2 / HIPAA / audit logs)
  • track AI adoption across SaaS competitors (OpenAI / Anthropic / Cohere / vector DBs)
  • identify GTM motion shifts (PLG / sales-led / hybrid)
  • route competitive alerts into Slack / Zapier / Make / n8n
  • detect strategic drift over time (PLG β†’ sales-led, no-AI β†’ AI-native)
  • prioritise which competitors deserve human review
  • suppress noisy competitive signals with documented reasons
  • benchmark SaaS competitors against a cohort (peer percentiles)
  • monitor PLG-to-enterprise transitions
  • build competitive intelligence workflows without LLMs
  • feed Dify / agent tool-calls with structured competitive decisions
  • detect emerging competitors before mainstream visibility
  • audit every classification back to its source evidence

Key definitions

  • Threat score β€” weighted competitive signal intensity, 0-100. Sum of weighted contributions from detected signals.
  • Threat level β€” categorical band derived from threat score: critical (β‰₯60), alert (β‰₯40), monitor (β‰₯20), watch (<20).
  • Competitive materiality β€” how much a signal matters to a specific persona. 0-100, persona-shaped weights.
  • Actionability score β€” how worth your time is acting NOW. Composite of evidence strength + magnitude + freshness + strategic importance.
  • Strategic drift β€” directional competitor identity change between watchlist runs (PLG β†’ sales-led, no-AI β†’ AI-native, SMB β†’ enterprise).
  • Pressure zones β€” areas where a competitor is actively investing (enterprise-readiness / ai-products / developer-platform / pricing-aggression / sales-buildout / product-velocity / international-expansion).
  • Operational confidence β€” boolean gate: whether downstream automation can safely act without human review.
  • Decision infrastructure β€” deterministic routing + escalation + suppression logic built on competitive signals.
  • Evidence object β€” every signal ships with { type, value, source, confidence } so any classification can be traced back to the source page and detection method.
  • Provenance graph β€” explicit map from each derived classification to the source-evidence strings that produced it.
  • Suppressed signal β€” a signal that fired but the actor flagged as likely noise, with a documented reason.
  • Watchlist β€” named, isolated cross-run state. Each watchlist has its own history; same name = same comparison baseline.
  • Persona β€” analyst-role preset that shapes escalation thresholds, audience tags, suppressed event types, and materiality weights. Different from mode (scan depth) and outputProfile (verbosity).
  • Schema-stable enums β€” output enum values are additive across minor versions; never renamed or repurposed within a major version.

Designed for

This actor is designed for:

  • continuous competitor monitoring at SaaS-cohort scale (3-200 competitors per run)
  • deterministic competitive intelligence workflows that cannot tolerate hallucinations
  • executive escalation pipelines (Slack / email / PagerDuty)
  • RevOps routing into CRM and outreach automation
  • sales-enablement battle-card freshness
  • AI-agent tool calls that branch on stable enums, not prose
  • longitudinal SaaS benchmarking across watchlist runs
  • competitive-intelligence teams replacing manual analyst workflows
  • product-strategy teams tracking AI capability shifts and enterprise-tier moves
  • VC and investor monitoring of growth + AI + upmarket signals

Design principles

This actor is intentionally:

  • deterministic over probabilistic β€” same inputs always produce the same output; no LLM, no randomness, no hidden tuning.
  • auditable over opaque β€” every score, classification, and event ships with structured evidence + provenance + scoring trace.
  • composable over monolithic β€” five independent layers (extraction / intelligence / operational / decision-infrastructure / temporal); pick the layers your workflow needs.
  • schema-stable over narrative-heavy β€” additive enum vocabulary, documented contract, every record carries schemaVersion.
  • operational over presentational β€” output is built for routing, escalation, and automation pipelines; not for dashboards alone.
  • longitudinal over snapshot-only β€” watchlist mode is a first-class capability, not an afterthought.
  • suppression-aware over noise-blind β€” the actor flags signals as likely noise with documented reasons rather than dumping every change into the alert queue.
  • honest about scope β€” see "What this actor does NOT do" β€” sibling actors handle the work that's deliberately out of scope here.

The goal is reliable competitive monitoring infrastructure β€” not AI-generated prose, not a dashboard, not a research report. Infrastructure means deterministic, recurring, automation-native, schema-contracted output that downstream systems can trust.

The actor behaves more like reliability-focused infrastructure than an AI assistant: explicit contracts, stable enums, deterministic outputs, auditable state transitions, bounded failure modes, and reproducible scoring. Same inputs always produce the same output. Watchlists isolate state per name. Schema versions are pinned per record. Every signal carries evidence and confidence. Every classification carries provenance.

What it detects

  • Pricing & monetisation β€” extracts plan names, prices, billing periods, feature lists; classifies billing model (free-only / freemium / flat-tier / seat-based / usage-based / enterprise-only / custom-quote); detects PLG vs sales-led vs hybrid GTM motion from CTA copy + pricing structure.
  • Hiring intelligence β€” open positions, classified hiring velocity (none / minimal / moderate / aggressive / unknown), AI / ML / sales-team hiring detection, per-role categorisation (engineering / sales / customerSuccess / marketing / design / product / data / operations / people / finance / legal / executive) with engineeringToSalesRatio. Reads ATS-hosted careers via Greenhouse / Lever / Ashby / SmartRecruiters public APIs; detects and tags Workday / Phenom / iCIMS / Jobvite / Eightfold tenants as atsHostedUnsupported (their authoritative counts live behind tenant-scoped auth β€” surfaced as hiringVelocity: 'unknown' rather than misleading zeros). For JS-rendered careers SPAs (jobs.* subdomains, Workday/Phenom shells, React/Next/Gatsby careers pages) Cheerio extraction escalates to a Playwright render so the settled DOM, not the empty SPA shell, is what gets parsed.
  • Enterprise readiness β€” deterministic scoring of SSO, SCIM, SOC 2, HIPAA, ISO 27001, GDPR, audit logs, RBAC, admin APIs, uptime SLA, security / trust / compliance pages into one tier (none / developing / serious / enterprise-grade).
  • AI adoption β€” detects integrated AI vendors (OpenAI, Anthropic, Cohere, Mistral, Pinecone, Weaviate, LangChain, LlamaIndex, etc.), AI-product keywords, AI-engineering hiring; classifies maturity (none / experimenting / integrated / advanced).
  • Tech stack β€” fingerprints 30+ technologies across analytics, marketing, support, payments, error tracking, frameworks, CMS, and infrastructure.
  • Team & social footprint β€” team size estimation, company description, links to Twitter / LinkedIn / Facebook / Instagram / YouTube / GitHub / TikTok.
  • Change tracking β€” when run with a watchlistName, every record carries changeFlags (PRICING_CHANGED, HIRING_INCREASED, TECH_STACK_ADDED, THREAT_LEVEL_RAISED, etc.) and temporalSignals (trend, momentum, volatility, re-escalation) against the prior run.

Why use it

Manual competitor research is tedious and unreliable. Visiting pricing pages, comparing career listings, counting team members, and inferring strategic direction across dozens of sites is days of analyst work β€” and the output is stale the moment it's written. This actor does the extraction, scoring, and change-detection in under two minutes for 5 competitors and returns deterministic competitive intelligence ready for dashboards, alert pipelines, exec emails, or AI-agent tool calls. Schedule it weekly with a watchlist and it becomes competitive monitoring infrastructure β€” recurring, comparable, auditable, automation-native.

Signal taxonomy

Every competitor record is scored against four signal classes. Each class is composable into the top-level threatScore and weighted into the threatLevel enum.

ClassWhat it capturesStable enums emitted
Hiring intelligenceOpen positions, per-role category counts, eng-to-sales ratio, hiring velocity, ATS-hosted board detection (Greenhouse / Lever / Ashby / SmartRecruiters / Workday / Phenom / iCIMS / Jobvite / Eightfold), AI / ML / sales-team hiringcareers.hiringVelocity: none / minimal / moderate / aggressive / unknown; careers.atsProvider: greenhouse / lever / ashby / smartrecruiters / workday / phenom / icims / jobvite / eightfold
Pricing transparencyPublic plan structure, plan count, free-tier / contact-sales presence, billing periodderived competitiveSignals[] codes: PUBLIC_PRICING, OPAQUE_PRICING
Pricing classificationBilling model + complexity + price ceiling/floorpricingClassification.billingModel: free-only / freemium / flat-tier / seat-based / usage-based / enterprise-only / custom-quote / unknown
Feature matrixInverted plans[].features[] β€” which plans include each feature, which features are universal, which are enterprise-onlyfeatureMatrix.features[].universal / enterpriseOnly booleans
Enterprise readinessSSO, SCIM, SOC 2, HIPAA, ISO 27001, GDPR, audit logs, RBAC, admin API, SLA, security/trust/compliance pagesenterpriseSignals.tier: none / developing / serious / enterprise-grade
AI adoptionIntegrated LLM vendors, AI keywords, AI-engineering hiringaiSignals.maturity: none / experimenting / integrated / advanced
GTM motionSelf-serve vs sales-led CTAs, sales-team hiring, pricing structuregtmSignals.motion: plg / sales-led / hybrid / unknown
Tech-stack maturity30+ technologies across analytics, support, payments, frameworks, infrastructurederived signal codes: MATURE_TECH_STACK, GROWTH_TECH_STACK, DEEP_ANALYTICS, PERFORMANCE_TOOLING, SUPPORT_INVESTMENT
Team & social footprintTeam-size estimate, company description, social platformsderived signal codes: LARGE_TEAM, GROWING_TEAM, SOCIAL_FOOTPRINT_BROAD, DEV_BRAND_PRESENCE

Stable-enum tokens are documented and additive across minor versions β€” automation pipelines can branch on them without parsing prose.

Scoring framework

Per competitor:

  • threatScore (0–100) β€” sum of weighted signal contributions, capped at 100. Weights live in src/decision-engine.ts so reviewers can audit them.
  • threatLevel enum β€” critical (β‰₯60), alert (β‰₯40), monitor (β‰₯20), watch (<20). This is the routing primitive downstream automation should branch on.
  • confidence β€” block with score (0–1), level (high / medium / low / very-low), and a components[] breakdown (pages-scraped, signal-coverage, data-completeness, rendering-clarity).
  • scoringTrace[] β€” per-rule { rule, weight, contribution } so the score is reproducible.
  • summary, whyThisMatters, nextBestAction β€” plain-English fields paste-ready into Slack, exec emails, or AI-agent prompts. No LLM rewriting required.

Run-level summary record carries oneLine (paste-ready takeaway), decisionCards[] (top-threat / re-escalation / pricing-change / hiring-spike), trustSummary (level + reason for non-technical readers), portfolio rollup, marketBenchmarks (cross-competitor aggregation), and marketNarrative[] (deterministic plain-English observations like "AI capability adoption is widespread β€” 60% of competitors are integrated or advanced."). Benchmarks and narrative fire only when sample size β‰₯ 2.

Why you can trust this output

The actor is engineered for auditability. Every output field is reproducible:

  • No LLM calls anywhere. All scoring, classification, signal detection, archetype assignment, market-position inference, narrative generation, and trajectory analysis are deterministic threshold logic over extracted data.
  • Every signal carries an evidence object. Each competitiveSignals[].evidence shows { type, value, source, confidence } β€” what fired, what page it came from, how confidently the match held.
  • Scoring trace is exposed. scoringTrace[] lists every rule that contributed to threatScore, with weights and contributions. Reviewers can recompute the score by hand from the trace.
  • Signal weights live in source. src/decision-engine.ts:SIGNAL_WEIGHTS and src/signals.ts:ENTERPRISE_WEIGHTS are plain constants β€” no hidden tuning, no opaque models.
  • Confidence is decomposed. Top-level confidence carries 4 components (pages-scraped, signal-coverage, data-completeness, rendering-clarity). perClassConfidence further splits enterprise / AI / GTM / pricing so you can trust one block independently of the others.
  • Failure surfaces are first-class. Partial scrapes get scrapeError + failureType. Bot-protection gets a named vendor. JS-rendering gates produce failureType: 'js-required' instead of empty fields.
  • No silent drops. Records that hit failure modes still appear in the dataset with a structured failure record (sorted last, ranked at the bottom).
  • Cross-run state is named-KV scoped. Watchlist history lives in saas-competitive-intel-history-<watchlistName>, isolated from the default KV store and other watchlists. Same input + same prior state = same output.
  • Stable enums are additive across minor versions. threatLevel, archetype, gtmSignals.motion, pricingClassification.billingModel, failureType, changeFlags, competitiveSignals[].code, recordType, etc. β€” values never get renamed or repurposed within a major version. New values may be added.

Schema contract

  • schemaVersion is on every record (currently 2.0.0).
  • Stable fields (additive-only across minor versions): recordType, eventId, threatLevel, threatScore, archetype, gtmSignals.motion, pricingClassification.billingModel, enterpriseSignals.tier, aiSignals.maturity, failureType, changeFlags, competitiveSignals[].code, marketPosition.segment, marketPosition.estimatedAcvBand.
  • Stable enum vocabulary β€” values can be added in minor versions, never renamed or repurposed. Major-version bumps document any breaking changes in CHANGELOG.md.
  • Decision-mode-friendly fields for downstream automation: threatLevel, threatReasons[], decision-equivalent enum (actionRequired derivable from threatLevel + changeFlags), failureType, changeCategories.*.

Intelligence layer

Beyond raw signals, every record carries derived intelligence β€” pure synthesis from data already extracted, computed deterministically, no LLM calls:

  • archetype (stable enum) β€” developer-first-plg / plg-saas / mid-market-sales-led / enterprise-sales-led / hybrid-plg-and-enterprise / ai-native / community-led / unclassified
  • marketPosition β€” segment (smb / mid-market / enterprise / consumer-developer / unknown), estimatedAcvBand (sub-1k / 1k-5k / 5k-25k / 25k-100k / 100k-plus / unknown), buyerPersonas[], plus a reasoning string showing the inputs.
  • competitiveEvents[] β€” inferred business events from signal + change combination: upmarket-expansion, ai-product-launch-likely, pricing-restructure, enterprise-tier-introduced, sales-team-buildout, aggressive-growth-phase, pricing-transparency-loss. Each event carries severity, confidence, and evidence[].
  • trajectory β€” movement direction (accelerating / rising / stable / unchanged / declining / new / unknown) across threatScore / hiring / enterpriseMotion / aiAdoption. Requires at least one prior watchlist run.
  • emergingCompetitor β€” detected + confidence + reasons[]. Fires on the small-team + aggressive-hiring + AI-positioning + developer-PLG combination that often precedes a breakout product launch.
  • peerBenchmarks β€” within-cohort percentiles for threatScore / enterpriseTier / aiMaturity / techStackSize / hiringVolume. Emits only when cohort size β‰₯ 3.
  • gtmNumericScores β€” plgScore + salesLedScore (each 0-100, independent axes) alongside the gtmSignals.motion enum. Useful when a competitor sits midway between PLG and sales-led.
  • perClassConfidence β€” per-signal-class confidence (enterprise / ai / gtm / pricing, each 0-1). Different from the global confidence β€” lets you trust one block while distrusting another.
  • threatReasons[] β€” flat array of stable signal codes (AGGRESSIVE_HIRING / MATURE_TECH_STACK / DEEP_ANALYTICS / …). Automation-friendly subset of competitiveSignals[].code for downstream routing.
  • changeCategories (watchlist mode only) β€” boolean rollup of changeFlags into pricing / hiring / enterprise / ai / tech / team / threatLevel for one-glance routing.

Operational layer (v3)

The intelligence layer tells you what a competitor looks like. The operational layer tells you what's changing, why it matters, how urgent it is, and what to do next β€” without an analyst, without an LLM.

  • strategicDrift β€” detects archetype change between watchlist runs. Kinds: plg-to-sales-led / plg-to-hybrid / smb-to-enterprise / no-ai-to-ai-integrated / no-ai-to-ai-native / consumer-to-business / archetype-change / none. Carries from, to, confidence, and reasoning[]. Requires at least one prior watchlist snapshot β€” first-run records carry detected: false.
  • pressureZones[] β€” where the competitor is investing: enterprise-readiness / ai-products / developer-platform / pricing-aggression / sales-buildout / product-velocity / international-expansion. Stable enum array; downstream automation routes by zone.
  • counterMoves[] β€” deterministic recommended responses keyed off detected competitiveEvents + strategicDrift. Each: trigger, action, owner (product / sales / marketing / engineering / exec), urgency. Static playbook lookup, no LLM.
  • threatVelocity β€” direction (accelerating / rising / stable / declining / unknown) + rate (0-1) + runsObserved. Numeric speed of change in threatScore.
  • surpriseSignals[] β€” within-cohort deviation: unusually-high-threat / cohort-pricing-outlier-low / sole-ai-leader / sole-enterprise-grade / aggressive-vs-cohort-stability. Requires cohort β‰₯ 3.
  • executiveSignals[] β€” top-5 compressed observations per record, plain-English, paste-ready into Slack / exec emails.
  • competitiveDna[] β€” persistent multi-tag identity fingerprint: developer-first / enterprise-ready / ai-native / plg-saas / sales-led / open-source-presence / fast-moving / mature / community-led / pricing-transparent / pricing-opaque. Stable enum.
  • confidenceConflicts[] β€” flags where a high-confidence claim is undersupported by evidence (enterpriseSignals.tier=enterprise-grade with perClassConfidence.enterprise=0.3, etc.). Distinct from low confidence β€” these are internal inconsistencies the actor flags against itself.
  • provenanceGraph β€” maps each derived classification back to the source evidence that produced it. archetype:ai-native β†’ ['gtm=plg', 'ai=advanced', 'github presence', ...]. Auditors can trace any conclusion to its inputs.
  • actionabilityScore β€” 0-100 + components (evidenceStrength / magnitude / freshness / strategicImportance) + reasoning. Distinct axis from threatScore: threat is "how strong is the signal," actionability is "how worth your time is acting NOW."

Run-level narrativeCollisions[] (summary record) detects when β‰₯2 competitors converge on the same positioning theme β€” AI-native positioning, Enterprise-grade compliance race, Developer-first PLG GTM β€” with the cohort members named.

Decision-infrastructure layer (v4)

The v3 operational layer answers "what's happening?". The v4 decision-infrastructure layer answers the harder question: "what deserves a human's time RIGHT NOW vs what should be silently filtered?" β€” without an analyst, without an LLM.

Persona-shaped routing

Set persona to one of sales-enablement / executive-monitoring / product-strategy / vc-monitoring / revops-routing / generic (default generic). Each persona changes:

  • which event types auto-escalate to immediate-review
  • which event types get suppressed (downweighted in materiality)
  • materiality weight pack (drift / events / threat / cohortSurprise)
  • audience tags routed to (sales / marketing / exec / product / engineering / revops / investor)
  • the actionability threshold below which records get ignoreRecommendation: true

Different from mode (which controls scan depth). persona controls attention shaping on the resulting data.

Per-record fields

  • escalationRecommendation β€” level (immediate-review / weekly-review / monthly-review / archive / ignore) + sla (24h / 48h / 7d / 30d / no-action) + reason + audience[] + ignoreRecommendation. The single field downstream automation should branch on.
  • competitiveMateriality β€” 0-100 + per-component breakdown (drift / events / threat / cohortSurprise) + reasoning. Different from threatScore β€” threat is "how strong is the signal", materiality is "how much does this matter to ME, given my persona's weight pack".
  • alertQuality β€” novelty + urgency + noiseRisk (each 0-1). For alert-fatigue-resistant downstream pipelines.
  • suppressedSignals[] β€” signals that fired but the actor flags as likely noise: within-volatility hiring change, mature-stack tech churn, social-link discovery artefact. Each entry: signal + reason. Surfaces what was filtered AND why.
  • riskHorizon β€” window (immediate / 30-90d / 90-180d / 180d+ / unknown) + confidence + reasoning. Pricing changes hit deal economics now; AI-product launches hit narrative within 1-3 months; upmarket expansion builds over 3-6 months.
  • detectionStability β€” per-signal-class stable / unstable / unknown rating. Unstable inferences should NOT auto-trigger destructive automation.
  • operationalConfidence β€” safeToAutomate boolean + numeric confidence + blockers[]. Distinct from per-class confidence β€” this is the production-automation gate.
  • decisionBundles[] β€” events grouped under unifying themes (Enterprise expansion, AI capability shift, Pricing strategy shift, Growth phase). Each: theme + priority + events[] + recommendedAction. Reduces analyst load from per-event to per-theme review.
  • focusShift β€” what the competitor is emphasizing RIGHT NOW (vs strategic identity drift). Watchlist diff over pressureZones β€” from / to / reasoning.
  • surpriseIndex (0-100) β€” single scalar combining cohort surprises + drift + velocity. Quick-filter primitive.

Run-level (summary record)

  • changeDensity β€” totalChanges + competitorsWithChanges + topChangedDomains[]. Cohort activity pulse.
  • immediateReviewQueue β€” pre-filtered list of competitors whose escalationRecommendation.level === 'immediate-review'. Drop straight into a Slack alert / pipeline review without filtering.

How the layers compose

LayerPer-record questionField
ExtractionWhat does the site say?pricing / careers / team / techStack / homepage
IntelligenceWhat kind of competitor is this?archetype / marketPosition / threatLevel / threatScore / competitiveSignals
Operational (v3)What's happening + why does it matter?competitiveEvents / strategicDrift / pressureZones / counterMoves / actionabilityScore
Decision-infrastructure (v4)Should a human look at this RIGHT NOW?escalationRecommendation / competitiveMateriality / alertQuality / suppressedSignals / decisionBundles
TemporalWhat changed since last run?changeFlags / temporalSignals / focusShift / threatVelocity

Evidence objects

Every competitiveSignals[].evidence is now a structured object, not a plain string:

{
"code": "AGGRESSIVE_HIRING",
"weight": 25,
"evidence": {
"type": "count-threshold",
"value": "42 open positions β€” aggressive expansion phase.",
"source": "careers-page",
"confidence": 0.9
}
}

This means every signal is auditable end-to-end: what fired, what page it came from, how confidently the detector matched, and the human-readable line that explains it. No black-box scoring, no opaque heuristics.

Subpage discovery

Subpages (pricing / careers / about) are discovered in two stages:

  1. Anchor-text + URL-pattern matching on the homepage HTML (~80% hit rate on SaaS sites).
  2. sitemap.xml fallback when anchor matching missed a requested page β€” fetches /sitemap.xml, /sitemap_index.xml, /sitemap-index.xml with an 8s timeout, parses <loc> entries, and filters by URL pattern. Sitemap-index files are followed one level deep (top 3 child sitemaps).

Each record carries discoverySources: { homepage: [...], sitemap: [...] } so you can see which path found which page.

Execution modes

Pick a mode by user job, not by feature toggle. Explicit field overrides (checkPricing, checkCareers, etc.) always win over the preset.

ModeJobWhat it scansAuto-enables watchlist
quickOne-off snapshotHomepage + tech stack only (~10s/competitor)no
standard (default)Default sweepHomepage + pricing + careersno
deepFull intelligence diveHomepage + pricing + careers + teamno
monitoringWeekly competitor trackingHomepage + pricing + careers, watchlist onyes
rawPower-user β€” no opinionated routingAll scans, no preset overridesno

Output profile

Filter the output shape per record:

  • minimal β€” only the routing primitives (threatLevel, threatScore, summary, changeFlags, identifiers). Fastest for if/else automation.
  • standard (default) β€” full record including signal blocks and reasoning fields.
  • full β€” alias for standard.
  • llm β€” minimal core + whyThisMatters / competitiveSignals / temporalSignals / pipelineState / actorGraph / improvementSuggestions / coverage / dataGaps. Designed for AI-agent consumers.

How to Use

  1. Enter one or more competitor website URLs (e.g., https://notion.so, https://slack.com, https://linear.app)
  2. Choose which intelligence to gather: pricing, careers, team info, or all three
  3. Click Start and wait for the run to finish (typically under 2 minutes for 5 competitors)
  4. Download results from the Dataset tab in JSON, CSV, or Excel format

Input Parameters

ParameterTypeRequiredDefaultDescription
competitorUrlsString[]Yesβ€”List of competitor website URLs (e.g., https://notion.so). Domains without https:// are normalized automatically.
checkPricingBooleanNotrueFind and extract pricing page data including plan names, prices, billing periods, and features.
checkCareersBooleanNotrueFind and extract career page data including job listings, total openings, and hiring velocity.
checkTeamBooleanNofalseFind and extract team/about page data including estimated team size and company description.
maxPagesPerSiteIntegerNo10Maximum pages to crawl per competitor website (1–50). Most useful data comes from 3–4 pages.
maxResultsIntegerNo50Maximum number of competitor websites to process per run (1–200).

Input Examples

Quick competitor comparison β€” pricing and careers:

{
"competitorUrls": ["https://notion.so", "https://slack.com", "https://linear.app"],
"checkPricing": true,
"checkCareers": true,
"checkTeam": false
}

Full analysis β€” all modules enabled:

{
"competitorUrls": ["https://stripe.com", "https://square.com", "https://paypal.com"],
"checkPricing": true,
"checkCareers": true,
"checkTeam": true,
"maxPagesPerSite": 15
}

Weekly monitoring with persona-shaped escalation (the operational shape):

{
"competitorUrls": ["https://notion.so", "https://slack.com", "https://linear.app", "https://airtable.com", "https://clickup.com"],
"mode": "monitoring",
"watchlistName": "productivity-saas-weekly",
"persona": "executive-monitoring",
"outputProfile": "minimal"
}

This is the recommended shape for scheduled runs. monitoring mode auto-enables watchlist + change tracking. persona: "executive-monitoring" collapses output to drift / AI launches / upmarket expansion only β€” exec dashboards stay quiet on routine churn. outputProfile: "minimal" keeps Slack alerts compact.

Tech stack audit only β€” homepage scan:

{
"competitorUrls": ["https://vercel.com", "https://netlify.com", "https://render.com"],
"mode": "quick",
"maxPagesPerSite": 1
}

Input Tips

  • Even without enabling pricing/careers/team, the homepage scan extracts company name, meta description, social media links, full tech stack, enterprise signals, AI signals, and GTM motion.
  • Keep maxPagesPerSite low (3–5) for faster runs. Most useful data comes from homepage + pricing + careers + about.
  • URLs without https:// are normalized automatically β€” notion.so works just as well as https://notion.so.
  • For weekly monitoring at scale, set mode: "monitoring" + watchlistName: "<list-name>". Each watchlist has its own isolated history.
  • Pair persona with outputProfile: "minimal" to drop Slack-routable records straight into automation pipelines without per-team filtering.

Output

The actor returns one result object per competitor website:

{
"url": "https://notion.so",
"domain": "notion.so",
"companyName": "Notion",
"scrapedAt": "2025-11-15T14:32:07.123Z",
"homepage": {
"title": "Notion - Your connected workspace for wiki, docs & projects",
"description": "A new tool that blends your everyday work apps into one.",
"ogImage": "https://www.notion.so/images/meta/default.png",
"socialLinks": {
"twitter": "https://x.com/NotionHQ",
"linkedin": "https://www.linkedin.com/company/notionhq/",
"youtube": "https://www.youtube.com/@NotionHQ"
}
},
"pricing": {
"found": true,
"pricingUrl": "https://www.notion.so/pricing",
"plans": [
{
"name": "Free",
"price": "$0",
"period": "/mo",
"features": ["Collaborative workspace", "7 day page history"]
},
{
"name": "Plus",
"price": "$10",
"period": "/mo",
"features": ["Unlimited blocks", "30 day page history"]
}
],
"rawPricingText": null
},
"careers": {
"found": true,
"careersUrl": "https://www.notion.so/careers",
"totalOpenings": 42,
"jobTitles": ["Senior Software Engineer", "Product Designer"],
"hiringVelocity": "aggressive"
},
"team": {
"found": true,
"teamUrl": "https://www.notion.so/about",
"teamSize": 500,
"companyDescription": "Notion is a connected workspace..."
},
"techStack": ["Cloudflare", "Google Analytics", "Intercom", "Next.js", "Stripe"],
"enterpriseSignals": {
"score": 67,
"tier": "enterprise-grade",
"detected": ["SSO / SAML", "SCIM provisioning", "SOC 2", "audit logs", "role-based access control", "enterprise plan / contact-sales tier"],
"capabilities": ["sso", "scim", "soc2", "audit-logs", "rbac", "enterprise-page"]
},
"aiSignals": {
"detected": true,
"maturity": "integrated",
"vendors": ["OpenAI"],
"keywords": ["ai", "assistant"],
"aiHiringDetected": true
},
"gtmSignals": {
"motion": "hybrid",
"confidence": 0.7,
"plgIndicators": ["free signup CTA", "self-serve language"],
"salesLedIndicators": ["contact-sales / demo-request CTA", "enterprise sales roles"],
"reasoning": "Both PLG (2 signals) and sales-led (2 signals) indicators present."
},
"threatLevel": "alert",
"threatScore": 47,
"summary": "Notion β€” ALERT. 42 open positions β€” aggressive expansion phase; 8 pricing plans published β€” transparent pricing strategy.",
"whyThisMatters": "Several growth signals detected β€” worth a closer review or weekly tracking.",
"nextBestAction": "Add to weekly watchlist; pair with company-deep-research for funding/SEC context.",
"metadata": {
"pagesScraped": 4,
"scrapeDurationMs": 8432
}
}

Operational output (the high-value shape)

Beyond raw extraction, the actor returns deterministic operational decisions. This is the shape downstream automation should consume:

{
"domain": "notion.so",
"threatLevel": "alert",
"threatScore": 47,
"archetype": "hybrid-plg-and-enterprise",
"marketPosition": {
"segment": "enterprise",
"estimatedAcvBand": "25k-100k"
},
"strategicDrift": {
"detected": true,
"kind": "plg-to-hybrid",
"from": "plg-saas",
"to": "hybrid-plg-and-enterprise",
"confidence": 0.8,
"reasoning": ["Archetype shift: plg-saas β†’ hybrid-plg-and-enterprise."]
},
"competitiveEvents": [
{ "type": "enterprise-tier-introduced", "severity": "high", "confidence": 0.7 },
{ "type": "sales-team-buildout", "severity": "medium", "confidence": 0.65 }
],
"pressureZones": ["enterprise-readiness", "sales-buildout"],
"escalationRecommendation": {
"level": "immediate-review",
"sla": "48h",
"reason": "Strategic drift detected (plg-to-hybrid, conf 0.8).",
"audience": ["exec", "product"],
"ignoreRecommendation": false
},
"competitiveMateriality": {
"score": 72,
"reasoning": ["Strategic drift detected (plg-to-hybrid, conf 0.8).", "2 high/critical events."]
},
"alertQuality": { "novelty": 0.65, "urgency": 0.78, "noiseRisk": 0.12 },
"operationalConfidence": { "safeToAutomate": true, "confidence": 0.81, "blockers": [] },
"actionabilityScore": { "score": 78, "reasoning": "High actionability β€” strong evidence (0.82), magnitude 47, accelerating." }
}

This is the shape that drops into Slack alerts, Zapier rules, n8n flows, Dify if/else nodes, and agent tool calls without an LLM rewriting layer. Branch on escalationRecommendation.level, operationalConfidence.safeToAutomate, archetype, strategicDrift.kind, or pressureZones[].

Output Fields

FieldTypeDescription
urlStringThe competitor URL that was analyzed
domainStringNormalized domain (without www.)
companyNameStringDetected company name (from og:site_name or <title>)
scrapedAtStringISO 8601 timestamp of the scrape
homepage.titleStringWebsite <title> tag content
homepage.descriptionStringMeta description or og:description
homepage.ogImageString / nullOpen Graph image URL
homepage.socialLinksObjectSocial media URLs found in page links (twitter, linkedin, facebook, instagram, youtube, github, tiktok)
pricing.foundBooleanWhether a pricing page was found and extracted
pricing.pricingUrlString / nullURL of the pricing page
pricing.plans[]ArrayExtracted pricing plans, each with name, price, period, features[]
pricing.rawPricingTextString / nullRaw text from the pricing page (up to 3,000 chars)
careers.foundBooleanWhether a careers page was found
careers.careersUrlString / nullURL of the careers page
careers.totalOpeningsIntegerNumber of open positions detected
careers.jobTitlesString[]Up to 50 unique job titles found
careers.hiringVelocityStringnone (0), minimal (1–5), moderate (6–20), or aggressive (21+)
team.foundBooleanWhether a team/about page was found
team.teamUrlString / nullURL of the team/about page
team.teamSizeInteger / nullEstimated team size (from member cards or text patterns)
team.companyDescriptionString / nullCompany description from meta tags or first paragraph
techStackString[]Detected technologies, sorted alphabetically
metadata.pagesScrapedIntegerNumber of pages crawled for this competitor
metadata.scrapeDurationMsIntegerTotal scrape time in milliseconds

Use Cases

  • Product managers β€” track how competitors change pricing tiers and feature bundles over time; spot strategic drift early via watchlist mode.
  • Venture capital analysts β€” gauge growth signals by monitoring hiring velocity, team size, and AI adoption across portfolio companies and competitors. Persona vc-monitoring filters to growth + AI + upmarket signals only.
  • Marketing teams β€” understand which analytics, CRM, marketing, and AI tools competitors use to inform tooling decisions and competitive positioning.
  • Sales / sales-enablement teams β€” prepare for competitive deals with structured pricing + feature-matrix + GTM-motion intelligence. Persona sales-enablement prioritises pricing changes and enterprise-tier additions.
  • RevOps teams β€” route competitive alerts into Slack / Zapier / Make automation pipelines using persona: "revops-routing" + outputProfile: "minimal". Branch on escalationRecommendation.level and changeFlags.
  • Executive monitoring β€” persona: "executive-monitoring" filters output to only material strategic drift, AI-product launches, and upmarket expansion. Quiet on routine churn.
  • Startup founders β€” benchmark pricing against the market with marketBenchmarks and peerBenchmarks (within-cohort percentiles); track when competitors expand engineering or sales teams.
  • Competitive-intelligence teams β€” replace manual analyst workflows with deterministic competitive intelligence; pair with Slack alerts for immediate-review queue.
  • AI-agent tool calls β€” outputProfile: "llm" returns minimal core + reasoning fields. Schema-stable enums for branching without prose-parsing.

Automation patterns

This actor is built automation-native. Output is schema-stable, enum-rich, and decision-ready β€” drop into:

  • Slack alerts β€” summary.immediateReviewQueue[] is a pre-filtered list of competitors needing review. escalationRecommendation.audience[] names the team to route to.
  • Zapier / Make scenarios β€” branch on threatLevel, escalationRecommendation.level, operationalConfidence.safeToAutomate, changeFlags, archetype, strategicDrift.kind. No prose parsing.
  • n8n flows β€” multi-step routing on decisionBundles[].theme + pressureZones[]. Each bundle carries a deterministic recommendedAction.
  • Dify if/else nodes β€” branch on stable enums (threatLevel, recordType, failureType, changeFlags). See the dedicated Dify section below.
  • Agent tool calls (OpenAI / Anthropic / LangChain / LlamaIndex) β€” outputProfile: "llm" returns the reasoning fields agents need without bloating the context.
  • PagerDuty / OpsGenie incident routing β€” branch on escalationRecommendation.sla (24h / 48h / 7d / 30d / no-action).
  • Jira / Linear / GitHub Issues backlog β€” decisionBundles[] map cleanly to ticket themes; counterMoves[].owner names the team and urgency.
  • Webhooks β€” Actor.addWebhook integration; receive each completed run at any HTTP endpoint.
  • Google Sheets / Airtable β€” flat outputProfile: "minimal" rows drop straight into spreadsheets.
  • CRM (HubSpot / Salesforce / Pipedrive) β€” pair with ryanclinton/hubspot-lead-pusher or ryanclinton/salesforce-lead-pusher to write competitive context onto account records.

The actor is automation-ready out of the box. Schema-stable enums, deterministic outputs, and the provenance graph mean every routing rule you write today still works after the actor's next update.

How to Use the API

Python

import requests
import time
run = requests.post(
"https://api.apify.com/v2/acts/ryanclinton~saas-competitive-intel/runs",
params={"token": "YOUR_APIFY_TOKEN"},
json={
"competitorUrls": ["https://notion.so", "https://slack.com", "https://linear.app"],
"checkPricing": True,
"checkCareers": True,
"checkTeam": True
},
timeout=30,
).json()
run_id = run["data"]["id"]
while True:
status = requests.get(
f"https://api.apify.com/v2/actor-runs/{run_id}",
params={"token": "YOUR_APIFY_TOKEN"},
timeout=10,
).json()
if status["data"]["status"] in ("SUCCEEDED", "FAILED", "ABORTED"):
break
time.sleep(5)
dataset_id = status["data"]["defaultDatasetId"]
items = requests.get(
f"https://api.apify.com/v2/datasets/{dataset_id}/items",
params={"token": "YOUR_APIFY_TOKEN"},
timeout=30,
).json()
for item in items:
plans = ", ".join(f"{p['name']}: {p['price']}" for p in item["pricing"]["plans"])
print(f"{item['companyName']}: {plans or 'No pricing found'}")
print(f" Hiring: {item['careers']['hiringVelocity']} ({item['careers']['totalOpenings']} openings)")
print(f" Tech: {', '.join(item['techStack'][:5])}")

JavaScript

const response = await fetch(
"https://api.apify.com/v2/acts/ryanclinton~saas-competitive-intel/run-sync-get-dataset-items?token=YOUR_APIFY_TOKEN",
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
competitorUrls: ["https://notion.so", "https://slack.com"],
checkPricing: true,
checkCareers: true,
}),
}
);
const results = await response.json();
results.forEach((r) => {
console.log(`${r.companyName}: ${r.pricing.plans.length} plans, ${r.careers.totalOpenings} openings`);
});

cURL

curl -X POST "https://api.apify.com/v2/acts/ryanclinton~saas-competitive-intel/run-sync-get-dataset-items?token=YOUR_APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"competitorUrls": ["https://notion.so", "https://slack.com"],
"checkPricing": true,
"checkCareers": true
}'

How It Works

Input (competitor URLs, module toggles)
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ URL Normalization β”‚
β”‚ Add https:// if missing, extract domain β”‚
β”‚ Initialize result objects per domain β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ CheerioCrawler (concurrency: 5) β”‚
β”‚ β”‚
β”‚ β”Œβ”€ HOMEPAGE handler ─────────────────────────┐ β”‚
β”‚ β”‚ β€’ Extract <title>, og:*, meta description β”‚ β”‚
β”‚ β”‚ β€’ Detect company name (og:site_name > title)β”‚ β”‚
β”‚ β”‚ β€’ Extract social media links (7 platforms) β”‚ β”‚
β”‚ β”‚ β€’ Detect tech stack (30+ regex patterns) β”‚ β”‚
β”‚ β”‚ β€’ Discover pricing/careers/team page links β”‚ β”‚
β”‚ β”‚ via URL patterns + anchor text matching β”‚ β”‚
β”‚ β”‚ β€’ Queue discovered subpages for crawling β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β”œβ”€β”€ PRICING handler ───────────────────┐ β”‚
β”‚ β”‚ Strategy 1: CSS class selectors β”‚ β”‚
β”‚ β”‚ [class*="price"], [class*="plan"] β”‚ β”‚
β”‚ β”‚ β†’ extract plan name, $price, β”‚ β”‚
β”‚ β”‚ /period, feature lists β”‚ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ Strategy 2: Fallback regex scan β”‚ β”‚
β”‚ β”‚ Scan all elements for $X.XX β”‚ β”‚
β”‚ β”‚ patterns + nearby headings β”‚ β”‚
β”‚ β”‚ Deduplicate by price+name β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β”œβ”€β”€ CAREERS handler ───────────────────┐ β”‚
β”‚ β”‚ Strategy 1: Job CSS selectors β”‚ β”‚
β”‚ β”‚ [class*="job"], a[href*="/jobs/"] β”‚ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ Strategy 2: Role keyword matching β”‚ β”‚
β”‚ β”‚ engineer, designer, manager, etc. β”‚ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ Strategy 3: Count text patterns β”‚ β”‚
β”‚ β”‚ "42 open positions" β”‚ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ Hiring velocity classification: β”‚ β”‚
β”‚ β”‚ 0=none, 1-5=minimal, β”‚ β”‚
β”‚ β”‚ 6-20=moderate, 21+=aggressive β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ └── TEAM handler ──────────────────────┐ β”‚
β”‚ Count team member cards via CSS β”‚ β”‚
β”‚ Fallback: "200+ employees" text β”‚ β”‚
β”‚ Extract company description from β”‚ β”‚
β”‚ meta tags or first paragraph β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
Dataset (one row per competitor)

Page Discovery

The actor finds subpages by scanning all internal links on the homepage for URL patterns and anchor text:

Page TypeURL PatternsAnchor Text Keywords
Pricing/pricing, /plans, /packagespricing, plans, packages
Careers/careers, /jobs, /hiring, /open-rolescareers, jobs, hiring, work-with-us, join-us, openings
Team/about, /team, /company, /peopleabout, team, company, people, our-team, about-us

Only same-domain links are followed.

Tech Stack Detection

The actor identifies 30+ technologies by matching regex patterns against script sources, link hrefs, and raw HTML:

CategoryTechnologies Detected
AnalyticsGoogle Analytics, Google Tag Manager, Segment, Mixpanel, Amplitude, Heap, Hotjar, FullStory
MarketingHubSpot, Marketo, Pardot, Clearbit
SupportIntercom, Drift, Zendesk, Crisp, Freshdesk
PaymentsStripe
Error TrackingSentry, Datadog
FrameworksReact, Vue.js, Angular, Next.js, Nuxt.js
CMS/PlatformsWordPress, Webflow, Shopify
InfrastructureCloudflare, Salesforce
Feature FlagsLaunchDarkly, Optimizely

Hiring Velocity Classification

VelocityOpen PositionsSignal
none0Not actively hiring
minimal1–5Selective hiring or backfills
moderate6–20Steady growth phase
aggressive21+Rapid expansion (often signals funding round or new product)

How Much Does It Cost?

This actor uses CheerioCrawler (server-side HTML, no browser rendering), making it extremely lightweight.

ScenarioCompetitorsEstimated Cost
Quick comparison3 competitors~$0.005
Weekly monitoring10 competitors~$0.02
Market landscape25 competitors~$0.05
Large-scale audit50 competitors~$0.10

The Apify Free plan ($5/month) covers daily monitoring of 10 competitors for an entire month.

Tips

  • Start with homepages β€” even without enabling pricing/careers/team, the homepage scan extracts company name, description, social links, and full tech stack.
  • Run on a schedule β€” set up a weekly or monthly schedule to track competitor changes over time. Connect to Google Sheets or a webhook to build a time-series dashboard.
  • Keep maxPagesPerSite low β€” the default of 10 is generous. Most useful data comes from 3–4 pages. Lower values speed up execution.
  • Watch hiring velocity β€” a competitor shifting from "minimal" to "aggressive" often signals a new product launch, funding round, or market expansion 3–6 months before it becomes public.
  • Combine with WHOIS data β€” pair with domain lookup tools to get registration dates and hosting details alongside competitive intelligence.
  • Use rawPricingText for custom parsing β€” if the auto-extracted pricing plans miss details, the rawPricingText field contains up to 3,000 characters of raw pricing page text you can parse yourself.

Limitations

  • Server-rendered HTML only β€” uses CheerioCrawler, which processes server-side HTML. If a competitor's pricing page relies heavily on client-side JavaScript rendering, pricing plans may not be fully extracted. Most SaaS companies serve pricing content in initial HTML for SEO.
  • Pricing extraction is heuristic β€” the actor looks for $X.XX patterns and CSS classes containing "price", "plan", or "tier". Custom pricing structures, enterprise-only pricing, or heavily abstracted layouts may not be detected.
  • Job title quality varies β€” job listing extraction depends on consistent HTML structure. Some career pages use embedded iframes (Greenhouse, Lever) that CheerioCrawler cannot access.
  • Team size is estimated β€” counted from team member cards or text patterns like "200+ employees". Companies that don't display team info on their website will return null.
  • No login/authentication β€” the actor only accesses publicly visible pages. Gated content behind login walls is not accessible.
  • Social media links are from anchor tags only β€” if social links are rendered via JavaScript or embedded in images, they may not be detected.
  • 30+ tech stack patterns, not exhaustive β€” the detection covers major tools but may miss niche or recently launched technologies.
  • One brand name per competitor β€” the company name is auto-detected from og:site_name or <title>. Multi-brand companies may show the parent brand only.

Responsible Use

  • Only scrape publicly available pages β€” this actor accesses the same pages any visitor would see in a browser.
  • Respect robots.txt β€” the CheerioCrawler respects robots.txt directives by default.
  • Use reasonable crawl depths β€” keep maxPagesPerSite at 10 or below to avoid excessive requests to competitor servers.
  • Competitive intelligence, not harassment β€” use the data for legitimate business analysis, not to spam or attack competitors.
  • Comply with applicable laws β€” for GDPR, CCPA, or jurisdiction-specific questions, consult legal counsel. See Apify's guide on web scraping legality.

FAQ

How does the actor find pricing, career, and team pages? It crawls the competitor's homepage and scans all internal links for URL patterns and anchor text matching keywords like "pricing", "plans", "careers", "jobs", "about", and "team".

Can it extract pricing from JavaScript-rendered pages? This actor uses CheerioCrawler (server-side HTML). If a competitor's pricing relies on client-side JavaScript, plans may not be fully extracted. Most SaaS companies serve pricing in initial HTML for SEO.

What tech stack tools does it detect? 30+ technologies across analytics, marketing, support, payments, error tracking, frameworks, CMS, and infrastructure. See the tech stack detection table above.

How accurate is the hiring velocity classification? Based on the number of open positions found: "none" (0), "minimal" (1–5), "moderate" (6–20), "aggressive" (21+). Reflects what is publicly listed at the time of the scrape.

Can I monitor competitors automatically on a schedule? Yes. Set up a cron schedule on Apify to run daily, weekly, or monthly. Combine with Zapier or webhooks for automated alerts.

Why is team size sometimes null? Team size is estimated from team member cards or text patterns. If the page uses a non-standard layout or doesn't disclose team size, the value will be null.

Can I track more than 50 competitors? Yes, set maxResults up to 200. Larger batches take proportionally longer to complete.

Use in Dify

Drop this actor into Dify workflows via the Apify plugin's Run Actor node. Each competitor returns scored, classified, and recommended as structured JSON β€” critical / alert / monitor / watch plus the changeFlag enums (PRICING_CHANGED, HIRING_INCREASED, TECH_STACK_ADDED, THREAT_LEVEL_RAISED) your downstream if/else node branches on. Generic web scrapers pointed at the same competitor sites return raw HTML; this returns decisions.

  • Actor ID: ryanclinton/saas-competitive-intel
  • Sample input (weekly competitor monitoring with watchlist diff):
{
"competitorUrls": ["https://notion.so", "https://slack.com", "https://linear.app"],
"mode": "monitoring",
"watchlistName": "productivity-saas-weekly"
}

Dify branching example

A Dify if/else node can route per competitor on the threatLevel enum:

Branch conditionAction
threatLevel == "critical"Notify product team in Slack with summary + whyThisMatters
threatLevel == "alert"Add to weekly review queue
changeFlags contains "PRICING_CHANGED"Diff fieldDiffs.pricing.planCount and post to #pricing channel
changeFlags contains "HIRING_INCREASED"Trigger company-deep-research for funding/SEC context
temporalSignals.reEscalated == trueAuto-page on-call competitive analyst
failureType != nullRoute to "needs human review" queue

The nextBestAction, whyThisMatters, and summary fields are usable verbatim in Slack messages, exec emails, or LLM agent prompts β€” no LLM rewriting required. Decision-mode profile (outputProfile: "minimal") drops everything except the routing primitives, which is the fastest shape for if/else evaluation.

Opt-in modes that pair well with Dify:

  • mode: "monitoring" auto-enables watchlist + emits temporalSignals and changeFlags per record.
  • outputProfile: "llm" keeps reasoning fields (whyThisMatters, improvementSuggestions) for AI-agent consumers.

Integrations

  • Zapier β€” trigger alerts when competitors change pricing or post new jobs.
  • Make (Integromat) β€” build multi-step competitive analysis workflows.
  • Google Sheets β€” export competitor data into a tracking spreadsheet.
  • Webhooks β€” receive results at any HTTP endpoint for custom processing.
  • Apify API β€” call programmatically from any language for custom dashboards.
  • Slack / Email β€” configure Apify notifications for run completion alerts.

What this actor does NOT do

Honest scope-fence β€” for needs in this column, use the sibling actor instead.

NeedUse this instead
Deep multi-page tech stack fingerprinting (50+ technologies, version detection)Website Tech Stack Detector
Wikipedia, SEC filings, GitHub org data, DNS/WHOIS, funding contextCompany Deep Research
Decision-makers, verified emails, buying committeesWebsite Lead Intelligence
Real-time stock-ticker / market data / Bloomberg-grade financial feedsnot in scope β€” this is public-web SaaS intelligence, not financial data
Shipment-level commercial data (Panjiva / ImportGenius style)not in scope
Brand-protection / typo-squatting detectionBrand Protection Monitor
Keyword / SERP-position tracking against competitorsSERP Rank Tracker
Domain registration / hosting / DNS historyWHOIS Domain Lookup
LLM-rewritten "competitive narrative" prosenot in scope β€” this is deterministic, no LLM calls
JavaScript-rendered SPAs that gate pricing behind a __next shellflagged in rendering.jsRenderingRequired + failureType: 'js-required'; out of scope for the CheerioCrawler engine
Product velocity / blog cadence / release-notes parsingoverlaps website-tech-stack-detector per the suite-cohesion contract β€” not in scope here
Cross-cohort market evolution (AI commoditisation, enterprise-feature saturation across many runs of different competitor sets)requires multi-cohort fleet aggregation β€” out of scope for a per-run actor
Funding-likely / launch-likely event predictionpublic-web signals can suggest a growth phase but cannot predict funding events β€” refusing to overpromise on this; use aggressive-growth-phase competitiveEvent + pressureZones as the deterministic stand-in
Per-field signal volatility (stability score over many runs)meaningful only at watchlist run #3+; emit temporalSignals + threatVelocity for now and revisit when median user has 3+ runs of history
Signal entropy / noise-density scoringrequires β‰₯3 runs of history to be meaningful; emit alertQuality.noiseRisk instead for single/two-run cases
Competitive heatmaps as a separate fielduse pressureZones[] + perClassConfidence + competitiveDna[] together β€” they cover the same analytical surface without duplicating data

The actor stays lightweight (256–1024MB / 60s request handler / Cheerio-only). The signal layer is composed deterministically from extracted data β€” no LLMs, no external APIs, no licensed feeds.

ActorWhat it doesUse with SaaS Competitive Intel
Website Tech Stack DetectorDeep technology analysisMore thorough tech stack fingerprinting
Website Contact ScraperExtract contact detailsGet emails and phone numbers from competitor sites
Brand Protection MonitorBrand threat monitoringCheck if competitors are typosquatting your brand
E-Commerce Price MonitorProduct price trackingTrack e-commerce product prices alongside SaaS pricing
Company Deep Research AgentComprehensive company researchGet Wikipedia, SEC, GitHub, and DNS data on competitors
SERP Rank TrackerKeyword ranking trackingSee how competitors rank for shared keywords
WHOIS Domain LookupDomain registration detailsGet registration dates and hosting info for competitor domains