CA Data Broker Registry Evidence Pack avatar

CA Data Broker Registry Evidence Pack

Pricing

from $250.00 / 1,000 broker evidence packs

Go to Apify Store
CA Data Broker Registry Evidence Pack

CA Data Broker Registry Evidence Pack

Normalize the CPPA Data Broker Registry and create public-page evidence packs for privacy policy, opt-out/delete, CCPA/GPC, noindex, broken-page, request-metric, and diff monitoring signals.

Pricing

from $250.00 / 1,000 broker evidence packs

Rating

0.0

(0)

Developer

Dongwook Kim

Dongwook Kim

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

Apify Actor for a public-page-only validation build around the California CPPA Data Broker Registry.

It fetches the CPPA registry CSV, normalizes broker rows, scans each broker's public website and CA rights URL, and emits one broker-evidence-pack dataset item per scanned broker.

What v1 Scans

  • CPPA registry row normalization as registry-row-normalized
  • Primary broker website and CA consumer-rights URL
  • Privacy policy, opt-out, delete, and CCPA links
  • CCPA and Global Privacy Control mentions
  • noindex evidence pages
  • Broken pages
  • Request-metric disclosure signals
  • Compact per-page evidence metadata: status, title, noindex, extracted links, CCPA/GPC/request-metric mentions, privacy-contact presence, and content hash
  • Optional diff rows against the latest saved snapshot

Explicitly Out Of Scope

  • No consumer-subject lookup
  • No consumer PII collection
  • No opt-out or deletion request submission
  • No login-gated or private-page crawling

Local Commands

npm install
npm test
npm run build
npx --yes apify-cli@1.6.2 validate-schema

Run a small local sample:

$npm run start:dev

Use Apify input to set mode, maxBrokers, includePageScan, and includeDomains for validation runs. includeDomains accepts bare domains or HTTP(S) URLs, matches exact multi-label domains and subdomains, so example.com also includes www.example.com; invalid-only filters match no brokers. registry_snapshot emits normalized registry rows only; evidence_pack emits broker evidence packs.

Diff Monitoring

diff_since_last_run compares the current evidence packs with the latest saved evidence-pack snapshot, stores the aggregate counts in CHANGE_DIFF_SUMMARY, and emits change-diff dataset rows only for added, removed, and changed brokers. Unchanged brokers are counted in the summary but are not emitted as per-broker diff rows.

PPE Events

Configured validation pricing:

  • registry-row-normalized: registry snapshot event, $0.01
  • broker-evidence-pack: primary evidence event, $0.25
  • change-diff: diff monitoring event, $0.08

The Actor charges registry-row-normalized, the non-configurable broker-evidence-pack, and change-diff before emitting the matching dataset rows. For evidence-pack runs, the primary event is charged before page scans start, so Apify can enforce the user's PPE max-charge limit without doing uncharged broker scans. Snapshots and run summaries are written only for emitted items.

Raw contact email values are not emitted in public dataset rows. The Actor keeps signals.hasPrivacyEmail as a presence signal for triage.