Privacy Compliance Analyzer V.1
Pricing
from $30.00 / 1,000 results
Privacy Compliance Analyzer V.1
Educational tool analyzing website privacy compliance across 7 global frameworks (GDPR, CCPA/CPRA, LGPD, PIPEDA, APPI, PIPA, PDPB). Detects dark patterns, analyzes cookies, validates security headers. Generates detailed reports with actionable recommendations. Not legal advice.
Pricing
from $30.00 / 1,000 results
Rating
0.0
(0)
Developer

ANIRBAN ROY
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
🔒 Privacy Compliance Scanner
Multi-framework privacy compliance analyzer for GDPR, CCPA/CPRA, LGPD, PIPEDA, and 4 more global regulations
Identify privacy compliance gaps, detect 100+ tracking technologies, analyze dark patterns, and generate professional reports—before regulators find issues that could cost you €20M (GDPR) or $7,500/violation (CCPA).
📋 Table of Contents
- What This Actor Does
- Key Features
- Quick Start
- Input Configuration
- Output & Reports
- Use Cases
- Pricing
- Technical Details
- Limitations
- FAQ
- Acknowledgments
- Legal Disclaimer
🎯 What This Actor Does
Analyzes website privacy compliance across 8 global frameworks by:
- 📄 Scanning privacy policies - GDPR, CCPA/CPRA, LGPD, PIPEDA, APPI, PIPA, PDPB, ePrivacy
- 🔍 Detecting 100+ privacy technologies - Analytics, ad pixels, session recording, consent tools, data brokers
- 🚨 Identifying dark patterns - Pre-ticked boxes, complex rejection, visual manipulation
- 🔐 Validating security - SSL/TLS, security headers, cookie flags, DNS records
- 📖 Assessing readability - GDPR Article 12 "plain language" requirement
- 📊 Benchmarking - Compare against 9 industry standards with actionable recommendations
Output: Professional HTML reports, CSV exports, and JSON data with compliance scores, risk levels, and prioritized action items.
✨ Key Features
🌍 Multi-Framework Compliance (8 Global Regulations)
| Region | Framework | Key Requirements Checked |
|---|---|---|
| 🇪🇺 EU | GDPR | Lawful basis, data subject rights, DPO contact, data retention |
| 🇺🇸 California | CCPA/CPRA | "Do Not Sell/Share" links, consumer rights, sensitive data |
| 🇧🇷 Brazil | LGPD | Data controller identification, legal basis, subject rights |
| 🇨🇦 Canada | PIPEDA | Consent mechanisms, access rights, privacy officer contact |
| 🇯🇵 Japan | APPI | Purpose of use, third-party provision, disclosure |
| 🇰🇷 South Korea | PIPA | Consent, collection/use, third-party provision |
| 🇮🇳 India | PDPB | Consent mechanism, data principal rights, fiduciary |
| 🇪🇺 EU | ePrivacy | Cookie consent, IAB TCF, third-party scripts |
🔍 Privacy Technology Detection
Automatically detects and risk-assesses privacy technologies:
📊 Analytics & Product Analytics
🎥 Session Recording & Heatmaps 🚨 CRITICAL RISK
📢 Advertising, Pixels & Retargeting ⚠️ HIGH RISK
🍪 Consent & Cookie Management ✅ GOOD
📦 Customer Data Platforms (CDP) ⚠️ HIGH RISK
🏷️ Tag Management Systems
📧 Marketing Automation & Email
💬 CRM, Support & Live Chat
🔬 A/B Testing & Optimization
📊 Data Brokers & Third-Party Data 🚨 CRITICAL RISK
🎯 Affiliate & Attribution
💳 Payment & Checkout Tracking
🔍 Fingerprinting & Device Detection ⚠️ HIGH RISK
Risk Assessment:
- 🚨 CRITICAL - Session recording, data brokers (complete user behavior capture)
- ⚠️ HIGH - Advertising pixels, CDPs, fingerprinting (cross-site tracking)
- 🔵 MEDIUM - Analytics, CRMs (behavior tracking)
- ✅ LOW - Privacy-respectful tools (Plausible, Fathom, Matomo)
- 🟢 GOOD - Consent management (reduces risk)
🚨 Dark Pattern Detection
Identifies 8 types of deceptive consent patterns violating GDPR Article 7:
| Dark Pattern | Penalty | GDPR Violation |
|---|---|---|
| ❌ Pre-ticked consent boxes | -15 pts | Article 7(2) - Invalid consent |
| ❌ No easy reject option | -20 pts | Article 7(3) - Withdrawal must be easy |
| ❌ Complex rejection (3+ clicks) | -15 pts | Recital 32 - Easy as accepting |
| ❌ Missing granular consent | -10 pts | Article 7(2) - Specific consent |
| ❌ Visual manipulation | -10 pts | Article 4(11) - Freely given |
| ❌ Forced consent language | -15 pts | Recital 42 - Not conditional |
| ❌ No "Reject All" button | -10 pts | Article 7(3) - Easy withdrawal |
| ❌ Not mobile-responsive | -5 pts | Accessibility issue |
Example violations:
- "You must accept cookies to continue" → Forced consent
- Large green "Accept All" vs tiny gray "Reject" → Visual manipulation
- "Accept" on page 1, "Reject" hidden in settings → Complex rejection
🔐 Technical Security Analysis
SSL/TLS Certificate Check
- ✅ Certificate validity & expiration
- ✅ TLS version (1.2/1.3 required)
- ✅ Trusted CA verification
- ✅ Domain matching
- ⚠️ Self-signed detection
Security Headers
- HSTS - Force HTTPS, prevent downgrade attacks
- CSP - Content Security Policy, XSS protection
- X-Frame-Options - Clickjacking protection
- X-Content-Type-Options - MIME sniffing protection
- Referrer-Policy - Referrer leakage prevention
- Permissions-Policy - Feature access control
Cookie Security
- HttpOnly - Prevents JavaScript access (XSS protection)
- Secure - HTTPS-only transmission
- SameSite - CSRF protection (Strict/Lax/None)
- Expiry - Detects excessive retention (>2 years)
- Third-party tracking - Cross-domain cookies
DNS & Email Security (Optional)
- SPF - Email spoofing prevention
- DMARC - Phishing protection
- MX Records - Email provider detection
WHOIS Lookup (Optional)
- Domain registrant (potential data controller)
- Domain age & trust signals
- Privacy protection status
📖 Readability Analysis
Validates GDPR Article 12 "plain language" requirement:
| Metric | Description | GDPR Requirement |
|---|---|---|
| Flesch Score | 0-100 (higher = easier) | ≥50 (8th-10th grade) |
| Grade Level | 5th grade to College | ≤10th grade |
| Avg Sentence Length | Words per sentence | ≤20 words |
| Complex Words | 13+ characters | <15% of total |
| Reading Time | Minutes to read | <15 minutes |
| Word Count | Total words | <3,500 optimal |
GDPR Article 12(1): "The controller shall take appropriate measures to provide any information... in a concise, transparent, intelligible and easily accessible form, using clear and plain language."
Example:
- ❌ "We may utilize your personal data for the purposes of effectuating transactional operations" (Grade: College)
- ✅ "We use your information to process orders" (Grade: 6th)
🏢 Industry Benchmarking
Compare against 9 industry standards:
| Industry | Avg Score | Top 10% | Top 25% | Common Issues |
|---|---|---|---|---|
| 🛒 E-commerce | 68/100 | 85+ | 78+ | Missing CCPA opt-out, weak cookie disclosure |
| 💻 SaaS | 72/100 | 90+ | 82+ | Missing data retention, no DPO contact |
| 🏥 Healthcare | 78/100 | 92+ | 86+ | Missing HIPAA details, weak breach notification |
| 💰 Finance | 75/100 | 91+ | 84+ | Missing financial privacy notice, weak security |
| 🎓 Education | 70/100 | 87+ | 80+ | Missing FERPA details, no parental consent |
| 📰 Media | 65/100 | 83+ | 76+ | Excessive ad tracking, weak cookie consent |
| ✈️ Travel | 66/100 | 83+ | 76+ | Missing cancellation policy, weak payment security |
| 🎮 Gaming | 64/100 | 80+ | 73+ | Missing child safety, weak in-game purchase disclosure |
| 🌐 General | 67/100 | 80+ | 73+ | Generic issues vary by site |
Output includes:
- Your score vs industry average
- Percentile ranking (e.g., "Top 15%")
- Gap to top 10% threshold
- Performance rating (Excellent/Above Avg/Average/Below Avg/Poor)
- Actionable recommendation
🤖 Optional AI-Powered Analysis
Rule-Based (Default): ✅ Fast, reliable, no external dependencies, FREE
AI-Powered (Optional): 🧠 Deep policy interpretation, nuanced analysis
| Provider | Model | Best For | Cost/Analysis | API Key |
|---|---|---|---|---|
| OpenAI | GPT-4o-mini | General accuracy | ~$0.01-0.03 | Get Key |
| Anthropic | Claude 3.5 Sonnet | Legal text | ~$0.01-0.02 | Get Key |
| xAI | Grok Beta | Fast & cheap | ~$0.005-0.015 | Get Key |
Requirements:
- Your own API key (not provided)
- Explicit
aiDataProcessingConsent: truecheckbox - PUBLIC websites only (privacy policy content sent to AI provider)
⚠️ Never use AI for: Internal sites, staging environments, confidential data, customer info, medical/financial records
🚀 Quick Start
1️⃣ Basic Scan (No AI)
{"startUrls": [{"url": "https://yourwebsite.com"}],"targetRegion": "EU","industryType": "ecommerce","enableTechnologyDetection": true,"performSSLCheck": true,"exportFormats": ["html", "csv", "json"],"acceptedTerms": true,"legalAcknowledgment": true}
2️⃣ Bulk Competitor Analysis
{"startUrls": [{"url": "https://competitor1.com"},{"url": "https://competitor2.com"},{"url": "https://competitor3.com"},{"url": "https://competitor4.com"},{"url": "https://competitor5.com"}],"targetRegion": "CA","industryType": "saas","concurrency": 3,"exportFormats": ["csv", "json"],"acceptedTerms": true,"legalAcknowledgment": true}
3️⃣ Deep AI-Powered Analysis
{"startUrls": [{"url": "https://yourwebsite.com"}],"aiProvider": "openai","aiApiKey": "sk-proj-...","aiDataProcessingConsent": true,"targetRegion": "EU","enableTechnicalAnalysis": true,"performSSLCheck": true,"performWhoisLookup": true,"performDNSAnalysis": true,"exportFormats": ["html", "json"],"acceptedTerms": true,"legalAcknowledgment": true}
4️⃣ With Notifications (Discord/Slack/Email)
{"startUrls": [{"url": "https://yourwebsite.com"}],"enableNotifications": true,"notificationType": "discord","discordWebhookUrl": "https://discord.com/api/webhooks/...","acceptedTerms": true,"legalAcknowledgment": true}
⚙️ Input Configuration
Required Fields
| Field | Type | Description | Example |
|---|---|---|---|
startUrls | Array | URLs to scan (1-100) | [{"url": "https://example.com"}] |
acceptedTerms | Boolean | Accept Terms of Use | true |
legalAcknowledgment | Boolean | Acknowledge limitations | true |
Target Configuration
| Field | Type | Default | Options | Description |
|---|---|---|---|---|
targetRegion | String | "EU" | EU, CA, BR, JP, KR, IN, US, Global | Primary compliance region |
industryType | String | "general" | ecommerce, saas, healthcare, finance, education, media, travel, gaming, general | Industry for benchmarking |
domainEnforcementLevel | String | "strict" | none, minimal, moderate, strict | Domain restriction level |
Domain Enforcement Levels:
"none"- Scan any domain (you assume all responsibility)"minimal"- Block social media only (Facebook, LinkedIn, etc.)"moderate"- Block social + business intel (Apollo.io)"strict"- Block all prohibited domains (default, safest)
AI Configuration (Optional)
| Field | Type | Default | Description |
|---|---|---|---|
aiProvider | String | "none" | "openai", "claude", "grok", or "none" |
aiApiKey | String | null | Your AI provider API key |
aiDataProcessingConsent | Boolean | false | Explicit consent to send data to AI |
Feature Toggles
| Field | Type | Default | Description |
|---|---|---|---|
enableTechnologyDetection | Boolean | true | Detect 100+ privacy technologies |
enableTechnicalAnalysis | Boolean | true | SSL, WHOIS, DNS checks |
performSSLCheck | Boolean | true | SSL/TLS certificate validation |
performWhoisLookup | Boolean | true | Domain registrant lookup |
performDNSAnalysis | Boolean | false | SPF, DMARC, MX records |
lowMemoryMode | Boolean | false | Reduce memory usage (slower) |
Performance Settings
| Field | Type | Default | Range | Description |
|---|---|---|---|---|
concurrency | Number | 2 | 1-10 | Parallel scans (memory dependent) |
maxPages | Number | 10 | 1-100 | Max pages to crawl per domain |
Export Options
| Field | Type | Default | Options | Description |
|---|---|---|---|---|
exportFormats | Array | ["html", "csv", "json"] | html, csv, json | Output formats |
Notifications (Optional)
| Field | Type | Description |
|---|---|---|
enableNotifications | Boolean | Enable scan completion notifications |
notificationType | String | "discord", "telegram", "slack", "email", or "none" |
Discord:
{"notificationType": "discord","discordWebhookUrl": "https://discord.com/api/webhooks/..."}
Telegram:
{"notificationType": "telegram","telegramBotToken": "123456:ABC-DEF...","telegramChatId": "123456789"}
Slack:
{"notificationType": "slack","slackWebhookUrl": "https://hooks.slack.com/services/..."}
Email (SendGrid):
{"notificationType": "email","emailRecipient": "you@company.com","emailSendGridApiKey": "SG.xxx","emailFromAddress": "scanner@yourcompany.com"}
📤 Output & Reports
Output Files (Key-Value Store)
| File | Format | Description | Size |
|---|---|---|---|
OUTPUT | HTML | Beautiful visual report | ~50-200 KB |
OUTPUT_CSV | CSV | Spreadsheet-friendly data | ~5-20 KB |
RESULTS | JSON | Complete machine-readable data | ~10-50 KB |
SUMMARY | JSON | High-level statistics | ~1 KB |
HTML Report Contents
📊 Executive Summary
- Average compliance score
- Risk distribution (Critical/High/Medium/Low)
- Total websites scanned
- Analysis method (Rule-based or AI)
📋 Results Table
- Domain, Industry, Score, Risk Level
- Technology count & risk level
- Framework compliance (GDPR, CCPA, etc.)
- Security score
🔍 Detailed Analysis (Per Domain)
- Compliance Summary: Score breakdown, confidence level, framework status
- Technology Stack: Detected tools by category, privacy risk assessment
- Privacy-Risky Tools: Categorized by risk level (Critical/High/Medium)
- Dark Patterns: Consent banner issues
- CCPA Compliance: Do Not Sell/Share link status
- Security Analysis: SSL, headers, cookies
- Readability: Flesch score, grade level, GDPR compliance
- Industry Benchmark: Your score vs industry average, percentile
- Quick Wins: Prioritized recommendations (effort vs impact)
⚠️ Legal Disclaimer
CSV Export Columns
Domain,URL,Compliance Score,Risk Level,GDPR Status,CCPA Status,Privacy Policy Found,Security Score,Scanned Atexample.com,https://example.com,62,MEDIUM,Compliant,Partial,Yes,45,2025-12-27T14:30:00.000Z
JSON Output Structure
{"metadata": {"generatedAt": "2025-12-27T14:30:00.000Z","scannerVersion": "1.0.0","totalScanned": 5,"summary": {"successful": 5,"failed": 0,"avgCompliance": 68,"riskDistribution": {"critical": 0,"high": 1,"medium": 2,"low": 2}}},"results": [{"domain": "example.com","url": "https://example.com","complianceScore": 62,"riskLevel": "MEDIUM","scoreConfidence": "High","frameworks": {"GDPR": "Compliant","CCPA": "Partial","CPRA": "Non-Compliant"},"technologyAnalysis": {"total_technologies": 12,"overall_privacy_risk": "MEDIUM","critical_privacy_tools": [],"high_risk_tools": ["Facebook Pixel", "Google Ads"],"medium_risk_tools": ["Google Analytics", "HubSpot"],"has_consent_management": true},"readabilityAnalysis": {"score": 52,"grade": "Fairly Difficult","gdprCompliant": true,"readingTime": 12},"scoreBreakdown": {"components": [{"name": "Privacy Policy", "score": 25, "max": 40},{"name": "Security", "score": 7, "max": 10}],"quickWins": [{"action": "Add 'Do Not Share' link","pts": "+8","effort": "30min","howTo": "Add next to 'Do Not Sell' link"}]}}]}
🎯 Use Cases
👨💼 Legal & Compliance Teams
- Pre-audit preparation - Identify issues before formal audit
- Multi-jurisdiction tracking - Monitor compliance across regions
- Vendor assessment - Evaluate third-party privacy practices
- Competitive intelligence - Benchmark against competitors
- M&A due diligence - Privacy assessment for acquisitions
🛡️ Privacy Officers (DPOs)
- Technology inventory - GDPR Article 30 Records of Processing
- Consent validation - Dark pattern detection
- Risk assessment - Prioritize remediation efforts
- Readability audits - GDPR Article 12 compliance
- DPIA support - Data Protection Impact Assessments
👨💻 Web Developers
- Pre-launch checks - Catch issues before going live
- Security validation - SSL, headers, cookies
- Tag audits - Inventory tracking technologies
- Cookie compliance - Validate consent implementation
🏢 Agencies & Consultants
- Client audits - Privacy compliance service offering
- Sales prospecting - Identify leads with compliance gaps
- Upsell opportunities - Show value of privacy improvements
- Competitive analysis - Client vs competitor comparison
🔬 Researchers & Academics
- Privacy trends - Large-scale compliance studies
- Technology adoption - Tracking tool prevalence
- Dark pattern research - Deceptive design patterns
- Industry benchmarking - Sector-specific analysis
💰 Pricing
| Scans | Free Plan | Starter Plan | Scale Plan | Business Plan |
|---|---|---|---|---|
| 10 | $0.50 | $0.40 | $0.35 | $0.30 |
| 100 | $5.00 | $4.00 | $3.50 | $3.00 |
| 1,000 | $50.00 | $40.00 | $35.00 | $30.00 |
What counts as an event?
- ✅ Each URL that completes analysis (with or without privacy policy)
- ❌ Failed scans (bot blocks, timeouts) - NOT charged
- ❌ Prohibited domains (auto-skipped) - NOT charged
Platform Compute Costs
Separate from event fees, handled by Apify:
- Compute: ~$0.002-0.005 per scan (included in $5 free monthly credits)
- Total: Event fee + compute = ~$0.75-0.80 per scan (Free plan)
Cost Examples
Small Business (5 websites):
- Free plan: $3.75
- Business plan: $2.25
- vs professional audit: $2,500-10,000 (99.9% cheaper)
Agency (50 clients):
- Free plan: $37.50
- Business plan: $22.50
- vs manual audits: $25,000-100,000
Enterprise (1,000 sites/month):
- Free plan: $750
- Business plan: $450
- vs compliance team: $120,000-240,000/year salary
🛠️ Technical Details
System Requirements
| Tier | Memory | Concurrency | Speed | Features |
|---|---|---|---|---|
| Minimal | 512MB | 1 | 10-20s/site | Basic |
| Recommended | 1GB | 2 | 5-15s/site | Full |
| Optimal | 2GB+ | 3+ | 5-10s/site | Fast + AI |
Performance Benchmarks
| Metric | Value |
|---|---|
| Speed | 5-15 sec/website (rule-based) |
| Speed (AI) | 10-30 sec/website |
| Max URLs | 100 per run |
| Max Concurrency | 10 (memory dependent) |
| Timeout | 30s page load, 30s policy search |
| Memory Usage | 200-500 MB per concurrent scan |
Technology Stack
- Runtime: Node.js 22.x
- Framework: Crawlee 3.x (Apify SDK)
- Browser: Playwright + Chromium (headless)
- Libraries:
- Privacy: Custom analyzers
- AI: OpenAI SDK, Anthropic SDK
- Notifications: Discord/Telegram/Slack/SendGrid
- Charts: Chart.js (HTML reports)
- Output: HTML, CSV, JSON
Architecture
┌─────────────┐│ Input URLs │└──────┬──────┘│▼┌─────────────────────────────────┐│ Validation & Security Checks ││ - Domain restrictions ││ - SSRF protection ││ - Rate limiting │└──────┬──────────────────────────┘│▼┌─────────────────────────────────┐│ Parallel Crawling (Playwright) ││ - Load page ││ - Bot detection check ││ - Memory monitoring │└──────┬──────────────────────────┘│▼┌─────────────────────────────────┐│ Privacy Policy Discovery ││ - Link detection ││ - Common paths (/privacy, etc) ││ - Fallback strategies │└──────┬──────────────────────────┘│▼┌──────────────────────────────────┐│ Parallel Analysis ││ ┌────────────────────────────┐ ││ │ Policy Analysis │ ││ │ - Rule-based OR AI │ ││ │ - Framework compliance │ ││ └────────────────────────────┘ ││ ┌────────────────────────────┐ ││ │ Technology Detection │ ││ │ - 100+ tools │ ││ │ - Risk assessment │ ││ └────────────────────────────┘ ││ ┌────────────────────────────┐ ││ │ Security Analysis │ ││ │ - SSL/TLS, headers │ ││ │ - WHOIS, DNS (optional) │ ││ └────────────────────────────┘ ││ ┌────────────────────────────┐ ││ │ Dark Pattern Detection │ ││ │ - Consent banner analysis │ ││ └────────────────────────────┘ │└──────┬───────────────────────────┘│▼┌─────────────────────────────────┐│ Scoring & Risk Assessment ││ - Calculate compliance score ││ - Determine risk level ││ - Generate recommendations │└──────┬──────────────────────────┘│▼┌─────────────────────────────────┐│ Report Generation ││ - HTML (visual) ││ - CSV (spreadsheet) ││ - JSON (machine-readable) │└──────┬──────────────────────────┘│▼┌─────────────────────────────────┐│ Optional Notifications ││ - Discord, Slack, Telegram ││ - Email (SendGrid) │└─────────────────────────────────┘
⚠️ Important Limitations
🎓 EDUCATIONAL TOOL - NOT LEGAL ADVICE
| ❌ What This Tool IS NOT | ✅ What This Tool CAN DO |
|---|---|
| ❌ Legal advice | ✅ Identify potential issues |
| ❌ Compliance certification | ✅ Detect technologies |
| ❌ Attorney-client relationship | ✅ Assess readability |
| ❌ Guarantee of compliance | ✅ Benchmark against industry |
| ❌ Legal determination | ✅ Suggest improvements |
| ❌ Substitute for lawyer | ✅ Generate reports |
Scores are estimates with ±15 point uncertainty:
- High score ≠ Legally compliant
- Low score ≠ Definitely non-compliant
- Professional legal review ALWAYS required
🚫 Prohibited Domains (Apify Terms of Service)
Cannot scan: Social media (YouTube, LinkedIn, Instagram, Facebook, Twitter/X, TikTok), Meta properties, Amazon, Google Search/Maps/Trends, Apollo.io
Can scan: Your websites, client sites (with permission), competitor public pages, business websites, e-commerce sites
Domain Enforcement: Configurable via domainEnforcementLevel (see Input Configuration)
🤖 AI Analysis Restrictions
⚠️ ONLY use AI for PUBLIC websites
Never scan with AI:
- ❌ Internal/staging sites
- ❌ Pre-launch websites
- ❌ Confidential data
- ❌ Customer information
- ❌ Medical/health records
- ❌ Financial/payment data
- ❌ Employee data
- ❌ Legal privileged documents
Data sent to AI:
- Privacy policy text only (up to 50,000 characters)
- Processed by: OpenAI (USA), Anthropic (USA), or xAI (USA)
- Subject to provider Terms of Service
- May be used for training (check provider policies)
🔍 Technical Limitations
| Limitation | Impact | Workaround |
|---|---|---|
| Bot detection | May be blocked by Cloudflare, Akamai | Use residential proxies, reduce speed |
| JavaScript-heavy sites | May not detect dynamic content | Increase timeout, manual review |
| PDF policies | Cannot analyze PDF text | Download manually, use PDF-to-text |
| Paywalls | Cannot access gated content | Requires authentication (not supported) |
| Rate limiting | May trigger anti-scraping measures | Reduce concurrency, add delays |
| Memory constraints | May fail on low-memory tiers | Upgrade memory, enable lowMemoryMode |
❓ FAQ
General Questions
Q: Is this tool legally binding?
A: No. This is an automated educational tool. Results may be incorrect. Always consult qualified legal counsel before making compliance decisions.
Q: Can I use this for client audits?
A: Yes, as a preliminary assessment tool. Always include disclaimer that professional legal review is required.
Q: How accurate are the scores?
A: Scores have ±15 point uncertainty. They're useful for identifying potential issues and prioritizing remediation, not legal determinations.
Q: Can I scan my competitor's websites?
A: Yes, if they're publicly accessible. Respect their Terms of Service and robots.txt. Only scan public information.
Technical Questions
Q: Why is my scan failing?
A: Common causes:
- Bot detection - Website blocking automated access (Cloudflare, etc.)
- Timeout - Website too slow or unavailable
- No privacy policy - Policy not found (still generates partial report)
- Memory - Insufficient memory (upgrade tier or enable lowMemoryMode)
- Prohibited domain - Social media, Amazon, Google (change domainEnforcementLevel)
Q: How do I get CSV/JSON files?
A: After run completes:
- Go to "Storage" tab
- Click "Key-value store"
- Download
OUTPUT(HTML),OUTPUT_CSV(CSV),RESULTS(JSON)
Q: Can I schedule recurring scans?
A: Yes! Use Apify Schedules:
- Save your input configuration
- Go to Actor → Schedules
- Set frequency (daily, weekly, monthly)
- Enable notifications to get alerts
Q: Why is Google/YouTube blocked?
A: Apify Terms of Service prohibit scraping certain platforms. Use domainEnforcementLevel: "none" at your own risk.
AI Questions
Q: Do I need AI?
A: No. Rule-based analysis is fast, reliable, and free. AI provides deeper interpretation but costs $0.01-0.03 per scan.
Q: Which AI provider is best?
A:
- OpenAI GPT-4: Most accurate, $0.01-0.03/scan
- Anthropic Claude: Best for legal text, $0.01-0.02/scan
- xAI Grok: Fastest/cheapest, $0.005-0.015/scan
Q: Is my data used for AI training?
A: Check your AI provider's policy:
- OpenAI: Not used for training (as of API usage policy)
- Anthropic: Not used for training
- xAI: Check xAI terms
Q: Can I use AI for internal sites?
A: No. AI sends content to external providers. Only use for PUBLIC websites.
Pricing Questions
Q: How much does it cost?
A: $0.45-0.75 per successful scan (plan dependent) + ~$0.002 compute. First 3 scans FREE.
Q: What if a scan fails?
A: Failed scans are NOT charged. You only pay for successful analyses.
Q: Are prohibited domains charged?
A: No. Auto-skipped domains don't count as events.
Q: Is compute cost separate?
A: Yes, but negligible (~$0.002/scan). Included in Apify's $5 free monthly credits.
🙏 Acknowledgments
This actor was built with the following amazing open-source technologies:
Core Infrastructure
- Apify Platform - Web scraping & automation infrastructure
- Crawlee - Web crawling framework
- Playwright - Browser automation
AI Providers (Optional)
- OpenAI - GPT-4 for policy analysis
- Anthropic - Claude for legal text interpretation
- xAI - Grok for fast analysis
Libraries & Tools
- Chart.js - Data visualization
- PQueue - Rate limiting
- Papaparse - CSV parsing
- Node.js Built-ins - DNS, HTTPS, child_process
Privacy Frameworks
- GDPR - GDPR.eu
- CCPA/CPRA - California AG Office
- LGPD - LGPD Brazil
- PIPEDA - Privacy Commissioner of Canada
Notification Services
Inspiration & Research
- Privacy regulations from EDPB, FTC, CNIL, ICO
- Dark pattern research from Dark Patterns Tip Line, Princeton WebTAP
- Technology detection inspired by Wappalyzer, BuiltWith
Special Thanks:
- Apify Team - For the incredible platform and Challenge 2025
- Privacy community - Researchers, advocates, and legal experts
- Open-source contributors - Making this project possible
📄 License & Legal
License
Apache License 2.0
Copyright 2025 Anirban Roy
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
**Free for commercial and personal use.**---## ⚖️ LEGAL DISCLAIMER**⚠️ CRITICAL - READ BEFORE USE**By using this tool, you **ACKNOWLEDGE AND AGREE** to the following:### 1. NOT LEGAL ADVICEThis tool does **NOT** provide legal advice, legal opinions, or legal compliance determinations. Results are **automated estimates** that may be incorrect, incomplete, or misleading.**NO attorney-client relationship** is created by using this tool.### 2. SCORES ARE ESTIMATESCompliance scores are **automated estimates** with significant uncertainty (±10-15 points).- **High score ≠ Legally compliant**- **Low score ≠ Definitely non-compliant**- **Professional legal review ALWAYS required**### 3. TOOL LIMITATIONSThis automated tool may:- ❌ Miss critical compliance issues- ❌ Generate false positives or false negatives- ❌ Misinterpret privacy policies or legal requirements- ❌ Fail to detect privacy violations- ❌ Provide outdated or incorrect information- ❌ Cannot assess internal business processes### 4. NO WARRANTY - NO LIABILITYThis tool is provided **"AS-IS"** with **NO warranties** of any kind.**We are NOT liable for:**- Regulatory fines or penalties- Legal costs or attorney fees- Business losses or damages- Privacy violations or data breaches- Any other damages resulting from use of this tool**Maximum Liability:** Limited to amounts paid (typically $0-1).### 5. YOU MUST CONSULT QUALIFIED LEGAL COUNSEL**Before making ANY compliance decisions:**1. ✅ Hire a licensed privacy attorney2. ✅ Conduct a professional legal audit3. ✅ Review with qualified legal counsel4. ✅ Verify all findings independently**DO NOT** rely solely on this tool for compliance.### 6. YOUR RESPONSIBILITIESBy using this tool, **YOU agree to:**1. Use results for informational/educational purposes ONLY2. Consult qualified legal counsel for compliance decisions3. Verify all findings independently4. Not rely on this tool as legal advice5. Hold creators harmless from any damages or losses6. Only scan public websites with legitimate research interest7. Respect website Terms of Service8. Comply with all applicable laws and regulations### 7. AI DATA PROCESSING (If Enabled)If AI analysis is enabled, website content will be sent to:- **OpenAI (USA)** - Subject to OpenAI Terms & Privacy Policy- **Anthropic (USA)** - Subject to Anthropic Terms & Privacy Policy- **xAI (USA)** - Subject to xAI Terms & Privacy Policy⚠️ **Only scan PUBLIC websites with AI enabled**⚠️ **Do NOT scan** internal/staging/confidential sites### 8. INTENDED USE**✅ Appropriate uses:**- Learning about privacy frameworks- Academic research on privacy compliance- Preliminary informal assessment- Identifying topics to discuss with legal counsel- Technology inventory for internal purposes**❌ Inappropriate uses:**- Making compliance decisions without legal review- Using as evidence of compliance- Using in legal proceedings- Scanning websites without legitimate interest- Violating website Terms of Service- Scanning confidential or non-public information---**BY USING THIS TOOL, YOU ACKNOWLEDGE YOU HAVE READ AND AGREE TO ALL TERMS ABOVE.****IF YOU DO NOT AGREE, DO NOT USE THIS TOOL.**---## 📞 Support & Contact**Questions?** Contact via [Apify Platform](https://console.apify.com)**Bug Reports?** Create an issue on GitHub or Apify platform**Feature Requests?** We'd love to hear from you!**Author:** [Anirban Roy](https://github.com/ar48code-dev)**Version:** 1.0.0**Built for:** Apify Challenge 2025**Last Updated:** December 2025---**🔒 Privacy Compliance Scanner** - Made with ❤️ for privacy compliance education*Helping businesses understand privacy compliance, one scan at a time.*