Pricing

from $5.00 / 1,000 results

Lovable Sites Scraper - Find & Enrich lovable.app Apps

Discover sites built with Lovable.dev. Enumerates *.lovable.app subdomains from public sources (CT logs, RapidDNS, hackertarget) and enriches each with title, description, Open Graph tags and custom domain detection. Perfect for lead-gen, competitive intel and market research on AI-built apps.

Pricing

from $5.00 / 1,000 results

Rating

0.0

(0)

Developer

deusex machine

Actor stats

Bookmarked

Total users

Monthly active users

4 days ago

Last modified

Lovable Sites Scraper — Find & Enrich `.lovable.app` Apps

Discover every public site built with Lovable.dev (the AI app builder by GPT Engineer). This actor enumerates live *.lovable.app subdomains from multiple public sources, then enriches each URL with HTTP metadata — title, description, Open Graph tags, favicon, canonical URL and custom-domain detection — so you can turn the raw list into a searchable, filterable dataset of AI-built apps.

Perfect for lead generation, competitive intelligence, market research on AI-built products, design inspiration and agency prospecting. If you're selling to founders who ship with AI, this is your radar.

Why this actor exists

Lovable.dev is one of the hottest AI app builders on the market — thousands of founders, indie hackers and agencies use it daily to ship full-stack apps in minutes. Every published Lovable project gets a forced subdomain on *.lovable.app, and optionally a custom domain on top. That forced subdomain is the reason we can enumerate the entire public surface of Lovable: if someone shipped it and hit Publish, it's discoverable.

But there's no official directory. No search engine. No public API. If you want to know:

Which agencies are shipping client work on Lovable?
What SaaS niches are being built with AI right now?
Who just bought a custom domain (signal: they're serious, have budget)?
What landing pages are converting in your market?
Which Lovable sites are dead vs. live vs. placeholder?

…you had to scrape it yourself. Until now.

This actor does the heavy lifting: multi-source subdomain enumeration, concurrent HTTP enrichment, dead-site filtering, custom-domain detection, and keyword search across the whole result set. You get a clean, structured dataset ready to import into your CRM, your BI tool, or your cold-outreach workflow.

What you get — output fields

Each row in the output dataset contains:

Field	Type	Description
`subdomain`	string	The full `xxx.lovable.app` hostname
`url`	string	Canonical `https://` URL
`status`	integer	HTTP status code returned (200, 404, 500…)
`isLive`	boolean	`true` if the site responds 200 and is NOT the Lovable placeholder page
`isDefault`	boolean	`true` if the response is Lovable's "project not found" / "not deployed yet" page
`title`	string	`<title>` tag, unescaped and trimmed to 300 chars
`description`	string	`<meta name="description">`, 600 chars
`ogTitle`	string	`<meta property="og:title">`
`ogDescription`	string	`<meta property="og:description">`
`ogImage`	string	`<meta property="og:image">` URL — great for thumbnails
`ogUrl`	string	`<meta property="og:url">`
`canonical`	string	`<link rel="canonical">` href
`favicon`	string	`<link rel="icon">` URL (absolute)
`customDomain`	string	Detected custom domain (from canonical / og:url hostname) — empty if none
`hasCustomDomain`	boolean	`true` if `customDomain` is set — key lead-gen filter
`contentLength`	integer	Response body size in bytes
`scrapedAt`	string	ISO-8601 UTC timestamp when the row was enriched

The hasCustomDomain field is the money field for lead-gen: a Lovable user who went through the effort of wiring up DNS is several orders of magnitude more likely to be a paying, serious customer than someone with a placeholder.

How it works

1) Discovery — multi-source subdomain enumeration

The actor queries up to three public sources in parallel:

crt.sh — Certificate Transparency logs. Every SSL certificate issued for *.lovable.app is logged here. The Lovable platform mostly uses a wildcard cert, so CT coverage is partial, but many projects get their own cert issued.
hackertarget.com — Free passive DNS / host search. Covers a large chunk of the surface with recent data.
rapiddns.io — Aggregated subdomain database. Pulls from passive DNS, CT and DNS brute-force.

By default all three are enabled (sources: ["crtsh", "hackertarget", "rapiddns"]). You can disable any of them via the input. Combining sources gives the broadest coverage — each individual source has blind spots.

2) Cleaning

Raw results are deduplicated, wildcard entries (*.lovable.app) are dropped, and www.<project>.lovable.app duplicates of the bare form are collapsed by default (toggle with includeWww: true).

3) Enrichment (optional, on by default)

For each surviving candidate, the actor fires a concurrent HTTPS GET with:

10-second timeout
Redirect follow (so we catch sites that 301 to a custom domain)
Desktop-class User-Agent
Configurable concurrency (concurrency: 15 by default, up to 50)

The HTML is parsed with lightweight regex (no Playwright / no heavy browser) — fast and cheap. We extract title, meta tags, OG tags, canonical, favicon. We also detect the Lovable placeholder / "not deployed" page and set isLive=false for those so your dataset isn't polluted with dead projects.

4) Filtering

Two filters run at the end:

onlyLive: true (default) — drop dead / placeholder sites
searchQuery — case-insensitive match across subdomain + title + description + OG tags + customDomain. Great for niching down: "crypto", "saas", "real estate", "restaurant", "ai".

5) Charging

The actor bills per result pushed ($0.002 per site), not per candidate discovered. If onlyLive=true and 80% of candidates are dead, you only pay for the 20% that landed in your dataset.

Input parameters

Field	Type	Default	Description
`maxSites`	integer	`100`	Maximum sites to return. Range: 1–5000
`enrichHtml`	boolean	`true`	Fetch each site and extract metadata. Turn off for pure subdomain enumeration (cheapest)
`onlyLive`	boolean	`true`	Skip 4xx/5xx/timeouts/placeholder pages
`includeWww`	boolean	`false`	Keep `www.*` variants even when the bare form is also found
`searchQuery`	string	`""`	Keyword filter (case-insensitive) across subdomain + title + description + OG + customDomain
`concurrency`	integer	`15`	Parallel HTTP requests during enrichment (1–50)
`sources`	array	all 3	Any combination of `crtsh`, `hackertarget`, `rapiddns`

Sample inputs

Lead-gen: find Lovable sites with custom domains (filter client-side):

{
  "maxSites": 2000,
  "enrichHtml": true,
  "onlyLive": true,
  "concurrency": 20
}

Run → filter dataset where hasCustomDomain = true → that's your qualified lead list.

Niche research: find crypto apps built on Lovable:

{
  "maxSites": 500,
  "searchQuery": "crypto",
  "enrichHtml": true,
  "onlyLive": true
}

Pure enumeration (fastest / cheapest):

{
  "maxSites": 5000,
  "enrichHtml": false
}

No HTTP enrichment — you get a clean list of subdomains in seconds. Good for feeding into your own downstream pipeline.

Competitive intel on a specific vertical:

{
  "maxSites": 1000,
  "searchQuery": "saas",
  "sources": ["crtsh", "hackertarget", "rapiddns"],
  "onlyLive": true,
  "concurrency": 25
}

Use cases

1. Lead generation for AI dev agencies

You build custom apps for clients. Your ICP is founders who've already validated with a no-code AI tool but hit the ceiling. Scrape Lovable → filter hasCustomDomain=true → enrich with the custom-domain owner (Clearbit / Apollo) → send a personalized cold email: "Saw you shipped acme.com on Lovable — once you need real auth + Stripe + multi-tenant, here's what we do."

Conversion rates on this kind of cold outreach are typically 3–8× higher than generic lists because you're qualifying on intent + budget + tech stack all at once.

2. Competitive intelligence for SaaS founders

You're building a SaaS. You want to know what's being shipped in your space right now — not six months ago when Crunchbase got around to indexing it. Filter searchQuery by your vertical keyword, inspect titles + descriptions, collect patterns. Which painpoints are recurring? Who's pricing what? Which are gaining traction (check the custom domain → check DNS age → check backlinks)?

3. Market research on AI-built products

Investors, analysts, journalists: Lovable is one of the primary funnels where AI-generated software becomes production software. Enumerating this surface gives you a weekly pulse on what's being built with AI, what verticals are hot, what's dying in the graveyard of placeholder pages.

4. Design and UX inspiration

Need examples of how AI builders design their landing pages? Filter by vertical, pull the ogImage field, build a mood board of 500 AI-generated homepages in an afternoon.

5. Agency prospecting for Lovable itself

If you work at Lovable or a compatible tool (Vercel, Supabase, Clerk, Stripe), this is your list of current users. Segment by custom-domain vs. placeholder, prioritize the serious ones, reach out with case studies and integration guides.

6. Historical tracking / weekly deltas

Schedule the actor to run weekly. Diff the datasets. What got published this week? What died? What graduated from placeholder to custom domain? That's a trend chart nobody else has.

Code examples

Python — Apify client

from apify_client import ApifyClient

client = ApifyClient("<YOUR_APIFY_TOKEN>")

run = client.actor("makework36/lovable-sites-scraper").call(run_input={
    "maxSites": 500,
    "enrichHtml": True,
    "onlyLive": True,
    "searchQuery": "saas",
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item["subdomain"], "→", item["title"], "| custom:", item.get("customDomain") or "-")

Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: '<YOUR_APIFY_TOKEN>' });

const run = await client.actor('makework36/lovable-sites-scraper').call({
    maxSites: 1000,
    enrichHtml: true,
    onlyLive: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();

const withCustomDomain = items.filter(i => i.hasCustomDomain);
console.log(`${withCustomDomain.length} qualified leads out of ${items.length} live sites`);

cURL (sync run-and-wait)

curl -X POST "https://api.apify.com/v2/acts/makework36~lovable-sites-scraper/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"maxSites": 200, "searchQuery": "crypto", "enrichHtml": true}'

PHP

$url = "https://api.apify.com/v2/acts/makework36~lovable-sites-scraper/run-sync-get-dataset-items?token=$token";
$payload = json_encode(["maxSites" => 500, "onlyLive" => true]);
$ch = curl_init($url);
curl_setopt_array($ch, [
    CURLOPT_POST => true,
    CURLOPT_POSTFIELDS => $payload,
    CURLOPT_HTTPHEADER => ["Content-Type: application/json"],
    CURLOPT_RETURNTRANSFER => true,
]);
$data = json_decode(curl_exec($ch), true);

Zapier / Make / n8n

Apify has native connectors for all three. Drop this actor in as a scheduled trigger (weekly/daily), pipe the dataset to Google Sheets, Airtable, Notion, HubSpot or your warehouse of choice. Typical setup: weekly run → filter hasCustomDomain=true → push new rows to your CRM as leads → enrich with Clearbit → trigger outreach sequence.

Pricing

$0.002 per site returned.

Billing model: pay-per-result. You only pay for rows that end up in your dataset — not for candidates that were dead, placeholder, or filtered out by your searchQuery.

Indicative totals:

Run size	Cost
100 sites	$0.20
500 sites	$1.00
1,000 sites	$2.00
5,000 sites	$10.00

The actor has no monthly subscription, no minimum spend, and no proxy/compute overhead charges — the $0.002/site is fully-loaded.

Compare to building this yourself:

Hiring a scraping dev: 2–5 days at $500/day = $1,000–$2,500
Scraping stack (proxy + compute + maintenance): $50–$200/month ongoing
Keeping it alive as sources change: ongoing engineering time

A one-off $10 run replaces weeks of work.

FAQ

How fresh is the data? Every run hits the sources live — no stale cache. crt.sh updates within minutes of a new cert. hackertarget and rapiddns refresh passive DNS daily-to-weekly. If a site was published this morning and has its own cert, it'll show up.

Can this find sites that haven't been published / are private? No. By design, this actor only sees what's publicly reachable. If a Lovable project was never shared publicly, it's invisible to all three sources.

What about sites with only custom domains and no lovable.app hostname? Lovable currently forces a *.lovable.app subdomain on every published project — the custom domain is added on top, not instead. So every live Lovable site has both. We discover via the forced *.lovable.app and surface the custom domain in the customDomain field.

Why are some results missing a title / description? Some Lovable projects are single-page apps rendered client-side with minimal SSR. The HTML we fetch is the shell; the actual content loads from JS. In those cases we extract what's in the shell and leave the rest empty. If you need rendered content, pipe the URLs into a Playwright-based enrichment actor as a second pass.

How do I filter for very recent sites? crt.sh returns certificate issuance dates — a future version of this actor may surface that. For now, run the actor weekly and diff against your previous dataset to find what's new.

Can I get phone / email of site owners? Not directly — this actor surfaces what's public on the site. For contact enrichment, combine the output with Clearbit, Apollo, Hunter.io or a WHOIS lookup on the customDomain field.

Does this work for other AI app builders (Bolt, v0, Replit)? This one is Lovable-specific. If you need bolt.new, v0.app or Replit public surface scraping, check our other actors or request one.

What's the difference between isLive and status = 200? status = 200 just means the server responded. isLive = true also means the response wasn't Lovable's default "project not found / not deployed yet" page. Many dead Lovable URLs return 200 + placeholder, which is worse than a 404 because you'd waste outreach on them. isLive cleans that up.

Why is enrichHtml=true so much slower? Because it's actually fetching each URL. 1000 sites at concurrency=15 takes roughly 2–4 minutes. Turn it off if you only need the subdomain list.

Can I re-run this on just a single subdomain to re-check it? Not as a focused use case of this actor — it's designed for bulk discovery. For point-queries, just curl the URL yourself.

Troubleshooting

Run finishes with very few results. Check: (1) maxSites isn't too low, (2) at least two sources are enabled, (3) onlyLive=true might be aggressive if Lovable had an outage — try onlyLive=false to see raw reachability, (4) if searchQuery is set, loosen it.

HTTP 429 / rate-limit warnings. Drop concurrency from 15 → 5 or 8. Sources occasionally tighten their free tiers.

crt.sh returns errors. crt.sh is sometimes flaky under load (502 Bad Gateway, timeouts). The actor logs a warning and continues with the other sources — you just get narrower coverage that run. Re-run later.

customDomain empty on a site I know has one. Custom-domain detection relies on <link rel="canonical"> or <meta property="og:url">. If the Lovable project doesn't set either to the custom domain, we can't detect it from the HTML alone. A v2 of this actor may add DNS / HTTP redirect-chain detection.

Some subdomains return contentLength: 0. Means the server returned an empty body (rare — usually a 3xx → 2xx redirect chain ending at an empty page, or HEAD-like response). Treat those as dead.

I want og:image URLs absolutized. They're returned as-is from the HTML. If a Lovable site uses a relative og:image, you'll get the relative path. Prepend the url field to resolve it.

Airbnb Scraper — Listings, Prices, Photos & Hosts API — full Airbnb enumeration for travel / lead-gen
Airbnb Market Analytics — ADR, RevPAR & Occupancy — short-term rental market metrics
Airbnb MCP Server — Claude, Cursor & AI Agents — conversational Airbnb search for LLM agents
VRBO Scraper — vacation-rental competitive data
Skyscanner scrapers — flight + hotel discovery

Check my Apify profile (makework36) for the full catalog of 70+ production scrapers.

Changelog

1.0 — 2026-04-21 — Initial release. Multi-source discovery (crt.sh + hackertarget + rapiddns), HTTP metadata enrichment, custom-domain detection, onlyLive + searchQuery filters, pay-per-result billing at $0.002/site.

Legal / ethics note

This actor only reads publicly available information: Certificate Transparency logs (a legal requirement for every issued SSL cert, mandated by browsers), passive DNS databases (publicly queried), and live HTTP GETs to publicly-published URLs. No authentication bypass, no private data, no rate-limit evasion on Lovable's own infrastructure. If you use the output for cold outreach, follow CAN-SPAM / GDPR / whatever jurisdiction applies to your recipient list.

Built by makework36. Questions, feature requests, or bug reports → open an issue on the actor page or DM on Apify.

App Store Apps Scraper

fatihai-tools/app-store-apps-scraper

Scrape Apple App Store app data with ratings, reviews, rankings, and developer info. Track iOS app market trends, monitor competitor apps, and analyze app store optimization metrics.

fatih dağüstü

App Store Reviews Scraper - iOS Apps Any Country

wetyr_corporation/appstore-reviews-scraper

Bulk extract iOS App Store reviews. Filter by country, app, rating. Pulls title, body, rating, author, version. Built for app developers, ASO research, competitive intel.

WETYR CORPORATION

Apple iOS App Store Scraper

parseforge/apple-app-store-iphone-scraper

Extracts iPhone apps from the Apple App Store. Search for apps, browse top charts, or look up specific app IDs. Captures detailed app information including ratings, reviews, screenshots, pricing data, and metadata for market research, competitor analysis, and app performance monitoring workflows.

ParseForge

5.0

🔥 Google Play Api

bebity/google-play-api

🚀 Discover "Google Play API" by Apify! Get Apps: Explore apps 📚. Get App Details: Crunchy details 🔍. Get Similar Apps: Spot trends 🚀. Search Apps: Powerful searching 🔥. Get Developer Apps: Dive into a dev's creations 👨‍💻. Get App Reviews: Hear the users 🎤. Permissions 🛡️ Categories 🌐 ...

Bebity

139

Google Play Store Apps Scraper

fatihai-tools/google-play-store-apps-scraper

Extract app data from Google Play Store including ratings, downloads, reviews, permissions, and update history. Monitor mobile app market, track competitor apps, and build app analytics.

fatih dağüstü

App Store Search Scraper – iOS App Metadata API

cloudcharlestom/app-store-search-scraper

Search and track iOS apps on the App Store with detailed metadata, ratings, and developer info. Ideal for market research, competitor analysis, and app intelligence. Fast, reliable, and easy to use for analysts, product managers, and developers.

CloudCharles

Apple App Store

simpleapi/apple-app-store

Apple App Store Scraper enables large-scale collection of public App Store data. Capture app metadata, review counts, ratings, and update logs for insights into app performance and market trends.

SimpleAPI

iTunes / Mac App Store Search Scraper

easyapi/itunes-mac-app-store-search-scraper

Discovering apps by retrieving comprehensive data from the App Store based on your search queries. With customizable options and detailed information for each app, it’s perfect for developers, marketers, and anyone looking to explore the app landscape efficiently! 📲

EasyApi

🛍️ Shopify App Store Scraper — App Research & Competitor Intel

nexgendata/shopify-app-store-scraper

Extract Shopify App Store data: app listings, reviews, pricing plans & ratings. Search by keyword or category. Tracker mode analyzes competitor app stacks. 10x cheaper than manual research. Built for e-commerce managers & Shopify app developers.

Stephan Corbeil

App Store Scraper

fatihai-tools/app-store-scraper

Scrape Apple App Store app data including ratings, reviews, descriptions, screenshots, and download estimates. Track competitor apps, analyze mobile market trends, and monitor app store optimization metrics.

fatih dağüstü

Lovable Sites Scraper - Find & Enrich lovable.app Apps

Lovable Sites Scraper — Find & Enrich .lovable.app Apps

Why this actor exists

What you get — output fields

How it works

1) Discovery — multi-source subdomain enumeration

2) Cleaning

3) Enrichment (optional, on by default)

4) Filtering

5) Charging

Input parameters

Sample inputs

Use cases

1. Lead generation for AI dev agencies

2. Competitive intelligence for SaaS founders

3. Market research on AI-built products

4. Design and UX inspiration

5. Agency prospecting for Lovable itself

6. Historical tracking / weekly deltas

Code examples

Python — Apify client

Node.js

cURL (sync run-and-wait)

PHP

Zapier / Make / n8n

Pricing

FAQ

Troubleshooting

Related actors from the same author

Changelog

Legal / ethics note

You might also like

App Store Apps Scraper

App Store Reviews Scraper - iOS Apps Any Country

Apple iOS App Store Scraper

🔥 Google Play Api

Google Play Store Apps Scraper

App Store Search Scraper – iOS App Metadata API

Apple App Store

iTunes / Mac App Store Search Scraper

🛍️ Shopify App Store Scraper — App Research & Competitor Intel

App Store Scraper

Lovable Sites Scraper — Find & Enrich `.lovable.app` Apps