Apify Creator Scraper · Bulk Profile & Actor Data Export
Pricing
from $6.99 / 1,000 results
Apify Creator Scraper · Bulk Profile & Actor Data Export
Extract every public Apify creator profile in one run. Get name, username, published actors, and usage stats — ready to export as a lead list or enrich in your CRM
Pricing
from $6.99 / 1,000 results
Rating
0.0
(0)
Developer
Corentin Robert
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Apify Creator Scraper · Bulk Store profiles & actor signals
Export public Apify Store creator profiles at scale — one dataset row per creator with identity, optional public email, social links, usage stats, and a server-rendered sample of published Actors. Use it for lead research, partner mapping, competitive snapshots, or CRM enrichment, then export JSON or CSV from the Apify Dataset.
The Actor reads only what is already public on apify.com profile pages (HTML + embedded Next.js data). It is not an official Apify product or API. Respect Apify’s Terms of Service and fair use — do not overload the site or use the data in ways that violate privacy or platform rules.
Table of contents
- Why use this?
- How it works (under the hood)
- Quick start on Apify
- Dataset views & key fields
- Understanding
emailandpublicActors - Failures, speed, and tuning
- Input
- Output (dataset & stores)
- Example inputs
- Logs during a run
- Limits
- Support & contact
- Developers: local run, tests, deploy
Why use this?
| Without this Actor | With this Actor |
|---|---|
Open each apify.com/{username} profile by hand | Thousands of profiles in one run (sitemap or your own URL list) |
| Copy-paste name, bio, and stats into a sheet | Structured rows — usernames, links, stats, timestamps |
| Guess who left a public email on their profile | email column when the creator chose to show it — filter non-empty in export |
Who it’s for
- SDRs / growth — build lists of creators who publish on the Store and surface public contact hints (email, website, LinkedIn, X).
- Partnerships & BD — map actor counts and activity-style stats without manual browsing.
- Research & analytics — snapshot public profile metadata and SSR actor samples for analysis you’re allowed to do.
Reality check: email is often empty. Most profiles do not expose an email. When present, it is only the address the user made public on Apify — not a guess or enrichment. publicActors is a partial list from the HTML (see below), not the full catalog behind “Load more” in the browser.
How it works (under the hood)
- Input — Either no profile list (empty input) or an array
profileUrls— onehttps://apify.com/{username}per line (scheme optional). URLs are normalized and deduplicated. - URL discovery — If
profileUrlsis empty, the Actor fetches the official sitemaphttps://apify.com/sitemap/users.xml, parses<loc>entries, keeps only single-segment profile URLs, applies an internal cap (see Limits), then scrapes each page. - Fetch — Each profile page is requested with
got-scraping(browser-like headers). Multiple pages run in parallel with configurable concurrency (see Failures, speed, and tuning). - Parse — Public data is extracted from embedded Flight / RSC-style JSON in the HTML (
user,actorsTotalUsers,actorsarray when present). No browser automation — static HTTP + string parse only. - Output — Each profile becomes one dataset item (success or error row). A
RUN_LOGkey in the default key-value store lists short failure lines for troubleshooting. The Output tab can link dataset URLs (including curated views) whenoutput_schema.jsonis configured.
Quick start on Apify
- Open the Actor in the Apify Console.
- Full sitemap run: leave Profile URLs empty (or use
{}in JSON input). - Targeted run: paste one profile URL per line under Profile URLs.
- Click Start.
- Open Dataset → tab Creators (lead-ready) for a compact table, or Profile detail (all fields) for every column including
publicActors. Export JSON or CSV as needed.
For CLI runs: apify login is optional for this Actor (no Apify Proxy in input); use apify run --input-file=input.json so local input is not empty after a clean checkout.
Dataset views & key fields
The Console exposes two views (column presets). Raw items always contain the full successful payload; views only change which columns you see first.
Creators (lead-ready) — ?view=results
Optimized for scanning and export: contact and identity first, no bulky publicActors JSON in the grid.
| Field | Why it matters |
|---|---|
profileUrl | Canonical public profile link. |
username | Store username. |
name | Display name. |
email | Public profile email only — often null. |
websiteUrl, linkedinUrl, githubUsername, twitterUsername, discordUserId | Other public links / handles. |
pictureUrl, bio | Quick context for outreach or filtering. |
actorsTotalUsers, publicActorsSampleCount | Scale of published actors vs. how many names appear in this row’s sample. |
statsActiveUsers30Days, statsRunSuccessRate, statsIssueResponseTimeDays | Usage-style stats when present in the payload. |
scrapedAt | When the row was written. |
httpOk, error | Set on failure rows (see Output). |
Profile detail (all fields) — ?view=profileDetail
Same lead block at the front, then IDs, readme, SEO meta (pageTitle, metaDescription, canonicalUrl), and the full publicActors array (sample from HTML).
Understanding email and publicActors
email— Mapped from Apify’sprofile.publicEmail. If the creator did not enable a public email on their profile, the value isnull. Filteremailis not empty in Sheets or SQL when you only want rows with a visible address.publicActors— Parsed from server-rendered HTML. It often contains fewer actors thanactorsTotalUsersbecause the rest loads in the client after page load. TreatactorsTotalUsersas the total signal andpublicActorsas a capped sample for context (names, titles, categories, etc.).
Failures, speed, and tuning
Defaults are tuned for throughput while staying reasonable toward apify.com: 12 parallel fetches and no fixed delay between requests.
If you see many timeouts, 429s, or blocked responses:
- Open
src/main.jsand lowerMAX_CONCURRENCY(e.g. from12to6or4). - Optionally set a small
DELAY_MSbetween requests (e.g.50–150) to reduce burst load.
These are source constants (not Console input fields). Rebuild / apify push after changes.
Input
| Field | Required | Description |
|---|---|---|
| profileUrls | No | String list: one https://apify.com/{username} per line (apify.com/user also works). Leave empty to use the full users sitemap (capped per run — see Limits). |
Output (dataset & stores)
Dataset — one object per profile URL processed:
- Success rows — profile fields from embedded JSON + meta + stats +
publicActorssample (see Dataset views & key fields). - Failure rows — at minimum
profileUrl,scrapedAt,error,httpOk:false;pageTitlemay be present when HTML was partially readable.
Key-value store — key RUN_LOG: short text log (recent failures and progress-style lines).
Apify Output tab — links may include: raw dataset items, Creators (lead-ready) view, Profile detail (all fields) view, run metadata, and RUN_LOG — per output_schema.json.
Example inputs
Full sitemap run (default cap applies)
{}
Targeted profiles
{"profileUrls": ["https://apify.com/apify","apify.com/zdroj"]}
Logs during a run
- Progress — Periodic INFO lines with counts, success/failure totals, and an ETA while the run is active.
- Warnings — Failed profiles log a short reason (full detail is in the dataset row’s
errorfield).
The dataset is the source of truth; logs are for monitoring.
Limits
- Sitemap mode: up to 50,000 profile URLs per run after filtering (internal cap in code; sitemap may grow over time).
- Public pages only — private or restricted profiles may yield error rows or missing payloads.
- Layout changes — if Apify changes how profile data is embedded in HTML, parsing may need an Actor update.
Support & contact
Questions or issues? Corentin Robert.
Developers: local run, tests, deploy
npm installnpm run apify:run # apify run --input-file=input.jsonnpm test # offline unit tests (sitemap + profile parse)apify push
Bump .actor/actor.json version (MAJOR.MINOR or patch per your convention) when you publish a new build.