Apify Creator Scraper · Bulk Profile & Actor Data Export avatar

Apify Creator Scraper · Bulk Profile & Actor Data Export

Pricing

from $6.99 / 1,000 results

Go to Apify Store
Apify Creator Scraper · Bulk Profile & Actor Data Export

Apify Creator Scraper · Bulk Profile & Actor Data Export

Extract every public Apify creator profile in one run. Get name, username, published actors, and usage stats — ready to export as a lead list or enrich in your CRM

Pricing

from $6.99 / 1,000 results

Rating

0.0

(0)

Developer

Corentin Robert

Corentin Robert

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Apify Creator Scraper · Bulk Store profiles & actor signals

Export public Apify Store creator profiles at scale — one dataset row per creator with identity, optional public email, social links, usage stats, and a server-rendered sample of published Actors. Use it for lead research, partner mapping, competitive snapshots, or CRM enrichment, then export JSON or CSV from the Apify Dataset.

The Actor reads only what is already public on apify.com profile pages (HTML + embedded Next.js data). It is not an official Apify product or API. Respect Apify’s Terms of Service and fair use — do not overload the site or use the data in ways that violate privacy or platform rules.


Table of contents


Why use this?

Without this ActorWith this Actor
Open each apify.com/{username} profile by handThousands of profiles in one run (sitemap or your own URL list)
Copy-paste name, bio, and stats into a sheetStructured rows — usernames, links, stats, timestamps
Guess who left a public email on their profileemail column when the creator chose to show it — filter non-empty in export

Who it’s for

  • SDRs / growth — build lists of creators who publish on the Store and surface public contact hints (email, website, LinkedIn, X).
  • Partnerships & BD — map actor counts and activity-style stats without manual browsing.
  • Research & analytics — snapshot public profile metadata and SSR actor samples for analysis you’re allowed to do.

Reality check: email is often empty. Most profiles do not expose an email. When present, it is only the address the user made public on Apify — not a guess or enrichment. publicActors is a partial list from the HTML (see below), not the full catalog behind “Load more” in the browser.


How it works (under the hood)

  1. Input — Either no profile list (empty input) or an array profileUrls — one https://apify.com/{username} per line (scheme optional). URLs are normalized and deduplicated.
  2. URL discovery — If profileUrls is empty, the Actor fetches the official sitemap https://apify.com/sitemap/users.xml, parses <loc> entries, keeps only single-segment profile URLs, applies an internal cap (see Limits), then scrapes each page.
  3. Fetch — Each profile page is requested with got-scraping (browser-like headers). Multiple pages run in parallel with configurable concurrency (see Failures, speed, and tuning).
  4. Parse — Public data is extracted from embedded Flight / RSC-style JSON in the HTML (user, actorsTotalUsers, actors array when present). No browser automation — static HTTP + string parse only.
  5. Output — Each profile becomes one dataset item (success or error row). A RUN_LOG key in the default key-value store lists short failure lines for troubleshooting. The Output tab can link dataset URLs (including curated views) when output_schema.json is configured.

Quick start on Apify

  1. Open the Actor in the Apify Console.
  2. Full sitemap run: leave Profile URLs empty (or use {} in JSON input).
  3. Targeted run: paste one profile URL per line under Profile URLs.
  4. Click Start.
  5. Open Dataset → tab Creators (lead-ready) for a compact table, or Profile detail (all fields) for every column including publicActors. Export JSON or CSV as needed.

For CLI runs: apify login is optional for this Actor (no Apify Proxy in input); use apify run --input-file=input.json so local input is not empty after a clean checkout.


Dataset views & key fields

The Console exposes two views (column presets). Raw items always contain the full successful payload; views only change which columns you see first.

Creators (lead-ready) — ?view=results

Optimized for scanning and export: contact and identity first, no bulky publicActors JSON in the grid.

FieldWhy it matters
profileUrlCanonical public profile link.
usernameStore username.
nameDisplay name.
emailPublic profile email only — often null.
websiteUrl, linkedinUrl, githubUsername, twitterUsername, discordUserIdOther public links / handles.
pictureUrl, bioQuick context for outreach or filtering.
actorsTotalUsers, publicActorsSampleCountScale of published actors vs. how many names appear in this row’s sample.
statsActiveUsers30Days, statsRunSuccessRate, statsIssueResponseTimeDaysUsage-style stats when present in the payload.
scrapedAtWhen the row was written.
httpOk, errorSet on failure rows (see Output).

Profile detail (all fields) — ?view=profileDetail

Same lead block at the front, then IDs, readme, SEO meta (pageTitle, metaDescription, canonicalUrl), and the full publicActors array (sample from HTML).


Understanding email and publicActors

  • email — Mapped from Apify’s profile.publicEmail. If the creator did not enable a public email on their profile, the value is null. Filter email is not empty in Sheets or SQL when you only want rows with a visible address.
  • publicActors — Parsed from server-rendered HTML. It often contains fewer actors than actorsTotalUsers because the rest loads in the client after page load. Treat actorsTotalUsers as the total signal and publicActors as a capped sample for context (names, titles, categories, etc.).

Failures, speed, and tuning

Defaults are tuned for throughput while staying reasonable toward apify.com: 12 parallel fetches and no fixed delay between requests.

If you see many timeouts, 429s, or blocked responses:

  1. Open src/main.js and lower MAX_CONCURRENCY (e.g. from 12 to 6 or 4).
  2. Optionally set a small DELAY_MS between requests (e.g. 50150) to reduce burst load.

These are source constants (not Console input fields). Rebuild / apify push after changes.


Input

FieldRequiredDescription
profileUrlsNoString list: one https://apify.com/{username} per line (apify.com/user also works). Leave empty to use the full users sitemap (capped per run — see Limits).

Output (dataset & stores)

Dataset — one object per profile URL processed:

  • Success rows — profile fields from embedded JSON + meta + stats + publicActors sample (see Dataset views & key fields).
  • Failure rows — at minimum profileUrl, scrapedAt, error, httpOk: false; pageTitle may be present when HTML was partially readable.

Key-value store — key RUN_LOG: short text log (recent failures and progress-style lines).

Apify Output tab — links may include: raw dataset items, Creators (lead-ready) view, Profile detail (all fields) view, run metadata, and RUN_LOG — per output_schema.json.


Example inputs

Full sitemap run (default cap applies)

{}

Targeted profiles

{
"profileUrls": [
"https://apify.com/apify",
"apify.com/zdroj"
]
}

Logs during a run

  • Progress — Periodic INFO lines with counts, success/failure totals, and an ETA while the run is active.
  • Warnings — Failed profiles log a short reason (full detail is in the dataset row’s error field).

The dataset is the source of truth; logs are for monitoring.


Limits

  • Sitemap mode: up to 50,000 profile URLs per run after filtering (internal cap in code; sitemap may grow over time).
  • Public pages only — private or restricted profiles may yield error rows or missing payloads.
  • Layout changes — if Apify changes how profile data is embedded in HTML, parsing may need an Actor update.

Support & contact

Questions or issues? Corentin Robert.


Developers: local run, tests, deploy

npm install
npm run apify:run # apify run --input-file=input.json
npm test # offline unit tests (sitemap + profile parse)
apify push

Bump .actor/actor.json version (MAJOR.MINOR or patch per your convention) when you publish a new build.