Social Profiles — Bio + Followers + Posts in CSV, No Limits avatar

Social Profiles — Bio + Followers + Posts in CSV, No Limits

Pricing

Pay per usage

Go to Apify Store
Social Profiles — Bio + Followers + Posts in CSV, No Limits

Social Profiles — Bio + Followers + Posts in CSV, No Limits

Get YOUR social profile data in CSV/JSON — bio, followers, posts, engagement across multiple platforms in bulk. 14+ real runs worldwide. Built for competitor tracking + influencer research + lead enrichment. No rate limits. Custom pipeline — spinov001@gmail.com · Tips: t.me/scraping_ai

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Alex

Alex

Maintained by Community

Actor stats

0

Bookmarked

5

Total users

3

Monthly active users

2 days ago

Last modified

Share

Social Profile Scraper — Unified Bio/Follower/Avatar Extract Across 12 Platforms (Auto-Detect, Batch, JSON)

Turn YOUR list of mixed social URLs (GitHub + LinkedIn + Twitter/X + YouTube + Reddit + Medium + Dev.to + Bluesky + Mastodon + Instagram + TikTok + Threads) into one clean JSON table — display name, bio, avatar, follower count, username — without writing 12 different scrapers or hitting 12 different rate limits. Auto-detects platform from URL, falls back to Open Graph + JSON-LD + Twitter Cards when no native API is available, returns a flat row per profile even for broken/deleted URLs. Batch up to 1,000 URLs per run.

Who buys this actor

  • SDR / lead-gen teams enriching a list of prospect LinkedIn URLs with their GitHub/Dev.to/Medium activity — technical buyers reveal themselves through public dev presence.
  • VC / investor-research analysts pulling founder footprint across 5-7 platforms into one row per person for deal-review memos.
  • Talent / recruiting vetting candidate portfolios: one scrape returns GitHub follower count + Dev.to post count + LinkedIn headline, instead of 3 separate tools.
  • Influencer-marketing teams comparing the same creator's reach on Instagram vs TikTok vs YouTube vs Bluesky before signing a deal.
  • Brand / reputation teams monitoring the social graph of their own executives across platforms — early signal when an exec is active on Bluesky but quiet on Twitter.
  • CRM-enrichment pipelines filling the "social handles" section of a HubSpot / Salesforce contact record at import time.

Why this over the obvious alternatives

Concern you haveHow this actor handles it
"Clearbit / Apollo / ZoomInfo already do social enrichment."They do — at $10K+/year seat licenses, with stale data (60-90 day refresh) and limited to business platforms (no Bluesky, no Mastodon, no Threads). This actor runs fresh on-demand at Apify PPR (~$0.001/profile at batch), including the newer platforms.
"Why not 12 different platform SDKs?"Because 11 of them have different auth flows, 8 have strict rate limits, 3 require OAuth app review. This actor abstracts all of it: feed URL, get row. No auth. No per-platform client code in your pipeline.
"Accuracy — what happens on platforms without public APIs (LinkedIn, Instagram)?"We don't scrape authenticated pages or bypass login walls (ToS + detection risk). For LinkedIn/Instagram we rely on Open Graph + JSON-LD + meta tags returned by the public unauth page. You get display name + bio + avatar + follower-range (when LinkedIn exposes it) — not private fields. Full-detail LinkedIn requires their official Sales Navigator API.
"Follower counts are strings ('210K') not numbers — why?"Because that's what the OG meta actually returns for most platforms. A followersRaw field gives parsed-number when derivable ('210000' from '210K', '1500000' from '1.5M'), and the original string stays in followers for display.
"Deleted / private / 404 profiles — does one bad URL kill the batch?"No. Each URL gets its own record. Failed ones have status: "error" + `error: "not_found"
"What about rate-limit bans on GitHub / Reddit?"We throttle per-platform: GitHub 1 req/sec anonymous, Reddit 1 req/2s public, LinkedIn 1 req/3s. For batches over 500 profiles on GitHub, supply your own PAT for a 5000/hr ceiling.
"Can I resolve a handle like @ben to a canonical URL first?"Yes — pass raw handles without URL prefix; auto-detect walks known platforms (github.com/ben, dev.to/ben, etc.) in priority order. Ambiguous handles return multiple rows, one per matched platform.

Input

{
"profileUrls": [
"https://github.com/torvalds",
"https://dev.to/ben",
"https://bsky.app/profile/jay.bsky.team",
"https://www.linkedin.com/in/patrick-collison/",
"https://www.youtube.com/@veritasium",
"https://twitter.com/paulg"
]
}
  • profileUrls (array, required) — 1 to 1,000 URLs, any mix of supported platforms.

Platforms supported (auto-detect): GitHub, Twitter/X, LinkedIn, Instagram, YouTube, TikTok, Reddit, Medium, Dev.to, Threads, Bluesky, Mastodon. New platforms added by request (≥5 paying users asking).

Output schema (per URL)

{
"url": "https://github.com/torvalds",
"platform": "github",
"username": "torvalds",
"displayName": "Linus Torvalds",
"bio": "Linux kernel developer",
"avatar": "https://avatars.githubusercontent.com/u/1024025",
"followers": "210K",
"followersRaw": 210000,
"following": "0",
"followingRaw": 0,
"postsCount": "35",
"location": null,
"website": null,
"verified": false,
"siteName": "GitHub",
"meta": {
"ogTitle": "torvalds (Linus Torvalds) · GitHub",
"ogDescription": "Linux kernel developer",
"ogImage": "https://...",
"pageTitle": "torvalds (Linus Torvalds) · GitHub",
"twitterCard": "summary"
},
"status": "success",
"scrapedAt": "2026-04-23T02:30:00.000Z",
"httpStatus": 200
}

~16 fields per profile when platform exposes them. Missing fields are null rather than omitted — downstream DataFrames stay schema-stable.

Python copy-paste — enrich a LinkedIn export with GitHub signal

Given 500 LinkedIn URLs from a Sales-Nav export, scrape + cross-reference with GitHub handles guessed from "displayName + company" to identify which prospects are hands-on technical buyers (own code on GitHub).

from apify_client import ApifyClient
import re
client = ApifyClient("<YOUR_APIFY_TOKEN>")
linkedin_urls = [line.strip() for line in open("prospects_linkedin.txt")]
def slugify(name: str) -> str:
return re.sub(r"[^a-z0-9]+", "-", name.lower()).strip("-")
# Step 1: pull LinkedIn bio
run = client.actor("knotless_cadence/social-profile-scraper").call(run_input={
"profileUrls": linkedin_urls,
})
li = {r["url"]: r for r in client.dataset(run["defaultDatasetId"]).iterate_items()}
# Step 2: guess GitHub handles from names
gh_guesses = [
f"https://github.com/{slugify(r['displayName'])}"
for r in li.values() if r.get("displayName")
]
run2 = client.actor("knotless_cadence/social-profile-scraper").call(run_input={
"profileUrls": gh_guesses,
})
gh = [r for r in client.dataset(run2["defaultDatasetId"]).iterate_items()
if r.get("status") == "success" and r.get("followersRaw", 0) > 50]
print(f"Technical-buyer signal: {len(gh)} of {len(linkedin_urls)} prospects "
f"have an active GitHub account (>50 followers).")
for r in gh:
print(f" {r['displayName']}{r['url']} ({r['followers']} followers)")

A 90-second enrichment that turns a cold LinkedIn list into a prioritized "hands-on technical buyers first" queue.

MCP / LLM-agent use

Wrap as a single tool for agents that need "get everything public about this person/URL":

tools = [{
"name": "lookup_social_profile",
"description": "Given a social profile URL on any major platform, return display name, bio, avatar, follower count.",
"input_schema": {
"type": "object",
"properties": {"url": {"type": "string"}},
"required": ["url"],
},
}]

Agent then enriches names mentioned in conversation ("tell me about @paulg") into canonical profile data for its reasoning step.

Frequent questions

1. "Will LinkedIn throw a captcha and tank the whole batch?" LinkedIn shows login wall on many unauth requests. We serve status: "blocked" without retry-storming their servers (which would escalate). Workaround: run LinkedIn separately with residential proxy (useApifyProxy: true, group RESIDENTIAL); for large LinkedIn volumes, LinkedIn's own API is the only sanctioned path.

2. "Does it parse Bluesky custom-domain handles (alice.dev instead of alice.bsky.social)?" Yes — both https://bsky.app/profile/alice.dev and https://alice.dev with proper _atproto DNS record resolve. DID resolution is transparent.

3. "How fresh is the data?" Scraped live per run. No cache. If the profile page changes between runs, next run reflects that. For change-tracking, snapshot to your DB and diff.

4. "Mastodon has 1000+ servers. Do I need to list the server for each URL?" Pass the full URL (https://fosstodon.org/@user) and the actor dispatches to the correct instance. Without instance, we can't guess — Mastodon has no global username registry.

5. "Follower count on TikTok is '1.2M' — how do I sort 1.2M vs 900K correctly?" Use followersRaw (integer). We parse standard suffix notation (K, M, B) into integer values. followers stays the string for display.

6. "I need email addresses too. Are you going to add that?" No — different problem, different actor. See Email Extractor Pro, which pulls emails from websites and combines cleanly in a multi-step pipeline (social profile → website → email).

Lead-enrichment toolkit (companion actors)

StepToolPurpose
1Google Maps Scraper ProFind businesses by category + location
2Social Profile Scraper (this)Get social presence for each result
3Email Extractor ProPull contact email from website
4Website Tech-Stack DetectorQualify by tech stack (Next.js, Shopify, etc.)
5Trustpilot Review ScraperBonus: review-volume signal of activity

All 5 chainable in a single Apify run via the Run input-from-dataset pattern.


Part of 78 data-extraction actors by knotless_cadence on Apify.

Platform missing, edge case not covered, a lead-gen pipeline recipe you want built? Email spinov001@gmail.com or open an issue on the actor page.