Meta Ad Library Scraper — Facebook & Instagram Ads avatar

Meta Ad Library Scraper — Facebook & Instagram Ads

Pricing

from $25.00 / 1,000 results

Go to Apify Store
Meta Ad Library Scraper — Facebook & Instagram Ads

Meta Ad Library Scraper — Facebook & Instagram Ads

Extract Facebook and Meta Ad Library transparency data: search by keyword, Page ID, or full URL. Sort by total_impressions or most_recent. Returns structured creatives, spend/impression estimates, timing, distribution. Supports 100+ languages. No cookies. Built for competitive intel.

Pricing

from $25.00 / 1,000 results

Rating

5.0

(1)

Developer

Scrapeify

Scrapeify

Maintained by Community

Actor stats

2

Bookmarked

22

Total users

14

Monthly active users

18 days ago

Last modified

Share

Meta Ad Library Scraper — Extract Facebook & Instagram Ads by Keyword, Page ID, or URL

Extract structured Facebook and Meta Ad Library transparency records at scale — no cookies required, no API credentials needed. The Scrapeify Meta Ad Scraper accepts a keyword, numeric Page ID, or full Ad Library URL with optional sort order, paginates through Meta's GraphQL API, and delivers one Dataset row per ad with full creative content, performance estimates, timing, distribution, status flags, and a rich run summary.

Supports 100+ languages including Arabic, Chinese, Japanese, Korean, and Russian. Built for competitive intelligence teams, political researchers, creative strategists, and AI pipelines that need programmatic access to Meta's ad transparency data.


Features

CapabilityDetail
Three seed modeskeyword (brand/topic search), numeric pageId, or full adLibraryUrl
Sort optionstotal_impressions (high to low) or most_recent (relevancy monthly grouped)
No cookies requiredUses Meta's public GraphQL API without authentication
Unlimited resultsNo hard cap — paginate until maxResults or exhaustion
Full nested schemametadata, ad_content, timing, performance, distribution, status, additional_info
100+ language supportUniversal language handling for global ad research
Proxy supportConfigurable via PROXY_URL environment variable
300-second watchdogScrape timeout with typed error objects
Run summaryAggregated spend, impressions, platforms, timing in OUTPUT.summary
Schema documentationOUTPUT.data.schema for field-group documentation and codegen
URL flexibilityAccepts both facebook.com/ads/library and meta.com/ads/library URLs

Use Cases

Competitive Creative Intelligence

Pull all active and archived ads for any brand or category keyword. Analyze creative formats (carousel, video, single image), CTA patterns, offer types, and messaging angles. Track competitor creative refreshes and campaign launches by monitoring new ad_archive_id values.

Political & Issue Ad Monitoring

Research political advertising transparently. Filter by country, date range, and status within custom adLibraryUrl inputs. Analyze spend patterns, platform distribution, and targeted country reach for political campaigns.

E-Commerce & DTC Brand Research

Track direct-to-consumer creative strategies at scale. Identify which product-led formats, promotional structures, and urgency CTAs dominate high-spend categories across Facebook, Instagram, and Messenger.

Spend Intelligence & Media Planning

Pull impression-weighted ad rankings using sortBy: total_impressions. Identify which brands are investing most heavily in your category. Build directional spend benchmarks from Meta's publicly disclosed transparency estimates.

AI Creative Analysis

Feed structured ad_content.body and title fields into LLM classifiers to cluster themes, score messaging quality, detect prohibited claims, categorize offer types, and generate creative briefs at scale.

Automated Competitive Monitoring

Schedule nightly runs for tracked brands and keywords. Alert on new campaign launches, budget increases, or creative pivots using ad_archive_id diff logic.

RAG & Knowledge Base Construction

Index ad creative text with temporal metadata in vector databases. Enable queries like "what messaging did Brand X use for summer promotions?" using semantic retrieval with timestamped ad records.

Market Research & Category Analysis

Map advertiser landscape for any keyword — who is advertising, at what volume, on which platforms, and with what creative approaches. Export for BI analysis, client reporting, and strategy briefings.


Why Choose This Actor

  • No cookies, no credentials — direct GraphQL API access to Meta's public transparency data
  • Flexible seeds — keyword OSINT, known Page ID dossiers, and custom URL presets in one actor
  • Impression-ranked results — sort by total_impressions for spend-weighted competitive analysis
  • Universal language support — 100+ languages for global ad research workflows
  • Production schemaOUTPUT.data.schema documents field families for TypeScript/Python codegen
  • Consistent envelope — identical nested structure as Scrapeify Instagram and WhatsApp actors for unified warehousing

Quick Start

  1. Open the Scrapeify Meta Ad Scraper on Apify Console.
  2. Choose exactly one seed: enter a keyword, a numeric pageId, or paste a full adLibraryUrl.
  3. Set maxResults (start with 50 to validate filters).
  4. Optionally set sortBy (total_impressions for spend-weighted ranking, most_recent for freshness).
  5. After completion: export Dataset as JSONL, or read OUTPUT.summary for aggregates.

Tip: To find a brand's numeric Page ID, use the Scrapeify Brand Finder actor first.


Input Schema

{
"keyword": "nike",
"maxResults": 120,
"sortBy": "total_impressions"
}
FieldTypeRequiredDescription
keywordstringOne-ofBrand name or topic search (e.g. nike, fintech app). Default: nike.
pageIdstringOne-ofNumeric Facebook Page ID. Use Brand Finder to resolve from brand name.
adLibraryUrlstringOne-ofFull Ad Library URL. Must contain facebook.com/ads/library or meta.com/ads/library.
maxResultsintegerYesAds to collect. Minimum 1. No hard upper limit — paginate to exhaustion. Default: 50.
sortByenumNototal_impressions (default) or most_recent.

Exactly one of keyword, pageId, or adLibraryUrl should be active per run.


Output Schema

Dataset Row (one row per ad)

{
"metadata": {
"scraped_at": "2026-05-07T04:00:00.000Z",
"ad_archive_id": "123456789012345",
"ad_id": "987654321",
"page_id": "1234567890",
"page_name": "Example Brand",
"page_like_count": 2400000,
"page_profile_uri": "https://www.facebook.com/examplebrand",
"page_categories": ["Retail", "Shopping & Retail"]
},
"ad_content": {
"body": "Summer sale — 40% off sitewide. Free shipping on orders over $50.",
"title": "Shop the Summer Sale",
"link_url": "https://www.example.com/summer-sale",
"link_description": "Ends July 31.",
"cta_text": "Shop Now",
"cta_type": "SHOP_NOW",
"cards": [],
"images": ["https://scontent.xx.fbcdn.net/v/t45.1600-4/..."],
"videos": []
},
"timing": {
"start_date": "2026-05-01",
"end_date": null,
"total_active_time": null
},
"performance": {
"spend": 50000,
"currency": "USD",
"impressions": 3100000,
"impressions_index": 85,
"reach_estimate": null
},
"distribution": {
"publisher_platform": ["FACEBOOK", "INSTAGRAM"],
"targeted_or_reached_countries": ["US", "CA", "GB", "AU"],
"political_countries": []
},
"status": {
"is_active": true,
"is_aaa_eligible": false,
"is_reshared": false,
"has_user_reported": false,
"contains_sensitive_content": false
},
"additional_info": {
"categories": [],
"archive_types": [],
"collation_count": 1,
"collation_id": null,
"display_format": "IMAGE"
}
}

Run Summary (OUTPUT key in default KV store)

{
"summary": {
"totalAds": 120,
"searchType": "keyword_unordered",
"identifier": "nike",
"adLibraryUrl": "https://www.facebook.com/ads/library/?country=ALL&q=nike&search_type=keyword_unordered&sort_data[mode]=total_impressions",
"sortBy": "total_impressions",
"keyword": "nike",
"pageId": null,
"urlParams": {
"country": "ALL",
"ad_type": "all",
"active_status": "all"
},
"totalSpend": 125000.50,
"totalImpressions": 9800000,
"uniquePlatforms": ["FACEBOOK", "INSTAGRAM", "MESSENGER"],
"scrapedAt": "2026-05-07T04:00:00.000Z",
"pagesScraped": 14,
"executionTimeSeconds": 42.3,
"status": "SUCCESS",
"error": null
},
"data": {
"ads": ["...array of full ad objects..."],
"schema": {
"metadata": ["scraped_at", "ad_archive_id", "ad_id", "page_id", "page_name", "page_like_count", "page_profile_uri", "page_categories"],
"ad_content": ["body", "title", "link_url", "link_description", "cta_text", "cta_type", "cards", "images", "videos"],
"timing": ["start_date", "end_date", "total_active_time"],
"performance": ["spend", "currency", "impressions", "impressions_index", "reach_estimate"],
"distribution": ["publisher_platform", "targeted_or_reached_countries", "political_countries"],
"status": ["is_active", "is_aaa_eligible", "is_reshared", "has_user_reported", "contains_sensitive_content"],
"additional_info": ["categories", "archive_types", "collation_count", "collation_id", "display_format"]
}
}
}
FieldTypeDescription
summary.totalAdsintegerTotal ads collected in this run
summary.searchTypestringkeyword_unordered, page, or url_based
summary.totalSpendnumberSum of spend estimates (mixed currencies possible)
summary.totalImpressionsnumberSum of impression estimates
summary.uniquePlatformsarrayDistinct publisher platforms in collected ads
summary.pagesScrapedintegerGraphQL pagination pages fetched
summary.executionTimeSecondsnumberTotal scrape duration
summary.statusstringSUCCESS, ERROR, NO_RESULTS, or TIMEOUT
summary.errorobject/nullTyped error: type, message, details, suggestion
data.schemaobjectField group documentation for codegen and schema drift detection

API Examples

cURL

curl "https://api.apify.com/v2/acts/scrapeify~meta-facebook-ad-scrapper-using-ad-library-url-premium/runs?token=$APIFY_TOKEN" \
-X POST \
-H "Content-Type: application/json" \
-d '{
"pageId": "15087023444",
"maxResults": 80,
"sortBy": "most_recent"
}'

Python

import os
from apify_client import ApifyClient
client = ApifyClient(os.environ["APIFY_TOKEN"])
# Search by keyword
run = client.actor("scrapeify/meta-ad-library-scraper").call(
run_input={"keyword": "fintech app", "maxResults": 200, "sortBy": "total_impressions"}
)
# Or by Page ID
run = client.actor("scrapeify/meta-ad-library-scraper").call(
run_input={"pageId": "15087023444", "maxResults": 400}
)
for ad in client.dataset(run["defaultDatasetId"]).iterate_items():
print(ad["ad_content"]["body"], ad["performance"].get("impressions"))

JavaScript / Node.js

import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
// Search by full Ad Library URL (with custom geo/date filters)
const run = await client.actor("scrapeify/meta-ad-library-scraper").call({
adLibraryUrl: "https://www.facebook.com/ads/library/?country=ALL&q=travel&search_type=keyword_unordered",
maxResults: 150,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Collected ${items.length} ads across ${new Set(items.flatMap(a => a.distribution?.publisher_platform || [])).size} platforms`);

Integration Examples

ChatGPT / Custom GPT Actions

Register the Apify run endpoint as a Custom GPT action. Return structured ad JSON — body, title, spend, impressions, platforms — for the model to analyze creative strategies, summarize competitor messaging, or build category benchmark tables.

Claude Tool Use

@tool
def search_meta_ads(keyword: str, max_results: int = 100, sort_by: str = "total_impressions") -> list:
"""Search Meta Ad Library for ad creatives by keyword. Returns structured ad data."""
run = client.actor("scrapeify/meta-ad-library-scraper").call(
run_input={"keyword": keyword, "maxResults": max_results, "sortBy": sort_by}
)
return client.dataset(run["defaultDatasetId"]).list_items().items

Gemini

Pass 500+ ad body + title pairs into Gemini's long-context window for thematic clustering, category labeling, and messaging trend identification at scale.

LangChain

from langchain.tools import tool
@tool
def get_competitor_ads(brand_name: str, max_ads: int = 200) -> list:
"""Retrieve Meta Ad Library creatives for a brand. Sorted by total impressions."""
run = client.actor("scrapeify/meta-ad-library-scraper").call(
run_input={"keyword": brand_name, "maxResults": max_ads, "sortBy": "total_impressions"}
)
return client.dataset(run["defaultDatasetId"]).list_items().items

Chain: search_meta_adsanalyze_creative_themesgenerate_brief

CrewAI

# BrandResearchAgent: resolves Page IDs using Brand Finder
# AdAnalystAgent: pulls creatives using Meta Ad Scraper
# CreativeStrategistAgent: generates briefs from structured ad data
# ComplianceAgent: flags potential policy violations in ad copy

AutoGen

Multi-agent workflow: UserProxyAgent specifies competitive research scope; DataAgent pulls ad creative data; AnalysisAgent clusters by theme; ReportAgent summarizes with spend-weighted ranking.

n8n / Make.com / Zapier

Cron trigger → Apify run → compare new ad_archive_id values against last run → push new ad launches to Slack or email alert.

RAG Systems

Chunk ad_content.body with metadata (page_name, start_date, publisher_platform, impressions). Index in Pinecone, Weaviate, or Qdrant. Enable queries like "what discount messaging do fitness brands use on Facebook?"

Vector Databases

Embed body + title as semantic vectors. Store page_id, impressions, is_active, publisher_platform, and start_date as filterable metadata for faceted creative research.


Frequently Asked Questions

1. Do I need cookies or a Facebook account? No. The actor uses Meta's public GraphQL API endpoint without any authentication.

2. Can I use both facebook.com and meta.com Ad Library URLs? Yes — both URL patterns are accepted for the adLibraryUrl input.

3. What is the difference between sortBy options? total_impressions ranks ads by estimated impression volume (high to low) — best for spend-weighted competitive analysis. most_recent uses Meta's relevancy monthly grouped view — better for tracking campaign freshness.

4. How do I find a brand's numeric Page ID? Use the Scrapeify Brand Finder actor — enter the brand name and receive deduplicated Page ID candidates.

5. Is there a maximum number of ads per run? No enforced cap — the actor paginates until maxResults is reached or the Ad Library returns no further results.

6. Are spend and impressions figures exact? No — Meta's Ad Library publishes transparency estimates, not audited financial figures. Treat as directional signals.

7. What is NO_RESULTS status? The Ad Library returned no matching ads for the given filter combination. Not a credential failure — try broadening the search keyword, country scope, or active status filter.

8. How do I scrape Instagram-only ads? Use the dedicated Instagram Ad Scraper which enforces the Instagram platform filter at URL construction.

9. What does sortBy ignored mean? Conflicting embedded sort_data in a pasted URL may conflict with the sortBy parameter. Inspect the final adLibraryUrl in OUTPUT.summary to verify sort propagation.

10. How do I deduplicate across runs? Key on metadata.ad_archive_id — the stable unique identifier for each ad in the Meta Ad Library.

11. Can I filter by country or date in the URL? Yes — encode geographic and date filters in a custom adLibraryUrl pasted from the browser's Ad Library address bar.

12. Are carousel ads returned as one row or multiple? One Dataset row per ad. Carousel cards are nested in ad_content.cards[].

13. What does display_format contain? The ad creative format: IMAGE, VIDEO, CAROUSEL, DPA, etc. from additional_info.display_format.

14. Does it support non-Latin brand names? Yes — 100+ languages supported including Arabic, Chinese, Japanese, Korean, and Russian. UTF-8 throughout.

15. What causes TIMEOUT status? Very deep pagination for large maxResults may hit the 300-second watchdog. Reduce maxResults and rerun, or shard across multiple actors.

16. Are inactive/archived ads included? Depends on URL parameters. Pass active_status=all in adLibraryUrl to include both active and archived creatives.

17. Can I extract video creative URLs? Yes — ad_content.videos[] contains URLs when video assets are present. Fetching media is your responsibility.

18. What does collation_count mean? Number of ad variations Meta groups under one creative ID. Usually 1 for individual ads; higher for A/B test variants.

19. Is there a webhook for run completion? Yes — configure Apify webhooks on RUN.SUCCEEDED to automatically pull the Dataset after completion.

20. How do I handle FX-heterogeneous spend aggregates? Check performance.currency per ad. Normalize to a common currency before summing spend across a dataset.

21. Can I scrape political advertising? Yes — political ad data is available in Meta's transparency library. Observe legal and ethical obligations in your jurisdiction.

22. What is impressions_index? A normalized impression index (0–100) from Meta's API. Useful for relative ranking when absolute numbers aren't available.

23. How do I set up a nightly competitive monitoring pipeline? Schedule Apify runs → compare ad_archive_id sets against yesterday's run → route new IDs to Slack alerts → push full records to your data warehouse.

24. Does the actor work for WhatsApp-specific ads? For WhatsApp CTA ads, use the dedicated WhatsApp Ad Scraper which enforces the WhatsApp platform filter.

25. Are there data retention or compliance obligations? Meta's Ad Library terms of service apply. Verify data retention policies for your use case — especially for political ad data and PII in page names.


Best Practices

  • Start with small maxResults to validate geographic and media filters before scaling
  • Log adLibraryUrl from OUTPUT.summary when debugging sort or filter behavior — inspect the final constructed URL
  • Use exponential backoff at orchestration level on TIMEOUT errors
  • Store OUTPUT + Dataset IDs together for reproducibility audits
  • Deduplicate on ad_archive_id in your warehouse to prevent duplicate rows from overlapping runs
  • Archive searchUrl with each analytical snapshot for reproducibility
  • Normalize currencies before aggregationperformance.currency varies per ad row

Performance & Scalability

FactorGuidance
ThroughputPagination-bound; each GraphQL page returns ~25 ads
Timeout riskVery large maxResults may hit 300s — shard by keyword or date range
Horizontal scalingMultiple parallel actor runs per brand shard or geographic segment
MemoryConsistent JSON rows — predictable at any scale
Large datasetsPrefer Dataset export over KV-embedded arrays for runs returning 1000+ ads

AI & Automation Workflows

Creative clustering pipeline: Embed body + title → cluster by cosine similarity → label theme groups (discount, social proof, urgency, education, product) → generate category-level creative briefs.

Spend intelligence: Pull total_impressions-sorted ads for top 20 competitors → aggregate by page_name → build spend-proxy leaderboard → update BI dashboard weekly.

Compliance screening: Extract ad_content.body for all active ads in a regulated vertical → run policy-rule LLM classifier → flag potential violations → queue for legal review.

Campaign launch detection: Schedule daily runs → diff ad_archive_id sets → alert on new clusters of IDs indicating budget scaling or new campaign launches.


Error Handling

ScenarioBehavior
Multiple seeds activeValidation expects single coherent seed — leave others empty
Non-numeric pageIdValidation error with descriptive message
Invalid adLibraryUrl hostnameValidation error — must contain Ad Library domain
NO_RESULTSClean completion; summary.status = NO_RESULTS, not a failure
TIMEOUTReduce maxResults; check pagesScraped for progress
SCRAPING_ERRORTyped error with details and suggestion; check proxy health
DATA_STORAGE_ERRORDataset write failure; partial results may be preserved

Error types surfaced: INVALID_INPUT, INVALID_URL, SCRAPING_ERROR, TIMEOUT, DATA_STORAGE_ERROR, UNKNOWN_ERROR — each with details and suggestion strings.


Trust & Reliability

Scrapeify operates this actor with explicit validation before network work, summary-first outputs for dashboards, and comprehensive schema documentation in OUTPUT.data.schema — designed for teams building production competitive intelligence and creative analytics pipelines.


Explore the full Scrapeify suite — chain these actors together for end-to-end automation pipelines:

ActorWhat it does
Amazon ScraperASINs, prices, sponsored flags across 23 marketplaces
Instagram Ad Library ScraperInstagram-only ads from Meta Ad Library
WhatsApp Ad ScraperClick-to-WhatsApp ad creatives
YouTube Video DownloaderVideos & audio to Apify Key-Value Store
Meta Brand & Page ID FinderResolve brand names to numeric Page IDs
Google Maps ScraperLocal business leads, reviews, emails, contacts
Google News ScraperHeadlines, sources, article URLs (up to 2K)

Meta and Facebook are trademarks of Meta Platforms, Inc. This actor is not affiliated with or endorsed by Meta.