Facebook Photos/Images Scraper premium avatar
Facebook Photos/Images Scraper premium

Pricing

$8.75/month + usage

Go to Apify Store
Facebook Photos/Images Scraper premium

Facebook Photos/Images Scraper premium

Instantly extract high-quality Facebook images and rich metadata- production-ready, secure, and plug-and-play. Save time, scale safely with proxy support, and integrate outputs into your workflows. Trusted for high-volume runs. Run now in Apify Console and get results in seconds. Start saving time.

Pricing

$8.75/month + usage

Rating

0.0

(0)

Developer

Neuro Scraper

Neuro Scraper

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

21 days ago

Last modified

Share

🌟 Facebook Image Scraper & Downloader — Metadata Edition

build badge version license trust


One-liner

Instantly extract images and rich metadata from public Facebook pages & posts — plug-and-play Apify Actor that runs in seconds and stores images + metadata securely.

Production-ready · privacy-safe · enterprise-friendly — run in Console with zero setup.


📖 What this Actor does

This Actor fetches a Facebook post or page URL, discovers high-quality image assets and rich metadata (titles, publish times, authors, inferred dimensions), stores image bytes in Key-Value storage and writes structured metadata to your Actor output (Dataset / KV). It is designed for fast, reliable runs with minimal configuration.

Outcome-focused: get downloadable images plus clean JSON metadata ready for analytics, cataloguing, or ingestion into downstream systems.


💡 Use cases / When to use

  • Archive public post images & captions for research, reporting, or media monitoring.
  • Build an image dataset with standardized metadata for ML or content pipelines.
  • Quickly fetch featured images and thumbnails for a list of public Facebook post URLs.
  • Power automated media catalogs and dashboards without running a browser.

⚡ Quick Start — Console (one-click)

  1. Go to the Actor in Apify Console.
  2. Paste one or more Facebook post or page URLs into the Input field (see example below).
  3. Click Run → results appear in seconds in the Dataset and Key-Value store.

Hero tip: Use the preserve_downloads option to keep a local copy of the images in the Actor output if desired.

(Add a short demo GIF or screenshot in Console for higher conversion — see CONFIG.md.)


⚙️ Quick Start — CLI & API

CLI (one-liner)

$apify run --input input.example.json

Python — apify-client (compact)

from apify_client import ApifyClient
client = ApifyClient('<APIFY_TOKEN>')
run = client.run_actor(actor_id='your-username/facebook-image-scraper', input={
'startUrls': [{'url': 'https://www.facebook.com/example/posts/1234567890'}],
'maxConcurrency': 5
})
print('Run started:', run['id'])

Replace <APIFY_TOKEN> with your secret. Store tokens in Apify Console secrets — never paste them into input fields.


📝 Inputs (fields & schema)

Paste the following Console JSON example into the Actor Input panel. This is the minimal example; the Actor accepts an array of startUrls or a single startUrl string.

Console JSON input example (also saved as input.example.json):

{
"startUrls": [
{ "url": "https://www.facebook.com/example/posts/1234567890" }
],
"maxConcurrency": 5,
"verify_tls": true,
"preserve_downloads": false
}

Fields (summary)

  • startUrls — list of objects { "url": "..." } or a single URL string. Required.
  • maxConcurrency — integer, how many pages to process concurrently. Default: 5.
  • verify_tls — boolean, whether to verify TLS certificates. Default: true.
  • preserve_downloads — boolean, store downloaded files into output/ in addition to Key-Value. Default: false.

⚙️ Configuration (friendly table)

🔑 Name📝 Type❓ Required⚙️ Default📌 Example🧠 Notes
startUrlsarray/string✅ YesNone[{"url":"https://..."}]One or many Facebook URLs (post/page).
maxConcurrencyinteger⚙️ Optional510Increase for faster runs; watch rate limits.
verify_tlsboolean⚙️ OptionaltruefalseDisable for environments with custom certs (not recommended).
preserve_downloadsboolean⚙️ OptionalfalsetrueStores image bytes locally in run output folder.
proxyConfigobject⚙️ Optional{}{"useApifyProxy": true}Use Apify Proxy or custom proxies for scale. TODO: confirm advanced proxy rotation.

Console setup hint: Paste a URL into startUrls → Click Run. For bulk runs, provide an array of URLs.


📄 Outputs (Dataset / KV examples)

What you get:

  • Dataset: One JSON record per discovered image containing page metadata and image metadata.
  • Key-Value Store (KV): Binary image records (named keys) and per-image metadata JSON stored as separate KV records.

Example output JSON (single image record)

[
{
"scraped_at": "2025-12-19T18:28:19.795929Z",
"page": {
"page_url": "https://web.facebook.com/share/p/17cZow1ZhN/",
"title": "স্বপ্নের শ্রীমঙ্গল",
"description": "এই ছবির দিকে সারাদিন তাকিয়ে থাকা যায়।",
"site_name": "",
"type": "video.other",
"canonical": "https://web.facebook.com/SopnerSrimangal/posts/%E0%A6%8F%E0%A6%87-%E0%A6%9B%E0%A6%AC%E0%A6%BF%E0%A6%B0-%E0%A6%A6%E0%A6%BF%E0%A6%95%E0%A7%87-%E0%A6%B8%E0%A6%BE%E0%A6%B0%E0%A6%BE%E0%A6%A6%E0%A6%BF%E0%A6%A8-%E0%A6%A4%E0%A6%BE%E0%A6%95%E0%A6%BF%E0%A7%9F%E0%A7%87-%E0%A6%A5%E0%A6%BE%E0%A6%95%E0%A6%BE-%E0%A6%AF%E0%A6%BE%E0%A7%9F/880943704459640/",
"published_time": "",
"modified_time": "",
"author": "This browser is not supported",
"og_image": "https://scontent-iad3-2.xx.fbcdn.net/v/t39.30808-6/594996458_880943671126310_2017080706398143877_n.jpg?stp=cp0_dst-jpg_e15_fr_q65_tt6&cstp=mx1536x2048&ctp=p600x600&_nc_cat=100&ccb=1-7&_nc_sid=b96d88&_nc_ohc=8V8IvVtlgCgQ7kNvwEtjXBZ&_nc_oc=AdlxFojSn0b22sh_zTxhc4c84jo3NBmMZJTsP1W503Vr7Epn6w2lLg4HDPAPfihJwao&_nc_ad=z-m&_nc_cid=0&_nc_zt=23&_nc_rml=0&_nc_ht=scontent-iad3-2.xx&_nc_gid=ipwL1QJVM6IrfnY8Y9rdGQ&oh=00_AflVdL2idu73SpYjOiGXhz7U8MfdvZFb0ZN4YmzbM6Zd_Q&oe=694B727E",
"json_ld": [],
"post_id": "409962623085609"
},
"image": {
"url": "https://scontent-iad3-2.xx.fbcdn.net/v/t39.30808-6/594996458_880943671126310_2017080706398143877_n.jpg?stp=cp0_dst-jpg_e15_fr_q65_tt6&cstp=mx1536x2048&ctp=p600x600&_nc_cat=100&ccb=1-7&_nc_sid=b96d88&_nc_ohc=8V8IvVtlgCgQ7kNvwEtjXBZ&_nc_oc=AdlxFojSn0b22sh_zTxhc4c84jo3NBmMZJTsP1W503Vr7Epn6w2lLg4HDPAPfihJwao&_nc_ad=z-m&_nc_cid=0&_nc_zt=23&_nc_rml=0&_nc_ht=scontent-iad3-2.xx&_nc_gid=ipwL1QJVM6IrfnY8Y9rdGQ&oh=00_AflVdL2idu73SpYjOiGXhz7U8MfdvZFb0ZN4YmzbM6Zd_Q&oe=694B727E",
"origin_tag": "meta:og:image",
"src_attr": {},
"chosen_ext": "jpg",
"content_type": "image/jpeg",
"content_length": 29888,
"width": 600,
"height": 800,
"kv_key": "IMAGE_1.jpg",
"local_path": "/usr/src/app/output/IMAGE_1.jpg",
"kv_direct_url": "https://api.apify.com/v2/key-value-stores/~/records/IMAGE_1.jpg",
"downloaded_at": "2025-12-19T18:28:19.795963Z",
"index_in_run": 1,
"meta_kv_key": "IMAGE_1_meta.json",
"meta_kv_direct_url": "https://api.apify.com/v2/key-value-stores/~/records/IMAGE_1_meta.json"
}
}
]

NOTE: kv_direct_url appears only when the Actor runs on the Apify platform and the KV record exists. TODO: confirm final Dataset vs KV output structure — inferred from Actor code but not explicitly documented.


🔑 Environment Variables & Secrets

  • APIFY_TOKEN — use apify-client or Console secrets; placeholder: <APIFY_TOKEN>
  • HTTP_PROXY, HTTPS_PROXY — or custom proxy strings placeholder: <PROXY_USER:PASS@HOST:PORT>

Security note: Always store tokens and proxy credentials in Apify Console Secrets, not plain input fields.


▶️ How to Run (Console, CLI, API)

Console (recommended)

  1. Open the Actor in Apify Console.
  2. Paste input.example.json content into the Input panel.
  3. Enable Apify Proxy (optional) → Click Run.
  4. Inspect Dataset and Key-Value store when run finishes.

CLI

$apify run --input input.example.json

API (quick)

curl -X POST "https://api.apify.com/v2/acts/your-user~facebook-image-scraper/runs?token=<APIFY_TOKEN>" \
-H "Content-Type: application/json" \
-d @input.example.json

⏰ Scheduling & Webhooks

  • Schedule: Use the Apify Console scheduler to run periodically (hourly/daily). Select the Actor → Schedule → set cadence.
  • Webhooks: Configure a webhook in the run settings to receive a POST to your endpoint when runs finish. Typical payload includes run.id and output links.

🕾️ Logs & Troubleshooting

Where to look:

  • Open the Actor run → Logs panel in Console. Logs show step-by-step progress and warnings.

Common issues & quick fixes:

  • No startUrls provided — ensure startUrls is present in input.
  • TLS/SSL errors — toggle verify_tls to false only if you trust the network; prefer adding proper CA bundles.
  • No images found — check the URL is a public post/page (private content cannot be accessed).
  • KV writes failing — ensure Actor has storage permissions and you're not exceeding storage quotas.

🔒 Permissions & Storage Notes

  • Storage used: Dataset and Key-Value store. Files may also be saved in the run output/ folder when preserve_downloads is enabled.
  • Privacy: This Actor only reads publicly available pages. Do not use it to access private content or perform automated actions that violate the target site's terms.
  • Data retention: Manage storage retention in Console to avoid unexpected costs.

🔟 Changelog / Versioning

  • 1.0.0 — Initial production-ready release: image discovery, metadata extraction, KV storage, Dataset output.

Future changelog entries will be listed here in semantic-version format.


🖌 Notes / TODOs

  • TODO: confirm final Dataset vs KV output structure — inferred from code but not explicit.
  • TODO: add annotated demo GIF and Console screenshot for the top of this README (improves conversions).
  • TODO: consider advanced proxy rotation presets for large-scale scraping.

🌍 Proxy Configuration

This Actor performs network requests — configure proxies when running at scale or to avoid IP rate limits.

Enable Apify Proxy (Console):

  • In the Actor run configuration, toggle Use Apify Proxy.

Custom proxy examples (set as secrets or env vars):

HTTP_PROXY="http://<PROXY_USER:PASS@HOST:PORT>"
HTTPS_PROXY="http://<PROXY_USER:PASS@HOST:PORT>"

Reminder: Store credentials in Console Secrets, never in plain input.

TODO: Consider proxy rotation for large-scale scraping.


📚 References (official Apify docs)

  • How to create an Actor README — Apify Academy.
  • Actor input schema & validation — Apify Platform docs.
  • Apify CLI & API reference — Apify docs.

(Links and exact docs are provided in the CONFIG.md for Console-friendly viewing.)


🤔 What I inferred from main.py

  • The Actor performs network fetches of public pages and extracts images and metadata.
  • Primary input is startUrls (array or single URL). maxConcurrency and preserve_downloads are supported.
  • Output includes per-image metadata records and binary image files stored in KV.
  • The Actor attempts to infer image dimensions and stores kv_key references for images.

✅ Why use this Actor (short benefits)

  • Plug-and-play: run in Console with no code changes.
  • Fast: parallel processing with configurable concurrency.
  • Reliable: stores both binary images and structured metadata for easy integration.
  • Privacy-safe: reads only public pages and recommends secure secrets management.

Run this Actor on Apify Console — get instant image & metadata exports in seconds.

If you want the README further expanded or tailored for publishing (screenshots, pricing, usage limits), tell me which section to enlarge and I will update the Canvas.