NASA APOD Scraper avatar

NASA APOD Scraper

Pricing

Pay per event

Go to Apify Store
NASA APOD Scraper

NASA APOD Scraper

Fetch NASA's Astronomy Picture of the Day for any date or date range via the NASA APOD API — title, explanation, image / video URL, HD URL, copyright, media type, date — export to JSON or CSV. A NASA APOD API wrapper; no key needed for small volumes.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

20 hours ago

Last modified

Categories

Share


🎯 What this scrapes

NASA's Astronomy Picture of the Day (apod.nasa.gov) has been publishing one space image or video per day since 1995 — over 10,000 beautifully captioned entries spanning galaxies, nebulae, rocket launches, and solar phenomena. This Actor wraps the official NASA APOD API (api.nasa.gov/planetary/apod), supports single-date, date-range, and random-pick modes, and delivers one clean structured row per entry.

The archive is free to access but bulk historical exports still need throttling, retry logic, and clean schema handling across 30 years of format variation (video days, absent copyright fields, HD URL gaps). We absorb all of that so you get consistent rows every time.

Target use cases span the full spectrum: indie developers shipping a daily space photo widget, researchers building a multimodal RAG dataset with captions, educators building astronomy lesson apps, and data artists exporting the nasa images dataset for print-on-demand posters.

🔥 Features

  • 🛡️ Browser fingerprint rotationcurl-cffi impersonates real Chrome / Firefox / Safari TLS handshakes so the target sees a browser, not a Python script.
  • 🌐 Residential proxy rotation via Apify Proxy — fresh session and exit IP whenever the target pushes back.
  • 🔁 Retries with exponential backoff on 408 / 429 / 5xx — up to 5 attempts per request, Retry-After honoured.
  • 🧱 Rate-limit-aware pacing — when the API throttles us, we slow down and keep going rather than crashing out.
  • 🧊 Clean, typed dataset rows — Pydantic-validated, ISO-8601 timestamps, stable IDs, JSON / CSV / Excel export straight from the Apify Console.
  • 💰 Pay-Per-Event pricing — you only pay for results that land in your dataset. No data, no charge.
  • 🎞️ Video-day handlingmedia_type field distinguishes images from YouTube / Vimeo embeds; thumbnail_url included when NASA provides it.
  • 📅 Full archive support — date range mode lets you pull the entire astronomy image archive in one run; earliest entry is 1995-06-16.

💡 Use cases

  • Multimodal RAG dataset — bulk-export the APOD archive as image-caption pairs to build or evaluate a vision-LLM retrieval pipeline.
  • Daily space photo widget — run on a schedule, pull today's entry, push to your iOS / Android / desktop wallpaper service or Telegram bot.
  • Astronomy lesson app — backfill 10+ years of entries into a CMS, pair captions with AI-generated quiz questions.
  • Generative art seeds — feed APOD images and captions into a multimodal model or Stable Diffusion workflow.
  • Print-on-demand posters — harvest HD image URLs from the nasa images dataset and generate high-resolution wall art.
  • Newsletter automation — scheduled daily pull delivers title, explanation, and image URL to your email or Slack.

⚙️ How to use it

  1. Click Try for free at the top of the page.
  2. Choose a mode: Today, Single date, Date range, or Random N.
  3. Optionally supply your own NASA API key (free at api.nasa.gov) to lift the shared-key rate cap.
  4. Click Start. Output streams into the run's dataset in real time.
  5. Export from Storage → Dataset as JSON, CSV, or Excel — or pull via the Apify REST API.

For bulk astronomy image archive exports, use Date range mode with startDate and endDate. The run will page through the range, retry on throttles, and deliver one row per day.

📥 Input

FieldTypeRequiredDefaultNotes
modestringnotodaytoday, single_date, range, or random.
datestringnoISO date (YYYY-MM-DD). Used when mode=single_date. Earliest: 1995-06-16.
startDatestringnoISO date (YYYY-MM-DD). Range start, inclusive.
endDatestringnoISO date (YYYY-MM-DD). Range end, inclusive.
countintegerno5How many random APODs. Used when mode=random.
apiKeystringnoYour personal NASA API key. Without one, DEMO_KEY is used (30 req/hour, 50 req/day).
thumbsForVideosbooleannotrueAdds thumbnail_url for video APODs (YouTube / Vimeo).
proxyConfigurationobjectno{"useApifyProxy": false}Proxy settings. Increase resilience on long bulk runs by enabling Apify Proxy.

Example input

{
"mode": "range",
"startDate": "2024-01-01",
"endDate": "2024-01-31",
"thumbsForVideos": true,
"proxyConfiguration": {
"useApifyProxy": false
}
}

📤 Output

One dataset row per APOD entry.

FieldTypeNotes
datestringAPOD date (YYYY-MM-DD).
titlestringTitle of the APOD entry.
explanationstringFull curator-written description.
urlstringPrimary image or video URL.
hdurlstring | nullHD image URL when available.
thumbnail_urlstring | nullVideo thumbnail when NASA provides it.
media_typestringimage or video.
copyrightstring | nullCredit / copyright string. Absent for public-domain NASA imagery.
service_versionstring | nullNASA API service version tag.
apod_urlstringCanonical APOD page URL on apod.nasa.gov.
scraped_atstringISO-8601 timestamp of when this row was recorded.

Example output

{
"date": "2026-05-15",
"title": "Spiral Galaxy NGC 1232",
"explanation": "One of the most photogenic galaxies in the southern sky...",
"media_type": "image",
"url": "https://apod.nasa.gov/apod/image/2605/NGC1232.jpg",
"hdurl": "https://apod.nasa.gov/apod/image/2605/NGC1232_full.jpg",
"copyright": "ESO; J. Spyromilio",
"apod_url": "https://apod.nasa.gov/apod/ap260515.html",
"scraped_at": "2026-05-15T09:12:34Z"
}

💰 Pricing

Pay-Per-Event — you only pay when these events fire:

EventUSDWhat it means
actor-start$0.005One-off warm-up charge per run
result$0.0015Per dataset item delivered

1,000 results ≈ $1.50. No subscription, no minimum. New Apify accounts start with $5 of free credit — enough to pull several years of APOD entries before spending a cent.

🚧 Limitations

  • The APOD archive contains one entry per day since 1995-06-16, giving roughly 11,000 total entries. It is not a million-image dataset — pitch it as a curated multimodal-RAG corpus or fine-tuning seed, not as a large-scale training set.
  • We surface image and HD URLs as-is from NASA's API. We do not proxy or cache images ourselves — the buyer is responsible for downloading from the target host.
  • thumbnail_url is available only for video entries where NASA includes it; not all video APODs carry a thumbnail.
  • A small share of APOD entries feature imagery by external photographers who retain their own copyright. The copyright field is present on every row — downstream use must respect it.

❓ FAQ

Do I need a NASA API key?

Not to get started. The DEMO_KEY lets you fetch a handful of entries to verify the data shape. For bulk astronomy image archive exports or scheduled daily runs, register a free personal key at api.nasa.gov — it raises the per-hour cap significantly and avoids sharing quota with other users of the demo key.

What is the astronomy picture of the day API and how does this Actor wrap it?

NASA publishes a JSON endpoint at api.nasa.gov/planetary/apod — the official astronomy picture of the day API. This Actor calls that endpoint, handles pagination for date ranges, manages rate-limit retries, normalises the response schema across 30 years of format variation, and writes Pydantic-validated rows to an Apify dataset. You get a clean, queryable table without touching the raw API.

Can I use this as a multimodal RAG dataset?

Yes. Each row contains the title, the full curator-written explanation (your caption), and both a standard-resolution and HD image URL. That is a complete image-caption pair. Export as JSON, upload to HuggingFace, and you have an astronomy domain corpus ready for semantic search or vision-LLM evaluation.

Why do some rows have media_type: video?

Roughly 10–15% of APOD entries are videos (rocket launches, solar animations, telescope time-lapses). The url field points to YouTube or Vimeo; thumbnail_url contains the video thumbnail when NASA provides it. Filter media_type == "image" if you need images-only.

What is the earliest APOD entry?

1995-06-16. Total archive size as of mid-2026 is approximately 11,000 entries.

Can I bulk-download the actual image files?

We provide the URL — pair this Actor with a generic file-download Actor or a curl loop to retrieve the image bytes. We intentionally do not bundle image download to keep the pricing predictable.

💬 Your feedback

Spotted a bug, hit a rate-limit edge case, or need a new field? Open an issue on the Actor's Issues tab in Apify Console — we ship fixes weekly and read every report.