SWAPI Star Wars Scraper avatar

SWAPI Star Wars Scraper

Pricing

from $6.00 / 1,000 result items

Go to Apify Store
SWAPI Star Wars Scraper

SWAPI Star Wars Scraper

Extract Star Wars characters, planets, starships, vehicles, species and films from the SWAPI public API. Flat JSON output with optional reference resolution.

Pricing

from $6.00 / 1,000 result items

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

7 days ago

Last modified

Share

ParseForge Banner

🪐 SWAPI Star Wars Scraper

🚀 Export the entire Star Wars universe in seconds. Pull every character, planet, starship, vehicle, species, and film from SWAPI as clean JSON, CSV, Excel, or XML. No API key, no rate-limit math, no manual paging.

🕒 Last updated: 2026-05-26 · 📊 Up to 26 fields per record · 241 canonical records across 6 categories · 6 categories (people, planets, starships, vehicles, species, films) · Free public data

The SWAPI Star Wars Scraper turns the Star Wars REST API into a one-click dataset export. Pick the categories you want, hit Run, and download a flat, well-named table covering 82 characters, 60 planets, 36 starships, 39 vehicles, 37 species, and the 6 saga films. Every record carries the original SWAPI identifiers, all canonical attributes (height, mass, climate, hyperdrive rating, episode number, opening crawl, the works), and the cross-reference URLs that tie the universe together. You skip building pagination loops, retry logic, and TLS workarounds, and you get the dataset in the format your downstream tool actually consumes.

The dataset is ideal for fan apps, tutorials, conference demos, and AI-assistant training corpora. Every column maps 1:1 to a documented SWAPI field, so anything you build is portable and reproducible. The actor handles the upstream's expired-certificate quirk on your behalf, supports optional URL-to-name resolution so you do not have to chase reference links yourself, and works equally well as a one-shot pull or a scheduled re-snapshot. Whether you are building a Star Wars Trivia API, generating a teaching dataset for a Python workshop, or seeding a vector database for a fandom chatbot, this Actor gives you the canonical answer with a single run.

🎯 Target Audience💡 Primary Use Cases
Educators teaching REST, JSON, and paginationBuild classroom datasets for data-analysis homework
Indie devs prototyping fan apps and botsSeed a Star Wars Trivia API, Discord bot, or quiz app
AI/ML engineers building demo corporaFine-tune small models or RAG demos on a famous public dataset
Data analysts and visualization tinkerersProduce timelines, heatmaps, and character network graphs
Workshop organizers and tutorial authorsShip a reproducible example dataset alongside your blog post
Researchers on pop-culture and narrative dataQuantitative studies on franchise expansion, character demographics, ship classes

📋 What the SWAPI Star Wars Scraper does

  • 🪐 Six categories, one Run. Toggle any combination of people, planets, starships, vehicles, species, and films and the Actor handles per-endpoint pagination for you.
  • 🔍 Every canonical field. Heights, masses, hyperdrive ratings, orbital periods, opening crawls, episode IDs, residents, pilots, characters - all of it, never trimmed.
  • 🔗 Optional reference resolution. Flip a checkbox and every URL reference (homeworld, films, residents, pilots) is enriched with the linked record's name, ready to drop into a UI.
  • 🧱 Stable, flat schema. Records share a common header (category, uid, name, url, summary) so downstream joins, filters, and group-bys work out of the box.
  • 🛰 TLS-quirk-proof. Transparently handles the upstream's expired certificate so your runs never break on a TLS error you cannot fix yourself.
  • Quick exports. Most full-universe pulls finish in under a minute, well below typical timeout budgets in classroom and CI contexts.

Each record is a single, deeply attributed row: a person carries height, mass, hair color, skin color, eye color, birth year, gender, homeworld, and the films, species, vehicles, and starships they appear in. A planet carries rotation period, orbital period, diameter, climate, gravity, terrain, surface water, population, residents, and films. A film carries episode_id, opening crawl, director, producer, release date, and the full cast and ship roster. Every record also has category, uid, summary, scrapedAt, and error columns so you can sort, group, and audit without a second pass.

💡 Why it matters: SWAPI is the canonical free Star Wars dataset, but the upstream API ships expired TLS, no built-in CSV/Excel export, and per-endpoint pagination. This Actor turns those rough edges into a single Run that ends with a clean spreadsheet, JSON file, or piped API response your app or notebook can consume directly.


🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to configure the Actor, pull all six categories, and pipe the dataset into a Google Sheet, a Python notebook, and a chatbot in one sitting.


⚙️ Input

FieldTypeRequiredDescription
categoriesarray of enumYesOne or more of people, planets, starships, vehicles, species, films. Default is people.
maxItemsintegerNoSoft cap on total records returned. Free users are limited to 10. Default is 10.
resolveReferencesbooleanNoWhen true, every URL reference (homeworld, films, residents, pilots, characters) is enriched with the linked record's name. Adds extra requests, default is false.

Example: pull the first 10 characters as a clean table.

{
"categories": ["people"],
"maxItems": 10
}

Example: full multi-category snapshot with reference names resolved.

{
"categories": ["people", "planets", "starships", "vehicles", "species", "films"],
"maxItems": 250,
"resolveReferences": true
}

⚠️ Good to Know: Records are split evenly across the categories you pick. If you pick three categories with maxItems=30, you get up to 10 of each. Pick a single category to focus a run on, say, all 60 planets.


📊 Output

Each record is a flat JSON object with category-specific fields plus the common header. Fields use the original SWAPI names so anything you write against SWAPI keeps working.

🧾 Schema

FieldTypeExample
🗂 categorystring"people"
🆔 uidstring"1"
📌 namestring"Luke Skywalker"
🔗 urlstring"https://swapi.dev/api/people/1/"
📝 summarystring"19BBY · male · 172cm"
📐 heightstring"172"
⚖️ massstring"77"
👁 eye_colorstring"blue"
🎂 birth_yearstring"19BBY"
🚻 genderstring"male"
🌍 homeworldstring or object"https://swapi.dev/api/planets/1/"
🎞 filmsarray["https://swapi.dev/api/films/1/", ...]
🌡 climatestring (planets)"arid"
🗺 terrainstring (planets)"desert"
👥 populationstring (planets)"200000"
🛠 modelstring (starships)"T-65 X-wing"
🚀 hyperdrive_ratingstring (starships)"1.0"
🎬 episode_idinteger (films)4
🎭 directorstring (films)"George Lucas"
📅 release_datestring (films)"1977-05-25"
📜 opening_crawlstring (films)"It is a period of civil war..."
🧑‍🤝‍🧑 charactersarray (films)["https://swapi.dev/api/people/1/", ...]
🏷 createdstring"2014-12-09T13:50:51.644000Z"
✏️ editedstring"2014-12-20T21:17:56.891000Z"
🕒 scrapedAtstring"2026-05-26T20:39:31.371Z"
❌ errorstring or nullnull

📦 Sample records


✨ Why choose this Actor

Capability
🪐Full universe coverage. Six categories, every canonical field, zero invented columns.
🔗Cross-references intact. Films, residents, pilots, and homeworld URLs preserved so you can join records yourself.
🧭Optional name resolution. Flip a switch and every reference URL is enriched with the linked record's name.
🛰TLS quirk handled. SWAPI's certificate expired - this Actor papers over it so your scheduled runs never break.
📦Multi-format export. JSON, CSV, Excel, XML, JSON Lines via the standard Apify download UI.
🔁Reproducible snapshots. Every record carries created, edited, and scrapedAt timestamps for versioned analysis.
🧪Stable schema. Field names mirror SWAPI 1:1 so anything you wrote against the upstream keeps working.

📊 A full-universe pull (all 241 records across all 6 categories) typically completes in under 30 seconds.


📈 How it compares to alternatives

ApproachCostCoverageRefreshFiltersSetup
Calling SWAPI by hand from your appFreeFull, if you write pagingRealtimeNonePagination loops, retries, TLS workaround
Static dumps in public GitHub reposFreeSnapshot onlyStaleNoneClone repo, parse files
Aggregator REST mirrorsFree or paidMirror-dependentVariesLimitedLock-in to a third-party mirror
⭐ SWAPI Star Wars Scraper (this Actor)Pay-per-resultAll 6 categories, all fieldsOn demandPer-category and maxItemsOne run, pick categories, download

The trade-off is consistent: you can hand-roll the same pull yourself, but you spend the afternoon on plumbing and another afternoon when the TLS cert expires again.


🚀 How to use

  1. 🆕 Sign up. Create a free Apify account at console.apify.com/sign-up. No credit card required for the free tier.
  2. 🛒 Open the Actor. Visit the SWAPI Star Wars Scraper on Apify Store and click Try for free.
  3. 🪐 Pick categories. Tick people, planets, starships, vehicles, species, or films. Pick one for a focused pull or all six for a full snapshot.
  4. ▶️ Run. Hit Start. Most runs finish in under a minute even with resolveReferences enabled.
  5. 📥 Download. Export the dataset as JSON, CSV, Excel, or XML, or call the Apify API to pipe it directly into your downstream system.

⏱️ Total time: about 2 minutes from sign-up to a downloadable Star Wars dataset on your machine.


💼 Business use cases

🧑‍🏫 Educators

  • Build classroom REST and JSON exercises against a familiar dataset
  • Provide reproducible homework datasets that do not change weekly
  • Demonstrate pagination, schema design, and cross-reference joins
  • Seed coding bootcamp projects with a fun, recognizable domain

🛠 Indie developers

  • Bootstrap a Star Wars Trivia API or Discord bot in an afternoon
  • Power fan-site filters (find all Wookiees from Kashyyyk in Episode 5)
  • Generate quiz questions for game jams and hackathon demos
  • Cache a local copy so your app survives upstream outages

🤖 AI and ML engineers

  • Seed RAG demos and fandom chatbots with a known, bounded corpus
  • Generate synthetic training prompts grounded in real entities
  • Build evaluation sets with verifiable canonical answers
  • Power small-model fine-tunes for entity-extraction demos

📊 Data analysts

  • Visualize the character relationship graph across films
  • Heatmap planet population vs climate vs film appearances
  • Build episode-by-episode cast and ship growth timelines
  • Demonstrate dashboard tooling on a recognizable dataset

🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

🎓 Research and academia

  • Empirical datasets for papers, thesis work, and coursework
  • Longitudinal studies tracking changes across snapshots
  • Reproducible research with cited, versioned data pulls
  • Classroom exercises on data analysis and ethical scraping

🎨 Personal and creative

  • Side projects, portfolio demos, and indie app launches
  • Data visualizations, dashboards, and infographics
  • Content research for bloggers, YouTubers, and podcasters
  • Hobbyist collections and personal trackers

🤝 Non-profit and civic

  • Transparency reporting and accountability projects
  • Advocacy campaigns backed by public-interest data
  • Community-run databases for local issues
  • Investigative journalism on public records

🧪 Experimentation

  • Prototype AI and machine-learning pipelines with real data
  • Validate product-market hypotheses before engineering spend
  • Train small domain-specific models on niche corpora
  • Test dashboard concepts with live input

🔌 Automating SWAPI Star Wars Scraper

Wire the Actor into your pipeline using any of the standard Apify integrations.

  • Node.js: use the official apify-client to start a run and stream items as they land in the dataset.
  • Python: the apify-client-python package mirrors the Node SDK and slots into Jupyter, Airflow, or cron.
  • Docs: the full API reference is at docs.apify.com/api/v2.

Schedules: create a saved task with your preferred categories and maxItems, then attach an Apify Schedule to refresh the dataset daily, weekly, or whenever SWAPI's canonical numbers shift.


❓ Frequently Asked Questions


🔌 Integrate with any app

Connect the dataset to wherever you work without writing glue code.

  • Make.com - drop the Apify trigger into a scenario and feed Star Wars records into 1500+ apps.
  • Zapier - send each new record to Airtable, Notion, Slack, or anywhere Zapier reaches.
  • Google Sheets - sync the dataset directly into a sheet for ad-hoc analysis.
  • Slack - notify a channel when a scheduled refresh completes.
  • Airbyte - load Star Wars data into your warehouse alongside everything else.
  • Webhook - POST each new dataset item to your own service on completion.

💡 Pro Tip: browse the complete ParseForge collection for more free, well-documented public-data scrapers.


🆘 Need Help? Have a question, hit an edge case, or want a new field added? Open our contact form and we will get back to you within one business day.


⚠️ Disclaimer: This Actor is an independent tool, not affiliated with Lucasfilm, Disney, or SWAPI. Only publicly available data is collected. Star Wars and all related names are trademarks of their respective owners.