NASA Exoplanet Archive Scraper avatar

NASA Exoplanet Archive Scraper

Pricing

Pay per usage

Go to Apify Store
NASA Exoplanet Archive Scraper

NASA Exoplanet Archive Scraper

Extract confirmed exoplanet data from NASA Exoplanet Archive TAP API. Filter by discovery method and year. Returns orbital period, mass, radius, equilibrium temp, and stellar data.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Compute Edge

Compute Edge

Maintained by Community

Actor stats

1

Bookmarked

1

Total users

1

Monthly active users

4 days ago

Last modified

Categories

Share

Extract confirmed exoplanet data from the NASA Exoplanet Archive — Query the world's most comprehensive confirmed exoplanet dataset via the Table Access Protocol (TAP) API. This Actor retrieves exoplanet discovery data, orbital parameters, and stellar information for over 5,500 confirmed planets without any authentication required.

What data can you scrape from NASA Exoplanet Archive?

FieldDescription
pl_nameExoplanet name (e.g., "Kepler-1167 b")
hostnameHost star name (e.g., "Kepler-1167")
disc_yearYear of discovery (integer)
discoverymethodDiscovery method (Transit, Radial Velocity, Direct Imaging, etc.)
pl_orbperOrbital period in days (float)
pl_bmassePlanet mass in Earth masses (float)
pl_radePlanet radius in Earth radii (float)
pl_eqtEquilibrium temperature in Kelvin (integer)
sy_snumNumber of stars in the system (integer)
sy_pnumNumber of planets in the system (integer)
urlNASA Exoplanet Archive overview page URL

Why use NASA Exoplanet Archive Scraper?

  • No authentication required — Direct access to NASA's public scientific data
  • Comprehensive dataset — 5,500+ confirmed exoplanets with standardized fields
  • Flexible filtering — Filter by discovery method, discovery year range, and result limit
  • Research-ready data — All fields follow NASA's standardized variable naming conventions
  • Fast API access — Direct fetch from TAP API with no browser overhead
  • Scientific accuracy — Data curated and maintained by NASA's Exoplanet Archive team
  • Perfect for analysis — Ideal input for exoplanet trend analysis, statistical research, or educational visualization

How to scrape NASA Exoplanet Archive data

  1. Go to the NASA Exoplanet Archive Scraper on Apify Store
  2. Click Try for free
  3. (Optional) Set Discovery Method (e.g., Transit, Radial Velocity, Direct Imaging) — leave empty to include all methods
  4. Set Start Year to filter planets discovered in this year or later (default: 1992, the year of the first confirmed exoplanet discovery)
  5. Set End Year to filter planets discovered in this year or earlier (default: 2026)
  6. Set Max Results to limit the number of exoplanets returned (default: 500, max: 10,000)
  7. Click Start and wait for the API to return results

Input example

{
"discoveryMethod": "Transit",
"startYear": 2020,
"endYear": 2023,
"maxResults": 20
}

Expected: 20 exoplanets discovered via the Transit method between 2020-2023, sorted by discovery year (newest first).

Output example

Each exoplanet returns a JSON object like this:

{
"pl_name": "TOI-1695 b",
"hostname": "TOI-1695",
"disc_year": 2023,
"discoverymethod": "Transit",
"pl_orbper": 5.26,
"pl_bmasse": 4.87,
"pl_rade": 1.98,
"pl_eqt": 874,
"sy_snum": 1,
"sy_pnum": 3,
"url": "https://exoplanetarchive.ipac.caltech.edu/overview/TOI-1695%20b"
}

You can download the dataset in various formats such as JSON, HTML, CSV, or Excel.

Data field reference

Planet Properties

  • pl_name: Exoplanet name (string)
  • hostname: Host star name (string)
  • disc_year: Discovery year (integer, 1992-present)
  • discoverymethod: How the planet was discovered (string, enum)
  • pl_orbper: Orbital period in days (float)
  • pl_bmasse: Planet mass in Earth masses (float)
  • pl_rade: Planet radius in Earth radii (float)
  • pl_eqt: Equilibrium temperature in Kelvin (integer)

System Properties

  • sy_snum: Number of stars in the system (integer)
  • sy_pnum: Number of planets in the system (integer)

Discovery Methods Filter

Common discovery methods include:

  • Transit — Exoplanet transit (planet passes in front of star)
  • Radial Velocity — Doppler shift of host star
  • Direct Imaging — Direct observation of planet's light
  • Microlensing — Gravitational lensing effect
  • Timing Variations — Pulsar/eclipse timing anomalies
  • TTV — Transit Timing Variations
  • Other — Less common detection methods

How much does it cost to scrape NASA Exoplanet Archive?

This Actor uses direct API fetch (no browser, no crawling), so compute costs are extremely low.

ScenarioResultsEst. ComputeEst. Actor Fee
Quick sample (20 planets)20~$0.001~$0.02
Medium query (500 planets)500~$0.01~$0.50
Maximum extraction (10,000 planets)10,000~$0.05~$10.00

Actor pricing: $0.001 per result + minimal compute costs (usually under $0.01 per run). A typical extraction of 100 exoplanets costs under $0.15 total.

Tips for efficient scraping

  1. Use discovery method filter — Narrow your query to a specific detection method to reduce results and cost. For example, "Transit" returns ~3,500 planets; "Radial Velocity" returns ~900.
  2. Set realistic year ranges — Discovery rate varies by year. Between 2010-2015, ~1,000 planets were discovered. Recent years (2020+) show ~1,000+ per year.
  3. Batch multiple queries — If you need all exoplanets, run one Actor with discoveryMethod empty instead of separate runs per method.
  4. Schedule recurring updates — Set up a weekly or monthly schedule to track newly discovered exoplanets.

FAQ & support

Q: How current is this data? A: The NASA Exoplanet Archive is updated continuously as new discoveries are confirmed. This Actor queries the live API, so data is current within hours of confirmation.

Q: What if I need stellar properties (temperature, mass, radius)? A: The current Actor focuses on planet and system properties. To get stellar data, you would need to query additional TAP tables (e.g., stars) and merge the results. Contact support if you need this feature.

Q: Can I filter by planet properties (e.g., mass > 10 Earth masses)? A: The current Actor supports year and method filtering. Custom SQL WHERE clauses could be added in a future version. Please open an issue on GitHub if this is important for your use case.

Q: Is this legal to use? A: Yes. NASA's Exoplanet Archive data is public domain and explicitly designed for research, education, and analysis. No terms-of-service restrictions apply.

Q: Can I use this data commercially? A: Yes. NASA public data has no copyright restrictions. You may use, modify, and commercialize the data as needed.

Need a custom solution?

This Actor provides a flexible foundation for exoplanet research. If you need:

  • Custom TAP query syntax
  • Integration with other NASA datasets
  • Real-time webhooks on new discoveries
  • Scheduled extractions with versioning

Please contact support or open an issue on the Actor's GitHub repository.

Data attribution

Data sourced from the NASA Exoplanet Archive, a service of the NASA Exoplanet Science Institute at the California Institute of Technology, funded by NASA's Science Mission Directorate.

Citation: NASA Exoplanet Science Institute (2024). Exoplanet Archive. Retrieved from https://exoplanetarchive.ipac.caltech.edu

  • Apify SDK - toolkit for building Actors
  • Crawlee - web scraping and browser automation library
  • Input schema - define and easily validate a schema for your Actor's input
  • Dataset - store structured data where each object stored has the same attributes
  • Cheerio - a fast, flexible & elegant library for parsing and manipulating HTML and XML
  • Proxy configuration - rotate IP addresses to prevent blocking

Resources

Creating Actors with templates

Getting started

For complete information see this article. To run the Actor use the following command:

$apify run

Deploy to Apify

Connect Git repository to Apify

If you've created a Git repository for the project, you can easily connect to Apify:

  1. Go to Actor creation page
  2. Click on Link Git Repository button

Push project on your local machine to Apify

You can also deploy the project on your local machine to Apify without the need for the Git repository.

  1. Log in to Apify. You will need to provide your Apify API Token to complete this action.

    $apify login
  2. Deploy your Actor. This command will deploy and build the Actor on the Apify Platform. You can find your newly created Actor under Actors -> My Actors.

    $apify push

Documentation reference

To learn more about Apify and Actors, take a look at the following resources: