Pricing

from $0.00005 / actor start

Nasa Reports Scraper

Access NASA's Technical Reports Server (NTRS) with an automated scraper that collects scientific papers, conference proceedings, journal articles, and research reports. Provides structured metadata for researchers, scientists, and academics needing large-scale access to NASA's technical publications

Pricing

from $0.00005 / actor start

Rating

0.0

(0)

Developer

Akash Kumar Naik

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

NASA Technical Reports Scraper (NTRS)

Extract structured metadata from NASA Technical Reports Server (NTRS) search results, with optional citation API enrichment and optional PDF downloads.

What This Actor Does

This Actor searches ntrs.nasa.gov for a keyword query and stores report metadata to the default dataset.

It is designed for research workflows where you need:

Repeatable NTRS collection for a query
Cleaned metadata fields (title, authors, dates, categories)
Optional PDF URL capture and optional PDF file download
Optional API enrichment for better field completeness

Why Use This Actor

Handles dynamic NTRS search pages with Playwright
Moves through pagination reliably (in-page "Next page" flow)
Deduplicates by citation ID during a run
Backfills missing fields from https://ntrs.nasa.gov/api/citations/{id} when enabled
Supports optional PPE charging attempts per scraped record

Typical Use Cases

Literature review and bibliography building
Space-tech trend analysis across years/topics
Building downstream datasets for NLP or search indexing
Monitoring new reports for a recurring query

Input

Field	Type	Required	Default	Description
`searchQuery`	string	Yes	`"alien"`	Search phrase sent to NTRS
`maxResults`	integer	No	`100`	Maximum items to save. `0` means unlimited
`startPage`	integer	No	`1`	Start scraping from this search page
`downloadPDFs`	boolean	No	`true`	Download PDF binaries to key-value store when available
`proxyConfiguration`	object	No	`{ "useApifyProxy": false }`	Apify proxy settings

Example Input

Metadata-first run

{
  "searchQuery": "mars sample return",
  "maxResults": 50,
  "startPage": 1,
  "downloadPDFs": false
}

Include PDF downloads

{
  "searchQuery": "lunar habitat",
  "maxResults": 25,
  "downloadPDFs": true
}

Output

Dataset item

Each item includes fields such as:

title
url
id
documentId
abstract
documentType
authors
publicationDate
dateAcquired
subjectCategory
acquisitionSource
reportNumber
organization
distributionLimits
copyright
meetingInformation (when available)
pdfUrl (only if a PDF URL exists)
downloadedPdfPath (only if PDF was downloaded)

Example:

{
  "title": "Searching for Alien Life Having Unearthly Biochemistry",
  "url": "https://ntrs.nasa.gov/citations/20040015106",
  "id": "20040015106",
  "documentType": "Preprint (Draft being sent to journal)",
  "authors": "Jones, Harry\n(NASA Ames Research Center Moffett Field, CA, United States)",
  "publicationDate": "January 1, 2003",
  "organization": "NASA Ames Research Center",
  "pdfUrl": "https://ntrs.nasa.gov/api/citations/20040015106/downloads/20040015106.pdf?attachment=true"
}

Key-value store `SUMMARY`

The Actor also writes a SUMMARY record with:

searchQuery
totalResults
totalPagesProcessed
lastPageVisited
usePPE
ppeEventName
ppeChargeAttempts
completed
timestamp

How To Run

On Apify platform

Open Actor input.
Set searchQuery and optional filters.
Run the Actor.
Read results in Dataset and SUMMARY in key-value store.

Pricing Notes

End-user pricing is controlled in Apify Console publication settings.
This Actor always attempts one PPE charge per scraped report (ntrs-report-scraped).
Billable PPE still requires the Actor to be configured for pay-per-event in Apify Console.

FAQ

Why are some fields empty?

Some NTRS records do not provide complete metadata. Source data can still be incomplete.

Why is `pdfUrl` empty?

Many records do not have a downloadable PDF. The Actor keeps pdfUrl empty in that case.

Are duplicates removed?

Yes, duplicates are skipped by citation id within a run.

Limitations

Data quality depends on NASA NTRS source records.
PDF availability is source-dependent.
Very large unlimited runs (maxResults = 0) can take a long time.

Legal

This project is not affiliated with or endorsed by NASA. Use this Actor in compliance with NASA NTRS terms and applicable laws.

Support

Apify docs: https://docs.apify.com
NTRS: https://ntrs.nasa.gov

NASA Reports Scraper

parseforge/nasa-reports-scraper

Access NASA’s Technical Reports Server (NTRS) with an automated scraper that collects scientific papers, conference proceedings, journal articles, and research reports. Provides structured metadata for researchers, scientists, and academics needing large-scale access to NASA’s technical publications

ParseForge

Nasa Open Api Scraper

fortuitous_pirate/nasa-open-api-scraper

Fortuitous Pirate

NASA Images Scraper

automation-lab/nasa-images-scraper

Search and extract images from NASA's media library — space photos, descriptions, dates, and download links.

Stas Persiianenko

NASA Space Data Scraper

cloud9_ai/nasa-data-scraper

Extract space data from NASA APIs. Get Astronomy Picture of the Day, Mars Rover photos, near-Earth asteroids, and Earth imagery.

cloud9

NASA Space Intelligence - APOD Asteroids Discovery AI Scoring

benthepythondev/nasa-space-intelligence

Extract space data from NASA's public APIs with AI-powered discovery scoring (0-100). Get Astronomy Picture of the Day (APOD) and Near-Earth Asteroid data with intelligent ranking.

ben

NASA Space Data Scraper

gentle_cloud/nasa-space-data-scraper

Fetch space data from NASA APIs: Astronomy Picture of the Day (APOD), Near Earth Objects (NEO), Mars Rover photos, and Earth images (EPIC). Free, no API key required.

Monkey Coder

NASA Near Earth Objects Scraper

compute-edge/nasa-neo-scraper

Scrape Near Earth Object (NEO) asteroid data from NASA NeoWs API including close approach dates, miss distances, estimated diameters, hazard status, and orbital information.

Compute Edge

NASA Exoplanet Archive Scraper

compute-edge/nasa-exoplanet-scraper

Extract confirmed exoplanet data from NASA Exoplanet Archive TAP API. Filter by discovery method and year. Returns orbital period, mass, radius, equilibrium temp, and stellar data.

Compute Edge

Unpaywall Scraper

parseforge/unpaywall-scraper

Discover open access research articles with our powerful Unpaywall scraper! Search through millions of articles in the Unpaywall database to find free-to-read scholarly publications. Perfect for researchers, librarians, and academics who need to find and access open access articles efficiently.

ParseForge

Website Performance & Technical SEO MCP Server for AI Assistant

alizarin_refrigerator-owner/performance-seo-mcp-server

MCP Server providing AI assistants with unified access to website performance and technical SEO analysis through a single interface. PageSpeed Insights, Core Web Vitals, Lighthouse audits, robots.txt validation, sitemap analysis, and comprehensive technical SEO checks.