Pricing

from $10.00 / 1,000 results

Semantic Scholar Paper Scraper

Scrapes academic papers from the Semantic Scholar API. Access a vast corpus of scientific literature with citation counts, abstracts, and author information powered by the Allen Institute for AI.

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

Donny

Actor stats

Bookmarked

Total users

Monthly active users

17 hours ago

Last modified

What it does

Scrapes academic papers from the Semantic Scholar API. Access a vast corpus of scientific literature with citation counts, abstracts, and author information powered by the Allen Institute for AI.

This actor connects to a public API, fetches structured data based on your search criteria, and stores the results in a clean, normalized dataset on the Apify platform. It handles pagination automatically so you can collect large volumes of results without worrying about API limits or offsets. The actor is designed to be robust with built-in error handling, request timeouts, and input validation to ensure reliable data collection every time you run it.

Why use this actor

Manually querying APIs and handling pagination, rate limits, and data normalization is tedious and error-prone. This actor automates the entire process. Simply provide your search parameters, set the maximum number of results you want, and let the actor handle the rest. The data is stored in a structured dataset that you can export as JSON, CSV, or Excel. You can integrate this actor into larger workflows using the Apify API, schedule it for recurring data collection, or trigger it from your own applications via webhooks.

Input parameters

searchQuery (string, required): The search term to query Semantic Scholar. Default: "transformer neural network".
maxResults (integer, optional): Maximum number of results to return. Default: 100. Range: 1-1000.

All inputs are validated at startup with sensible defaults applied when values are missing. The actor will log warnings for any misconfigured options and continue with safe defaults rather than failing outright.

Output data

Each result in the dataset contains the following fields:

paperId: Semantic Scholar paper identifier
title: Paper title
abstract: Paper abstract
year: Publication year
citationCount: Number of citations
url: URL to the paper on Semantic Scholar
authors: Comma-separated list of author names

All string fields are null-checked to ensure consistent data quality. Missing or undefined values are stored as null rather than empty strings or undefined values.

Example output

{
    "paperId": "abc123def456",
    "title": "Attention Is All You Need",
    "abstract": "The dominant sequence transduction models...",
    "year": 2017,
    "citationCount": 95000,
    "url": "https://www.semanticscholar.org/paper/abc123",
    "authors": "Ashish Vaswani, Noam Shazeer"
}

Pricing

This actor is available on the Apify platform with transparent usage-based pricing. Each run incurs a small startup cost of approximately $0.005 per start, plus roughly $0.01 per result collected. Actual costs depend on the number of results, API response times, and memory allocation. You can control costs by setting the maxResults parameter to limit the number of results collected per run. For high-volume use cases, consider running the actor on a schedule during off-peak hours to optimize platform resource usage.

More scrapers from brave_paradise

Check out these other useful data collection actors by brave_paradise:

Visit the brave_paradise profile on Apify to explore the full collection of specialized data scrapers and automation tools.

Semantic Scholar Paper Scraper

agenscrape/semantic-scholar-paper-scraper

Scrape academic papers from Semantic Scholar. Search by keyword and extract paper titles, abstracts, authors, citation counts, publication dates, DOIs, open access PDFs... Perfect for literature reviews, citation analysis, and research databases. Real time data output with pagination support.

Agenscrape

Semantic Scholar Scraper

consummate_mandala/semantic-scholar-scraper

Semantic Scholar Scraper. Extract structured data with automatic pagination, proxy rotation, and JSON/CSV export. Pay only for results.

Donny Nguyen

Arxiv Research Scraper

brave_paradise/arxiv-research-scraper

Scrapes research papers from the arXiv preprint repository. Searches across all scientific disciplines including physics, mathematics, computer science, and more.

Donny

Eurostat Statistics Scraper

brave_paradise/eurostat-statistics-scraper

Scrapes statistical dataset metadata from the Eurostat API, the statistical office of the European Union. Access information about thousands of European statistical datasets covering economy, population, trade, environment, and more.

Donny

Federal Register Document Scraper

brave_paradise/federal-register-document-scraper

Scrapes official documents from the Federal Register API, the daily journal of the United States Government. Access rules, proposed rules, notices, and presidential documents with full metadata.

Donny

Doaj Open Access Scraper

brave_paradise/doaj-open-access-scraper

Scrapes open access journal articles from the Directory of Open Access Journals (DOAJ). Access metadata for millions of peer-reviewed articles from trusted open access journals across all academic disciplines.

Donny

Data Gov Catalog Scraper

brave_paradise/data-gov-catalog-scraper

Scrapes dataset metadata from the Data.gov catalog API, the home of the U.S. Government open data. Access information about hundreds of thousands of federal datasets including descriptions, organizations, tags, and resource counts.

Donny

World Bank Development Scraper

brave_paradise/world-bank-development-scraper

Scrapes development indicator data from the World Bank API. Access economic, social, and environmental data for countries worldwide including GDP, population, health indicators, and more across custom date ranges.

Donny

Usaspending Gov Scraper

brave_paradise/usaspending-gov-scraper

Scrapes government contract and spending data from the USASpending.gov API. Access detailed award information including recipient names, award amounts, agencies, and descriptions for federal spending transparency.

Donny

Pubmed Medical Scraper

brave_paradise/pubmed-medical-scraper

Scrapes medical and biomedical research articles from the PubMed database maintained by the National Library of Medicine. Access millions of citations and abstracts from MEDLINE and life science journals.

Donny