Figshare Scraper avatar

Figshare Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
Figshare Scraper

Figshare Scraper

This actor extracts metadata and content information from Figshare, one of the world's largest open research data repositories. It supports full-text keyword search, direct article ID lookup, and institution-specific article browsing across all Figshare content types.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

14 days ago

Last modified

Share

Scrape research articles, datasets, code, figures, and more from Figshare using its public REST API — no authentication or proxy required.

What It Does

This actor extracts metadata and content information from Figshare, one of the world's largest open research data repositories. It supports full-text keyword search, direct article ID lookup, and institution-specific article browsing across all Figshare content types.

Key Features

  • No authentication required — uses the public Figshare API
  • Three modes: keyword search, fetch by ID, or browse by institution
  • Filter by content type: datasets, papers, theses, code, figures, presentations, and more
  • Full detail extraction: authors, categories, tags, license, files, DOI
  • HTML description stripping — descriptions are returned as clean plain text
  • Automatic pagination — retrieves up to 1,000 results per run

Input Fields

FieldTypeDescription
modeSelectsearchArticles (default), getById, or searchByInstitution
searchQueryStringKeyword query (e.g. climate change, CRISPR)
itemTypeSelectFilter by content type (see table below)
articleIdsArrayFigshare article IDs for getById mode
institutionIdStringInstitution numeric ID for searchByInstitution mode
maxItemsIntegerMax results to return (1–1000, default 50)

Item Types

ValueType
(empty)All types
1Figure
2Media
3Dataset
4Poster
5Paper
6Presentation
7Thesis
8Code
9Preprint

Output Fields

Each item in the dataset contains:

FieldTypeDescription
articleIdIntegerUnique Figshare article ID
titleStringArticle title
descriptionStringPlain-text abstract/description (HTML stripped)
doiStringDigital Object Identifier
publishedDateStringPublication date (ISO 8601)
modifiedDateStringLast modified date (ISO 8601)
authorsArrayList of author full names
tagsArrayKeyword tags
categoriesArraySubject category names
licenseStringLicense name (e.g. CC BY 4.0)
figshareUrlStringPublic URL on Figshare
sourceUrlStringSame as figshareUrl
thumbUrlStringThumbnail image URL
downloadUrlStringDirect download URL for the primary file
itemTypeStringContent type name (e.g. dataset, paper)
citationStringFull citation string
viewCountIntegerNumber of views (when available)
citationCountIntegerNumber of citations (when available)
recordTypeStringAlways "article"
scrapedAtStringTimestamp when the record was scraped

Example Output

{
"articleId": 32513898,
"title": "Perspectives of Allied Health Professionals on Digital Health Services in South Australia",
"description": "This research project explores the perspectives of allied health professionals on digital health services in South Australia.",
"doi": "10.25909/32513898.v1",
"publishedDate": "2026-05-30T02:03:44Z",
"modifiedDate": "2026-05-30T02:03:44Z",
"authors": ["Muhammad Khan"],
"tags": ["digital health activism"],
"categories": ["Audiology", "Occupational therapy", "Physiotherapy", "Speech pathology"],
"license": "CC BY 4.0",
"figshareUrl": "https://adelaide.figshare.com/articles/poster/...",
"sourceUrl": "https://adelaide.figshare.com/articles/poster/...",
"thumbUrl": "https://s3-eu-west-1.amazonaws.com/ppreviews-adelaide-.../thumb.png",
"downloadUrl": "https://ndownloader.figshare.com/files/65104542",
"itemType": "poster",
"citation": "Khan, Muhammad (2026). Perspectives of Allied Health...",
"recordType": "article",
"scrapedAt": "2026-05-30T10:00:00.000000+00:00"
}

Use Cases

  • Research discovery: Find datasets or papers by keyword across all disciplines
  • Data science: Collect open datasets for analysis and modeling
  • Literature reviews: Gather papers, theses, and preprints by topic
  • Institution profiling: Browse all public outputs from a specific university
  • Open science auditing: Track open data and code availability by subject
  • Citation analysis: Collect DOIs for downstream citation graph work

FAQs

Q: Does this require an API key? A: No. The Figshare public API is freely accessible without authentication.

Q: How many results can I get? A: Up to 1,000 per run. Figshare hosts millions of items.

Q: What does searchByInstitution mode do? A: It retrieves all public articles published by a specific institution on Figshare. You need the institution's numeric ID (e.g., University of Manchester = 2).

Q: Are descriptions HTML-free? A: Yes. The actor strips all HTML tags from descriptions, returning clean plain text.

Q: Why is viewCount sometimes absent? A: The Figshare public API does not always return view/citation statistics in the article detail endpoint. The field is only included when the data is available.

Q: Can I search for code repositories? A: Yes — set itemType to 8 (Code) and use any keyword in searchQuery.

Q: How fresh is the data? A: The Figshare API returns live data. New uploads appear within minutes of publication.