Pricing

$19.99/month + usage

Go to Store

arXiv Search Scraper 📚

Try for free

Developed by

EasyApi

Extract comprehensive research paper data from arXiv search results. Get detailed metadata including titles, authors, abstracts, categories and more. Perfect for academic research monitoring, trend analysis and building paper databases. 🎓📚

0.0 (0)

Pricing

$19.99/month + usage

Total users

Monthly users

Runs succeeded

>99%

Last modified

4 months ago

Integrations

Other

Scrape research papers, authors, and metadata from arXiv search results. Get detailed information about academic papers including titles, authors, abstracts, categories, submission dates and more.

Features ✨

🔍 Scrape papers from any arXiv search URL
📄 Extract comprehensive paper metadata including:
- Paper ID and PDF links
- Title and abstract
- Author names and profile URLs
- Research categories and classifications
- Submission dates and comments
⚡ Fast and efficient pagination handling
🔄 Support for multiple search URLs
⚙️ Configurable maximum items limit
🌐 Proxy support for reliable scraping

Use Cases 💡

Research trend analysis
Academic paper monitoring
Building paper databases
Author tracking
Category-based paper collection
Literature review automation

Input Parameters 🎛️

The actor accepts the following input parameters:

Field	Type	Description
searchUrls	Array	List of arXiv search URLs to scrape
maxItems	Integer	Maximum number of items to scrape (optional)
proxyConfiguration	Object	Proxy settings (optional)

Output 📊

The actor stores results in dataset with the following fields for each paper:

searchUrl: Source search URL
arxivId: Unique arXiv paper ID
pdfUrl: Direct link to PDF
categories: Research categories with codes and names
title: Paper title
authors: Author details including names and profile URLs
abstract: Full paper abstract
submissionDate: Paper submission date
comments: Additional paper comments

Example Usage 💻

Input Example

A full explanation of an input example in JSON.

{
    "searchUrls": [
        "https://arxiv.org/search/?query=ai&searchtype=all&source=header"
    ],
    "maxItems": 60
}

Output sample

The results will be wrapped into a dataset which you can always find in the Storage tab. Here's an excerpt from the data you'd get if you apply the input parameters above:

And here is the same data but in JSON. You can choose in which format to download your data: JSON, JSONL, Excel spreadsheet, HTML table, CSV, or XML.

[
    {
        "searchUrl": "https://arxiv.org/search/?query=ai&searchtype=all&source=header",
        "arxivId": "arXiv:2502.21286",
        "pdfUrl": "https://arxiv.org/pdf/2502.21286",
        "categories": [
            {
                "code": "cs.CR",
                "name": "Cryptography and Security"
            },
            {
                "code": "cs.LG",
                "name": "Machine Learning"
            },
            {
                "code": "cs.NI",
                "name": "Networking and Internet Architecture"
            },
            {
                "code": "doi"
            },
            {
                "code": "10.1109/TNSM.2024.3376631"
            }
        ],
        "title": "Enabling AutoML for Zero-Touch Network Security: Use-Case Driven Analysis",
        "authors": [
            {
                "name": "Li Yang",
                "url": "https://arxiv.org/search/?searchtype=author&query=Yang%2C+L"
            },
            {
                "name": "Mirna El Rajab",
                "url": "https://arxiv.org/search/?searchtype=author&query=Rajab%2C+M+E"
            },
            {
                "name": "Abdallah Shami",
                "url": "https://arxiv.org/search/?searchtype=author&query=Shami%2C+A"
            },
            {
                "name": "Sami Muhaidat",
                "url": "https://arxiv.org/search/?searchtype=author&query=Muhaidat%2C+S"
            }
        ],
        "abstract": "Zero-Touch Networks (ZTNs) represent a state-of-the-art paradigm shift towards fully automated and intelligent network management, enabling the automation and intelligence required to manage the complexity, scale, and dynamic nature of next-generation (6G) networks. ZTNs leverage Artificial Intelligence (AI) and Machine Learning (ML) to enhance operational efficiency, support intelligent decision-making, and ensure effective resource allocation. However, the implementation of ZTNs is subject to security challenges that need to be resolved to achieve their full potential. In particular, two critical challenges arise: the need for human expertise in developing AI/ML-based security mechanisms, and the threat of adversarial attacks targeting AI/ML models. In this survey paper, we provide a comprehensive review of current security issues in ZTNs, emphasizing the need for advanced AI/ML-based security mechanisms that require minimal human intervention and protect AI/ML models themselves. Furthermore, we explore the potential of Automated ML (AutoML) technologies in developing robust security solutions for ZTNs. Through case studies, we illustrate practical approaches to securing ZTNs against both conventional and AI/ML-specific threats, including the development of autonomous intrusion detection systems and strategies to combat Adversarial ML (AML) attacks. The paper concludes with a discussion of the future research directions for the development of ZTN security approaches.\n        △ Less",
        "submissionDate": "28 February, 2025",
        "comments": "Published in IEEE Transactions on Network and Service Management (TNSM); Code is available at Github link: https://github.com/Western-OC2-Lab/AutoML-and-Adversarial-Attack-Defense-for-Zero-Touch-Network-Security"
    },
    ...
]

🔬 Nature Search Results Scraper - Extract comprehensive research article data from Nature.com search results
📚 Goodreads Book Scraper - Extract comprehensive book data for literature research and analysis
📚 Goodreads Review Scraper - Extract detailed book reviews and ratings for academic literature analysis
📚 Udemy Course Scraper - Extract detailed course information for educational content research
📚 Udemy Course Reviews Scraper - Collect comprehensive course review data for educational analysis
📄 Article Content Extractor - Extract clean article content and metadata from any web page
🔍 Google Scholar Scraper - Collect scholarly results with flexible search options and citation filtering
🔍 AI-powered Search - Get AI-enhanced search summaries with references and optimization tips
📊 Text Sentiment Analysis - Analyze sentiment in research abstracts and academic content
📝 Text Summarization - Generate concise summaries of research papers and documents
🔍 PubMed Search Scraper - Extract research papers and academic articles from PubMed
📚 Substack Publications Scraper - Collect detailed academic newsletter and publication data
📚 Substack Posts Scraper - Extract comprehensive academic post and article content
🔍 Keyword Discovery Tool - Discover relevant academic keywords and research topics
🔍 Keyword Density Checker - Analyze keyword frequency in academic content
📚 Medium Posts Search Scraper - Extract detailed article data for research content analysis

On this page

arXiv Search Scraper 📚

Share Actor:

ArXiv MCP server

jakub.kopecky/arxiv-mcp-server

The ArXiv MCP server provides a bridge between AI assistants and arXiv's research repository through the Model Context Protocol (MCP). It allows AI models to search for papers and access their content in a programmatic way.

Jakub Kopecký

Nature Search Results Scraper 🔬

easyapi/nature-search-results-scraper

Extract comprehensive research article data from Nature.com search results. Automatically scrape article details, author information, metadata, and preview images. Perfect for research monitoring, trend analysis, and building scientific literature databases. 🔬📚

EasyApi

Stackoverflow Academic Research Exporter

red.cars/stackoverflow-academic-research-exporter

Professional Stack Overflow data export for academic research, thesis projects, and educational analysis - Export questions, answers, and community discussions by tags, search queries, and date ranges. Computer science research, programming education studies, and technology trend analysis.

AutomateLab

PubMed Search Scraper

easyapi/pubmed-search-scraper

Scrape research papers and academic articles from PubMed based on search terms. Extract comprehensive article metadata including titles, authors, citations, abstracts, and more. Perfect for medical research and literature reviews.

EasyApi

Google Scholar Scraper

easyapi/google-scholar-scraper

Powerful Google Scholar scraper collect up to 5000 scholarly results per run with flexible search options, citation filtering. Perfect for academic research, bibliometric analysis, and scientific trend tracking. 🎓🔍

EasyApi

135

Google Scholar Scraper

marco.gullo/google-scholar-scraper

Scrape publication details from scholar.google.com. Add your query, time range, and optionally document type (PDF or HTML only). Extract information about articles such as titles, authors, links, related articles, and more.

Marco Gullo

1.1K

3.1

Google Scholar Scraper: Fast & Affordable (Rental) 📚⚡

scrapestorm/google-scholar-scraper-fast-affordable-rental

Unlock the potential of the Google Scholar scraper tool! 📚✨ Easily collect academic papers based on your chosen keyword or research topic 🔍. Get essential details like the title 📝, author(s) 🖋️, publication date ⏰, journal/source 🌐, and direct links to full texts 🔗—ideal for researchers ! 🚀

Storm_Scraper

PubMed Articles Scraper 📚🎓 - Pay per results

scrapestorm/pubmed-articles-scraper---pay-per-results

Easily retrieve relevant academic articles from PubMed with this powerful scraper 🔍. Customize results with filters like max items to extract 🔢 and sorting options like "e.g Best match" 🏆. Ideal for medical research, scientific papers, and gathering references fast! 🌍

Storm_Scraper

5.0

PubMed Articles Scraper 📚🎓

scrapestorm/pubmed-articles-scraper

Storm_Scraper

GitHub Issues to Slack

lhotanok/github-issues-to-slack

Monitors GitHub issues and sends Slack notifications about their modification. Tracks changes in issue's state such as newly opened issue, closed issue or re-opened issue. Saves data with all scraped issues into key value store.