# Google Scholar Lite - Cheap Bulk Academic Papers API (`johnvc/google-scholar-lite-api`) Actor

Search Google Scholar for academic papers in bulk and export clean JSON: title, authors, journal, year, citation count, and PDF links. Fast bibliometric search for literature reviews, citation discovery, and research datasets. Pay per paper from $1.50 per 1,000, with no setup or per-run fee.

- **URL**: https://apify.com/johnvc/google-scholar-lite-api.md
- **Developed by:** [John](https://apify.com/johnvc) (community)
- **Categories:** Developer tools, AI, Integrations
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 4 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $0.01 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Google Scholar Lite - Cheap Bulk Academic Papers API

Search Google Scholar for academic papers in bulk and get clean, structured JSON: title, authors, journal, publication year, citation count, result snippet, and links to the paper and its PDF or HTML full text. Search many queries at once, filter by year range, and export thousands of papers. Pay per paper from **$1.50 per 1,000**, with no setup or per-run fee.

This is the **Lite** option: fast bibliometric search for literature reviews, citation discovery, and research datasets. It talks to a structured scholarly search API instead of driving a slow, breakable headless browser, so it is quick and reliable. If you need full PDF text extraction, author profiles (h-index, full publication lists), or citation-network walking, this is not the tool - see the comparison below.

### What you get

One clean row per paper:

- Title and result snippet
- `publicationInfo` line (authors, journal or venue, year)
- Publication `year`
- `citedBy` citation count
- `link` to the paper, plus `pdfUrl` / `htmlUrl` full-text links when available
- A stable result `id` and the `searchTerm` it was found for

### What you do NOT get

- Full PDF text extraction
- Author profile expansion (h-index, all-publications lists)
- Citation-network walking (who-cites-whom graphs)
- Semantic enrichment or de-duplication against external databases

Need those? Our more robust [Google Scholar API](https://apify.com/johnvc/google-scholar-api?fpr=9n7kx3) is the full-featured companion: it adds author profiles (h-index, full publication lists), citation lookups, and co-author network expansion. This Lite Actor is the fast, cheap complement for everyone who just needs the paper data in bulk.

### Use cases

- Build a literature-review shortlist across dozens of queries in one run
- Track citation counts for a topic or research group over a year range
- Assemble bibliometric datasets for analysis or AI training pipelines
- Discover the most-cited recent papers in a field, then follow the PDF links
- Monitor a research area by re-running the same queries on a schedule

### When to use this actor

This Lite Actor is the bulk, low-cost option. When you need deeper research features, our more robust [Google Scholar API](https://apify.com/johnvc/google-scholar-api?fpr=9n7kx3) is the companion to reach for.

|  | This Actor (Lite) | [Google Scholar API](https://apify.com/johnvc/google-scholar-api?fpr=9n7kx3) (premium) |
|---|---|---|
| Best for | Bulk paper search, literature reviews, datasets | Deep author and citation research |
| Paper search with year filters | Yes | Yes |
| Author profiles, h-index, citation graphs | Not included | Included |
| Full PDF text | Not included | Often included |
| Pricing | Pay per paper, from $1.50 / 1,000 | See its store page |

Rule of thumb: **bulk paper discovery -> this Lite Actor; deep author and citation research -> the premium [Google Scholar API](https://apify.com/johnvc/google-scholar-api?fpr=9n7kx3).**

### Pricing

Pay-per-result: you are charged only for the **papers** returned. The per-paper price scales down with your Apify plan:

| Plan | Per paper | Per 1,000 papers |
|---|---|---|
| Free | $0.0015 | $1.50 |
| Bronze | $0.0013 | $1.30 |
| Silver | $0.0011 | $1.10 |
| Gold | $0.0009 | $0.90 |

No per-run fee, no setup fee, no monthly minimum. You only pay for the papers you receive.

### Input

| Field | Type | Description |
|-------|------|-------------|
| `searchTerms` | array of strings | One or more search queries, e.g. `transformer attention mechanism`. Each is searched independently. Required. |
| `yearFrom` | integer | Optional earliest publication year, e.g. `2020`. |
| `yearTo` | integer | Optional latest publication year, e.g. `2026`. |
| `maxResultsPerSearch` | integer | Papers per search query. Minimum 10, default 100. |
| `language` | string | Optional two-letter language code, e.g. `en`, `es`, `de`. Default `en`. |

#### Example input

```json
{
  "searchTerms": ["transformer attention mechanism", "diffusion models"],
  "yearFrom": 2020,
  "yearTo": 2026,
  "language": "en",
  "maxResultsPerSearch": 100
}
````

### Sample output

```json
{
  "searchTerm": "transformer attention mechanism",
  "position": 1,
  "title": "Transformer architecture and attention mechanisms in genome data analysis: a comprehensive review",
  "link": "https://www.mdpi.com/2079-7737/12/7/1033",
  "publicationInfo": "SR Choi, M Lee - Biology, 2023 - mdpi.com",
  "snippet": "... the transformer architecture and the attention mechanism in specific application of transformers and attention methods ...",
  "year": 2023,
  "citedBy": 293,
  "htmlUrl": "https://www.mdpi.com/2079-7737/12/7/1033",
  "id": "LY1VJ0g70YsJ"
}
```

Papers that expose a PDF include a `pdfUrl` field pointing at the full text.

### How to get started

1. Open [Google Scholar Lite on the Apify Store](https://apify.com/johnvc/google-scholar-lite-api?fpr=9n7kx3).
2. Enter one or more `searchTerms` (add `yearFrom` / `yearTo` to focus the range).
3. Set `maxResultsPerSearch`, then run the Actor.
4. Export the dataset as JSON, CSV, or Excel, or pull it from the API.

Prefer code? See [johnvc's GitHub for setup guides and code examples](https://github.com/johnisanerd/ApifyPublicData).

### Run from the API

```bash
curl -X POST "https://api.apify.com/v2/acts/johnvc~google-scholar-lite-api/run-sync-get-dataset-items?token=YOUR_APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"searchTerms":["transformer attention mechanism"],"yearFrom":2020,"yearTo":2026,"maxResultsPerSearch":50}'
```

### 🔌 Use this API from Claude (MCP)

This Actor is compatible with the Model Context Protocol (MCP), so AI agents can call it as a tool. Add it through the hosted Apify MCP server using this Actor-specific URL:

https://mcp.apify.com/?tools=actors,docs,johnvc/google-scholar-lite-api

If you run agents from [Claude Code](https://claude.ai/referral/uIlpa7nPLg) (free trial) or [Claude Cowork](https://claude.ai/referral/uIlpa7nPLg) (free trial), add the Apify MCP server and call this Actor directly with requests like "find the 50 most-cited papers on diffusion models since 2021."

Setup walkthrough:

https://www.youtube.com/watch?v=jREWahDGhJM

Apify MCP integration docs: https://docs.apify.com/platform/integrations/mcp

#### MCP setup, step by step

Visual setup guides for each client (source and more assets: [ApifyPublicData on GitHub](https://github.com/johnisanerd/ApifyPublicData)):

**Claude Cowork Desktop**

![Install in Claude Cowork Desktop](https://raw.githubusercontent.com/johnisanerd/ApifyPublicData/main/assets/guides/install_mcp_into_claude_desktop.png)

**Claude Code**

![Install in Claude Code](https://raw.githubusercontent.com/johnisanerd/ApifyPublicData/main/assets/guides/install_mcp_into_claude_code.png)

**Claude (website)**

![Install in Claude website](https://raw.githubusercontent.com/johnisanerd/ApifyPublicData/main/assets/guides/install_mcp_into_claude_ai.png)

**Cursor**

![Install in Cursor](https://raw.githubusercontent.com/johnisanerd/ApifyPublicData/main/assets/guides/install_mcp_into_cursor.png)

**ChatGPT**

![Install in ChatGPT](https://raw.githubusercontent.com/johnisanerd/ApifyPublicData/main/assets/guides/install_mcp_into_ChatGPT.png)

### FAQ

**Does it return full PDF text?** No. It returns the paper's metadata plus a `pdfUrl` link when one is available. Follow the link to fetch the PDF yourself.

**Does it return author profiles or h-index?** No. For author profiles, citation lookups, and co-author expansion, use our premium [Google Scholar API](https://apify.com/johnvc/google-scholar-api?fpr=9n7kx3).

**How many papers per search?** Set `maxResultsPerSearch` (minimum 10, default 100). Results come in pages of about 10 and a search stops early when the topic runs out of papers, so you only pay for what exists.

**Why did I get fewer results than I asked for?** `maxResultsPerSearch` is a ceiling, not a guarantee. Niche or tightly year-filtered queries simply have fewer matching papers.

**Can I search many topics at once?** Yes. Pass multiple `searchTerms`; each is searched independently and tagged with its source term in the output.

**Can I filter by year?** Yes. Set `yearFrom` and/or `yearTo` to bound the publication years.

Last Updated: 2026.06.02

# Actor input Schema

## `searchTerms` (type: `array`):

Provide one or more search queries, for example 'transformer attention mechanism' or 'CRISPR gene editing'. Each query is searched independently and billed per paper returned.

## `yearFrom` (type: `integer`):

Limit results to papers published in or after this year, for example 2020. Leave blank for no lower bound.

## `yearTo` (type: `integer`):

Limit results to papers published in or before this year, for example 2026. Leave blank for no upper bound.

## `maxResultsPerSearch` (type: `integer`):

Set how many papers to return per search query. Results come in pages of about 10; the Actor pulls just enough pages to reach this count, then stops early when a query runs out of papers. Default 100.

## `language` (type: `string`):

Set the two-letter interface language code for results, for example 'en', 'es', 'de'. Defaults to 'en'.

## Actor input object example

```json
{
  "searchTerms": [
    "transformer attention mechanism"
  ],
  "maxResultsPerSearch": 100,
  "language": "en"
}
```

# Actor output Schema

## `papers` (type: `string`):

All paper records stored in the default dataset, one item per unique paper.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchTerms": [
        "transformer attention mechanism"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("johnvc/google-scholar-lite-api").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "searchTerms": ["transformer attention mechanism"] }

# Run the Actor and wait for it to finish
run = client.actor("johnvc/google-scholar-lite-api").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchTerms": [
    "transformer attention mechanism"
  ]
}' |
apify call johnvc/google-scholar-lite-api --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=johnvc/google-scholar-lite-api",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Google Scholar Lite - Cheap Bulk Academic Papers API",
        "description": "Search Google Scholar for academic papers in bulk and export clean JSON: title, authors, journal, year, citation count, and PDF links. Fast bibliometric search for literature reviews, citation discovery, and research datasets. Pay per paper from $1.50 per 1,000, with no setup or per-run fee.",
        "version": "0.0",
        "x-build-id": "pnlzBHcZhYfh0gWvg"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/johnvc~google-scholar-lite-api/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-johnvc-google-scholar-lite-api",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/johnvc~google-scholar-lite-api/runs": {
            "post": {
                "operationId": "runs-sync-johnvc-google-scholar-lite-api",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/johnvc~google-scholar-lite-api/run-sync": {
            "post": {
                "operationId": "run-sync-johnvc-google-scholar-lite-api",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "searchTerms"
                ],
                "properties": {
                    "searchTerms": {
                        "title": "Search Terms",
                        "minItems": 1,
                        "type": "array",
                        "description": "Provide one or more search queries, for example 'transformer attention mechanism' or 'CRISPR gene editing'. Each query is searched independently and billed per paper returned.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "yearFrom": {
                        "title": "Year From",
                        "minimum": 1900,
                        "maximum": 2100,
                        "type": "integer",
                        "description": "Limit results to papers published in or after this year, for example 2020. Leave blank for no lower bound."
                    },
                    "yearTo": {
                        "title": "Year To",
                        "minimum": 1900,
                        "maximum": 2100,
                        "type": "integer",
                        "description": "Limit results to papers published in or before this year, for example 2026. Leave blank for no upper bound."
                    },
                    "maxResultsPerSearch": {
                        "title": "Maximum Results Per Search",
                        "minimum": 10,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Set how many papers to return per search query. Results come in pages of about 10; the Actor pulls just enough pages to reach this count, then stops early when a query runs out of papers. Default 100.",
                        "default": 100
                    },
                    "language": {
                        "title": "Language",
                        "type": "string",
                        "description": "Set the two-letter interface language code for results, for example 'en', 'es', 'de'. Defaults to 'en'.",
                        "default": "en"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
