# OpenStax Open Textbooks Scraper (`parseforge/openstax-textbooks-scraper`) Actor

Browse OpenStax open license textbooks by subject or free text query. Each record returns title, subject, edition, authors, license, isbn, pages, language, available reading formats, and url. Useful for OER catalogs, curriculum planning, and edtech content discovery.

- **URL**: https://apify.com/parseforge/openstax-textbooks-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Education, Automation, Integrations
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $7.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 📚 OpenStax Textbooks Scraper

> 🚀 **Export OpenStax records in seconds. Pipe results straight into your spreadsheet, dashboard, or data warehouse.**

> 🕒 **Last updated:** 2026-06-05 · **📊 10 fields** per record · Public OpenStax data · Real-time updates

The OpenStax Textbooks Scraper turns the public OpenStax CMS endpoint into a clean structured dataset of open educational textbooks. Every record carries title, subject, edition, authors, license, ISBN, page count, language, and direct download links.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| 🎓 Students | Find free textbooks for class. |
| 👩‍🏫 Educators | Build reading lists from open content. |
| 📚 Librarians | Track new OpenStax releases. |
| 🤖 EdTech builders | Power discovery features with open data. |

### 📋 What the OpenStax Textbooks Scraper does

- Fetches the public OpenStax feed at `https://openstax.org/apps/cms/api/v2/pages/`.
- Parses the response and flattens each record into one structured row.
- Casts numeric values to numbers, dates to ISO strings.
- Surfaces upstream errors as a clean `error` record instead of crashing.
- Pushes everything to the dataset, ready for instant download.

> 💡 **Why it matters:** OpenStax publishes the data, but the raw response is awkward to work with. This actor normalizes everything into a flat schema that drops straight into pandas, BigQuery, or a Google Sheet.

### 🎬 Full Demo

_🚧 Coming soon._

### ⚙️ Input

See the Input tab on the Apify console for the full list of supported filters. Every filter is optional. `maxItems` controls how many records are returned.

**Example**
```json
{
  "maxItems": 50
}
````

> ⚠️ **Good to Know.** Free users are capped at 10 records per run as a preview. Paid users can pull up to 1,000,000 records.

### 📊 Output

Each record is a flat object. The `error` field is always last.

| Field | Type | Description |
|---|---|---|
| 📚 `title` | string | Textbook title. |
| 🏷️ `subject` | string | Subject area. |
| 📖 `edition` | string | Edition label. |
| ✍️ `authors` | array | List of author names. |
| ⚖️ `license` | string | Creative Commons license. |
| 🔢 `isbn` | string | ISBN identifier if available. |
| 📄 `pages` | number | Page count. |
| 🗣️ `language` | string | Primary language. |
| 📥 `downloadFormats` | array | Available download formats. |
| 🔗 `url` | string | Public OpenStax URL. |
| 🕒 `scrapedAt` | string | When this row was fetched. |
| ❌ `error` | string | Set if the upstream response was an error. |

### ✨ Why choose this Actor

| 🆓 | Works with the free Apify plan (10-record preview). |
| 🧹 | Clean snake\_case keys ready for BI tools. |
| 🔢 | Auto-casts numeric and date fields. |
| 🛟 | Surfaces upstream errors as a clean record. |
| 💾 | Push to dataset and download in any supported format. |

### 📈 How it compares to alternatives

| Approach | Setup time | Clean keys | Numeric casting | Error handling |
|---|---|---|---|---|
| Roll your own fetch | 30 min + | No | No | No |
| **This Actor** | 5 sec, no install | Yes | Yes | Yes |

### 🚀 How to use

1. Click **Try for free**.
2. Adjust the input filters or leave defaults.
3. Click **Start**. Within seconds, your dataset is ready.

### 💼 Business use cases

**🎓 Course planning.** Pull every OpenStax title in a subject and pick the right edition for your syllabus.

**📚 Library catalogs.** Sync OpenStax metadata into your library system on a schedule.

**🤖 EdTech discovery.** Power search and recommendation features in your learning app.

**🌍 Translation projects.** Identify titles by language to coordinate volunteer translation efforts.

### 🔌 Automating OpenStax Textbooks Scraper

- **Make / Zapier.** Trigger this actor on a schedule, push results to Airtable, Slack, or your CRM.
- **Cron schedule.** Apify's native scheduler runs this on whatever cadence you need.
- **Webhooks.** Get a POST to your endpoint the moment a run finishes.
- **Pipe to your warehouse.** Native Apify integrations move datasets straight into BigQuery, Snowflake, or Postgres.

### 🌟 Beyond business use cases

**🎓 Education.** Use real public data for classroom projects.

**🧪 Personal research.** Build your own dashboards and notebooks.

**🤝 Non-profit & open data.** Power public dashboards without writing client code.

**🧰 Tinkering & prototyping.** Spin up a fresh data feed in seconds.

### 🤖 Ask an AI assistant about this scraper

Pop this README into ChatGPT, Claude, or any AI assistant and ask it to map your specific workflow to the actor's inputs.

### ❓ Frequently Asked Questions

**❓ Is the data free to use?** OpenStax publishes everything under Creative Commons licenses. Check each record's `license` field for specifics.

**❓ How fresh is the data?** Pulled live from the OpenStax CMS API on every run.

**❓ Can I filter by subject?** Yes, pick a subject from the dropdown.

**❓ Are all formats listed?** Yes. The `downloadFormats` array surfaces every download option.

**❓ Does this need an API key?** No. The OpenStax API is fully public.

**❓ Can I schedule runs?** Yes, via Apify's native scheduler or Make / Zapier.

**❓ Will the schema change?** The core fields are stable.

**❓ Is this scraping or API?** API. OpenStax exposes a public CMS endpoint.

**❓ What if a field is null?** Some optional fields (ISBN, pages) are only set when OpenStax publishes them.

**❓ What output format can I download?** Every Apify-supported export format is available straight from the dataset UI.

### 🔌 Integrate with any app

Apify ships native integrations with Make, Zapier, Slack, Discord, Google Drive, Google Sheets, Gmail, Airbyte, Keboola, Telegram, GitHub, and any REST API or webhook endpoint.

### 🔗 Recommended Actors

| Actor | What it does |
|---|---|
| [ParseForge Alpha Vantage Scraper](https://apify.com/parseforge) | Market data, FX, crypto. |
| [ParseForge OurAirports Scraper](https://apify.com/parseforge/ourairports-scraper) | Global airport database. |
| [ParseForge NBA Stats Scraper](https://apify.com/parseforge/nba-stats-scraper) | Player and team stats from NBA.com. |
| [ParseForge CurseForge Mods Scraper](https://apify.com/parseforge/curseforge-mods-scraper) | Public mod metadata. |

> 💡 **Pro Tip.** Browse the complete [ParseForge collection](https://apify.com/parseforge) for 900+ production-grade scrapers across business intelligence, real estate, e-commerce, sports, finance, and public records.

***

**Disclaimer.** This actor scrapes only publicly available data. ParseForge is not affiliated with, endorsed by, or sponsored by any third-party services referenced. Users are responsible for complying with the target site's terms of service and applicable law. [Create a free account w/ $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp).

# Actor input Schema

## `subject` (type: `string`):

Filter textbooks by subject area.

## `query` (type: `string`):

Free-text search across textbook titles.

## `maxItems` (type: `integer`):

Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000

## Actor input object example

```json
{
  "subject": "all",
  "maxItems": 10
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/openstax-textbooks-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "maxItems": 10 }

# Run the Actor and wait for it to finish
run = client.actor("parseforge/openstax-textbooks-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 10
}' |
apify call parseforge/openstax-textbooks-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/openstax-textbooks-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "OpenStax Open Textbooks Scraper",
        "description": "Browse OpenStax open license textbooks by subject or free text query. Each record returns title, subject, edition, authors, license, isbn, pages, language, available reading formats, and url. Useful for OER catalogs, curriculum planning, and edtech content discovery.",
        "version": "0.1",
        "x-build-id": "F3nBFnoWInSCzZK0x"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~openstax-textbooks-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-openstax-textbooks-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~openstax-textbooks-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-openstax-textbooks-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~openstax-textbooks-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-openstax-textbooks-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "subject": {
                        "title": "Subject",
                        "enum": [
                            "all",
                            "math",
                            "science",
                            "social-sciences",
                            "humanities",
                            "business",
                            "college-success",
                            "ap",
                            "essentials",
                            "high-school",
                            "nursing"
                        ],
                        "type": "string",
                        "description": "Filter textbooks by subject area.",
                        "default": "all"
                    },
                    "query": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Free-text search across textbook titles."
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
