DLMF NIST Math Functions Scraper
Pricing
Pay per event
DLMF NIST Math Functions Scraper
Scrapes the NIST Digital Library of Mathematical Functions (DLMF) for structured equation data: MathML, LaTeX source, constraints, and referenced functions — across all 36 chapters and hundreds of sections.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Scrapes the NIST Digital Library of Mathematical Functions (DLMF) — the authoritative reference for special functions in mathematics and physics — to produce a structured, machine-readable corpus of numbered equations with MathML, LaTeX source, and associated metadata.
What it does
The actor performs a three-level hierarchical crawl:
- Index — discovers all 36 DLMF chapters from the homepage
- Chapter pages — discovers all sections within each chapter (typically 10–20 sections)
- Section pages — extracts every numbered equation including MathML, LaTeX TeX source, plain-text rendering, referenced symbols, and the canonical permalink
Across all 36 chapters the DLMF contains approximately 5,000–10,000 numbered equations. A full crawl completes in minutes at the default concurrency.
Output fields
| Field | Description |
|---|---|
chapter | Chapter number (integer, 1–36) |
section | Section identifier, e.g. 1.2 |
title | Section title, e.g. Elementary Algebra |
equation_number | DLMF equation number, e.g. 1.2.1 |
equation_mathml | Full MathML XML for the equation |
equation_tex | LaTeX source recovered from MathML alttext attribute |
equation_text | Unicode plain-text rendering of the equation |
constraints | Constraint text associated with the equation (if any) |
referenced_functions | Pipe-separated list of symbol/function names referenced |
url | Canonical DLMF permalink, e.g. http://dlmf.nist.gov/1.2.E1 |
Use cases
- Symbolic math / CAS training data — verified special-function formulas with LaTeX and MathML
- RAG / vector search corpora — ground-truth equation database for scientific-computing AI agents
- Formula search engines — structured index of equations by chapter/section with canonical IDs
- Verification datasets — NIST-authoritative identities for function evaluations
Input parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
maxItems | integer | 10 | Maximum number of equations to scrape (0 = unlimited) |
startChapter | integer | 1 | First chapter to crawl (1–36) |
endChapter | integer | 36 | Last chapter to crawl (1–36, omit for all) |
Example output record
{"chapter": 1,"section": "1.2","title": "Elementary Algebra","equation_number": "1.2.1","equation_tex": "\\genfrac{(}{)}{0.0pt}{}{n}{k}=\\frac{n!}{(n-k)!k!}","equation_text": "(nk)=n!/(n−k)!k!","constraints": "","referenced_functions": "(mn): binomial coefficient | !: factorial (as in n!) | n: nonnegative integer","url": "http://dlmf.nist.gov/1.2.E1"}
Notes
- The DLMF is a US government publication (NIST). Content is in the public domain.
- No proxy required — dlmf.nist.gov is a clean US gov host with no anti-bot measures.
- Chapter 1 alone contains ~180 equations across 18 sections. Full 36-chapter run yields ~5,000+ records.
- Sections containing only notation tables (no numbered equations) return 0 results — this is expected.