Patents to Markdown for RAG
Pricing
from $40.00 / 1,000 document/chunks
Patents to Markdown for RAG
Convert patents (US/EP/WO) into clean, chunked Markdown for RAG and LLM pipelines via Google Patents — abstract, claims, description.
Pricing
from $40.00 / 1,000 document/chunks
Rating
0.0
(0)
Developer
NexGenData
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
📑 Patents to Markdown for RAG
Convert patents (US/EP/WO) into clean, chunked Markdown for RAG and LLM pipelines via Google Patents — abstract, claims, description.
⚡ What you get
One row per chunk: source, url, title, chunkIndex, totalChunks, markdown (LLM-ready, source URL = citation).
🎯 Use cases
- RAG over this content 2. Vector-store ingestion 3. Searchable knowledge bases 4. Citation-tagged LLM data
🚀 Sample inputs
{ "items": ["US10000000B2", "US9876543B2"], "chunkWords": 800 }
📦 Sample output
{ "source": "US10000000B2", "title": "...", "chunkIndex": 0, "totalChunks": 8, "markdown": "# ...\n..." }
📊 Sample Output

🛠 How it works
- Fetch each source. 2. Isolate the main document. 3. HTML → ATX Markdown. 4. Chunk ~chunkWords. 5. One row/chunk + citation.
🔗 Related Actors
💰 Pricing Example
Pay-per-event: $0.005 per run + $0.04 per document/chunk (document-record).
| Chunks | Cost |
|---|---|
| 100 | ~$4.00 |
| 500 | ~$20.00 |
| 2,000 | ~$80.00 |
| Apify's $5 free credit covers ~124 chunks. Start free → |
⚖️ Legal & data sources
Fetches publicly-accessible documents with an identified User-Agent; output includes source URLs for attribution.
❓ FAQ
Citations? Yes. Chunk size? chunkWords. Fresh? Live. Key? No. Inputs? Public HTML. Dedup? Per run.
🆘 Troubleshooting
- Empty markdown → JS-rendered/restricted page. - Boilerplate → use the canonical URL. - Huge → lower inputs/chunkWords. - 404 → check the URL/ID.
🏷️ About NexGenData
Public-data tools for analysts, developers, and operators. thenextgennexus.com