California & SF Construction Safety Scraper
Pricing
from $0.01 / 1,000 results
California & SF Construction Safety Scraper
Extracts official Cal/OSHA Title 8 safety orders and construction regulations. Designed for LLM/RAG compliance checking. Returns clean JSON with full text.
Pricing
from $0.01 / 1,000 results
Rating
0.0
(0)
Developer

Egor Manchulyantsev
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
15 days ago
Last modified
Categories
Share
California & SF Construction Regulations Scraper
🚧 What does it do?
This Actor scrapes detailed construction safety orders and building regulations from official California government sources (specifically Cal/OSHA Title 8).
It is designed to be the "Data Acquisition Layer" for Construction AI agents, RAG pipelines, and compliance checking tools. Unlike general scrapers, this actor:
- Filters out junk: Ignores "Contact Us", "Jobs", and unrelated pages.
- Deep crawling: Penetrates 3 levels deep to find the actual text of the law (Section/Article level).
- Clean Output: Returns structured JSON ready for Vector Databases (Pinecone, Milvus, etc.).
🎯 Use Cases
- AI Compliance Agents: Feed this data into an LLM to answer questions like "What is the railing height requirement in SF?".
- Construction Bidding: Automatically check if a project plan meets specific safety codes.
- Legal Monitoring: Track changes in Title 8 Safety Orders.
⚙️ Input Configuration
You can configure the target region. Currently optimized for San Francisco / California.
Example Input (JSON):
{"city": "San Francisco","state": "CA","max_pages": 99999}📦 Output FormatThe actor stores results in the Apify Dataset. Each item represents a specific regulation section containing the full legal text.Example JSON Result:{"source_url": "[https://www.dir.ca.gov/title8/1620.html](https://www.dir.ca.gov/title8/1620.html)","title": "§1620. Design and Construction of Guardrails.","text": "A. Railing specifications... (full legal text here)...","region": "San Francisco, CA","segment": "cal_osha_detailed","timestamp": "2025-12-06T20:30:00+00:00"}🚀 How to useClick the green Start button.Wait for the run to finish (it scans dir.ca.gov).Go to the Storage tab -> Dataset to view results.Click Export to download the data in JSON, CSV, or Excel format.(Optional) Use the API to integrate this data directly into your AI application.