Pricing

Pay per event

Apache & Nginx Log Parser

Parse Apache, Nginx, and IIS access logs into structured JSON. Extracts IPs, timestamps, HTTP methods, paths, status codes, bytes, user agents, and referrers. Includes traffic analytics.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Actor stats

Bookmarked

Total users

Monthly active users

11 days ago

Last modified

🪵 Apache & Nginx Log Parser

Parse Apache access logs, Nginx logs, and IIS W3C logs in seconds. Extract structured data — IP addresses, timestamps, HTTP methods, paths, status codes, user agents — and get instant traffic analytics without installing any tools.

No proxy required. No web scraping. Pure computation.

🔍 What does it do?

Apache & Nginx Log Parser reads raw access log files and converts them into clean, structured records you can filter, sort, and export. It works with:

Apache Combined Log Format (the default for most Apache 2.x installs)
Nginx access logs (same format by default)
Common Log Format (CLF) — legacy Apache without referrer/user-agent
IIS W3C Extended Log Format — Microsoft IIS server logs

Give it a log file URL or paste raw log content, and it returns:

One structured record per log line — IP, timestamp (ISO 8601), method, path, status code, response bytes, referrer, user agent
A summary record — top pages, top IPs, status code distribution, hourly traffic breakdown, top user agents, HTTP method breakdown

👤 Who is it for?

🔧 SEO Auditors

Identify your top crawled pages, detect Googlebot activity, find 404s and redirect chains — all from raw access logs without Google Analytics or Search Console.

🖥️ DevOps & SysAdmins

Quickly diagnose traffic spikes, identify bad bots hammering your server, find which endpoints are slowest, and spot error patterns — without SSH access or log aggregation tools.

📊 Data Analysts

Load access logs into your data pipeline, spreadsheet, or BI tool. Get structured JSON out of messy log text in one step.

🛡️ Security Analysts

Find brute force attempts, unusual IP patterns, and suspicious user agents from raw access logs with no special software.

💡 Why use Apache & Nginx Log Parser?

Traditional log analysis requires command-line tools (awk, grep, GoAccess, AWStats) or a full ELK stack. This actor:

✅ Works in your browser — no SSH, no terminal
✅ Outputs clean JSON you can download, query via API, or pipe into n8n/Make
✅ Auto-detects log format — no configuration needed
✅ Handles large log files (millions of lines) efficiently
✅ Integrates with Apify's dataset storage — keep your analysis history
✅ Zero proxy cost — pure computation, no web requests to target sites

📋 What data is extracted?

Each parsed log line produces a structured record:

Field	Type	Description
`ip`	string	Client IP address
`timestamp`	string	Raw timestamp from log (e.g. `10/Oct/2000:13:55:36 -0700`)
`timestampIso`	string	ISO 8601 timestamp (e.g. `2000-10-10T13:55:36-07:00`)
`method`	string	HTTP method (`GET`, `POST`, `PUT`, `DELETE`, etc.)
`path`	string	Request path (may include query string)
`protocol`	string	HTTP protocol (`HTTP/1.0`, `HTTP/1.1`, `HTTP/2`)
`statusCode`	number	HTTP status code (200, 301, 404, 500, etc.)
`responseBytes`	number	Response size in bytes
`referrer`	string	HTTP Referer header value
`userAgent`	string	User-Agent header value
`logFormat`	string	Detected format (`apache_combined`, `common`, `iis_w3c`)
`parseError`	string	Error message if the line couldn't be parsed
`rawLine`	string	Original raw log line

Plus a summary record (when includeStats: true):

Field	Description
`topPages`	Top N most-requested paths
`topIPs`	Top N most-active IP addresses
`statusCodes`	Count per HTTP status code
`hourlyTraffic`	Request count per hour
`topUserAgents`	Top N user agent strings
`topMethods`	HTTP method distribution

💰 How much does it cost to parse Apache logs?

This actor uses Pay-Per-Event (PPE) pricing — you only pay for what you use:

Event	FREE tier	DIAMOND tier
Run start	$0.005	$0.0025
Per 1,000 log lines	$0.008	$0.004

Example costs

File size	Lines	Estimated cost (FREE)
Small log	1,000 lines	~$0.013
Medium log	10,000 lines	~$0.085
Large log	100,000 lines	~$0.805
Huge log (1M lines)	1,000,000	~$8.005

Parsing is pure computation with zero proxy cost — the only expense is the nominal per-batch charge.

Use maxLines to cap processing and control costs on very large files.

🚀 How to use it

Step 1 — Provide your log data

Option A: Paste log content Set logText to your raw log lines. Ideal for small snippets or when testing.

Option B: Fetch from URL Set logUrl to any publicly accessible log file URL. The actor fetches it over HTTP — no auth or proxy needed.

Step 2 — Choose format (optional)

Leave logFormat as auto to detect automatically. Set it explicitly if auto-detection fails:

apache_combined — Apache Combined Log Format
nginx — Nginx access log (same as Apache Combined by default)
common — Common Log Format (CLF), no referrer/user-agent
iis_w3c — IIS W3C Extended Log Format

Step 3 — Run and download results

Results are saved to the actor's dataset. Download as JSON, CSV, or JSONL — or query them via the Apify API.

⚙️ Input parameters

Parameter	Type	Default	Description
`logText`	string	—	Paste raw log lines directly
`logUrl`	string	—	URL of a log file to fetch
`logFormat`	string	`auto`	Log format: `auto`, `apache_combined`, `nginx`, `common`, `iis_w3c`
`maxLines`	integer	`0` (unlimited)	Max lines to parse (0 = parse all)
`includeStats`	boolean	`true`	Include summary statistics record
`topN`	integer	`10`	How many entries in top-N summaries

Either logText or logUrl is required.

📤 Output format

Results are stored in two dataset views:

`log-entries` view

One record per parsed log line. Use this to filter, sort, or search individual requests.

`summary` view

A single record with aggregated statistics. Use this for the big-picture traffic breakdown.

🔧 Tips & best practices

Handling large log files

Use maxLines to process only the most recent N lines (log files grow from the bottom up)
For very large files (> 1GB), consider splitting them before uploading
Use includeStats: false if you only need raw entries and want to compute stats yourself

Format auto-detection

Auto-detection samples the first 20 non-comment lines. If your log file has a long header or preamble, auto-detection may fail — set logFormat explicitly in that case.

IIS logs

IIS W3C logs have a #Fields: header that defines column order. The actor reads this header automatically and adjusts field mapping accordingly. Multiple #Fields: headers in a single file (rare but valid) are handled correctly.

Filtering by status code

After parsing, filter the dataset by statusCode in the Apify console or via API:

GET https://api.apify.com/v2/datasets/{DATASET_ID}/items?fields=ip,path,statusCode&limit=1000

Nginx default format

Nginx's default access_log format is identical to Apache Combined Log Format. The actor parses it without any special configuration.

🔌 Integrations

With n8n

Use the Apify n8n node to trigger the parser, then pass the dataset URL to a HTTP Request node to fetch results. Feed into Spreadsheet File or Google Sheets nodes for pivot tables.

With Make (Integromat)

Trigger the actor via Apify > Run Actor module, then use Apify > Get Dataset Items to fetch structured records into any downstream module (Google Sheets, Airtable, Slack).

With Zapier

Use Apify > Run Actor trigger, connect to Google Sheets > Create Spreadsheet Row — each parsed log line becomes a row.

With your data pipeline

Use the dataset API to stream results:

curl "https://api.apify.com/v2/datasets/{DATASET_ID}/items?format=json" \
  -H "Authorization: Bearer YOUR_TOKEN" | jq '.[] | select(.statusCode == 404)'

🤖 API usage

Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_TOKEN' });

const run = await client.actor('automation-lab/apache-log-parser').call({
    logUrl: 'https://example.com/access.log',
    logFormat: 'auto',
    includeStats: true,
    topN: 10,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items.filter(i => i.statusCode === 404));

Python

from apify_client import ApifyClient

client = ApifyClient(token='YOUR_TOKEN')

run = client.actor('automation-lab/apache-log-parser').call(run_input={
    'logUrl': 'https://example.com/access.log',
    'logFormat': 'auto',
    'includeStats': True,
    'topN': 10,
})

items = client.dataset(run['defaultDatasetId']).list_items().items
errors = [i for i in items if i.get('statusCode', 0) >= 500]
print(f"Found {len(errors)} server errors")

cURL

TOKEN="YOUR_TOKEN"

# Start the run
RUN=$(curl -s -X POST "https://api.apify.com/v2/acts/automation-lab~apache-log-parser/runs?token=$TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"logUrl":"https://example.com/access.log","includeStats":true}')

DATASET_ID=$(echo $RUN | python3 -c "import sys,json; print(json.load(sys.stdin)['data']['defaultDatasetId'])")

# Fetch results
curl "https://api.apify.com/v2/datasets/$DATASET_ID/items?token=$TOKEN" | python3 -m json.tool

🧠 Use with Claude (MCP)

You can use this actor directly from Claude via the Apify MCP server:

Claude Desktop — Add to claude_desktop_config.json:

{
  "mcpServers": {
    "apify": {
      "command": "npx",
      "args": ["-y", "@apify/mcp-server?tools=automation-lab/apache-log-parser"],
      "env": { "APIFY_TOKEN": "YOUR_TOKEN" }
    }
  }
}

Claude Code — Run in terminal:

$claude mcp add --transport http apify "https://mcp.apify.com?tools=automation-lab/apache-log-parser"

Example prompts:

"Parse this Apache log and show me the top 10 pages with 404 errors"
"Fetch my Nginx log from https://example.com/access.log and give me hourly traffic for yesterday"
"Analyze this IIS log: [paste log content] — which IPs are hitting us hardest?"

⚖️ Legality & privacy

This actor processes log data you provide — it does not scrape any website. Log files typically contain IP addresses and user agent strings, which may be considered personal data under GDPR and similar regulations.

Ensure you:

Have the right to process the log data you provide
Comply with your organization's data retention policies
Anonymize or pseudonymize IP addresses if required by your jurisdiction

❓ FAQ

Q: My log lines aren't being parsed — what format should I use? A: Copy one line from your log and check if it matches Apache Combined Format: IP - user [timestamp] "METHOD /path HTTP/1.1" STATUS BYTES "referrer" "user-agent" If so, use apache_combined. If it's missing the referrer/user-agent at the end, use common. If it has a #Fields: header, use iis_w3c.

Q: How many lines can it process? A: There is no hard limit — the actor processes all lines in memory. For very large files (millions of lines), Apify's 256 MB memory limit applies. Use maxLines to cap processing if you get out-of-memory errors.

Q: Can it parse custom Nginx log formats? A: Currently supports the Nginx default format (which matches Apache Combined). Custom log_format directives produce non-standard output that may not parse correctly. Support for custom format strings is planned for a future version.

Q: The URL fetch fails — what can I do? A: The log URL must be publicly accessible without authentication. If your log is behind auth or a firewall, download it locally and paste the content using logText instead.

Q: Why are some lines showing parseError? A: Lines that don't match the expected format (e.g., blank lines, comment lines in Apache config files accidentally included, or log rotation markers) produce a parseError. The actor continues processing remaining lines — one bad line doesn't stop the run.

Q: Does it support compressed (.gz) log files? A: Not currently. Decompress the file first (gunzip access.log.gz) and then provide the plain text content via logText or host it at a URL.

JSON Schema Generator — Generate JSON Schema from sample JSON documents
Color Contrast Checker — WCAG 2.1 AA/AAA color contrast validator
JSON CSV Converter — Convert between JSON and CSV formats

Security Headers Scanner

pattonholdings/security-headers-scanner

Scan HTTP security headers for any URL. Returns A+ to F grade, per-header pass/weak/missing status (CSP, HSTS, X-Frame-Options), and Nginx/Apache/Express/Cloudflare fix snippets. Use for security audits, compliance checks, vendor review. Input: url or urls[]. Output: JSON grade report.

Coleton Patton

HTTP Status Codes and URL Checker

antonio_espresso/website-status-code-crawler

A HTTP Status Codes Crawler is a tool that scans a website and retrieves HTTP status codes for each page. This helps in diagnosing errors and optimizing technical SEO.

Antonio Blago

HTTP Methods Checker

scrappy_garden/http-methods-checker

Probes which HTTP methods appear enabled for each URL (OPTIONS + per-method requests) and flags risky methods like TRACE. Outputs per-URL results plus SUMMARY and REPORT.

Bikram Adhikari

Website Traffic Machine

bhansalisoft/website-traffic-machine

Website Traffic Machine is unique usefull tool that is useful increase Website traffic directly using proxy ips and also search engine based traffic using keyword

bhansalisoft

RAG Markdown Cleaner

cooldev/rag-markdown-cleaner

Transform web pages into RAG-ready Markdown with smart chunking, metadata, code detection & quality scoring. Production-tested deduplication. Fully open-source (Apache 2.0)—review code, contribute, or self-host. Turn messy HTML into embedding-ready knowledge instantly.

Mohamed khalil Zouitni

Web Traffic Boots

hung.ad4gate/web-traffic-boots

Generate realistic web traffic for Google Analytics (GA) with sophisticated bot detection avoidance and human-like behavior simulation capabilities. traffic generator auto traffic generate website free traffic web traffic bot unlimited website traffic software organic traffic generator tool

Hung Dinh

732

5.0

Apacheinc Product Detail Ec Spider

getdataforme/apacheinc-product-detail-ec-spider

The Apacheinc Product Detail Ec Spider is a web scraping tool designed for extracting detailed product information from Apache Inc.'s website....

GetDataForMe

Apacheinc Product Discovery Ec Spider

getdataforme/apacheinc-product-discovery-ec-spider

The Apacheinc Product Discovery EC Spider efficiently extracts product information from the Apache Inc. website using keyword-based searches, ensuring high-quality data extraction. It supports flexible item limits for comprehensive analysis, optimized performance, and easy configuration....