Orphan Content Analyzer avatar
Orphan Content Analyzer

Pricing

from $0.18 / 1,000 analyzedpages

Go to Apify Store
Orphan Content Analyzer

Orphan Content Analyzer

Detect orphan and ghost pages that silently hurt your SEO. Crawl your website, analyze internal links, and instantly identify pages with zero inbound links. Ideal for SEO audits, content optimization, and large-scale website analysis.

Pricing

from $0.18 / 1,000 analyzedpages

Rating

0.0

(0)

Developer

Mamadou Diao Bah

Mamadou Diao Bah

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

Orphan Content Analyzer - SEO Internal Links Detector

Detect orphan pages and ghost content by analyzing your website's internal link structure.

This Actor crawls your entire website to identify pages with weak internal linking, which is one of the most costly and difficult SEO problems to solve. It counts how many internal links point to each page and measures content length, helping you discover:

  • 🚨 Orphan pages (0-1 internal backlinks) that search engines struggle to find
  • ⚠️ Weakly linked pages (2-5 backlinks) at risk of becoming orphans
  • πŸ‘» Ghost content (orphan pages with <300 words)
  • πŸ“Š Complete internal linking report for your entire site

🎯 Why This Matters for SEO

The Problem

Orphan pages are a critical SEO issue that affects:

  • Crawlability: Search engines can't easily discover orphan pages
  • PageRank distribution: Orphan pages receive no internal link equity
  • User experience: Important content becomes hard to find
  • Rankings: Pages with poor internal linking rank lower

Professional SEO tools charge $100-300/month to detect orphan content. This Actor solves it with a simple crawl.

Real-World Impact

  • Sites with 10%+ orphan pages can lose 30-50% of potential organic traffic
  • Fixing internal linking is one of the highest ROI SEO improvements
  • A well-structured internal linking system improves crawl efficiency by 40%+

πŸš€ How to Use

Input Configuration

The Actor accepts two parameters:

ParameterTypeDescriptionDefault
Start URLsArrayWebsite URL(s) to analyzeRequired
Max Pages to CrawlIntegerMaximum pages to analyze (0 = unlimited)100

Example Input

{
"startUrls": [
{ "url": "https://example.com" }
],
"maxRequestsPerCrawl": 200
}

Running the Actor

  1. Enter your website's homepage URL
  2. Set the crawl limit (start with 100-200 for testing, increase for full audits)
  3. Click Start
  4. Wait for the crawl to complete (2-10 minutes depending on site size)
  5. Export results as CSV for analysis in Excel/Google Sheets

πŸ“Š Output Format

The Actor generates a dataset with the following fields for each page:

FieldTypeDescription
urlStringPage URL
internalBacklinksCountIntegerNumber of internal links pointing to this page
wordCountIntegerNumber of words on the page
isOrphanBooleantrue if page has 0-1 internal backlinks
isGhostContentBooleantrue if orphan AND <300 words
outgoingLinksCountIntegerNumber of internal links from this page

Example Output Row

{
"url": "https://example.com/old-product-page",
"internalBacklinksCount": 1,
"wordCount": 245,
"isOrphan": true,
"isGhostContent": true,
"outgoingLinksCount": 12
}

πŸ” How to Analyze Results

Step 1: Export as CSV

Click Export results β†’ CSV in the Apify Console.

Step 2: Filter for Problems

Find orphan pages:

  • Filter internalBacklinksCount ≀ 1
  • Sort by wordCount descending to prioritize high-value content

Find weakly linked pages:

  • Filter internalBacklinksCount between 2-5
  • These pages are at risk of becoming orphans

Find ghost content:

  • Filter isGhostContent = true
  • These pages have minimal value and should be improved or removed

Step 3: Fix Issues

For each orphan/weakly-linked page:

  1. Add internal links from relevant high-authority pages
  2. Update navigation menus to include important pages
  3. Create category/tag pages to link related content
  4. Remove or consolidate very low-quality pages

πŸ’‘ Best Practices

For Small Sites (< 500 pages)

  • Set maxRequestsPerCrawl to 0 (unlimited) for complete analysis
  • Focus on fixing all pages with internalBacklinksCount < 3

For Medium Sites (500-5,000 pages)

  • Start with maxRequestsPerCrawl: 1000 to sample your site
  • Prioritize fixing orphan pages with high wordCount (valuable content)

For Large Sites (5,000+ pages)

  • Run multiple crawls on different sections (blog, products, etc.)
  • Set maxRequestsPerCrawl: 2000 per section
  • Focus on fixing orphans in high-priority categories first

Ongoing Monitoring

  • Run monthly audits to catch new orphan pages
  • Track trends: Is the orphan % increasing or decreasing?
  • Automate with Apify Schedules for continuous monitoring

πŸ› οΈ Technical Details

How It Works

  1. Phase 1: Crawling - Discovers all pages starting from your homepage
  2. Phase 2: Link Extraction - Extracts all internal links from each page
  3. Phase 3: Backlink Counting - Counts how many pages link to each URL
  4. Phase 4: Orphan Detection - Identifies pages with 0-1 backlinks
  5. Phase 5: Export - Generates CSV/JSON report sorted by backlink count

Limitations

  • Only analyzes pages accessible from the homepage via internal links
  • Does not include pages only in XML sitemaps (use a sitemap crawler separately)
  • Requires pages to respond with valid HTML
  • Respects robots.txt and rate limits

Performance

  • Crawls 10-20 pages/second on average
  • 500 pages = ~2-3 minutes
  • 2,000 pages = ~5-10 minutes
  • Uses minimal Apify compute units (very cost-effective)

πŸ“ˆ Use Cases

SEO Agencies

  • Client audits: Identify quick SEO wins for new clients
  • Competitive analysis: Compare orphan % across competitor sites
  • Reporting: Show clients concrete data on internal linking issues

E-commerce Sites

  • Find orphaned product pages that need better category placement
  • Identify seasonal products with outdated links
  • Optimize category pages to distribute link equity

Content Publishers

  • Discover old blog posts losing visibility
  • Ensure pillar content has strong internal links
  • Find author pages or tag pages that need attention

Enterprise Websites

  • Large-scale audits of 10,000+ page sites
  • Multi-domain monitoring for brand portfolios
  • Technical SEO compliance checks

πŸ†˜ Troubleshooting

"0 pages crawled"

  • Check that the start URL is accessible and returns HTML
  • Verify your site doesn't block crawlers in robots.txt
  • Ensure the URL includes https:// or http://
  • This is good! Your site has excellent internal linking
  • Try crawling a different section or subdomain
  • Increase maxRequestsPerCrawl to analyze more pages

"Too many orphan pages"

  • This is common on large, older websites
  • Focus on fixing high-value content first (high word count)
  • Consider using breadcrumbs or related content modules

πŸ“„ Pricing & Support

This Actor runs on Apify's infrastructure and consumes compute units based on:

  • Number of pages crawled
  • Crawl duration
  • Data storage

Typical costs:

  • 100 pages: ~$0.05-0.10
  • 500 pages: ~$0.20-0.40
  • 2,000 pages: ~$0.80-1.50

πŸ’‘ Cost-effective alternative to $100-300/month SEO tools for orphan content detection.

Issues or questions? Contact the Actor developer through the Apify Console.

Feature requests? Submit feedback via the Apify platform.


πŸ† Why Choose This Actor?

βœ… Zero external API costs - Pure crawling, no third-party dependencies
βœ… Fast & efficient - Optimized Cheerio-based crawler
βœ… Professional output - Export-ready CSV reports
βœ… Actionable insights - Prioritized list of pages to fix
βœ… Cost-effective - $0.50-2.00 per audit vs. $100-300/month subscriptions
βœ… Pay per use - Only pay for the pages you actually crawl

Start improving your SEO today! πŸš€