Aljazeera Scraper avatar

Aljazeera Scraper

Pricing

$18.00/month + usage

Go to Apify Store
Aljazeera Scraper

Aljazeera Scraper

Stay informed with comprehensive coverage of global news, especially critical tensions in the Middle East and Gaza. This powerful scraper extracts up to 200 articles per run from Aljazeera, providing essential insights into current events and regional tensions.

Pricing

$18.00/month + usage

Rating

4.0

(1)

Developer

Marco Rodrigues

Marco Rodrigues

Maintained by Community

Actor stats

0

Bookmarked

5

Total users

0

Monthly active users

6 days ago

Last modified

Share

🌍 Aljazeera Scraper

Stay ahead of the curve with comprehensive coverage of global news, geopolitical developments, and critical ongoing tensions in the Middle East. This powerful Aljazeera scraper makes it incredibly easy to extract massive amounts of structured news data for analysis, research, and AI workflows.

Just choose your desired category (Middle East, Economy, Human Rights, Climate Crisis, and more), and the scraper will dynamically load and extract up to 100 articles per run, neatly packing headlines, full text, author metadata, and publication timestamps into a clean CSV or JSON file.

Aljazeera website

πŸ’‘ Perfect for...

  • Journalists & Newsrooms: Track breaking developments across specific regions (like the Middle East or Africa) or topics (Human Rights, Climate Crisis) to build comprehensive timelines.
  • Policy Analysts & Humanitarian Organizations: Monitor regional tensions and conflicts through real-time, on-the-ground reporting.
  • Sentiment & NLP Pipelines: Run classifiers, sentiment analysis, or LLMs on full article bodies and excerpts to track the tone of global events over time.
  • Data Analysts & Academic Institutions: Download clean, structured datasets for visualizations, sociological research, or political dashboards.
  • πŸ“š RAG Systems: Chunk the content and metadata into vector stores so your custom AI agents can answer geopolitical questions with direct citations to Al Jazeera reporting.
  • πŸ”— AI Workflows: Integrate seamlessly with LangChain, OpenClaw, Claude Code, and other AI frameworks that need structured, reliable global news data.

✨ Why you'll love this scraper

  • 🎯 Deep Category Targeting: Choose from 19 comprehensive news categories, spanning regions (Middle East, US & Canada, Asia Pacific) and topics (Economy, Investigations, Explained, Opinion).
  • βš™οΈ Deep Content Extraction: Goes far beyond just the headline. Extracts the full article body, detailed author metadata (including social links), the article excerpt, and normalized publication dates.
  • πŸ“‘ Hidden API Scraping: Uses advanced interception of Al Jazeera's GraphQL responses to reliably pull hidden metadata like isBreaking, isLive, and sponsorshipType flags that are hard to get from raw HTML.
  • ⏱️ Dynamic Listing Depth: Automatically clicks "Show more" and scrolls the feed until your exact max_articles quota is met.
  • πŸ“Έ Rich Media Details: Captures featured image URLs and extracts video metadata (like duration and video IDs) when present.

πŸ“¦ What's inside the data?

For every single article, you will get:

  • Core Details: id, link (URL), short_url, title, excerpt, date
  • Content: content (Full main-body text)
  • Categorization: category, seoTitle, postType, sponsorshipType
  • News Flags: isBreaking, isLive, isDeveloping
  • Author Details: author_name, author_id, author_link, author_job_title, author_description, author_twitter, author_facebook, author_linkedin
  • Media Details: featured_image_url, galleryImagesCount, video_id, video_duration, featuredYoutube

πŸš€ Quick start

  1. Go to the actor on Apify (or run it locally with the Apify SDK).
  2. Choose your category (the Al Jazeera region or topic you care about, e.g., Middle East, Human Rights, Economy).
  3. Set max_articles (how many articles you want to scrape, up to 100).
  4. Click Start and let it run! πŸ—žοΈ Once it's done, you can export your data as a CSV, Excel spreadsheet, or JSON file.

Tech details for developers πŸ§‘β€πŸ’»

Input Example:

{
"category": "Science & Technology",
"max_articles": 100
}

Output Example:

{
"id": "4382231",
"title": "Anthropic sues Trump administration to undo US β€˜supply chain risk’ tag",
"excerpt": "In its lawsuit, Anthropic said the designation was unlawful and violated its US free speech and due process rights.",
"date": "2026-03-09T18:44:03",
"link": "https://www.aljazeera.com/economy/2026/3/9/anthropic-sues-trump-administration-to-undo-us-supply-chain-risk-tag",
"short_url": "https://aje.news/r2liof",
"isBreaking": false,
"isLive": false,
"isDeveloping": false,
"sponsorshipType": null,
"postType": "post",
"author_id": null,
"author_name": null,
"author_link": null,
"author_description": null,
"author_job_title": null,
"author_twitter": null,
"author_facebook": null,
"author_linkedin": null,
"featured_image_url": "https://www.aljazeera.com/wp-content/uploads/2026/03/2026-03-02T143300Z_417747899_RC2EWJAHOKCX_RTRMADP_3_USA-PENTAGON-ANTHROPIC-1772810507.jpg",
"galleryImagesCount": null,
"seoTitle": null,
"video_id": "",
"video_duration": "",
"featuredYoutube": "",
"category": "Science & Technology",
"content": "Anthropic has filed a lawsuit to block the Pentagon from placing it on a US national security blacklist, escalating the artificial intelligence lab’s high-stakes battle with the administration of United States President Donald Trump over usage restrictions on its technology.Anthropic said in its lawsuit on Monday that the designation was unlawful and violated its free speech and due process rights. The filing in federal court in the US state of California asked a judge to undo the designation and block federal agencies from enforcing it..."
}

Parameters:

ParameterTypeRequiredDescription
categorystringNoThe Al Jazeera feed to scrape. Options include regions (Middle East, Africa, Asia, Europe, US & Canada, Latin America, Asia Pacific) and topics (Explained, Opinion, Sport, Features, Economy, Human Rights, Climate Crisis, Investigations, Interactives, In Pictures, Science & Technology, Travel). Default: Middle East.
max_articlesintegerNoTarget number of articles to collect from the listing (increments as "Show more" loads). Min 20, max 100, default 100.