Medium Scraper avatar

Medium Scraper

Under maintenance
Try for free

No credit card required

Go to Store
This Actor is under maintenance.

This Actor may be unreliable while under maintenance. Would you like to try a similar Actor instead?

See alternative Actors
Medium Scraper

Medium Scraper

2clouds/medium-scraper
Try for free

No credit card required

Medium Profile Scraper A robust asynchronous scraper for Medium profiles and articles, built with Python. A base scraper for finding Medium users and their articles A detailed profile scraper for gathering comprehensive user information Author Afnan Khan GitHub: 2Cloud-S LinkedIn: afnankhan-ak

Medium Profile Scraper

A robust asynchronous scraper for Medium profiles and articles, built with Python. This project consists of two main components:

  1. A base scraper for finding Medium users and their articles
  2. A detailed profile scraper for gathering comprehensive user information

Author

Afnan Khan

Features

Base Scraper (medium.py)

  • Asynchronous scraping of Medium profiles
  • Topic-based user discovery
  • Premium content detection
  • Website and email extraction
  • Progress tracking and resumable scraping
  • CSV export with duplicate prevention

Profile Scraper (profile_scraper.py)

  • Detailed profile information extraction
  • Bio and social links
  • Article statistics and history
  • User interests and topics
  • Batch processing with rate limiting
  • Structured data export

Installation

  1. Clone the repository:
1git clone https://github.com/2Cloud-S/medium-scraper.git
2cd medium-scraper
  1. Create and activate a virtual environment:
1python -m venv venv
2source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt

Usage

1. Base Scraper

Run the base scraper to collect Medium users:

python medium.py

This will:

  • Scrape users from specified topics
  • Save progress to data/medium_users_progress.csv
  • Export final results to data/medium_users_final.csv

2. Profile Scraper

After collecting users, run the profile scraper:

python profile_scraper.py

This will:

  • Read users from medium_users_final.csv
  • Collect detailed profile information
  • Save results to data/medium_profiles_detailed.csv

Data Structure

Base Scraper Output

  • username
  • is_premium
  • has_newsletter
  • email
  • website
  • website_emails
  • follower_count
  • article_count
  • premium_articles

Profile Scraper Output

  • username
  • bio
  • total_claps
  • total_responses
  • following_count
  • top_writer_in
  • member_since
  • last_active
  • social_links
  • interests
  • latest_articles

Configuration

Modify the following in medium.py:

  • topics: List of topics to scrape
  • headers: Update cookies for authenticated requests
  • search_paths: Customize URL patterns

Rate Limiting

The scrapers implement:

  • Random delays between requests
  • Batch processing
  • Error handling with retries
  • Session management

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Built with aiohttp for async operations
  • Uses BeautifulSoup4 for HTML parsing
  • Implements best practices for web scraping

Disclaimer

This tool is for educational purposes only. Be sure to comply with Medium's terms of service and implement appropriate delays between requests.

Developer
Maintained by Community

Actor Metrics

  • 1 monthly user

  • 0 No stars yet

  • >99% runs succeeded

  • Created in Feb 2025

  • Modified 6 days ago