Datapilot V2
Pricing
Pay per usage
Datapilot V2
The AI Company & Topic Researcher is a sophisticated web scraping and analysis tool designed for competitive intelligence, market research, and technical deep dives. It automates the entire research pipeline from data collection to AI-powered strategic analysis.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Satyam Gupta
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 hours ago
Last modified
Categories
Share
๐ง AI Company & Topic Researcher
๐ Table of Contents
- Overview
- Key Features
- Installation
- Usage
- Input Schema
- Output Schema
- Research Modes
- Technical Stack
- Examples
- Troubleshooting
๐ฏ Overview
The AI Company & Topic Researcher is a sophisticated web scraping and analysis tool designed for competitive intelligence, market research, and technical deep dives. It automates the entire research pipeline from data collection to AI-powered strategic analysis.
What Makes It Unique?
- Dual Research Modes: Switch between company analysis and general topic research
- Multi-Source Aggregation: Combines official websites, news, reviews, social mentions, and competitive data
- AI-Powered Insights: Uses Groq's Llama 3.3 70B model for deep strategic analysis
- Structured Output: Generates SWOT analysis, pricing strategies, market positioning, and actionable recommendations
- Zero Configuration: Works out-of-the-box with sensible defaults
๐ Key Features
๐ Intelligent Web Scraping
- Smart URL Discovery: Automatically finds and categorizes important pages (pricing, features, about, blog, etc.)
- Static & Dynamic Content: Handles both traditional HTML and JavaScript-rendered content
- Multi-Page Extraction: Scrapes homepage, pricing, features, documentation, blog posts, and more
- Anti-Detection: Uses realistic browser fingerprints and human-like behavior patterns
๐ Comprehensive Data Collection
Company Analysis Mode
โ
Official website content and metadata
โ
Pricing tiers and monetization models
โ
Product features and capabilities
โ
News articles and press releases
โ
Customer reviews from G2, Trustpilot, Capterra, Product Hunt
โ
Social media mentions (LinkedIn, Twitter/X, Reddit)
โ
Competitor identification and analysis
โ
Blog posts and content marketing insights
General Topic Research Mode
โ
Top authoritative sources discovery
โ
Technical documentation and guides
โ
Recent developments and news
โ
Community discussions and sentiment
โ
Implementation examples and use cases
๐ค AI-Powered Analysis
The actor uses Llama 3.3 70B Versatile via Groq API to generate:
For Companies:
- Company Overview (description, industry, target market, value proposition)
- Product Analysis (core products, key features, technology stack)
- Pricing Strategy (pricing model, positioning, monetization)
- Market Position (market strength, brand reputation, growth trajectory)
- Competitive Landscape (main competitors, threats, differentiation)
- Customer Sentiment (overall sentiment, praise/complaints, NPS)
- SWOT Analysis (top 5 strengths, weaknesses, opportunities, threats)
- Strategic Insights (growth potential, investment worthiness)
- Risk Assessment (risk score 0-100, key risk factors)
- Executive Summary (4-6 paragraph comprehensive overview)
For Topics:
- Topic Overview (definition, key concepts, importance)
- Technical Deep Dive (how it works, core components, architecture)
- State of the Art (current trends, recent advancements, leading players)
- Pros & Cons (advantages, disadvantages, implementation challenges)
- Use Cases (real-world applications across sectors)
- Future Outlook (predictions, research directions, disruption potential)
- Learning Resources (prerequisites and next steps)
๐ฅ Installation
Prerequisites
- Python 3.11+
- Apify Account (for running as an Actor)
- Groq API Key (optional, fallback provided)
Option 1: Run on Apify Platform (Recommended)
- Import the Actor to your Apify account
- Configure inputs via the Apify Console
- Run and view results in the dataset
Option 2: Local Development
# Clone the repositorygit clone https://github.com/yourusername/ai-company-researcher.gitcd ai-company-researcher# Install dependenciespip install -r requirements.txt# Install Playwright browsersplaywright install chromium# Set environment variables (optional)export GROQ_API_KEY="your_groq_api_key_here"# Run locallypython main.py
๐ฎ Usage
Using Apify Console
- Navigate to the Actor in your Apify Console
- Fill in the input form:
- Target Name: Enter company/product name or research topic
- Research Mode: Select "Company Analysis" or "General Topic Research"
- Groq API Key: (Optional) Enter your key for higher rate limits
- Click Start
- View results in the Dataset tab once complete
Using Apify API
// Node.js Exampleconst ApifyClient = require('apify-client');const client = new ApifyClient({token: 'YOUR_APIFY_TOKEN',});const run = await client.actor('your-actor-id').call({company_name: 'Stripe',mode: 'Company Analysis'});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
# Python Examplefrom apify_client import ApifyClientclient = ApifyClient('YOUR_APIFY_TOKEN')run = client.actor('your-actor-id').call(run_input={'company_name': 'Stripe','mode': 'Company Analysis'})dataset = client.dataset(run['defaultDatasetId']).list_items().itemsprint(dataset)
Local Execution
import asynciofrom main import EnhancedCompanyResearchScraperasync def main():scraper = EnhancedCompanyResearchScraper(company_name="Notion",groq_api_key="your_key_here" # Optional)results = await scraper.run_complete_research()# Save outputsscraper.save_to_json(results, "notion_research.json")scraper.save_analysis_text(results, "notion_analysis.txt")asyncio.run(main())
๐ Input Schema
Required Fields
| Field | Type | Description | Example |
|---|---|---|---|
company_name | string | Name of company, product, or topic | "Stripe", "React Framework" |
mode | select | Research type | "Company Analysis" or "General Topic Research" |
Optional Fields
| Field | Type | Description | Default |
|---|---|---|---|
groq_api_key | string | Your Groq API key | Uses fallback key |
Input Examples
Company Analysis:
{"company_name": "Figma","mode": "Company Analysis"}
Topic Research:
{"company_name": "Graph RAG","mode": "General Topic Research"}
๐ค Output Schema
Output Structure
{"metadata": {"research_date": "2024-12-06T10:30:00Z","company_name": "Stripe","scraper_version": "2.0","analysis_model": "llama-3.3-70b-versatile"},"raw_data": {"website_data": { "homepage": {}, "pricing": {} },"pricing_tiers": [{"name": "Starter","price": "$29","billing_period": "month"}],"features": [{"name": "Payment Processing","description": "Accept payments worldwide"}],"news_articles": [],"reviews": [],"social_mentions": [],"competitors": []},"ai_deep_analysis": {"company_overview": {"description": "Stripe is a financial infrastructure platform...","industry": "Financial Technology","target_market": "Online businesses and platforms","value_proposition": "Easy-to-integrate payment processing","business_model": "Transaction fees and subscriptions"},"product_analysis": {},"pricing_strategy": {},"market_position": {},"competitive_landscape": {},"customer_sentiment": {},"strengths_weaknesses": {"top_5_strengths": ["Best-in-class API", "Strong brand", "..."],"top_5_weaknesses": ["Premium pricing", "Complex setup", "..."],"opportunities": ["B2B expansion", "Crypto integration"],"threats": ["Regulatory changes", "Increased competition"]},"strategic_insights": {},"actionable_recommendations": ["Consider Stripe for global payment processing","Evaluate pricing for low-volume businesses"],"executive_summary": "Stripe is a leading financial infrastructure...","risk_score": {"overall_score": "35","risk_factors": ["Regulatory compliance", "Account stability"],"confidence_level": "high"}},"metrics": {"pages_scraped": 8,"pricing_tiers_found": 4,"features_identified": 15,"news_articles": 10,"reviews_collected": 8,"competitors_identified": 12}}
๐ฌ Research Modes
1. Company Analysis Mode
Best For:
- Competitive intelligence
- Investment research
- Partnership evaluation
- Product comparison
Data Collected:
- Website content, pricing, features
- News and announcements
- Customer reviews and ratings
- Social media presence
- Competitor landscape
AI Analysis:
- Business model breakdown
- Market positioning
- SWOT analysis
- Customer sentiment
- Risk assessment
2. General Topic Research Mode
Best For:
- Technology trends
- Academic research
- Learning new concepts
- Industry analysis
Data Collected:
- Authoritative sources
- Latest developments
- Community discussions
- Implementation examples
AI Analysis:
- Technical deep dive
- State-of-the-art trends
- Pros and cons
- Future outlook
- Learning pathway
๐ ๏ธ Technical Stack
| Component | Technology | Purpose |
|---|---|---|
| Runtime | Python 3.11+ | Core execution environment |
| Web Scraping | Playwright | Headless browser automation |
| AI Analysis | Groq API | Llama 3.3 70B inference |
| Actor Framework | Apify SDK | Scalable execution & storage |
| Parsing | BeautifulSoup4, regex | Content extraction |
Dependencies
apify>=1.0.0playwright>=1.40.0groq>=0.4.0requests>=2.31.0rich>=13.0.0
๐ Examples
Example 1: Analyze a SaaS Company
{"company_name": "Notion","mode": "Company Analysis"}
Output Highlights:
- 10 pages scraped
- 3 pricing tiers identified
- 25+ features extracted
- 15 news articles collected
- Risk Score: 25/100 (Low Risk)
- Market Position: Strong growth trajectory
Example 2: Research a Technical Topic
{"company_name": "Retrieval Augmented Generation","mode": "General Topic Research"}
Output Highlights:
- 3 authoritative sources analyzed
- Technical architecture explained
- Current trends identified
- 5 use cases documented
- Future outlook: High disruption potential
๐ Troubleshooting
Common Issues
| Issue | Solution |
|---|---|
| "Company website not found" | Verify company name is correct; use full name |
| "AI analysis failed" | Check Groq API key validity; may be rate limited |
| "No pricing found" | Pricing may be on request; check raw_data |
| Empty competitor list | Company may be too niche |
| Slow execution | Normal; typically takes 2-5 minutes |
Performance
- Total Runtime: 2-5 minutes
- Pages Scraped: 5-10 per company
- Output Size: 50-200 KB JSON
- Memory Usage: 500MB - 1GB
๐ก Best Practices
- Use Specific Names: "Stripe Payment Processing" > "Payment Company"
- Check Domain Accuracy: Ensure the company has a website
- Use Custom API Key: For higher rate limits
- Review Outputs: AI analysis should be verified for critical decisions
๐ธ Visual Overview
๐ License
This project is licensed under the MIT License.
๐๏ธ Built For Apify Hackathon 2025
This Actor was developed as part of the Apify Hackathon 2024, showcasing the power of combining web scraping with advanced AI analysis for market intelligence.