Datapilot V1 avatar
Datapilot V1

Pricing

from $0.01 / 1,000 results

Go to Apify Store
Datapilot V1

Datapilot V1

The AI Company & Topic Researcher is a sophisticated web scraping and analysis tool designed for competitive intelligence, market research, and technical deep dives. It automates the entire research pipeline from data collection to AI-powered strategic analysis.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

Satyam Gupta

Satyam Gupta

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

19 hours ago

Last modified

Share

๐Ÿง  AI Company & Topic Researcher


๐Ÿ“‹ Table of Contents


๐ŸŽฏ Overview

The AI Company & Topic Researcher is a sophisticated web scraping and analysis tool designed for competitive intelligence, market research, and technical deep dives. It automates the entire research pipeline from data collection to AI-powered strategic analysis.

What Makes It Unique?

  • Dual Research Modes: Switch between company analysis and general topic research
  • Multi-Source Aggregation: Combines official websites, news, reviews, social mentions, and competitive data
  • AI-Powered Insights: Uses Groq's Llama 3.3 70B model for deep strategic analysis
  • Structured Output: Generates SWOT analysis, pricing strategies, market positioning, and actionable recommendations
  • Zero Configuration: Works out-of-the-box with sensible defaults

๐Ÿš€ Key Features

๐Ÿ” Intelligent Web Scraping

  • Smart URL Discovery: Automatically finds and categorizes important pages (pricing, features, about, blog, etc.)
  • Static & Dynamic Content: Handles both traditional HTML and JavaScript-rendered content
  • Multi-Page Extraction: Scrapes homepage, pricing, features, documentation, blog posts, and more
  • Anti-Detection: Uses realistic browser fingerprints and human-like behavior patterns

๐Ÿ“Š Comprehensive Data Collection

Company Analysis Mode

โœ… Official website content and metadata
โœ… Pricing tiers and monetization models
โœ… Product features and capabilities
โœ… News articles and press releases
โœ… Customer reviews from G2, Trustpilot, Capterra, Product Hunt
โœ… Social media mentions (LinkedIn, Twitter/X, Reddit)
โœ… Competitor identification and analysis
โœ… Blog posts and content marketing insights

General Topic Research Mode

โœ… Top authoritative sources discovery
โœ… Technical documentation and guides
โœ… Recent developments and news
โœ… Community discussions and sentiment
โœ… Implementation examples and use cases

๐Ÿค– AI-Powered Analysis

The actor uses Llama 3.3 70B Versatile via Groq API to generate:

For Companies:

  • Company Overview (description, industry, target market, value proposition)
  • Product Analysis (core products, key features, technology stack)
  • Pricing Strategy (pricing model, positioning, monetization)
  • Market Position (market strength, brand reputation, growth trajectory)
  • Competitive Landscape (main competitors, threats, differentiation)
  • Customer Sentiment (overall sentiment, praise/complaints, NPS)
  • SWOT Analysis (top 5 strengths, weaknesses, opportunities, threats)
  • Strategic Insights (growth potential, investment worthiness)
  • Risk Assessment (risk score 0-100, key risk factors)
  • Executive Summary (4-6 paragraph comprehensive overview)

For Topics:

  • Topic Overview (definition, key concepts, importance)
  • Technical Deep Dive (how it works, core components, architecture)
  • State of the Art (current trends, recent advancements, leading players)
  • Pros & Cons (advantages, disadvantages, implementation challenges)
  • Use Cases (real-world applications across sectors)
  • Future Outlook (predictions, research directions, disruption potential)
  • Learning Resources (prerequisites and next steps)

๐Ÿ“ฅ Installation

Prerequisites

  • Python 3.11+
  • Apify Account (for running as an Actor)
  • Groq API Key (optional, fallback provided)
  1. Import the Actor to your Apify account
  2. Configure inputs via the Apify Console
  3. Run and view results in the dataset

Option 2: Local Development

# Clone the repository
git clone https://github.com/yourusername/ai-company-researcher.git
cd ai-company-researcher
# Install dependencies
pip install -r requirements.txt
# Install Playwright browsers
playwright install chromium
# Set environment variables (optional)
export GROQ_API_KEY="your_groq_api_key_here"
# Run locally
python main.py

๐ŸŽฎ Usage

Using Apify Console

  1. Navigate to the Actor in your Apify Console
  2. Fill in the input form:
    • Target Name: Enter company/product name or research topic
    • Research Mode: Select "Company Analysis" or "General Topic Research"
    • Groq API Key: (Optional) Enter your key for higher rate limits
  3. Click Start
  4. View results in the Dataset tab once complete

Using Apify API

// Node.js Example
const ApifyClient = require('apify-client');
const client = new ApifyClient({
token: 'YOUR_APIFY_TOKEN',
});
const run = await client.actor('your-actor-id').call({
company_name: 'Stripe',
mode: 'Company Analysis'
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);
# Python Example
from apify_client import ApifyClient
client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('your-actor-id').call(run_input={
'company_name': 'Stripe',
'mode': 'Company Analysis'
})
dataset = client.dataset(run['defaultDatasetId']).list_items().items
print(dataset)

Local Execution

import asyncio
from main import EnhancedCompanyResearchScraper
async def main():
scraper = EnhancedCompanyResearchScraper(
company_name="Notion",
groq_api_key="your_key_here" # Optional
)
results = await scraper.run_complete_research()
# Save outputs
scraper.save_to_json(results, "notion_research.json")
scraper.save_analysis_text(results, "notion_analysis.txt")
asyncio.run(main())

๐Ÿ“ Input Schema

Required Fields

FieldTypeDescriptionExample
company_namestringName of company, product, or topic"Stripe", "React Framework"
modeselectResearch type"Company Analysis" or "General Topic Research"

Optional Fields

FieldTypeDescriptionDefault
groq_api_keystringYour Groq API keyUses fallback key

Input Examples

Company Analysis:

{
"company_name": "Figma",
"mode": "Company Analysis"
}

Topic Research:

{
"company_name": "Graph RAG",
"mode": "General Topic Research"
}

๐Ÿ“ค Output Schema

Output Structure

{
"metadata": {
"research_date": "2024-12-06T10:30:00Z",
"company_name": "Stripe",
"scraper_version": "2.0",
"analysis_model": "llama-3.3-70b-versatile"
},
"raw_data": {
"website_data": { "homepage": {}, "pricing": {} },
"pricing_tiers": [
{
"name": "Starter",
"price": "$29",
"billing_period": "month"
}
],
"features": [
{
"name": "Payment Processing",
"description": "Accept payments worldwide"
}
],
"news_articles": [],
"reviews": [],
"social_mentions": [],
"competitors": []
},
"ai_deep_analysis": {
"company_overview": {
"description": "Stripe is a financial infrastructure platform...",
"industry": "Financial Technology",
"target_market": "Online businesses and platforms",
"value_proposition": "Easy-to-integrate payment processing",
"business_model": "Transaction fees and subscriptions"
},
"product_analysis": {},
"pricing_strategy": {},
"market_position": {},
"competitive_landscape": {},
"customer_sentiment": {},
"strengths_weaknesses": {
"top_5_strengths": ["Best-in-class API", "Strong brand", "..."],
"top_5_weaknesses": ["Premium pricing", "Complex setup", "..."],
"opportunities": ["B2B expansion", "Crypto integration"],
"threats": ["Regulatory changes", "Increased competition"]
},
"strategic_insights": {},
"actionable_recommendations": [
"Consider Stripe for global payment processing",
"Evaluate pricing for low-volume businesses"
],
"executive_summary": "Stripe is a leading financial infrastructure...",
"risk_score": {
"overall_score": "35",
"risk_factors": ["Regulatory compliance", "Account stability"],
"confidence_level": "high"
}
},
"metrics": {
"pages_scraped": 8,
"pricing_tiers_found": 4,
"features_identified": 15,
"news_articles": 10,
"reviews_collected": 8,
"competitors_identified": 12
}
}

๐Ÿ”ฌ Research Modes

1. Company Analysis Mode

Best For:

  • Competitive intelligence
  • Investment research
  • Partnership evaluation
  • Product comparison

Data Collected:

  • Website content, pricing, features
  • News and announcements
  • Customer reviews and ratings
  • Social media presence
  • Competitor landscape

AI Analysis:

  • Business model breakdown
  • Market positioning
  • SWOT analysis
  • Customer sentiment
  • Risk assessment

2. General Topic Research Mode

Best For:

  • Technology trends
  • Academic research
  • Learning new concepts
  • Industry analysis

Data Collected:

  • Authoritative sources
  • Latest developments
  • Community discussions
  • Implementation examples

AI Analysis:

  • Technical deep dive
  • State-of-the-art trends
  • Pros and cons
  • Future outlook
  • Learning pathway

๐Ÿ› ๏ธ Technical Stack

ComponentTechnologyPurpose
RuntimePython 3.11+Core execution environment
Web ScrapingPlaywrightHeadless browser automation
AI AnalysisGroq APILlama 3.3 70B inference
Actor FrameworkApify SDKScalable execution & storage
ParsingBeautifulSoup4, regexContent extraction

Dependencies

apify>=1.0.0
playwright>=1.40.0
groq>=0.4.0
requests>=2.31.0
rich>=13.0.0

๐Ÿ“Š Examples

Example 1: Analyze a SaaS Company

{
"company_name": "Notion",
"mode": "Company Analysis"
}

Output Highlights:

  • 10 pages scraped
  • 3 pricing tiers identified
  • 25+ features extracted
  • 15 news articles collected
  • Risk Score: 25/100 (Low Risk)
  • Market Position: Strong growth trajectory

Example 2: Research a Technical Topic

{
"company_name": "Retrieval Augmented Generation",
"mode": "General Topic Research"
}

Output Highlights:

  • 3 authoritative sources analyzed
  • Technical architecture explained
  • Current trends identified
  • 5 use cases documented
  • Future outlook: High disruption potential

๐Ÿ› Troubleshooting

Common Issues

IssueSolution
"Company website not found"Verify company name is correct; use full name
"AI analysis failed"Check Groq API key validity; may be rate limited
"No pricing found"Pricing may be on request; check raw_data
Empty competitor listCompany may be too niche
Slow executionNormal; typically takes 2-5 minutes

Performance

  • Total Runtime: 2-5 minutes
  • Pages Scraped: 5-10 per company
  • Output Size: 50-200 KB JSON
  • Memory Usage: 500MB - 1GB

๐Ÿ’ก Best Practices

  1. Use Specific Names: "Stripe Payment Processing" > "Payment Company"
  2. Check Domain Accuracy: Ensure the company has a website
  3. Use Custom API Key: For higher rate limits
  4. Review Outputs: AI analysis should be verified for critical decisions

๐Ÿ“ธ Visual Overview

Screenshot 2025-12-06 144028 Screenshot 2025-12-06 144005

๐Ÿ“œ License

This project is licensed under the MIT License.


๐ŸŽ–๏ธ Built For Apify Hackathon 2025

This Actor was developed as part of the Apify Hackathon 2024, showcasing the power of combining web scraping with advanced AI analysis for market intelligence.