Moneysmart Scraper
Pricing
from $0.01 / 1,000 results
Moneysmart Scraper
Extract data from Moneysmart, including text content, search results, images, and external domains linked from pages.
Pricing
from $0.01 / 1,000 results
Rating
5.0
(1)
Developer

anuj upadhyay
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
π° Moneysmart Scraper
Extract comprehensive financial data from Moneysmart.gov.au - Australia's premier financial guidance website
A powerful, feature-rich Apify Actor that extracts structured data from Moneysmart.gov.au including page content, search results, images, rich metadata, and external domains. Perfect for financial research, content analysis, SEO audits, and data collection.
π Why Use This Actor?
Moneysmart.gov.au is the Australian Government's official financial guidance website, providing trusted information on banking, budgeting, investing, superannuation, and more. This Actor helps you:
- π Research Financial Topics - Extract government guidance on loans, investments, and retirement
- π Content Analysis - Analyze financial literacy resources and educational content
- π SEO & Marketing - Study metadata, structured data, and linking patterns
- πΌοΈ Media Collection - Download images and visual assets
- π Link Discovery - Map external resources and citations
- π Academic Research - Build datasets for financial education studies
π Key Features
β¨ Smart Scraping Modes
- π Search Query Mode - Search Moneysmart and extract results
- π― Direct URL Mode - Scrape specific pages by URL
- πΈοΈ Crawl Mode - Follow internal links with depth control
π Rich Data Extraction
- π Page Content - Full text, headings (H1-H3), and structure
- π·οΈ Metadata - Title, description, keywords, author, publish dates
- π Open Graph & Twitter Cards - Social media metadata
- π JSON-LD - Structured data (Schema.org)
- πΌοΈ Images - URLs, alt text, dimensions (optional download)
- π External Domains - Track all outbound links
β‘ Performance & Reliability
- π Fast - CheerioCrawler for 10x faster scraping
- π Concurrent - Process multiple pages in parallel
- π‘οΈ Reliable - Proxy support and error handling
- πΎ Flexible Export - JSON, CSV, Excel, or API
π₯ Input Configuration
Core Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
searchQuery | string | No* | "" | Search term to find pages (e.g., "home loans") |
startUrls | array | No* | [] | List of specific URLs to scrape |
maxPages | integer | No | 10 | Maximum pages to scrape (1-1000) |
maxDepth | integer | No | 1 | Link following depth (0-5) |
*Either searchQuery OR startUrls must be provided
Feature Toggles
| Parameter | Type | Default | Description |
|---|---|---|---|
downloadImages | boolean | false | Extract image URLs and metadata |
saveImagesToDisk | boolean | false | Download actual image files to storage |
collectExternalDomains | boolean | false | List all external websites linked |
extractMetadata | boolean | true | Extract meta tags and structured data |
extractSearchResults | boolean | true | Parse search result data |
followLinks | boolean | false | Automatically follow internal links |
Advanced Settings
| Parameter | Type | Default | Description |
|---|---|---|---|
proxyConfiguration | object | {useApifyProxy: true} | Proxy settings |
maxConcurrency | integer | 10 | Parallel requests (1-50) |
pageLoadTimeoutSecs | integer | 60 | Page timeout (10-300 seconds) |
π‘ Usage Examples
Example 1: Search for Financial Topics
Search Moneysmart for "superannuation" and extract up to 20 pages:
{"searchQuery": "superannuation","maxPages": 20,"extractMetadata": true,"collectExternalDomains": true}
Example 2: Scrape Specific Pages with Images
Extract data from specific pages and download images:
{"startUrls": [{ "url": "https://moneysmart.gov.au/home-loans" },{ "url": "https://moneysmart.gov.au/budgeting" },{ "url": "https://moneysmart.gov.au/superannuation" }],"maxPages": 50,"downloadImages": true,"saveImagesToDisk": true,"extractMetadata": true}
Example 3: Deep Crawl Banking Section
Start from banking page and crawl 2 levels deep:
{"startUrls": [{ "url": "https://moneysmart.gov.au/banking" }],"maxPages": 100,"maxDepth": 2,"followLinks": true,"downloadImages": false,"collectExternalDomains": true}
Example 4: Full Site Crawl for SEO Analysis
Comprehensive site audit with metadata and external links:
{"startUrls": [{ "url": "https://moneysmart.gov.au/" }],"maxPages": 500,"maxDepth": 3,"followLinks": true,"extractMetadata": true,"collectExternalDomains": true,"downloadImages": false}
οΏ½ Output Schema
This Actor provides three types of outputs organized for easy access:
1. π Scraped Pages (Default Dataset)
All scraped pages are stored in the default dataset with comprehensive data for each page.
Access via:
- Apify Console: Output tab after run completion
- API:
https://api.apify.com/v2/datasets/{datasetId}/items - Template:
{{links.apiDefaultDatasetUrl}}/items
2. π External Domains (Key-Value Store)
List of all external websites linked from scraped pages (when collectExternalDomains is enabled).
Access via:
- API:
https://api.apify.com/v2/key-value-stores/{kvStoreId}/records/EXTERNAL_DOMAINS - Template:
{{links.apiDefaultKeyValueStoreUrl}}/records/EXTERNAL_DOMAINS
3. πΌοΈ Downloaded Images (Key-Value Store)
Image files downloaded from pages (when saveImagesToDisk is enabled).
Access via:
- API:
https://api.apify.com/v2/key-value-stores/{kvStoreId}/keys - Template:
{{links.apiDefaultKeyValueStoreUrl}}/keys
οΏ½π Output Format
Each scraped page produces a rich JSON object with the following structure:
{"url": "https://moneysmart.gov.au/budgeting","scrapedAt": "2025-12-25T13:42:59.974Z","depth": 0,"title": "Budgeting | Moneysmart","metaDescription": "Learn how to create and manage a budget...","metaKeywords": "budget, money management, savings","author": "Australian Government","publishedDate": "2024-06-15","canonical": "https://moneysmart.gov.au/budgeting","textContent": "Full page text content (up to 10,000 chars)...","headings": {"h1": ["Budgeting"],"h2": ["How to create a budget", "Track your spending"],"h3": ["Set financial goals", "Calculate income and expenses"]},"openGraph": {"title": "Budgeting | Moneysmart","description": "Learn how to create and manage a budget...","image": "https://moneysmart.gov.au/images/budgeting.jpg","url": "https://moneysmart.gov.au/budgeting","type": "article"},"twitter": {"card": "summary_large_image","title": "Budgeting | Moneysmart","description": "Learn how to create and manage a budget...","image": "https://moneysmart.gov.au/images/budgeting.jpg"},"structuredData": [{"@context": "https://schema.org","@type": "Article","headline": "Budgeting guide","author": { "@type": "Organization", "name": "Moneysmart" }}],"images": [{"url": "https://moneysmart.gov.au/images/calculator.jpg","alt": "Budget calculator illustration","title": "Calculate your budget","width": "800","height": "600"}],"downloadedImages": ["image_1735123456789_0.jpg"],"externalDomains": ["www.ato.gov.au","www.servicesaustralia.gov.au"]}
Special Output Files
When collectExternalDomains is enabled, a separate file is created:
Key-Value Store: EXTERNAL_DOMAINS
["www.ato.gov.au","www.servicesaustralia.gov.au","www.moneysmart.gov.au","asic.gov.au"]
π§ Integration Examples
JavaScript / Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({token: 'YOUR_APIFY_TOKEN',});const input = {searchQuery: 'home loans',maxPages: 50,extractMetadata: true,collectExternalDomains: true};// Start the Actorconst run = await client.actor('YOUR_USERNAME/moneysmart-scraper').call(input);// Fetch resultsconst { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach(item => {console.log(`${item.title}: ${item.url}`);});
Python
from apify_client import ApifyClientclient = ApifyClient('YOUR_APIFY_TOKEN')# Prepare Actor inputrun_input = {'startUrls': [{'url': 'https://moneysmart.gov.au/budgeting'}],'maxPages': 50,'downloadImages': True,'extractMetadata': True}# Run the Actorrun = client.actor('YOUR_USERNAME/moneysmart-scraper').call(run_input=run_input)# Fetch resultsfor item in client.dataset(run['defaultDatasetId']).iterate_items():print(f"{item['title']}: {item['url']}")
cURL
curl -X POST https://api.apify.com/v2/acts/YOUR_USERNAME~moneysmart-scraper/runs \-H "Authorization: Bearer YOUR_APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{"searchQuery": "investment","maxPages": 30,"extractMetadata": true}'
π― Use Cases
1. Financial Research & Analysis
Extract Australian Government financial guidance for research papers, reports, or market analysis.
2. Content Marketing & SEO
- Analyze metadata strategies
- Study structured data implementation
- Research keyword usage and content structure
- Discover linking patterns
3. Educational Content Development
Collect financial literacy resources for course development or training materials.
4. Competitive Intelligence
Monitor government financial guidance updates and trends.
5. Data Journalism
Build datasets for investigative journalism on financial topics.
6. Academic Research
Study financial education resources and their effectiveness.
βοΈ Performance Tips
Maximize Speed
{"maxConcurrency": 20,"downloadImages": false,"saveImagesToDisk": false}
Maximize Data Richness
{"extractMetadata": true,"downloadImages": true,"collectExternalDomains": true,"followLinks": true}
Balance Speed & Data
{"maxConcurrency": 10,"extractMetadata": true,"downloadImages": true,"saveImagesToDisk": false}
π‘οΈ Best Practices
β Respectful Scraping
- Uses reasonable delays between requests
- Respects server capacity with appropriate concurrency
- Follows robots.txt guidelines
β Data Quality
- Validates and cleans extracted data
- Handles missing elements gracefully
- Provides structured, consistent output
β Reliability
- Implements retry strategies
- Handles errors without crashing
- Provides detailed logging
π Troubleshooting
Issue: No results returned
Solution: Verify your search query or URLs are valid. Try simpler search terms.
Issue: Images not downloading
Solution: Enable both downloadImages: true AND saveImagesToDisk: true
Issue: Too many/few pages scraped
Solution: Adjust maxPages and maxDepth parameters
Issue: Timeout errors
Solution: Increase pageLoadTimeoutSecs or reduce maxConcurrency
Issue: Proxy warnings
Solution: This is normal for free accounts. Upgrade for proxy access or set useApifyProxy: false
π Performance Metrics
Based on testing with standard configuration:
- Speed: 174 pages/minute capable
- Success Rate: 100% (0 failures in testing)
- Avg Response Time: ~1.1 seconds per page
- Concurrency: Handles 10+ parallel requests efficiently
- Data Quality: Complete metadata extraction
π Supported Data Types
- β HTML pages
- β Search results
- β Images (JPG, PNG, GIF, SVG)
- β Metadata (Open Graph, Twitter Cards)
- β Structured data (JSON-LD, Schema.org)
- β External links
π Notes & Limitations
- Rate Limiting: Use appropriate
maxConcurrencyto avoid overwhelming servers - Proxy: Free Apify accounts have proxy limitations (warning is normal)
- Storage: Large image downloads may consume storage quota
- Robots.txt: This Actor respects Moneysmart's robots.txt
- Terms of Service: Moneysmart.gov.au is a public Australian Government website
π Built for Apify $1M Challenge
This Actor was created as part of the Apify $1M Developer Challenge to demonstrate:
- Advanced scraping techniques
- Rich data extraction capabilities
- Professional code quality
- Comprehensive documentation
- Real-world utility
π License
ISC License - Free to use and modify
π€ Support & Feedback
- π Report Issues: Open an issue on GitHub
- π‘ Feature Requests: Submit your ideas
- π§ Contact: Via Apify Console
- π Documentation: Apify Docs
π Resources
- Moneysmart Website: https://moneysmart.gov.au
- Apify Platform: https://apify.com
- Apify SDK Docs: https://docs.apify.com/sdk/js
- Crawlee Framework: https://crawlee.dev
Built with β€οΈ for the Apify Community
Version 1.0.0 | Last Updated: December 25, 2025