Moneysmart Scraper avatar
Moneysmart Scraper

Pricing

from $0.01 / 1,000 results

Go to Apify Store
Moneysmart Scraper

Moneysmart Scraper

Extract data from Moneysmart, including text content, search results, images, and external domains linked from pages.

Pricing

from $0.01 / 1,000 results

Rating

5.0

(1)

Developer

anuj upadhyay

anuj upadhyay

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

πŸ’° Moneysmart Scraper

Extract comprehensive financial data from Moneysmart.gov.au - Australia's premier financial guidance website

Apify Challenge LICENSE

A powerful, feature-rich Apify Actor that extracts structured data from Moneysmart.gov.au including page content, search results, images, rich metadata, and external domains. Perfect for financial research, content analysis, SEO audits, and data collection.


🌟 Why Use This Actor?

Moneysmart.gov.au is the Australian Government's official financial guidance website, providing trusted information on banking, budgeting, investing, superannuation, and more. This Actor helps you:

  • πŸ“Š Research Financial Topics - Extract government guidance on loans, investments, and retirement
  • πŸ” Content Analysis - Analyze financial literacy resources and educational content
  • πŸ“ˆ SEO & Marketing - Study metadata, structured data, and linking patterns
  • πŸ–ΌοΈ Media Collection - Download images and visual assets
  • πŸ”— Link Discovery - Map external resources and citations
  • πŸ“š Academic Research - Build datasets for financial education studies

πŸš€ Key Features

✨ Smart Scraping Modes

  • πŸ”Ž Search Query Mode - Search Moneysmart and extract results
  • 🎯 Direct URL Mode - Scrape specific pages by URL
  • πŸ•ΈοΈ Crawl Mode - Follow internal links with depth control

πŸ“Š Rich Data Extraction

  • πŸ“„ Page Content - Full text, headings (H1-H3), and structure
  • 🏷️ Metadata - Title, description, keywords, author, publish dates
  • 🌐 Open Graph & Twitter Cards - Social media metadata
  • πŸ“‹ JSON-LD - Structured data (Schema.org)
  • πŸ–ΌοΈ Images - URLs, alt text, dimensions (optional download)
  • πŸ”— External Domains - Track all outbound links

⚑ Performance & Reliability

  • πŸš„ Fast - CheerioCrawler for 10x faster scraping
  • πŸ”„ Concurrent - Process multiple pages in parallel
  • πŸ›‘οΈ Reliable - Proxy support and error handling
  • πŸ’Ύ Flexible Export - JSON, CSV, Excel, or API

πŸ“₯ Input Configuration

Core Parameters

ParameterTypeRequiredDefaultDescription
searchQuerystringNo*""Search term to find pages (e.g., "home loans")
startUrlsarrayNo*[]List of specific URLs to scrape
maxPagesintegerNo10Maximum pages to scrape (1-1000)
maxDepthintegerNo1Link following depth (0-5)

*Either searchQuery OR startUrls must be provided

Feature Toggles

ParameterTypeDefaultDescription
downloadImagesbooleanfalseExtract image URLs and metadata
saveImagesToDiskbooleanfalseDownload actual image files to storage
collectExternalDomainsbooleanfalseList all external websites linked
extractMetadatabooleantrueExtract meta tags and structured data
extractSearchResultsbooleantrueParse search result data
followLinksbooleanfalseAutomatically follow internal links

Advanced Settings

ParameterTypeDefaultDescription
proxyConfigurationobject{useApifyProxy: true}Proxy settings
maxConcurrencyinteger10Parallel requests (1-50)
pageLoadTimeoutSecsinteger60Page timeout (10-300 seconds)

πŸ’‘ Usage Examples

Example 1: Search for Financial Topics

Search Moneysmart for "superannuation" and extract up to 20 pages:

{
"searchQuery": "superannuation",
"maxPages": 20,
"extractMetadata": true,
"collectExternalDomains": true
}

Example 2: Scrape Specific Pages with Images

Extract data from specific pages and download images:

{
"startUrls": [
{ "url": "https://moneysmart.gov.au/home-loans" },
{ "url": "https://moneysmart.gov.au/budgeting" },
{ "url": "https://moneysmart.gov.au/superannuation" }
],
"maxPages": 50,
"downloadImages": true,
"saveImagesToDisk": true,
"extractMetadata": true
}

Example 3: Deep Crawl Banking Section

Start from banking page and crawl 2 levels deep:

{
"startUrls": [
{ "url": "https://moneysmart.gov.au/banking" }
],
"maxPages": 100,
"maxDepth": 2,
"followLinks": true,
"downloadImages": false,
"collectExternalDomains": true
}

Example 4: Full Site Crawl for SEO Analysis

Comprehensive site audit with metadata and external links:

{
"startUrls": [
{ "url": "https://moneysmart.gov.au/" }
],
"maxPages": 500,
"maxDepth": 3,
"followLinks": true,
"extractMetadata": true,
"collectExternalDomains": true,
"downloadImages": false
}

οΏ½ Output Schema

This Actor provides three types of outputs organized for easy access:

1. πŸ“„ Scraped Pages (Default Dataset)

All scraped pages are stored in the default dataset with comprehensive data for each page.

Access via:

  • Apify Console: Output tab after run completion
  • API: https://api.apify.com/v2/datasets/{datasetId}/items
  • Template: {{links.apiDefaultDatasetUrl}}/items

2. πŸ”— External Domains (Key-Value Store)

List of all external websites linked from scraped pages (when collectExternalDomains is enabled).

Access via:

  • API: https://api.apify.com/v2/key-value-stores/{kvStoreId}/records/EXTERNAL_DOMAINS
  • Template: {{links.apiDefaultKeyValueStoreUrl}}/records/EXTERNAL_DOMAINS

3. πŸ–ΌοΈ Downloaded Images (Key-Value Store)

Image files downloaded from pages (when saveImagesToDisk is enabled).

Access via:

  • API: https://api.apify.com/v2/key-value-stores/{kvStoreId}/keys
  • Template: {{links.apiDefaultKeyValueStoreUrl}}/keys

οΏ½πŸ“Š Output Format

Each scraped page produces a rich JSON object with the following structure:

{
"url": "https://moneysmart.gov.au/budgeting",
"scrapedAt": "2025-12-25T13:42:59.974Z",
"depth": 0,
"title": "Budgeting | Moneysmart",
"metaDescription": "Learn how to create and manage a budget...",
"metaKeywords": "budget, money management, savings",
"author": "Australian Government",
"publishedDate": "2024-06-15",
"canonical": "https://moneysmart.gov.au/budgeting",
"textContent": "Full page text content (up to 10,000 chars)...",
"headings": {
"h1": ["Budgeting"],
"h2": ["How to create a budget", "Track your spending"],
"h3": ["Set financial goals", "Calculate income and expenses"]
},
"openGraph": {
"title": "Budgeting | Moneysmart",
"description": "Learn how to create and manage a budget...",
"image": "https://moneysmart.gov.au/images/budgeting.jpg",
"url": "https://moneysmart.gov.au/budgeting",
"type": "article"
},
"twitter": {
"card": "summary_large_image",
"title": "Budgeting | Moneysmart",
"description": "Learn how to create and manage a budget...",
"image": "https://moneysmart.gov.au/images/budgeting.jpg"
},
"structuredData": [
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Budgeting guide",
"author": { "@type": "Organization", "name": "Moneysmart" }
}
],
"images": [
{
"url": "https://moneysmart.gov.au/images/calculator.jpg",
"alt": "Budget calculator illustration",
"title": "Calculate your budget",
"width": "800",
"height": "600"
}
],
"downloadedImages": ["image_1735123456789_0.jpg"],
"externalDomains": [
"www.ato.gov.au",
"www.servicesaustralia.gov.au"
]
}

Special Output Files

When collectExternalDomains is enabled, a separate file is created:

Key-Value Store: EXTERNAL_DOMAINS

[
"www.ato.gov.au",
"www.servicesaustralia.gov.au",
"www.moneysmart.gov.au",
"asic.gov.au"
]

πŸ”§ Integration Examples

JavaScript / Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({
token: 'YOUR_APIFY_TOKEN',
});
const input = {
searchQuery: 'home loans',
maxPages: 50,
extractMetadata: true,
collectExternalDomains: true
};
// Start the Actor
const run = await client.actor('YOUR_USERNAME/moneysmart-scraper').call(input);
// Fetch results
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => {
console.log(`${item.title}: ${item.url}`);
});

Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_APIFY_TOKEN')
# Prepare Actor input
run_input = {
'startUrls': [
{'url': 'https://moneysmart.gov.au/budgeting'}
],
'maxPages': 50,
'downloadImages': True,
'extractMetadata': True
}
# Run the Actor
run = client.actor('YOUR_USERNAME/moneysmart-scraper').call(run_input=run_input)
# Fetch results
for item in client.dataset(run['defaultDatasetId']).iterate_items():
print(f"{item['title']}: {item['url']}")

cURL

curl -X POST https://api.apify.com/v2/acts/YOUR_USERNAME~moneysmart-scraper/runs \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"searchQuery": "investment",
"maxPages": 30,
"extractMetadata": true
}'

🎯 Use Cases

1. Financial Research & Analysis

Extract Australian Government financial guidance for research papers, reports, or market analysis.

2. Content Marketing & SEO

  • Analyze metadata strategies
  • Study structured data implementation
  • Research keyword usage and content structure
  • Discover linking patterns

3. Educational Content Development

Collect financial literacy resources for course development or training materials.

4. Competitive Intelligence

Monitor government financial guidance updates and trends.

5. Data Journalism

Build datasets for investigative journalism on financial topics.

6. Academic Research

Study financial education resources and their effectiveness.


βš™οΈ Performance Tips

Maximize Speed

{
"maxConcurrency": 20,
"downloadImages": false,
"saveImagesToDisk": false
}

Maximize Data Richness

{
"extractMetadata": true,
"downloadImages": true,
"collectExternalDomains": true,
"followLinks": true
}

Balance Speed & Data

{
"maxConcurrency": 10,
"extractMetadata": true,
"downloadImages": true,
"saveImagesToDisk": false
}

πŸ›‘οΈ Best Practices

βœ… Respectful Scraping

  • Uses reasonable delays between requests
  • Respects server capacity with appropriate concurrency
  • Follows robots.txt guidelines

βœ… Data Quality

  • Validates and cleans extracted data
  • Handles missing elements gracefully
  • Provides structured, consistent output

βœ… Reliability

  • Implements retry strategies
  • Handles errors without crashing
  • Provides detailed logging

πŸ› Troubleshooting

Issue: No results returned

Solution: Verify your search query or URLs are valid. Try simpler search terms.

Issue: Images not downloading

Solution: Enable both downloadImages: true AND saveImagesToDisk: true

Issue: Too many/few pages scraped

Solution: Adjust maxPages and maxDepth parameters

Issue: Timeout errors

Solution: Increase pageLoadTimeoutSecs or reduce maxConcurrency

Issue: Proxy warnings

Solution: This is normal for free accounts. Upgrade for proxy access or set useApifyProxy: false


πŸ“ˆ Performance Metrics

Based on testing with standard configuration:

  • Speed: 174 pages/minute capable
  • Success Rate: 100% (0 failures in testing)
  • Avg Response Time: ~1.1 seconds per page
  • Concurrency: Handles 10+ parallel requests efficiently
  • Data Quality: Complete metadata extraction

🌐 Supported Data Types

  • βœ… HTML pages
  • βœ… Search results
  • βœ… Images (JPG, PNG, GIF, SVG)
  • βœ… Metadata (Open Graph, Twitter Cards)
  • βœ… Structured data (JSON-LD, Schema.org)
  • βœ… External links

πŸ“ Notes & Limitations

  • Rate Limiting: Use appropriate maxConcurrency to avoid overwhelming servers
  • Proxy: Free Apify accounts have proxy limitations (warning is normal)
  • Storage: Large image downloads may consume storage quota
  • Robots.txt: This Actor respects Moneysmart's robots.txt
  • Terms of Service: Moneysmart.gov.au is a public Australian Government website

πŸ† Built for Apify $1M Challenge

This Actor was created as part of the Apify $1M Developer Challenge to demonstrate:

  • Advanced scraping techniques
  • Rich data extraction capabilities
  • Professional code quality
  • Comprehensive documentation
  • Real-world utility

πŸ“„ License

ISC License - Free to use and modify


🀝 Support & Feedback

  • πŸ› Report Issues: Open an issue on GitHub
  • πŸ’‘ Feature Requests: Submit your ideas
  • πŸ“§ Contact: Via Apify Console
  • πŸ“š Documentation: Apify Docs

πŸ”— Resources


Built with ❀️ for the Apify Community

Version 1.0.0 | Last Updated: December 25, 2025


🌟 If this Actor helps you, please give it a star and share your feedback!