Webpage Link Extractor avatar

Webpage Link Extractor

Pricing

from $10.00 / 1,000 results

Go to Apify Store
Webpage Link Extractor

Webpage Link Extractor

Extract all links from webpages with optional depth crawling

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

Donny

Donny

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

16 hours ago

Last modified

Categories

Share

What does this actor do?

Webpage Link Extractor is an Apify actor that extract all links from webpages with optional depth crawling. It runs on the Apify platform and delivers structured data in JSON, CSV, or Excel formats that you can easily integrate into your workflows. For each item found, the actor extracts key data fields including source url, target url, anchor text, external, and more. All results are stored in an Apify dataset that you can download or connect to via the Apify API.

Why use this actor?

Manually collecting this data would be extremely time-consuming and error-prone. Webpage Link Extractor automates the entire process, saving you hours of manual work. This actor is ideal for data analysts, researchers, marketers, and developers who need reliable, structured data. You can schedule regular runs to keep your data fresh, integrate results directly into spreadsheets or databases, and scale your data collection without any coding required. The actor handles pagination, rate limiting, and data normalization automatically.

How does it work?

This actor uses the Cheerio HTTP scraping library to efficiently parse HTML pages from the target website. It sends lightweight HTTP requests without rendering JavaScript, making it fast and resource-efficient. The actor processes search results, follows pagination, and extracts structured data from each page using CSS selectors.

Input parameters

ParameterTypeDescriptionDefault
urlstringStarting URL to extract links fromNone
maxDepthintegerMaximum crawl depth (1 = only the starting page)1
maxLinksintegerMaximum number of links to extract1000

Output fields

Each item in the output dataset contains the following fields:

FieldDescriptionFormat
sourceUrlSource URLtext
targetUrlTarget URLtext
anchorTextAnchor Texttext
isExternalExternaltext
isNofollowNofollowtext

Example output:

{
"sourceUrl": "Sample Source URL",
"targetUrl": "Sample Target URL",
"anchorText": "Sample Anchor Text",
"isExternal": "Sample External",
"isNofollow": "Sample Nofollow"
}

Cost and performance

This actor runs with a default memory allocation of 1024 MB. Using lightweight HTTP requests, each run typically costs around $0.10-0.25 in Apify platform credits per 1,000 results. A typical run processing 100 results completes in 1-3 minutes. You can reduce costs by limiting the number of results with the maxResults parameter and by scheduling runs during off-peak hours.

Tips and best practices

  • Start with a small number of results to test your configuration before scaling up.
  • Use the Apify scheduling feature to automate regular data collection runs.
  • Export results in the format that best fits your workflow: JSON for APIs, CSV for spreadsheets, or Excel for reports.
  • Connect this actor with other actors on the Apify platform for more comprehensive data pipelines.

Related actors you might find useful: