Video Download Link Crawler

Pricing

Pay per event

Video Download Link Crawler

Automatically discover and extract video download links from any website. Crawl through multiple pages, follow custom link patterns, and export results in JSON, CSV, HTML, or XML formats. Perfect for content creators, researchers, and media professionals.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Rodrigo Franco

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

4 months ago

Last modified

.dockerignore

# configurations
.idea
.vscode
.zed

# crawlee and apify storage folders
apify_storage
crawlee_storage
storage

# installed files
node_modules

# git folder
.git

.editorconfig

root = true

[*]
indent_style = space
indent_size = 4
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
end_of_line = lf
quote_type = single

.gitignore

# This file tells Git which files shouldn't be added to source control

.DS_Store
.idea
.vscode
.zed
dist
node_modules
apify_storage
storage

# Added by Apify CLI
.venv

.prettierrc

{
    "printWidth": 120,
    "tabWidth": 4,
    "singleQuote": true
}

Dockerfile

# Specify the base Docker image. You can read more about
# the available images at https://docs.apify.com/sdk/js/docs/guides/docker-images
# You can also use any other image from Docker Hub.
FROM apify/actor-node:22

# Check preinstalled packages
RUN npm ls crawlee apify puppeteer playwright

# Copy just package.json and package-lock.json
# to speed up the build using Docker layer cache.
COPY package*.json ./

# Install NPM packages, skip optional and development dependencies to
# keep the image small. Avoid logging too much and print the dependency
# tree for debugging
RUN npm --quiet set progress=false \
    && npm install --omit=dev --omit=optional \
    && echo "Installed NPM packages:" \
    && (npm list --omit=dev --all || true) \
    && echo "Node.js version:" \
    && node --version \
    && echo "NPM version:" \
    && npm --version \
    && rm -r ~/.npm

# Next, copy the remaining files and directories with the source code.
# Since we do this after NPM install, quick build will be really fast
# for most source file changes.
COPY . ./

# Create and run as a non-root user.
RUN adduser -h /home/apify -D apify && \
    chown -R apify:apify ./
USER apify

# Run the image.
CMD npm start --silent

eslint.config.mjs

1import prettier from 'eslint-config-prettier';
2
3import apify from '@apify/eslint-config/js.js';
4
5// eslint-disable-next-line import/no-default-export
6export default [{ ignores: ['**/dist'] }, ...apify, prettier];

package.json

{
	"name": "video-download-crawler",
	"version": "0.0.1",
	"type": "module",
	"description": "This is a boilerplate of an Apify Actor.",
	"engines": {
		"node": ">=18.0.0"
	},
	"dependencies": {
		"apify": "^3.4.2",
		"crawlee": "^3.13.8"
	},
	"devDependencies": {
		"@apify/eslint-config": "^1.0.0",
		"eslint": "^9.29.0",
		"eslint-config-prettier": "^10.1.5",
		"prettier": "^3.5.3"
	},
	"scripts": {
		"start": "node src/main.js",
		"format": "prettier --write .",
		"format:check": "prettier --check .",
		"lint": "eslint",
		"lint:fix": "eslint --fix",
		"test": "echo \"Error: oops, the Actor has no tests yet, sad!\" && exit 1"
	},
	"author": "It's not you it's me",
	"license": "ISC"
}

.actor/actor.json

{
  "actorSpecification": 1,
  "name": "video-download-link-crawler",
  "title": "Video Download Link Crawler",
  "version": "1.0",
  "description": "Automatically discover and extract video download links from websites with customizable crawling patterns and export options.",
  "storages": {
    "dataset": {
      "actorSpecification": 1,
      "fields": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
          "videoUrl": {
            "type": "string",
            "format": "uri",
            "title": "Video Download URL",
            "description": "Direct download URL for the video file"
          },
          "sourceUrl": {
            "type": "string",
            "format": "uri",
            "title": "Source Page URL",
            "description": "URL of the webpage where the video link was found"
          },
          "title": {
            "type": "string",
            "title": "Video Title",
            "description": "Title or name of the video file"
          },
          "format": {
            "type": "string",
            "title": "Video Format",
            "description": "File format of the video (e.g., mp4, avi, mov, webm)",
            "enum": ["mp4", "avi", "mov", "mkv", "webm", "m4v", "flv", "hevc", "mpg", "m2ts", "ogv", "unknown"]
          },
          "fileSize": {
            "type": ["string", "null"],
            "title": "File Size",
            "description": "Size of the video file if available"
          },
          "foundAt": {
            "type": "string",
            "format": "date-time",
            "title": "Discovery Time",
            "description": "Timestamp when the video link was discovered"
          },
          "depth": {
            "type": "integer",
            "title": "Crawl Depth",
            "description": "Depth level at which the video was found (0 = start URL)",
            "minimum": 0
          },
          "eventNumber": {
            "type": ["integer", "string"],
            "title": "Event Number",
            "description": "The sequential event number when this video was discovered"
          },
          "note": {
            "type": ["string", "null"],
            "title": "Additional Notes",
            "description": "Additional information about how the video was discovered"
          }
        },
        "required": ["videoUrl", "sourceUrl", "title", "format", "foundAt", "depth"]
      },
      "views": {
        "overview": {
          "title": "Video Links Overview",
          "description": "Complete list of discovered video download links with metadata",
          "transformation": {
            "fields": [
              "eventNumber",
              "title",
              "videoUrl",
              "sourceUrl",
              "format",
              "fileSize",
              "foundAt",
              "depth",
              "note"
            ]
          },
          "display": {
            "component": "table",
            "properties": {
              "eventNumber": {
                "label": "Event #",
                "format": "text"
              },
              "title": {
                "label": "Video Title",
                "format": "text"
              },
              "videoUrl": {
                "label": "Download URL",
                "format": "link"
              },
              "sourceUrl": {
                "label": "Source Page",
                "format": "link"
              },
              "format": {
                "label": "Format",
                "format": "text"
              },
              "fileSize": {
                "label": "File Size",
                "format": "text"
              },
              "foundAt": {
                "label": "Discovered",
                "format": "date"
              },
              "depth": {
                "label": "Depth",
                "format": "number"
              },
              "note": {
                "label": "Notes",
                "format": "text"
              }
            }
          }
        },
        "by-format": {
          "title": "Videos by Format",
          "description": "Video links grouped by file format",
          "transformation": {
            "fields": [
              "format",
              "title",
              "videoUrl",
              "sourceUrl",
              "foundAt"
            ]
          },
          "display": {
            "component": "table",
            "properties": {
              "format": {
                "label": "Format",
                "format": "text"
              },
              "title": {
                "label": "Video Title",
                "format": "text"
              },
              "videoUrl": {
                "label": "Download URL",
                "format": "link"
              },
              "sourceUrl": {
                "label": "Source Page",
                "format": "link"
              },
              "foundAt": {
                "label": "Discovered",
                "format": "date"
              }
            }
          }
        },
        "summary": {
          "title": "Summary",
          "description": "Key video links without detailed metadata",
          "transformation": {
            "fields": [
              "title",
              "videoUrl",
              "format"
            ]
          },
          "display": {
            "component": "table",
            "properties": {
              "title": {
                "label": "Video Title",
                "format": "text"
              },
              "videoUrl": {
                "label": "Download URL",
                "format": "link"
              },
              "format": {
                "label": "Format",
                "format": "text"
              }
            }
          }
        }
      }
    }
  }
}

.actor/input_schema.json

{
    "title": "Video Download Link Crawler Input",
    "type": "object",
    "schemaVersion": 1,
    "properties": {
        "startUrl": {
            "title": "Start URL",
            "type": "string",
            "description": "The URL where crawling will begin",
            "editor": "textfield",
            "pattern": "^https?://.*",
            "example": "https://sample-videos.com/",
            "prefill": "https://sample-videos.com/"
        },
        "videoRegex": {
            "title": "Video Detection Regex",
            "type": "string",
            "description": "Regular expression to identify video download links",
            "editor": "textfield",
            "default": "\\.(mp4|avi|mov|mkv|webm|m4v)$",
            "prefill": "\\.(mp4|avi|mov|mkv|webm|m4v)$",
            "example": "\\.(mp4|webm|mov)$"
        },
        "linkRegex": {
            "title": "Link Following Regex",
            "type": "string",
            "description": "Regular expression to match URLs to follow",
            "editor": "textfield",
            "default": ".*",
            "prefill": "sample-videos\\.com",
            "example": "example\\.com"
        },
        "maxCrawlDepth": {
            "title": "Maximum Crawl Depth",
            "type": "integer",
            "description": "Maximum depth of crawling (0 = start URL only)",
            "default": 2,
            "prefill": 2,
            "minimum": 0,
            "maximum": 5,
            "example": 2
        },
        "maxPages": {
            "title": "Maximum Pages (Events)",
            "type": "integer",
            "description": "Maximum number of pages to crawl. Each page = 1 billable event.",
            "default": 5,
            "prefill": 5,
            "minimum": 1,
            "maximum": 1000,
            "example": 50
        },
        "outputFormat": {
            "title": "Output Format",
            "type": "string",
            "description": "Export format for results",
            "enum": ["JSON", "CSV", "HTML", "XML"],
            "default": "JSON",
            "prefill": "JSON",
            "editor": "select"
        }
    },
    "required": ["startUrl"]
}

src/main.js

1// Import required modules using ES6 syntax
2import { Actor } from 'apify';
3import { CheerioCrawler } from 'crawlee';
4
5// Usage summary function
6function logUsageSummary(eventsProcessed, videosFound, failedRequests, startTime, maxPages) {
7    const endTime = Date.now();
8    const duration = Math.round((endTime - startTime) / 1000);
9    const efficiency = videosFound > 0 ? (videosFound / eventsProcessed).toFixed(2) : '0';
10    
11    console.log('\n🔍 === FINAL USAGE SUMMARY ===');
12    console.log(`📊 Total Events Processed: ${eventsProcessed} / ${maxPages}`);
13    console.log(`🎥 Videos Found: ${videosFound}`);
14    console.log(`❌ Failed Requests: ${failedRequests}`);
15    console.log(`⏱️  Duration: ${duration} seconds`);
16    console.log(`🎯 Efficiency: ${efficiency} videos per event`);
17    console.log(`💰 Billable Events: ${eventsProcessed}`);
18    
19    if (eventsProcessed < maxPages) {
20        console.log(`💡 You used ${maxPages - eventsProcessed} fewer events than your limit`);
21    }
22    
23    console.log('==============================\n');
24}
25
26// Real-time usage monitor
27function startUsageMonitor(maxPages, getEventsProcessed) {
28    const monitorInterval = setInterval(() => {
29        const eventsProcessed = getEventsProcessed();
30        if (eventsProcessed > 0) {
31            const percentage = Math.round((eventsProcessed / maxPages) * 100);
32            console.log(`📈 Usage: ${eventsProcessed}/${maxPages} events (${percentage}%)`);
33        }
34    }, 30000); // Log every 30 seconds
35    
36    return monitorInterval;
37}
38
39// Initialize the actor
40await Actor.main(async () => {
41    // Usage tracking variables
42    let eventsProcessed = 0;
43    let videosFound = 0;
44    let failedRequests = 0;
45    let startTime = Date.now();
46    
47    // Get input from user
48    const input = await Actor.getInput();
49    
50    // Validate input and show usage info
51    if (!input || !input.startUrl) {
52        throw new Error('Start URL is required');
53    }
54    
55    // Set up default values
56    const {
57        startUrl,
58        linkRegex = '.*',
59        videoRegex = '\\.(mp4|avi|mov|mkv|webm|m4v)$',
60        maxCrawlDepth = 3,
61        maxPages = 100,
62        outputFormat = 'JSON'
63    } = input;
64    
65    // Usage tracking and warnings
66    console.log('=== USAGE TRACKING ===');
67    console.log(`Maximum pages to crawl: ${maxPages}`);
68    console.log(`Each page = 1 billable event`);
69    console.log(`Estimated maximum cost: ${maxPages} events`);
70    console.log('=====================');
71    
72    console.log('Starting Video Download Link Crawler...');
73    console.log(`Start URL: ${startUrl}`);
74    console.log(`Video Regex: ${videoRegex}`);
75    console.log(`Max Depth: ${maxCrawlDepth}`);
76    console.log(`Max Pages: ${maxPages}`);
77    
78    // Create request queue
79    const requestQueue = await Actor.openRequestQueue();
80    
81    // Add start URL to queue
82    await requestQueue.addRequest({
83        url: startUrl,
84        userData: { depth: 0 }
85    });
86    
87    // Set up dataset for results
88    const dataset = await Actor.openDataset();
89    
90    // Configure crawler with usage tracking
91    const crawler = new CheerioCrawler({
92        requestQueue,
93        maxRequestsPerCrawl: maxPages,
94        async requestHandler({ request, $ }) {
95            const { url } = request;
96            const { depth } = request.userData;
97            
98            // Increment event counter
99            eventsProcessed++;
100            
101            console.log(`Event #${eventsProcessed}: Processing ${url} (depth: ${depth})`);
102            
103            // Show progress every 10 events
104            if (eventsProcessed % 10 === 0) {
105                console.log(`📊 PROGRESS: ${eventsProcessed}/${maxPages} events processed`);
106            }
107            
108            // Show warning at 80% of limit
109            if (eventsProcessed >= maxPages * 0.8 && eventsProcessed <= maxPages * 0.8 + 1) {
110                console.log(`⚠️  WARNING: Approaching usage limit (${eventsProcessed}/${maxPages} events)`);
111            }
112            
113            // Extract video links from current page
114            const videoLinks = await extractVideoLinks($, url, videoRegex);
115            
116            console.log(`Found ${videoLinks.length} video links on ${url}`);
117            videosFound += videoLinks.length;
118            
119            // Save video links to dataset
120            for (const videoLink of videoLinks) {
121                await dataset.pushData({
122                    sourceUrl: url,
123                    videoUrl: videoLink.url,
124                    title: videoLink.title,
125                    fileSize: videoLink.fileSize,
126                    format: videoLink.format,
127                    foundAt: new Date().toISOString(),
128                    depth: depth,
129                    eventNumber: eventsProcessed
130                });
131            }
132            
133            // Find and enqueue new links if within depth limit
134            if (depth < maxCrawlDepth) {
135                const links = await extractLinks($, url, linkRegex, videoRegex);
136                
137                console.log(`Found ${links.length} links to follow from ${url}`);
138                
139                for (const link of links) {
140                    await requestQueue.addRequest({
141                        url: link,
142                        userData: { depth: depth + 1 }
143                    });
144                }
145            }
146        },
147        async failedRequestHandler({ request }) {
148            failedRequests++;
149            console.error(`❌ Request failed (not counted as event): ${request.url}`);
150            console.log(`Failed requests so far: ${failedRequests}`);
151            
152            // Check if the failed request is actually a video file we should record
153            const videoRegexPattern = new RegExp(videoRegex, 'i');
154            if (videoRegexPattern.test(request.url)) {
155                console.log(`🎥 Recording failed request as video link: ${request.url}`);
156                
157                // Extract info from the URL
158                const urlParts = request.url.split('/');
159                const filename = urlParts[urlParts.length - 1];
160                const format = getVideoFormat(request.url);
161                
162                videosFound++;
163                
164                await dataset.pushData({
165                    sourceUrl: request.userData.sourceUrl || 'Unknown',
166                    videoUrl: request.url,
167                    title: filename.replace(/\.[^/.]+$/, ""),
168                    fileSize: null,
169                    format: format,
170                    foundAt: new Date().toISOString(),
171                    depth: request.userData.depth || 0,
172                    note: 'Found as direct video link',
173                    eventNumber: 'N/A (failed request)'
174                });
175            }
176        }
177    });
178    
179    // Start usage monitor
180    const usageMonitor = startUsageMonitor(maxPages, () => eventsProcessed);
181    
182    // Run the crawler
183    await crawler.run();
184    
185    // Clear the monitor
186    clearInterval(usageMonitor);
187    
188    // Keep our manually tracked values - they're more accurate
189    // eventsProcessed and failedRequests are already being tracked correctly
190    console.log(`Debug: eventsProcessed = ${eventsProcessed}, failedRequests = ${failedRequests}`);
191    
192    // Get crawler stats for additional info if needed
193    const crawlerStats = crawler.stats || {};
194    console.log('Crawler stats for reference:', {
195        requestsFinished: crawlerStats.requestsFinished,
196        requestsFailed: crawlerStats.requestsFailed,
197        requestsTotal: crawlerStats.requestsTotal
198    });
199    
200    // Export results based on format
201    const results = await dataset.getData();
202    await exportResults(results.items, outputFormat, eventsProcessed, videosFound);
203    
204    // Log comprehensive usage summary
205    logUsageSummary(eventsProcessed, videosFound, failedRequests, startTime, maxPages);
206    
207    console.log(`✅ Crawling completed! Found ${results.items.length} video links.`);
208});
209
210// Helper function to extract video links
211async function extractVideoLinks($, baseUrl, videoRegex) {
212    const videoLinks = [];
213    const regex = new RegExp(videoRegex, 'i');
214    
215    try {
216        // Check all links on the page
217        $('a[href]').each((index, element) => {
218            const href = $(element).attr('href');
219            if (!href) return;
220            
221            try {
222                const absoluteUrl = new URL(href, baseUrl).href;
223                
224                if (regex.test(absoluteUrl)) {
225                    videoLinks.push({
226                        url: absoluteUrl,
227                        title: $(element).text().trim() || $(element).attr('title') || 'Unknown',
228                        fileSize: null,
229                        format: getVideoFormat(absoluteUrl)
230                    });
231                }
232            } catch (urlError) {
233                console.warn(`Invalid URL: ${href}`);
234            }
235        });
236        
237        // Check for video elements
238        $('video source[src], video[src]').each((index, element) => {
239            const src = $(element).attr('src');
240            if (!src) return;
241            
242            try {
243                const absoluteUrl = new URL(src, baseUrl).href;
244                
245                if (regex.test(absoluteUrl)) {
246                    videoLinks.push({
247                        url: absoluteUrl,
248                        title: $('video').attr('title') || 'Video',
249                        fileSize: null,
250                        format: getVideoFormat(absoluteUrl)
251                    });
252                }
253            } catch (urlError) {
254                console.warn(`Invalid video URL: ${src}`);
255            }
256        });
257        
258        // Check for direct video links in download buttons or specific patterns
259        $('a[href*="download"], a[href*="sample"], a[href*="video"]').each((index, element) => {
260            const href = $(element).attr('href');
261            if (!href) return;
262            
263            try {
264                const absoluteUrl = new URL(href, baseUrl).href;
265                
266                if (regex.test(absoluteUrl)) {
267                    const linkText = $(element).text().trim();
268                    const title = linkText || $(element).attr('title') || $(element).attr('alt') || 'Video File';
269                    
270                    videoLinks.push({
271                        url: absoluteUrl,
272                        title: title,
273                        fileSize: null,
274                        format: getVideoFormat(absoluteUrl)
275                    });
276                }
277            } catch (urlError) {
278                console.warn(`Invalid video link URL: ${href}`);
279            }
280        });
281        
282        // Look for embedded videos or iframes
283        $('iframe[src*="video"], embed[src*="video"]').each((index, element) => {
284            const src = $(element).attr('src');
285            if (!src) return;
286            
287            try {
288                const absoluteUrl = new URL(src, baseUrl).href;
289                
290                if (regex.test(absoluteUrl)) {
291                    videoLinks.push({
292                        url: absoluteUrl,
293                        title: $(element).attr('title') || 'Embedded Video',
294                        fileSize: null,
295                        format: getVideoFormat(absoluteUrl)
296                    });
297                }
298            } catch (urlError) {
299                console.warn(`Invalid embedded video URL: ${src}`);
300            }
301        });
302        
303    } catch (error) {
304        console.error('Error extracting video links:', error);
305    }
306    
307    // Remove duplicates based on URL
308    const uniqueVideos = [];
309    const seenUrls = new Set();
310    
311    for (const video of videoLinks) {
312        if (!seenUrls.has(video.url)) {
313            seenUrls.add(video.url);
314            uniqueVideos.push(video);
315        }
316    }
317    
318    return uniqueVideos;
319}
320
321// Helper function to extract links to follow
322async function extractLinks($, baseUrl, linkRegex, videoRegex) {
323    const links = [];
324    const regex = new RegExp(linkRegex, 'i');
325    const videoRegexPattern = new RegExp(videoRegex, 'i');
326    
327    try {
328        $('a[href]').each((index, element) => {
329            const href = $(element).attr('href');
330            if (!href) return;
331            
332            try {
333                const absoluteUrl = new URL(href, baseUrl).href;
334                
335                // If it's a video file, don't try to crawl it, but save the reference
336                if (videoRegexPattern.test(absoluteUrl)) {
337                    console.log(`Found direct video link (not crawling): ${absoluteUrl}`);
338                    // We'll handle this in the video extraction function
339                    return;
340                }
341                
342                // Only follow HTML pages and directories
343                if (regex.test(absoluteUrl) && absoluteUrl.startsWith('http')) {
344                    // Avoid crawling direct file downloads
345                    const urlPath = new URL(absoluteUrl).pathname;
346                    if (!urlPath.match(/\.(pdf|zip|exe|dmg|pkg|deb|rpm)$/i)) {
347                        links.push(absoluteUrl);
348                    }
349                }
350            } catch (urlError) {
351                console.warn(`Invalid link URL: ${href}`);
352            }
353        });
354    } catch (error) {
355        console.error('Error extracting links:', error);
356    }
357    
358    return [...new Set(links)]; // Remove duplicates
359}
360
361// Helper function to get video format
362function getVideoFormat(url) {
363    const match = url.match(/\.([^.?]+)(?:\?|$)/);
364    return match ? match[1].toLowerCase() : 'unknown';
365}
366
367// Enhanced export function with usage metadata
368async function exportResults(results, format, eventsProcessed, videosFound) {
369    try {
370        // Add usage metadata to results
371        const metadata = {
372            totalEvents: eventsProcessed,
373            totalVideos: results.length,
374            videosFound: videosFound,
375            exportedAt: new Date().toISOString(),
376            format: format
377        };
378        
379        switch (format) {
380            case 'CSV':
381                await Actor.setValue('OUTPUT.csv', convertToCSV(results));
382                await Actor.setValue('USAGE_SUMMARY.json', metadata);
383                break;
384            case 'HTML':
385                await Actor.setValue('OUTPUT.html', convertToHTML(results, metadata));
386                break;
387            case 'XML':
388                await Actor.setValue('OUTPUT.xml', convertToXML(results, metadata));
389                break;
390            default:
391                await Actor.setValue('OUTPUT.json', results);
392                await Actor.setValue('USAGE_SUMMARY.json', metadata);
393        }
394        console.log(`✅ Results exported in ${format} format`);
395        console.log(`📊 Usage summary saved as USAGE_SUMMARY.json`);
396    } catch (error) {
397        console.error('Error exporting results:', error);
398    }
399}
400
401// CSV conversion function
402function convertToCSV(data) {
403    if (!data.length) return '';
404    
405    const headers = Object.keys(data[0]);
406    const csvContent = [
407        headers.join(','),
408        ...data.map(row => 
409            headers.map(header => `"${(row[header] || '').toString().replace(/"/g, '""')}"`).join(',')
410        )
411    ].join('\n');
412    
413    return csvContent;
414}
415
416// HTML conversion function with metadata
417function convertToHTML(data, metadata) {
418    const htmlContent = `
419<!DOCTYPE html>
420<html>
421<head>
422    <title>Video Download Links</title>
423    <style>
424        body { font-family: Arial, sans-serif; margin: 20px; }
425        .metadata { background: #f5f5f5; padding: 15px; border-radius: 5px; margin-bottom: 20px; }
426        table { border-collapse: collapse; width: 100%; }
427        th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }
428        th { background-color: #f2f2f2; }
429        a { color: #0066cc; text-decoration: none; }
430        a:hover { text-decoration: underline; }
431    </style>
432</head>
433<body>
434    <h1>Video Download Links</h1>
435    
436    <div class="metadata">
437        <h3>Usage Summary</h3>
438        <p><strong>Total Events:</strong> ${metadata.totalEvents}</p>
439        <p><strong>Videos Found:</strong> ${metadata.videosFound}</p>
440        <p><strong>Export Date:</strong> ${metadata.exportedAt}</p>
441        <p><strong>Format:</strong> ${metadata.format}</p>
442    </div>
443    
444    <p>Total videos found: ${data.length}</p>
445    <table>
446        <thead>
447            <tr>
448                <th>Event #</th>
449                <th>Title</th>
450                <th>Video URL</th>
451                <th>Source URL</th>
452                <th>Format</th>
453                <th>File Size</th>
454                <th>Found At</th>
455                <th>Depth</th>
456                <th>Notes</th>
457            </tr>
458        </thead>
459        <tbody>
460            ${data.map(item => `
461                <tr>
462                    <td>${item.eventNumber || 'N/A'}</td>
463                    <td>${item.title || 'Unknown'}</td>
464                    <td><a href="${item.videoUrl}" target="_blank">${item.videoUrl}</a></td>
465                    <td><a href="${item.sourceUrl}" target="_blank">${item.sourceUrl}</a></td>
466                    <td>${item.format}</td>
467                    <td>${item.fileSize || 'Unknown'}</td>
468                    <td>${item.foundAt}</td>
469                    <td>${item.depth}</td>
470                    <td>${item.note || ''}</td>
471                </tr>
472            `).join('')}
473        </tbody>
474    </table>
475</body>
476</html>
477    `;
478    
479    return htmlContent;
480}
481
482// XML conversion function with metadata
483function convertToXML(data, metadata) {
484    const xmlContent = `<?xml version="1.0" encoding="UTF-8"?>
485<videoResults>
486    <metadata>
487        <totalEvents>${metadata.totalEvents}</totalEvents>
488        <videosFound>${metadata.videosFound}</videosFound>
489        <exportedAt>${metadata.exportedAt}</exportedAt>
490        <format>${metadata.format}</format>
491    </metadata>
492    <videos count="${data.length}">
493        ${data.map(item => `
494        <video>
495            <eventNumber>${item.eventNumber || 'N/A'}</eventNumber>
496            <title><![CDATA[${item.title || 'Unknown'}]]></title>
497            <videoUrl><![CDATA[${item.videoUrl}]]></videoUrl>
498            <sourceUrl><![CDATA[${item.sourceUrl}]]></sourceUrl>
499            <format>${item.format}</format>
500            <fileSize>${item.fileSize || 'Unknown'}</fileSize>
501            <foundAt>${item.foundAt}</foundAt>
502            <depth>${item.depth}</depth>
503            <note><![CDATA[${item.note || ''}]]></note>
504        </video>
505        `).join('')}
506    </videos>
507</videoResults>`;
508    
509    return xmlContent;
510}

Video Link Crawler

infoweaver/video-link-crawler

Effortlessly discover and extract video links from any website with our powerful Video Link Crawler within few seconds. Starting from a specified URL, it navigates through web pages, identifies video content, and compiles structured datasets.! Try it Now!

InfoWeaver

262

TeraBox VideoDownload Link Scraper

hello.datawizards/TeraBox-VideoDownload-Link-Scraper

TeraBox-VideoDownload-Link-Scraper is an Apify Actor that extracts direct video download links from TeraBox URLs. Get structured JSON with original URLs, download links, filenames, and sizes. Ideal for content aggregation and media analysis. Built by DataWizards with residential proxy support.

datawizards

Video Download Link Crawler(fast and cheap)

thenetaji/video-download-link-crawler

Extract direct video download links from any webpage! Customize crawling with regex, export in multiple formats, and automate video collection. Fast, efficient, and cheap—try now!

The Netaji

5.0

🎥 TeraBox Video Downloader & Audio (MP3) Extractor 🛠️

scrapearchitect/terabox-video-downloader-audio-mp3-extractor

🚀 Instantly download TeraBox videos 🎥 & MP3s 🎵! Secure Apify storage links 🔐📦, bulk URLs 📋, auto-retry 🔄. For devs 🛠️, archivists 📚, music collectors 🎧. Fast ⚡, no watermarks 🚫💧, encrypted downloads! 🎥 TeraBox Video Downloader & Audio (MP3) Extractor 🛠️

Scrape Architect

Teabox Scraper

mshopik/teabox-scraper

Scrape Teabox and extract data on coffee and tea from teabox.com. Our Teabox API lets you crawl product information and pricing. The saved data can be downloaded as HTML, JSON, CSV, Excel, and XML.

Mark Carter

TeraBox Video/File Downloader 🎥

easyapi/terabox-video-file-downloader

Extract direct download links from TeraBox video sharing URLs. Get fast download links, file details, and metadata for TeraBox shared videos automatically.

EasyApi

590

5.0

Terabox Video Downloader

scraper-mind/terabox-downloader

Download Terabox videos effortlessly with our fast and reliable Terabox Video Downloader API. Enjoy hassle-free video downloads for just $10/month using our powerful downloader API.

Scraper Mind

422

5.0

Terabox Fast Video Downloader

igview-owner/terabox-fast-video-downloader

Instantly generate direct download links, streaming URLs, and full metadata for any Terabox video or file. Fast, free, and no login required. Returns file name, size, and thumbnail.

Sachin Kumar Yadav

180

Social Media API Video Download

codsec/social-media-api-video-download

Download high-quality videos from any social network. Videos in HD, Full HD, and even 4K, up to 1 hour long. Support Social Networks: Youtube, Facebook, Instagram, SoundCloud, Vimeo, TikTok, X/Twitter, Twitch, and IMDB

MrWhite

All Video Downloader

andryerica/all-video-downloader

🚀 Universal Video Extractor – Download in 1 click from YouTube, TikTok, Instagram, Facebook, X, and more. Full metadata, direct HD/audio links, private video support. Perfect for devs, marketers, and API automation. 🔥