# PDF Text Extraction From URLs

**Use case:** 

Extract text and metadata from direct PDF URLs with configurable concurrency, timeout, and per-page text output.

## Input

```json
{
  "urls": [
    "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf",
    "https://www.orimi.com/pdf-test.pdf"
  ],
  "maxConcurrency": 3,
  "timeoutPerPdfSecs": 60,
  "includePages": true
}
```

## Output

```json
{
  "url": {
    "label": "URL",
    "format": "link"
  },
  "fileName": {
    "label": "File Name",
    "format": "text"
  },
  "pageCount": {
    "label": "Pages",
    "format": "number"
  },
  "title": {
    "label": "Title",
    "format": "text"
  },
  "author": {
    "label": "Author",
    "format": "text"
  },
  "subject": {
    "label": "Subject",
    "format": "text"
  },
  "creator": {
    "label": "Creator",
    "format": "text"
  },
  "creationDate": {
    "label": "Created",
    "format": "text"
  },
  "modificationDate": {
    "label": "Modified",
    "format": "text"
  },
  "fileSizeBytes": {
    "label": "Size (bytes)",
    "format": "number"
  },
  "pdfVersion": {
    "label": "PDF Version",
    "format": "text"
  },
  "fullText": {
    "label": "Full Text",
    "format": "text"
  }
}
```

## About this Actor

This example demonstrates how to use [PDF Text Extractor](https://apify.com/automation-lab/pdf-text-extractor) with a specific input configuration. Visit the [Actor detail page](https://apify.com/automation-lab/pdf-text-extractor) to learn more, explore other use cases, and run it yourself.