Actor picture

Run webhook digest

pocesar/run-webhook-digest

Through webhooks installed in your tasks/actors, allows you to provide multiple HTTP endpoints, that receive a more complete JSON from the run, and allow you to hit those endpoints using a proxy, and enables you to do conditional webhook calls with some lines of Javascript code.

Free trial for 14 days

Then $5/month

No credit card required now

Author's avatarPaulo Cesar
  • Modified
  • Users1
  • Runs58
Actor picture

Run webhook digest

Free trial for 14 days

Then $5/month

Through webhooks installed in your tasks / actors, allows you to provide multiple HTTP endpoints, that receive a more complete JSON from the run, and allow to hit those endpoints using a proxy.

Enables you to do conditional webhook calls, when only certain conditions are met. You can set individual thresholds to get only some emails with some lines of Javascript code as you have access to most platform resources.

Sends you an email containing a digest with the run information whenever the run succeeds, timeouts, or fails.

Webhook

The installed webhook looks like this:

{
    "emails": ["email1@example.com", "email2@example.com"],
    "endpoints": ["https://api.apify.com/v2/acts/username~another-actor/runs/?token=YourTokenHere"],
    // ...
    "resource": {{resource}}
}

The email looks like this:

Task my-task-107    ptew7Wypww36aZNUZ
Status:                TIMED-OUT
Run:                GBsPE3a78ZBH74Yq3
Started At:            2021-08-27T07:25:07.056Z
Finished At:        2021-08-27T07:25:18.236Z
Duration:            9.014 seconds | 0.15 minutes
CU:                    0.0003129861111111111
Dataset Count:        0

By default, the HTTP endpoints will receive a JSON POST with:

{
    "type": "Task", // or "Actor"
    "name": "my-actor",
    "resource": {
        "id": "hhj4yUPrncjTRsfas",
        "actId": "S2xxbN3BVLXLmU2da",
        "userId": "aurPRTH47KhmiaPNJ",
        "startedAt": "2021-08-23T06:00:52.923Z",
        "finishedAt": "2021-08-23T06:00:56.828Z",
        "status": "SUCCEEDED",
        "meta": {
            "origin": "API",
            "userAgent": "axios/0.21.1"
        },
        "stats": {
            "inputBodyLen": 1216,
            "restartCount": 0,
            "durationMillis": 3759,
            "resurrectCount": 0,
            "runTimeSecs": 3.759,
            "metamorph": 0,
            "computeUnits": 0.00013052083333333335,
            "memAvgBytes": 35479552,
            "memMaxBytes": 35999744,
            "memCurrentBytes": 35479552,
            "cpuAvgUsage": 0,
            "cpuMaxUsage": 0,
            "cpuCurrentUsage": 0,
            "netRxBytes": 530,
            "netTxBytes": 150
        },
        "options": {
            "build": "latest",
            "timeoutSecs": 10,
            "memoryMbytes": 128,
            "diskMbytes": 256
        },
        "buildId": "dxryivn95ynb9",
        "exitCode": 0,
        "defaultKeyValueStoreId": "gnr59b7jh9d",
        "defaultDatasetId": "485gwe485gv",
        "defaultRequestQueueId": "4b5c0w845b",
        "buildNumber": "0.0.6",
        "containerUrl": "https://dryuvbdxpory.runs.apify.net"
    },
    "customData": {
        "any": "custom information"
    },
    "datasetCount": 1000,
    "run": {
        "id": "rotne4amGv3YF",
        "name": "496hvw94X5L7XAj",
        "userId": "54vineirn4mBZmm",
        "createdAt": "2019-12-12T07:34:14.202Z",
        "modifiedAt": "2019-12-13T08:36:13.202Z",
        "accessedAt": "2019-12-14T08:36:13.202Z",
        "itemCount": 7,
        "cleanItemCount": 5,
        "actId": null,
        "actRunId": null,
        "fields": []
    }
}

Trigger condition

The email and endpoints will only be called if the triggerCondition parameter returns a truthy value.

{
    triggerCondition: async ({ Apify, dataset, requestQueue, keyValueStore, abort, data, input: { customData } }) => {
        const { cleanItemCount } = await dataset.getInfo();

        return cleanItemCount === 0 // execute the remote endpoint only in case the dataset yield nothing
            || run.stats.computeUnits > 10 // or the compute units is over 10
            || (await requestQueue.handledCount()) === 0 // or the requestQueue had an issue and processed 0 items
    }
}

You have full control of the data of your run here, you can do as many checks you need before sending out the request through the endpoints.

Custom HTTP endpoints and payloads

The custom HTTP endpoints webhook allows you to use proxies, something that the Apify platform don't provide.

This is mainly useful, using a proxy group like StaticUS3 with static IPs to do tunneling or IP whitelist!

You'll also be able to hit multiple endpoints with your data at once.

Apify Slack actor:

{
    endpoints: [
        "https://api.apify.com/v2/acts/katerinahronik~slack-message/runs?token=YOUR_TOKEN"
    ],
    transformEndpoint: async ({ data, url }) => {
        // you can differentiate by URL
        if (url.includes('slack')) {
            return {
                token: "slack-token",
                channel: "#your-channel",
                text: `<https://my.apify.com/tasks/${data.resource.actorTaskId}|Task> finished with status ${data.resource.status}`
            }
        }
    }
}

MS Team:

{
    endpoints: [
        "https://m341231.webhook.office.com/..."
    ],
    transformEndpoint: async ({ data, url }) => {
        return {
            "@type": "MessageCard",
            "@context": "http://schema.org/extensions",
            themeColor: "0076D7",
            summary: "Larry Bryant created a new task",
            sections: [{
                activityTitle: "Larry Bryant created a new task",
                activitySubtitle: "On Project Tango",
                activityImage: "https://teamsnodesample.azurewebsites.net/static/img/image5.png",
                facts: [{
                    name: "Assigned to",
                    value: "Unassigned"
                }, {
                    name: "Due date",
                    value: "Mon May 01 2017 17:07:18 GMT-0700 (Pacific Daylight Time)"
                }, {
                    name: "Status",
                    value: "Not started"
                }],
                markdown: true
            }],
        };
    }
}

Wordpress:

{
    endpoints: [
        "https://your-wordpress-website.com/wp-json/wp/v2/posts"
    ],
    transformEndpoint: async ({ dataset, keyValueStore }) => {
        // if your actor stores the data inside OUTPUT
        const output = await keyValueStore.getValue('OUTPUT');
        // otherwise access the dataset
        const { items } = await dataset.getData({ desc: true, limit: 1 });

        return {
            // important, this will make the object to be treated as a plain
            // request object instead of data, since we need to update
            // the headers too with the tokens
            __isRawRequest: true,
            headers: {
                'Authorization': 'Bearer your_token_here'
            },
            // Body can be a string or an object
            body: {
                date: new Date().toISOString(),
                status: "publish",
                title: "My blog post",
                content: output.content,
                tags: items[0].tags
            }
        }
    }
}

Google App Scripts:

{
    endpoints: [
        "https://script.google.com/macros/s/###/exec"
    ],
    transformEndpoint: async ({ dataset }) => {
        // will be sent as an array that you can access through e.postData.contents
        return {
            data: (await dataset.getData()).items,
        };
    }
}

Remote form submission:

{
    endpoints: [
        "https://your-remote-form-website.com/form"
    ],
    transformEndpoint: async ({ data, dataset }) => {
        return {
            // needs this to access the headers parameter
            __isRawRequest: true,
            headers: {
                'Content-Type': 'application/x-www-form-urlencoded'
            },
            body: `cus=${data.run.stats.computeUnits}&count=${data.datasetCount}&startedAt=${data.run.startedAt}`
        }
    }
}

Advanced usage:

{
    transformEndpoint: async () => {
        return {
            __isRawRequest: true,
            gotScraping: {
                // this can effectively call gotScraping directly
                url: 'https://new URL',
                retry: {
                    limit: 10,
                    statusCodes: [502,503,504]
                }
            }
        }
    }
}

If you throw an Error inside transformEndpoint function, the payload won't be delivered.

License

Apache 2.0