Query Dataset
This Actor is unavailable because the developer has decided to deprecate it. Would you like to try a similar Actor instead?
See alternative ActorsQuery Dataset
Query your existing datasets, map and generate a subset of your data
Query your existing datasets, map and generate a subset of your data
Uses MongoDB-like query style, for extended documentation, check MongoDB query documentation
It uses sift module for matching, means you can use as query input.
Example
Take this dataset, for example:
1[ 2 { 3 "name": "Name 1", 4 "anotherValue": 1 5 }, 6 { 7 "name": "Name 2", 8 "anotherValue": 2 9 }, 10 { 11 "name": "", 12 "anotherValue": 3 13 } 14]
You want to query only items that have a name that isn't empty, so you use the following INPUT
:
1{ 2 "datasetId": "YOUR_DATASET_ID", 3 "query": { 4 "name": { "$ne": "" } 5 } 6}
$ne
means "not equal" in MongoDB, so you'll receive "Name 1" and "Name 2" items.
Now say you want to rename the "name" field to something else:
1{ 2 "datasetId": "YOUR_DATASET_ID", 3 "query": { 4 "name": { "$ne": "" } 5 }, 6 "filterMap": "({ item }) => { item.name = item.name.replace('Name ', ''); item.extra = true; return item; }" 7}
Your generated dataset is now:
1[ 2 { 3 "name": "1", 4 "anotherValue": 1, 5 "extra": true, 6 }, 7 { 8 "name": "2", 9 "anotherValue": 2, 10 "extra": true, 11 } 12]
filterMap and customOperationSetup
The filterMap
parameter exists to do even more complex checks. filterMap
is run in a limited context, and those are the variables available inside your function:
sift
: the sift module, so you can create a filter on-the-flyconsole.log
: tied to the 'outside'console.log
and outputs information to the actor logitem
: the current dataset itemindex
: the current filtered indextotal
: total items available in the datasetfilter
: the created filter fromquery
parameterdatasetIndex
: the current position in the dataset index
The customOperationSetup
is mostly useful to prepare a custom operation using sift
:
1() => ({ 2 $gtDate(params, ownerQuery, options) { 3 const timestamp = new Date(params).getTime(); 4 5 return createEqualsOperation( 6 value => new Date(value).getTime() > timestamp, // 'value' here is the date from the field you provide 7 ownerQuery, 8 options 9 ); 10 } 11})
then use directly inside your query
("2020-01-01" is passed as param to params
):
1{ 2 "query": { 3 "lastModified": { "$gtDate": "2020-01-01" } 4 } 5}
Most of the time, you won't need to use customOperationSetup
, since the built-in operators can do a lot by themselves, but they are provided for completeness.
Expected Comsumption
The memory requirements should be really low, but you need at least 128MB, the dataset items aren't loaded all at once in memory, but depending on the shape of your query, you may need more. The more query parameters you provide, more memory and CPU are required, subsequently your query finishes faster.
Limitations
Some types aren't allowed in JSON, such as Date
and RegExp
. The workaround is to define a query without those types, then inside the filterMap
, you return either null or undefined for dates or RegExp that don't match.
E.g.:
1{ 2 "datasetId": "YOUR_DATASET_ID", 3 "query": { 4 5 }, 6 "filterMap": "({ item }) => { if (new Date(item.someDateField).getTime() < new Date(2019, 10, 20)) { return item } }" 7}
Or you can use the customOperationSetup
and provide your advanced operator for native types.
License
Apache-2.0