Functions
Available functions
Functions can be used with any field or function, as long as the expected types matches, which means that any args
element can also be another object defining a function.
Arithmetic functions
Add
Can contain any number of arguments and will be added together.
{
"function": "add",
"args": ["field_slug", 3, 5]
}
Subtract
Can contain any number of arguments and will be subtracted from each other.
{
"function": "sub",
"args": ["field_slug", 4]
}
Multiply
Can contain any number of arguments and will multiply all arguments.
{
"function": "mul",
"args": ["field_slug", 3]
}
Divide
Must contain two arguments that will be divided.
{
"function": "div",
"args": ["field_slug", 2]
}
Date functions
Date
Take one datetime argument as input and truncates the time part to only return the date.
{
"function": "date",
"args": ["date_crawled"]
}
If date_crawled
is "2020-09-22 08:07:00Z"
, the result of this function will be "2020-09-22"
.
Year
Takes one date or datetime as argument and returns the year as an integer.
{
"function": "year",
"args": ["date_crawled"]
}
If date_crawled
is "2020-09-22 08:07:00Z"
, the result of this function will be 2020
.
Month
Takes one date or datetime as argument and returns the month as an integer.
{
"function": "month",
"args": ["date_crawled"]
}
If date_crawled
is "2020-09-22 08:07:00Z"
, the result of this function will be 9
.
Year-month
Takes one date or datetime as argument and returns a year-month string.
{
"function": "year_month",
"args": ["date_crawled"]
}
If date_crawled
is "2020-09-22 08:07:00Z"
, the result of this function will be "2020-09"
.
First day of week
Takes one date or datetime as argument and returns a date string of the first of day of the corresponding week.
Can be used as dimension to aggregate data by week.
{
"function": "first_day_of_week",
"args": ["date_crawled"]
}
If date_crawled
is "2020-09-22 08:07:00Z"
, the result of this function will be "2020-09-21"
.
First day of month
Takes one date or datetime as argument and returns a date string of the first of day of the corresponding month.
Can be used as dimension to aggregate data by month.
{
"function": "first_day_of_month",
"args": ["date_crawled"]
}
If date_crawled
is "2020-09-22 08:07:00Z"
, the result of this function will be "2020-09-01"
.
Year and Week Number starting Monday
Takes one date or datetime as argument and returns a year-week_number string, The week number is follows the ISO week number definition.
{
"function": "year_week_number_starting_monday",
"args": ["date_crawled"]
}
If date_crawled
is "2020-09-22 08:07:00Z"
, the result of this function will be "2020-39"
.
Year and Week Number starting Sunday
Takes one date or datetime as argument and returns a year-week_number string, The week number is follows the ISO week number definition but transposed to start on Sunday.
{
"function": "year_week_number_starting_sunday",
"args": ["date_crawled"]
}
If date_crawled
is "2020-09-22 08:07:00Z"
, the result of this function will be "2020-39"
.
Truncate date at hour
Takes one date or datetime as argument and returns a datetime string where we truncate until the hours.
Useful to aggregate metrics by a datetime dimension hourly.
{
"function": "truncate_date_at_hour",
"args": ["date_crawled"]
}
If date_crawled
is "2020-09-22 08:07:00Z"
, the result of this function will be "2020-09-22 08:00:00Z"
.
Aggregation functions
To use aggregation functions, see Metrics how to use them in a query.
Sum
Takes one argument and will add all aggregated entries.
{
"function": "sum",
"args": ["field_slug"]
}
Average
Takes one argument and will average all aggregated entries.
{
"function": "avg",
"args": ["field_slug"]
}
Weighted Average
Takes two arguments and will do a weighted average all aggregated entries, depending on the corresponding weight field.
{
"function": "weighted_avg",
"args": ["field_slug", "weight_field_slug"]
}
Min
Takes one argument and will return the smallest value among the aggregated entries.
{
"function": "min",
"args": ["field_slug"]
}
Max
Takes one argument and will return the smallest value among the aggregated entries.
{
"function": "max",
"args": ["field_slug"]
}
Count
Takes one argument and returns the number of aggregated entries.
{
"function": "count",
"args": ["field_slug"]
}
Count Distinct
Takes one argument and returns the number of different aggregated entries.
{
"function": "count_distinct",
"args": ["field_slug"]
}
Approximated Count Distinct
Takes one argument and returns the approximate number of different aggregated entries.
If the query takes some time, using an approximate can speed up the query.
{
"function": "count_distinct_approx",
"args": ["field_slug"]
}
Count True
Takes one argument and returns the number of true aggregated entries.
{
"function": "count_true",
"args": ["field_slug"]
}
Count False
Takes one argument and returns the number of false aggregated entries.
{
"function": "count_false",
"args": ["field_slug"]
}
Count Null
Takes one argument and returns the number of null aggregated entries.
{
"function": "count_null",
"args": ["field_slug"]
}
Count Equals
Takes two arguments and returns the number of aggregated entries where both values are equal.
{
"function": "count_eq",
"args": ["field_slug_1", "field_slug_2"]
}
Count Greater than
Takes two arguments and returns the number of aggregated entries where the first argument is greater than the second.
{
"function": "count_gt",
"args": ["field_slug_1", "field_slug_2"]
}
Count Greater than or equals
Takes two arguments and returns the number of aggregated entries where the first argument is greater than or equal to the second.
{
"function": "count_gte",
"args": ["field_slug_1", "field_slug_2"]
}
Count Lower than
Takes two arguments and returns the number of aggregated entries where the first argument is lower than the second.
{
"function": "count_lt",
"args": ["field_slug_1", "field_slug_2"]
}
Count Lower than or equals
Takes two arguments and returns the number of aggregated entries where the first argument is lower than or equal to the second.
{
"function": "count_lte",
"args": ["field_slug_1", "field_slug_2"]
}
Conditional functions
If
Takes three arguments as input. The first argument must be a boolean, and can either be a field or a function like eq
that will return a boolean.
If the first argument is true, it returns the second argument. If the first argument is false, it returns the third argument.
{
"function": "if",
"args": [true, 1, 2]
}
Will always return 1. A more realistic example, if the number of crawls from Google equals the number of visits from Google, we return true, else we return null.
{
"function": "if",
"args": [
{
"function": "eq",
"args": [
"crawl.20210102.search_engines.google.crawls.count",
"crawl.20210102.visits.organic.google.nb"
]
},
true,
null
]
}
If not exists
Takes two arguments as input. Return the first argument if it is not null, else it returns the second one.
{
"function": "if_not_exists",
"args": ["search_console.period_0.count_clicks", 0]
}
When joining collections, we might not have any clicks for a crawled URL, because it never appeared on a SERP. It can therefore be null, but we don't want to handle null and just replace it with 0.
Equal
Takes two arguments as input and returns true if they are equal.
{
"function": "eq",
"args": ["field_slug_1", "field_slug_2"]
}
Not equal
Takes two arguments as input and returns true if they are different.
{
"function": "ne",
"args": ["field_slug_1", "field_slug_2"]
}
Lower than
Takes two arguments as input and returns true if the first one is lower than the second.
{
"function": "lt",
"args": ["field_slug_1", "field_slug_2"]
}
Greater than
Takes two arguments as input and returns true if the first one is greater than the second.
{
"function": "gt",
"args": ["field_slug_1", "field_slug_2"]
}
Lower than or equal
Takes two arguments as input and returns true if the first one is lower than or equal to the second.
{
"function": "lte",
"args": ["field_slug_1", "field_slug_2"]
}
Greater than or equal
Takes two arguments as input and returns true if the first one is greater than or equal to the second.
{
"function": "gte",
"args": ["field_slug_1", "field_slug_2"]
}
Exists
Takes one argument as input and returns true if the argument is not null.
{
"function": "exists",
"args": ["field_slug"]
}
Not exists
Takes one argument as input and returns true if the argument is null.
{
"function": "not_exists",
"args": ["field_slug"]
}
Not
Takes one argument as input and returns it's negated result.
{
"function": "not",
"args": ["field_slug"]
}
And
Takes any number of arguments as input and returns true if all arguments are true.
{
"function": "and",
"args": ["field_slug_1", "field_slug_2"]
}
Or
Takes any number of arguments as input and returns true if at least one argument is true.
{
"function": "or",
"args": ["field_slug_1", "field_slug_2"]
}
Literal
Takes one argument as input and handle the input as a constant.
Can be used to hard-code strings instead of interpreting them as field slugs.
{
"function": "literal",
"args": ["constant_value"]
}
Will return "constant_value"
for each entry.
List functions
These functions can only be applied of list type fields.
First
Takes one argument and returns the first element of the list as result.
{
"function": "first",
"args": ["field_slug"]
}
Sort
Takes one argument and returns the list with sorted elements.
{
"function": "sort",
"args": ["field_slug"]
}
Array to String
Takes two arguments and returns a string composed of each element of the list, with is the first argument, with a separator which is the second argument.
{
"function": "array_to_string",
"args": ["field_slug", "separator"]
}
HTTP code functions
HTTP code family
Takes one argument as input and returns the family of the HTTP code.
Families are:
2xx
for HTTP codes between 200 and 299, and 304exx
for HTTP codes below 0Yxx
for all other HTTP codes, where Y is the first digit of the HTTP code
{
"function": "http_code_family",
"args": ["http_code"]
}
If the HTTP code is 429
, this function returns 4xx
.
HTTP code quality
Takes one argument as input and returns the quality of the HTTP code.
Quality values are:
good
for HTTP codes between 200 and 299, and 304bad
for all others
{
"function": "http_code_quality",
"args": ["http_code"]
}
If the HTTP code is 429
, this function returns bad
.
Ranges function
The ranges function is explained in details in Dimensions.
It is called with two arguments. The first one if the field with continuous values that we want to aggregate on. The second argument is a list of objects containing one, or both, of the from
and to
keys.
{
"function": "ranges",
"args": [
"search_console.period_0.avg_position",
[
{
"to": 2
},
{
"from": 2,
"to": 5
},
{
"from": 5,
"to": 8
},
{
"from": 8
}
]
]
}
String functions
Concatenation
Takes any number of string arguments and concatenates them together.
{
"function": "concat",
"args": [
"url",
{
"function": "literal",
"args": ["&botify"]
}
]
}
Will concatenate the URL with a &botify
suffix. (the example could be improved to handle if URL path doesn't contain any query string to use a ?
instead. Leaving that as an exercise to the reader).
Matches content
Takes two string arguments as input and returns true if the first argument is contained by the second argument one, when both arguments are normalized.
Normlization means ignoring any non-ascii characters and spaces in the strings.
Used to verify if a keyword is contained the title of the URL for instance:
{
"function": "matches_content",
"args": [
"keyword_meta.normalized_keyword_words_string",
"crawl.20210102.metadata.title.normalized_content"
]
}
Values list functions
Values list are mostly used for RealKeywords Keyword Groups feature, in order group together multiple keywords into one list.
Once you know your values list identifier, you will be able to query it through some functions.
Match lists
Takes as input one object with two keys: field
and lists
.
{
"function": "match_lists",
"args": [
{
"field": "keyword",
"lists": [
"list1-identifier1",
"list2-identifier2",
"list3-identifier3"
]
}
]
}
And it will return the value of the item it matches in one of the lists.
Lists are evaluated in the order they are passed to the function.
Special functions
By dimension
Allows breaking down a metric on another dimension than the ones specified in the BQL query.
Takes as input the metric and the dimension on which you want to break it down.
Used for sparklines in the Botify Application to break down the data by date, additionally to the selected dimensions:
{
"function": "by_dimension",
"args": [
"search_console.period_0.count_clicks",
"period_0.date"
]
}
To JSON string
Takes one argument as input and transforms it to a JSON string.
Useful to get rid of list fields and fit them into a single row.
{
"function": "to_json_string",
"args": ["query_string_keys"]
}
Updated almost 4 years ago