Filtering
Filters in BQL are composable so that any complex condition can be expressed.
This section covers the multiple ways of filtering that are available. Additionally, we cover the differences between filters
and post_filters
.
Types and formats
Predicate filters
The most common filter is the predicate filter, which is used to filter on a specific field against a specific value.
"filters": {
"field": "url",
"predicate": "eq",
"value": "example.com/a"
}
The list of available predicates can be found in Predicates.
The field
can contain a plain field slug, or also any function which can also be used a dimension. For instance:
"filters": {
"field": {
"function": "div",
"args": ["crawl.20210102.http_code", 100]
},
"predicate": "eq",
"value": 2
}
This filter would return rows when the HTTP code is 200, because the result of the function would be equal to 2.
To compare two fields, one way would be to use the equality function in the predicate:
"filters": {
"field": {
"function": "eq",
"args": [
"crawl.20210102.http_code",
"crawl.20201201.http_code"
]
},
"predicate": "eq",
"value": true
}
So that the function returns true
and we filter on rows where the result is true
.
Find all available functions in the related section.
Boolean filters
In order to compound filters, BQL supports the boolean filters and
and or
, which can be nested. For example:
"filters": {
"or": [
{
"field": "url",
"predicate": "contains",
"value": "mobile"
},
{
"and": [
{
"field": "protocol",
"predicate": "eq",
"value": "http"
},
{
"field": "host",
"predicate": "eq",
"value": "m.example.com"
}
]
}
]
}
Negative filters
To negate a filter and exclude rows that match this condition, one can encapsulate the nested filter in a not
:
"filters": {
"not": {
"field": "url",
"predicate": "eq",
"value": "example.com/a"
}
}
Available predicates
An exhaustive list of available predicates can be found in Predicates.
Filters and post filters
BQL queries contain two keys to filter: filters
and post_filters
. Both can used in the same query, but they have a different behavior when filtering on metrics.
In short,
filters
are used to filter on dimensions and metricspost_filters
are used to filter on metrics aggregated by the requested dimensions
It is recommended to filter using filters
, which generally matches the use case.
Filtering on a dimension has the same effect using filters
and post_filters
. Filtering a metric has a different behavior.
Let's illustrate through an example, filtering on RealKeywords data:
{
"dimensions": ["country"],
"metrics": ["search_console.period_0.count_clicks"],
"filters": {
"and": [
{
"field": "keyword",
"predicate": "contains",
"value": "botify"
},
{
"field": "search_console.period_0.count_clicks",
"predicate": "gt",
"value": 10
}
]
},
"post_filters": {
"field": "search_console.period_0.count_clicks",
"predicate": "gt",
"value": 500
}
}
We query the number of clicks for each country.
For filters
:
- we filter only on keywords containing
botify
, and - we filter on rankings (URL + keyword couples) in each country that have received more than 10 clicks
For post_filters
:
- we filter on countries that received more than 500 clicks
Since count_clicks
is a metric, we require an aggregation. The filters
key filters on the metric aggregated on intermediate dimensions, whereas post_filters
filters on the metric aggregated by the final dimensions.
These intermediate dimensions often are a combination of the final dimensions + URL + keyword when available. These intermediate dimensions are what allows BQL to cross data from different collections.
Updated over 1 year ago