Filtering

Filters in BQL are composable so that any complex condition can be expressed.
This section covers the multiple ways of filtering that are available. Additionally, we cover the differences between filters and post_filters.

Types and formats

Predicate filters

The most common filter is the predicate filter, which is used to filter on a specific field against a specific value.

"filters": {
  "field": "url",
  "predicate": "eq",
  "value": "example.com/a"
}

The list of available predicates can be found in Predicates.

The field can contain a plain field slug, or also any function which can also be used a dimension. For instance:

"filters": {
  "field": {
    "function": "div",
    "args": ["crawl.20210102.http_code", 100]
  },
  "predicate": "eq",
  "value": 2
}

This filter would return rows when the HTTP code is 200, because the result of the function would be equal to 2.

To compare two fields, one way would be to use the equality function in the predicate:

"filters": {
  "field": {
    "function": "eq",
    "args": [
      "crawl.20210102.http_code",
      "crawl.20201201.http_code"
    ]
  },
  "predicate": "eq",
  "value": true
}

So that the function returns true and we filter on rows where the result is true.

Find all available functions in the related section.

Boolean filters

In order to compound filters, BQL supports the boolean filters and and or, which can be nested. For example:

"filters": {
  "or": [
    {
      "field": "url",
      "predicate": "contains",
      "value": "mobile"
    },
    {
      "and": [
        {
          "field": "protocol",
          "predicate": "eq",
          "value": "http"
        },
        {
          "field": "host",
          "predicate": "eq",
          "value": "m.example.com"
        }
      ]
    }
  ]
}

Negative filters

To negate a filter and exclude rows that match this condition, one can encapsulate the nested filter in a not:

"filters": {
  "not": {
    "field": "url",
    "predicate": "eq",
    "value": "example.com/a"
  }
}

Available predicates

An exhaustive list of available predicates can be found in Predicates.

Filters and post filters

BQL queries contain two keys to filter: filters and post_filters. Both can used in the same query, but they have a different behavior when filtering on metrics.
In short,

  • filters are used to filter on dimensions and metrics
  • post_filters are used to filter on metrics aggregated by the requested dimensions

It is recommended to filter using filters, which generally matches the use case.

Filtering on a dimension has the same effect using filters and post_filters. Filtering a metric has a different behavior.
Let's illustrate through an example, filtering on RealKeywords data:

{
  "dimensions": ["country"],
  "metrics": ["search_console.period_0.count_clicks"],
  "filters": {
    "and": [
      {
        "field": "keyword",
        "predicate": "contains",
        "value": "botify"
      },
      {
        "field": "search_console.period_0.count_clicks",
        "predicate": "gt",
        "value": 10
      }
    ]
  },
  "post_filters": {
    "field": "search_console.period_0.count_clicks",
    "predicate": "gt",
    "value": 500
  }
}

We query the number of clicks for each country.
For filters:

  • we filter only on keywords containing botify, and
  • we filter on rankings (URL + keyword couples) in each country that have received more than 10 clicks

For post_filters:

  • we filter on countries that received more than 500 clicks

Since count_clicks is a metric, we require an aggregation. The filters key filters on the metric aggregated on intermediate dimensions, whereas post_filters filters on the metric aggregated by the final dimensions.
These intermediate dimensions often are a combination of the final dimensions + URL + keyword when available. These intermediate dimensions are what allows BQL to cross data from different collections.