Collections and periods

Collections

A collection is a source of data in the Botify Application. Each collection exposes a set of fields that can be used as metrics, dimension and/or filters.
The exhaustive list of potentially available collections is under this section:

To select collections for a BQL query, one can simply specify a list of collections:

{
  "collections": [
    "crawl.20210102",
    "search_console"
  ],
  ...
}

which prepares a query that will join SiteCrawler data with RealKeywords data.

Your available collections

There are two methods to find the collections that are available on your project:

Query

Using the information in Getting started, construct the following query:

curl --location --request GET 'https://api.botify.com/v1/projects/<USERNAME>/<PROJECT_SLUG>/collections' \
--header 'Authorization: Token <API_TOKEN>'

which will return a list of collections:

[
    {
        "id": "global",
        "name": "URL Scheme and Segmentation",
        "date": "2021-03-05",
        "timestamped": false
    },
    {
        "id": "crawl.20210302",
        "name": "2021 Mar. 2nd",
        "date": "2021-03-02",
        "timestamped": false
    },
    {
        "id": "crawl.20210223",
        "name": "2021 Feb. 23rd",
        "date": "2021-02-23",
        "timestamped": false
    },
    {
        "id": "search_console",
        "name": "Search Console",
        "date": "2021-03-05",
        "timestamped": true,
        "date_start": "2018-03-17",
        "date_end": "2021-03-02"
    }
]

which holds 4 collection: the global collection, two crawl collections and the search console collection:

  • a crawl on March 2nd 2021
  • another crawl on February 23rd 2021
  • search console data from March 17th 2018 to March 2nd 2021

Collections Explorer

The Collections Explorer spreadsheet template allows you to retrieve your project's collections, metrics, and dimensions in one location. You must have access to Google Sheets to use this spreadsheet.

  1. Access and make a copy of the Collections Explorer spreadsheet. Do not alter the spreadsheet, including any hidden cells or sheets.
  2. Navigate to your Botify account and copy your API token to the system clipboard.
  3. In the spreadsheet, paste your API token in cell B1. The projects that belong to your user account will populate the cell in the next row after a few seconds.

❗️

To protect your API token, don't share your cloned spreadsheet with anyone.

  1. Expand the dropdown list in cell B3 and select the desired project. The project collections that belong to your user account will populate the cell in the next row after a few seconds.
  2. Expand the dropdown list in cell B4 and select the desired collection.
The metrics and dimensions for the selected collection populate the rows below. To change to another project or collection, clear both cells B3 and B4 before making another selection.
14641464

📘

Validation in the Collections Explorer will prevent you from entering invalid information. If you encounter an error, clear the cell contents before trying again. If you encounter an error when cloning the spreadsheet or adding your API token, it should resolve after you select a project.

The global collection

The global collection cannot be queried directly. It is meant to help the Botify Application know which global fields are available. The concept of global fields is explained in Dimensions.

Timestamped collections

One might notice that the collections in the response above contain a timestamped key.
A timestamped collection contains continuous data on a period of time, like the search console collection, which is updated daily to ingest the latest data from the Google Search Console.
This compares to a non-timestamped collection which represents a data snapshot at a certain moment, like the crawl collection, which is a snapshot of a website at a certain point in time.

📘

Timestamped collections require at least one period.

Periods

The format of a period is a list of two strings that represent dates:

{
  ...
  "periods": [
    ["2021-01-01", "2021-01-31"],
    ["2021-02-01", "2021-02-28"]
  ],
  ...
}

In the above example, we define two periods. The first one represents January 2021, and the second one February 2021.
Those will be applied to all specified timestamped collections and will allow to query each period using an 0-based index.

For example:

{
  "collections": [
    "search_console",
    "conversion"
  ],
  "periods": [
    ["2021-01-01", "2021-01-31"],
    ["2021-02-01", "2021-02-28"]
  ],
  ...
}

Both collections are timestamped, and we define two periods. We are therefore able to query fields that are prefixed as:

  • search_console.period_0.field_slug => RealKeywords data from January 2021
  • search_console.period_1.field_slug => RealKeywords data from February 2021
  • conversion.period_0.field_slug => EngagementAnalytics data from January 2021
  • conversion.period_1.field_slug => EngagementAnalytics data from February 2021