Collections
Overview
A collection is a source of data in Botify. Each collection exposes a set of fields that can be used as metrics, dimensions, and/or filters.
Current Collections
The following collections are currently in Botify, though we constantly add new collections as we bring more data into Botify! Refer to the section below to find the collections to which you have access in your project.
Collection Name | Description |
---|---|
conversion | Conversion data |
conversion.dip | Google Analytics conversion data (data integration platform) |
crawl.YYYYMMDD | Crawl data, where YYYYMMDD is the crawl slug |
paid_search.ga4.dip | GA4 paid search data (data integration platform) |
QueryMaskML.YYYYMMDD | Currently contains only the field is_landing |
search_console | Google Search Console data |
search_console_by_property | Google Search Console data by website |
searchenginesorphans._YYYYMMDD_ | Search engine orphan URLs, where YYYYMMDD is the crawl slug |
semrush_domain_organic | Semrush keyword metrics |
sitemaps | Sitemap data |
trended_crawls | Trended crawl data |
visits.adobe | Adobe Analytics visit data |
visits.atinternet | Piano Analytics visit data |
visits.atinternet_airbyte | Piano Analytics visit data by Airbyte |
visits.dip | GA4 Analytics visit data (data integration platform) |
visits.ganalytics | Google Analytics visit data |
visits.ganalytics_premium | Google Analytics 360 visit data |
web_vitals.field_data | Core Web Vitals field data |
web_vitals.field_data_by_origin | Core Web Vitals field data origin summary |
Note: There are many visit providers and collections, but each project can only access one visit collection, depending on the selected provider.
Your Available Collections
Each time a configuration is made on a project, a collection might become available. For instance, if you set a CrUX API Key in Data Station, the web_vitals.field_data
will become available on your project.
There are two methods to find the collections that are available on your project:
Collections Explorer
The Collections Explorer spreadsheet template allows you to retrieve your project's collections, metrics, and dimensions in one location. You must have access to Google Sheets to use this spreadsheet.
- Access and make a copy of the Collections Explorer spreadsheet. Do not alter the spreadsheet, including any hidden cells or sheets.
- Navigate to your Botify account and copy your API token to the system clipboard.
- In the spreadsheet, paste your API token in cell B1. The projects that belong to your user account will populate the cell in the next row after a few seconds.
To protect your API token, don't share your cloned spreadsheet with anyone.
- Expand the dropdown list in cell B3 and select the desired project. The project collections that belong to your user account will populate the cell in the next row after a few seconds.
- Expand the dropdown list in cell B4 and select the desired collection.
Validation in the Collections Explorer will prevent you from entering invalid information. If you encounter an error, clear the cell contents before trying again. If you encounter an error when cloning the spreadsheet or adding your API token, it should resolve after you select a project.
Query
Using the information in Getting started, construct the following query:
curl --location --request GET 'https://api.botify.com/v1/projects/<USERNAME>/<PROJECT_SLUG>/collections' \
--header 'Authorization: Token <API_TOKEN>'
which will return a list of collections:
[
{
"id": "global",
"name": "URL Scheme and Segmentation",
"date": "2021-03-05",
"timestamped": false
},
{
"id": "crawl.20210302",
"name": "2021 Mar. 2nd",
"date": "2021-03-02",
"timestamped": false
},
{
"id": "crawl.20210223",
"name": "2021 Feb. 23rd",
"date": "2021-02-23",
"timestamped": false
},
{
"id": "search_console",
"name": "Search Console",
"date": "2021-03-05",
"timestamped": true,
"date_start": "2018-03-17",
"date_end": "2021-03-02"
}
]
- The Global collection on March 5th 2021
- A Crawl collection on March 2nd 2021
- Another Crawl collection on February 23rd 2021
- Search Console collectiondata from March 17th 2018 to March 2nd 2021
Timestamped collections
Notice the collections in the response above contain a timestamped
key. A timestamped collection contains continuous data on a period, like the search console collection, updated daily to ingest the latest data from Google Search Console. In contrast, a non-timestamped collection represents a data snapshot at a certain moment, like the Crawl collection, which is a snapshot of a website at a certain time.
The Periods page provides more details on targeting specific date intervals.
Timestamped collections require at least one period.
Examples
Refer to the following pages for examples of non-timestamped collections (crawl) and timestamped collections (search_console):
To select collections for a BQL query, specify a list of collections to prepare a query that joins SiteCrawler data with RealKeywords data:
{
"collections": [
"crawl.20210102",
"search_console"
],
...
}
Updated 9 months ago