Backends and connectors
Connectors
The various backends that allow you to get the data exactly where you want it is one of the strengths of Botify's data exports. The backend/connector is defined through the connector
key and the options and configuration are passed through the extra_config
key:
{
"connector": "<STRING>",
"extra_config": {...}
}
Available backends and their configuration
Direct Download
"connector": "direct_download"
This backend is always available for export. Botify will store the export and provide a link to download the file.
It's a great backend for testing your exports and their accuracy before automating them.
One limitation is an upper bound on export size limit when using this backend. By default, the limit is at one million data rows, but can be raised depending on your plan. Don't hesitate to contact your CSM about this limitation.
There are no options available on the direct download backend. All formatters can be used with this backend.
S3
"connector": "<UUID>"
This backend will allow to push the data directly to your AWS S3 bucket.
Available options:
{
"extra_config": {
"filename": "<STRING>",
"subdirectory": "<STRING>",
"push_helpers": BOOLEAN
}
}
filename
: The file name. By default:data.EXT.gz
,EXT
depending on the chosen formatter. Can contain context variables.filetype
: The compressed file format. You only need to include this for .ZIP files; you do not need to include this for .GZ files, the default format.subdirectory
: The directory in which we want to store the file. By default empty. Can contain context variables.push_helpers
: Also create helper files on the backend when the formatter enables them. See Formatters.
All formatters can be used with this backend.
We don't expose any API yet to create a connector. To create this connector, please contact your CSM. Some specific permissions are needed on the bucket for us to export to your bucket.
Google Cloud Storage
"connector": "<UUID>"
This backend lets us push the data directly to your Google Cloud Storage bucket.
Available options:
{
"extra_config": {
"filename": "<STRING>",
"filetype": "zip",
"subdirectory": "<STRING>",
"push_helpers": BOOLEAN
}
}
filename
: The file name, by default:data.EXT.gz
,EXT
depending on the chosen formatter. Can contain context variables.filetype
: The compressed file format. You only need to include this for .ZIP files; you do not need to include this for .GZ files, the default format.subdirectory
: The directory in which we want to store the file. By default empty. Can contain context variables.push_helpers
: Also create helper files on the backend when the formatter enables them. See Formatters.
All formatters can be used with this backend.
We don't expose any API yet to create a connector. To create this connector, please contact your CSM. Some specific permissions are needed on the bucket for us to export to your bucket.
List your available backends
GET https://api.botify.com/v1/connectors/USERNAME
Using the same authentication method as usual, this endpoint will list your available backends.
Example response:
{
"count": 2,
"next": null,
"previous": null,
"results": [
{
"id": "12345678-ABCD-1234-ABCD-1234ABCD5678EF00",
"type": "s3",
"name": "s3://some.botify.bucket.export"
},
{
"id": "direct_download",
"type": "direct_download",
"name": "Direct download"
}
]
}
And you will be able to specify the id
as connector
in your export job.
Context variables
These are dynamic variables that can be used in filenames and directory paths, which adapt to the context of the export.
For example, exporting to a backend that supports custom filenames and subdirectories, one could use the following:
{
"extra_config": {
"subdirectory": "$crawl.20210102.year-$crawl.20210102.month_2digits",
"filename": "botify-$crawl.20210102.day_2digits.csv.gz"
}
}
to create the export 2021-01/botify-02.csv.gz
.
The following example uses the project: botify-team/botify-blog
that has a crawl on January 2, 2021.
The crawl_collection_slug corresponds to crawl.20210102
. For more information, see Collections and periods.
Context variable | Description | Example |
---|---|---|
user | The username associated to the project. | botify-team |
project | The project slug. | botify-blog |
crawl_collection_slug | The analysis slug | 20210102 |
crawl_collection_slug.date | The analysis date | 2021-01-02 |
crawl_collection_slug.day | The analysis day | 2 |
crawl_collection_slug.day_2digits | The analysis day on 2 digits | 02 |
crawl_collection_slug.month | The analysis month. | 1 |
crawl_collection_slug.month_2digits | The analysis month on 2 digits. | 01 |
crawl_collection_slug.year | The analysis year. | 2021 |
crawl_collection_slug.week_number | The analysis ISO week number. | 53 |
crawl_collection_slug.date_next_week | The analysis date one week after. | 2021-01-09 |
crawl_collection_slug.day_next_week | The analysis day one week after. | 9 |
crawl_collection_slug.day_2digits_next_week | The analysis day one week after on 2 digits. | 09 |
crawl_collection_slug.month_next_week | The analysis month one week after. | 1 |
crawl_collection_slug.month_2digits_next_week | The analysis month one week after on 2 digits. | 01 |
crawl_collection_slug.year_next_week | The analysis year one week after. | 2021 |
crawl_collection_slug.week_number_next | The analysis ISO week number one week after. | 1 |
Updated over 1 year ago
Discover how to use the backends, and the available formatters: