Backends and connectors

Connectors

One of the strengths of Botify's data exports are the various backends that will allow you get the data exactly where you want it.
The backend/connector is defined through the connector key, and the options and configuration is passed through the extra_config key:

{
  "connector": "<STRING>",
  "extra_config": {...}
}

Available backends and their configuration

Direct Download

"connector": "direct_download"

This backend is always available for an export.
The idea is that Botify will store the export for you, and provide you with a link to download the file.

It's a great backend for testing your exports and their accuracy before automating them.

One limitation is an upper bound on export size limit when using this backend. By default, the limit is at one million data rows, but can be raised depending on your plan. Don't hesitate to contact your CSM about this limitation.

No options available on the direct download backend.
All formatters can be used with this backend.

S3

"connector": "<UUID>"

This backend will allow to push the data directly to your AWS S3 bucket.

Available options:

{
  "extra_config": {
    "filename": "<STRING>",
    "subdirectory": "<STRING>",
    "push_helpers": BOOLEAN
  }
}
  • filename: the filename . By default: data.EXT.gz, EXT depending on the chosen formatter. Can contain context variables.
  • subdirectory: the directory in which we want to store the file. By default empty. Can contain context variables.
  • push_helpers: also create helper files on the backend when the formatter enables them. See Formatters.

All formatters can be used with this backend.

We don't expose any API yet to create a connector. To create this connector, please contact your CSM. Some specific permissions are needed on the bucket in order to be able for us to export to your bucket.

Google Cloud Storage

"connector": "<UUID>"

This backend will allow to push the data directly to your Google Cloud Storage bucket.

Available options:

{
  "extra_config": {
    "filename": "<STRING>",
    "subdirectory": "<STRING>",
    "push_helpers": BOOLEAN
  }
}
  • filename: the filename . By default: data.EXT.gz, EXT depending on the chosen formatter. Can contain context variables.
  • subdirectory: the directory in which we want to store the file. By default empty. Can contain context variables.
  • push_helpers: also create helper files on the backend when the formatter enables them. See Formatters.

All formatters can be used with this backend.

We don't expose any API yet to create a connector. To create this connector, please contact your CSM. Some specific permissions are needed on the bucket in order to be able for us to export to your bucket.

List your available backends

GET https://api.botify.com/v1/connectors/USERNAME
Using the same authentication method as usual, this endpoint will list your available backends.
Example response:

{
    "count": 2,
    "next": null,
    "previous": null,
    "results": [
        {
            "id": "12345678-ABCD-1234-ABCD-1234ABCD5678EF00",
            "type": "s3",
            "name": "s3://some.botify.bucket.export"
        },
        {
            "id": "direct_download",
            "type": "direct_download",
            "name": "Direct download"
        }
    ]
}

And you will be able to specify the id as connector in your export job.

Context variables

Those are dynamic variables that can be used in filenames and directory paths, and which adapt to the context of the export.

For example, exporting to a backend that supports custom filenames and subdirectories, one could do:

{
  "extra_config": {
    "subdirectory": "$crawl.20210102.year-$crawl.20210102.month_2digits",
    "filename": "botify-$crawl.20210102.day_2digits.csv.gz"
  }
}

would create the export 2021-01/botify-02.csv.gz.

We will take as example the project: botify-team/botify-blog that has a crawl on January 2 2021.
The crawl_collection_slug would correspond to crawl.20210102. For more information, see Collections and periods.

Context variable

Description

Example

user

The username associated to the project.

botify-team

project

The project slug.

botify-blog

crawl_collection_slug

The analysis slug

20210102

crawl_collection_slug.date

The analysis date

2021-01-02

crawl_collection_slug.day

The analysis day

2

crawl_collection_slug.day_2digits

The analysis day on 2 digits

02

crawl_collection_slug.month

The analysis month.

1

crawl_collection_slug.month_2digits

The analysis month on 2 digits.

01

crawl_collection_slug.year

The analysis year.

2021

crawl_collection_slug.week_number

The analysis ISO week number.

53

crawl_collection_slug.date_next_week

The analysis date one week after.

2021-01-09

crawl_collection_slug.day_next_week

The analysis day one week after.

9

crawl_collection_slug.day_2digits_next_week

The analysis day one week after on 2 digits.

09

crawl_collection_slug.month_next_week

The analysis month one week after.

1

crawl_collection_slug.month_2digits_next_week

The analysis month one week after on 2 digits.

01

crawl_collection_slug.year_next_week

The analysis year one week after.

2021

crawl_collection_slug.week_number_next

The analysis ISO week number one week after.

1


What’s Next

Discover how to use the backends, and the available formatters:

Did this page help you?