Backends and connectors

Connectors

One of the strengths of Botify's data exports are the various backends that will allow you get the data exactly where you want it.
The backend/connector is defined through the connector key, and the options and configuration is passed through the extra_config key:

{
  "connector": "<STRING>",
  "extra_config": {...}
}

Available backends and their configuration

Direct Download

"connector": "direct_download"

This backend is always available for an export.
The idea is that Botify will store the export for you, and provide you with a link to download the file.

It's a great backend for testing your exports and their accuracy before automating them.

One limitation is an upper bound on export size limit when using this backend. By default, the limit is at one million data rows, but can be raised depending on your plan. Don't hesitate to contact your CSM about this limitation.

No options available on the direct download backend.
All formatters can be used with this backend.

S3

"connector": "<UUID>"

This backend will allow to push the data directly to your AWS S3 bucket.

Available options:

{
  "extra_config": {
    "filename": "<STRING>",
    "subdirectory": "<STRING>",
    "push_helpers": BOOLEAN
  }
}
  • filename: the filename . By default: data.EXT.gz, EXT depending on the chosen formatter. Can contain context variables.
  • subdirectory: the directory in which we want to store the file. By default empty. Can contain context variables.
  • push_helpers: also create helper files on the backend when the formatter enables them. See Formatters.

All formatters can be used with this backend.

We don't expose any API yet to create a connector. To create this connector, please contact your CSM. Some specific permissions are needed on the bucket in order to be able for us to export to your bucket.

Google Cloud Storage

"connector": "<UUID>"

This backend will allow to push the data directly to your Google Cloud Storage bucket.

Available options:

{
  "extra_config": {
    "filename": "<STRING>",
    "subdirectory": "<STRING>",
    "push_helpers": BOOLEAN
  }
}
  • filename: the filename . By default: data.EXT.gz, EXT depending on the chosen formatter. Can contain context variables.
  • subdirectory: the directory in which we want to store the file. By default empty. Can contain context variables.
  • push_helpers: also create helper files on the backend when the formatter enables them. See Formatters.

All formatters can be used with this backend.

We don't expose any API yet to create a connector. To create this connector, please contact your CSM. Some specific permissions are needed on the bucket in order to be able for us to export to your bucket.

List your available backends

GET https://api.botify.com/v1/connectors/USERNAME
Using the same authentication method as usual, this endpoint will list your available backends.
Example response:

{
    "count": 2,
    "next": null,
    "previous": null,
    "results": [
        {
            "id": "12345678-ABCD-1234-ABCD-1234ABCD5678EF00",
            "type": "s3",
            "name": "s3://some.botify.bucket.export"
        },
        {
            "id": "direct_download",
            "type": "direct_download",
            "name": "Direct download"
        }
    ]
}

And you will be able to specify the id as connector in your export job.

Context variables

Those are dynamic variables that can be used in filenames and directory paths, and which adapt to the context of the export.

For example, exporting to a backend that supports custom filenames and subdirectories, one could do:

{
  "extra_config": {
    "subdirectory": "$crawl.20210102.year-$crawl.20210102.month_2digits",
    "filename": "botify-$crawl.20210102.day_2digits.csv.gz"
  }
}

would create the export 2021-01/botify-02.csv.gz.

We will take as example the project: botify-team/botify-blog that has a crawl on January 2 2021.
The crawl_collection_slug would correspond to crawl.20210102. For more information, see Collections and periods.

Context variableDescriptionExample
userThe username associated to the project.botify-team
projectThe project slug.botify-blog
crawl_collection_slugThe analysis slug20210102
crawl_collection_slug.dateThe analysis date2021-01-02
crawl_collection_slug.dayThe analysis day2
crawl_collection_slug.day_2digitsThe analysis day on 2 digits02
crawl_collection_slug.monthThe analysis month.1
crawl_collection_slug.month_2digitsThe analysis month on 2 digits.01
crawl_collection_slug.yearThe analysis year.2021
crawl_collection_slug.week_numberThe analysis ISO week number.53
crawl_collection_slug.date_next_weekThe analysis date one week after.2021-01-09
crawl_collection_slug.day_next_weekThe analysis day one week after.9
crawl_collection_slug.day_2digits_next_weekThe analysis day one week after on 2 digits.09
crawl_collection_slug.month_next_weekThe analysis month one week after.1
crawl_collection_slug.month_2digits_next_weekThe analysis month one week after on 2 digits.01
crawl_collection_slug.year_next_weekThe analysis year one week after.2021
crawl_collection_slug.week_number_nextThe analysis ISO week number one week after.1

What’s Next

Discover how to use the backends, and the available formatters: