Export my RealKeywords data
In this section, we will see how to export one million raw data rows from your RealKeywords integration from January 2021.
1. Get your configuration
We will need 3 pieces of information to run the export. All can be gathered by following the guide in Getting started.
- the
username
andproject_slug
, which identifies the project we are targeting - your
API Token
, which is used to identify you
In the rest of the tutorial, we will consider these values:
username
:botify-team
project_slug
:botify-blog
API Token
:123abc
The period of dates we will use in the snippets below is from January 1st 2021 to January 31st 2021. If you want to export data for another period of time, you can change the periods
key and select a period for which data is available on your project.
2. The BQL query
This section is the BQL Query that we will run in order to fetch crawl data.
This query will fetch for each combination of URL, keyword, device, country for each day of data:
- the number of clicks
- the number of impressions
- the average position
- the Clickthrough rate (CTR)
- the number of missed clicks
- whether this combination of dimensions is new on this day or not
{
"collections": [
"search_console"
],
"periods": [
["2021-01-01", "2021-01-31"]
],
"query": {
"dimensions": [
"url",
"keyword",
"device",
"country",
"search_console.period_0.date"
],
"metrics": [
"search_console.period_0.count_clicks",
"search_console.period_0.count_impressions",
"search_console.period_0.avg_position",
"search_console.period_0.ctr",
"search_console.period_0.count_missed_clicks",
"search_console.period_0.is_new"
],
"sort": [
{
"index": 0,
"type": "metrics",
"order": "desc"
}
]
}
}
This query should give you a good overview of your keywords data and first million combinations.
Feel free to remove dimensions in order to aggregate your data by only a subset of those.
3. Execute the API call
To launch the export, you will need to run the HTTP request to our servers.
You should be able to import the cURL command below into an HTTP tool if you use one.
Use your own configuration
Don't forget to replace
--header 'Authorization: Token 123abc'
by your own API token value. Replace123abc
"username": "botify-team",
by the project's username. Replacebotify-team
"project": "botify-blog",
by your project slug. Replacebotify-blog
curl --location --request POST 'https://api.botify.com/v1/jobs' \
--header 'Authorization: Token 123abc' \
--header 'Content-Type: application/json' \
--data-raw '{
"job_type": "export",
"payload": {
"username": "botify-team",
"project": "botify-blog",
"connector": "direct_download",
"formatter": "csv",
"formatter_config": {
"print_header": true
},
"export_size": 5000,
"query": {
"collections": ["search_console"],
"periods": [
["2021-01-01", "2021-01-31"]
],
"query": {
"dimensions": [
"url",
"keyword",
"device",
"country",
"search_console.period_0.date"
],
"metrics": [
"search_console.period_0.count_clicks",
"search_console.period_0.count_impressions",
"search_console.period_0.avg_position",
"search_console.period_0.ctr",
"search_console.period_0.missed_clicks"
],
"sort": [
{
"index": 0,
"type": "metrics",
"order": "desc"
}
]
}
}
}
}'
If the export was launched correctly, you should get a response like
{
"job_id": 99999,
"job_type": "export",
"job_url": "/v1/jobs/99999",
"job_status": "CREATED",
"payload": {...},
"results": null,
"date_created": "2021-03-15T16:45:48.110189Z",
"user": "botify-team",
"metadata": null
}
with the explicit payload.
If the job_status
is CREATED
, the job was created successfully 🎉
The information you will need here is the job_id
: 99999.
We will use it to fetch the jobs status.
4. Fetch the job status
Now that the job is in the pipeline, we will fetch it's status until it is done.
For more details, see Export job reference.
We will send a GET request using the job_id
from the previous response.
curl --location --request GET 'https://api.botify.com/v1/jobs/99999' \
--header 'Authorization: Token 123abc'
Which will return something like:
{
"job_id": 99999,
"job_type": "export",
"job_url": "/v1/jobs/99999",
"job_status": "DONE",
"results": {
"nb_lines": 956,
"download_url": "https://d121xa69ioyktv.cloudfront.net/collection_exports/a/b/c/abcdefghik987654321/botify-2021-03-15.csv.gz"
},
"date_created": "2021-03-15T16:45:48.110189Z",
"payload": {...},
"user": "botify-team",
"metadata": null
}
If the job_status
is PROCESSING
, wait a bit and run the same request until the status switches to DONE
.
5. Fetch the results
Once the job is done, the results
object will have a download_url
field. The URL links directly to your exported SEO data. Download it by accessing the given link.
6. Extract the result
Once the file downloaded, one might notice that the file ends with .csv.gz
. The data is compressed. Software on your Operating System should be able to extract the CSV file.
For more options about the data export options, see Export your data and it's subsections. Existing options are connecting this kind of export directly to your storage system through a connectors.
Updated almost 4 years ago