SiteCrawler - crawl collection
The crawl collection corresponds to one analysis that is available in SiteCrawler. It is a non-timestamped collection.
We will list here a subset of available fields in the crawl collection.
Identifier
{
"collections": ["crawl.YYYYMMDD"],
...
}
Indexability fields
Name | Slug | Type |
---|---|---|
Is Indexable | crawl.20210411.indexable.is_indexable | Boolean |
Non-Indexable Reason is Non-Self Canonical Tag | crawl.20210411.indexable.reason.canonical | Boolean |
Non-Indexable Reason is Noindex Status | crawl.20210411.indexable.reason.noindex | Boolean |
Non-Indexable Reason is Non-200 HTTP Status Code | crawl.20210411.indexable.reason.http_code | Boolean |
Non-Indexable Reason is Bad Content-Type | crawl.20210411.indexable.reason.content_type | Boolean |
Crawl fields
Name | Slug | Type |
---|---|---|
Depth | crawl.20210411.depth | Integer |
HTTP Status Code | crawl.20210411.http_code | Integer |
Content Type | crawl.20210411.content_type | String |
Content Byte Size | crawl.20210411.byte_size | Int |
Delay First Byte Received | crawl.20210411.delay_first_byte | Integer |
Delay Total | crawl.20210411.delay_last_byte | Integer |
Date Crawled | crawl.20210411.date_crawled | Datetime |
Many more fields are available and can be explored in Botify.
Updated almost 2 years ago