SiteCrawler - crawl collection
The crawl collection corresponds to one analysis that is available in SiteCrawler. It is a non-timestamped collection.
We will list here a subset of available fields in the crawl collection.
Identifier
{
"collections": ["crawl.YYYYMMDD"],
...
}
Indexability fields
Name | Slug | Type |
---|---|---|
Is Indexable |
| Boolean |
Non-Indexable Reason is Non-Self Canonical Tag |
| Boolean |
Non-Indexable Reason is Noindex Status |
| Boolean |
Non-Indexable Reason is Non-200 HTTP Status Code |
| Boolean |
Non-Indexable Reason is Bad Content-Type |
| Boolean |
Crawl fields
Name | Slug | Type |
---|---|---|
Depth |
| Integer |
HTTP Status Code |
| Integer |
Content Type |
| String |
Content Byte Size |
| Int |
Delay First Byte Received |
| Integer |
Delay Total |
| Integer |
Date Crawled |
| Datetime |
Many more fields are available and can be explored in Botify.
Updated about 1 year ago