DocumentsΒΆ

The list of all indexed documents can be reached from the Administration interface, by clicking on Documents. Regular expressions can be used in the search bar to match URLs or page titles.

_images/documents_list.png

The document page contains fields about the crawl status of the page:

StatusΒΆ

Shows if the document triggered an error during its last crawl.

ErrorΒΆ

The error tat was triggered during last crawl if any.

Crawl DTΒΆ

The interval before the next recrawl of the document.

Recursion remainingΒΆ

The number of recursion level remaining, when the matching policy crawls Depending on depth.

Rejected by robots.txtΒΆ

This indicates if the URL was not crawled due to a robots.txt rule. If necessary the robots.txt can be ignored in the Domain settings.

Too many redirectsΒΆ

Indicates if the page was not crawled due to too many redirection. The limit can be set in the configuration file.

Show on homepageΒΆ

When the browsable home option is enabled, this parameter can switch availability of the document from the homepage. (See Archiving)