Dealing with Captchas

User agent

By default, the crawlers send HTTP requests with a SOSSE User agent HTTP header this can sometime lead websites to flag the crawler as a robot and display a Captcha. To mitigate this, SOSSE can use the Fake user-agent library to simulate a real browser user agent. This can be achieved with the following options in the configuration file:

Cookies

The captcha can be manually validated in a browser, then cookies can be exported and imported in SOSSE, see the Cookies documentation.