I am trying to crawl some domains with different user-agents. My crawler works fins, the problem happens when a domain does not have an SSL certificate and is insecure, in that case, I do not get any response with HttpClient
. To skip that I use HttpHandler
and set the certificate myself.
With this solution I get 301 for all those domains, it feels like my AllowAutoRedirect
is false however it is not. I tried and assigned MaxAutomaticRedirections to 5, that did not work as well.
Here is my code:
public Task<int> Crawl(string userAgent, string url)
{
var handler = new HttpClientHandler();
handler.ClientCertificateOptions = ClientCertificateOption.Manual;
handler.ServerCertificateCustomValidationCallback =
(httpRequestMessage, cert, cetChain, policyErrors) =>
{
return true;
};
var httpClient = new HttpClient(handler);
httpClient.DefaultRequestHeaders.Add("User-Agent", userAgent);
var statusCode = (int)(await httpClient.SendAsync(new HttpRequestMessage(HttpMethod.Get, URL))).StatusCode;
return statusCode;
}