2

The script has quite a simple purpose of checking if a series of websites are up and running. I tried with urllib but I get a certificate error.

Using http.client and tunneling via proxy seems to return a weird output up until a website were it crashes due to the [SSL: UNKNOWN_PROTOCOL] error.

The 2 problems that I have are:

  1. I cannot understand why for a said website I get a 404 response although the website is working if I check in the browser.

  2. At some point (when I check another website), I get the "ssl.SSLError: [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:777)"

The code:

import http.client, csv

my_file = open('active_site.csv')
my_reader = csv.reader(my_file)
my_data = list(my_reader)
my_len = len(my_data)

g = 1
while g < 10:
    print("Checking {}....\n".format(my_data[g][3]))
    conn = http.client.HTTPSConnection("My_Proxy", my_port)
    conn.set_tunnel(my_data[g][3])
    conn.request("HEAD", "/index.html")
    res = conn.getresponse()
    if res.status == 200:
        print("{} is online!".format(my_data[g][3]))
        g += 1
        conn.close()
    else:
        print("{} seems to be offline".format(my_data[g][3]))
        g += 1
        conn.close()

I appreciate any advice on where I am messing things up and/or incomplete code.

Robert
  • 521
  • 2
  • 8
  • 14

1 Answers1

0

@Robert ,

For 1, the main reason for this behavior is session information is required by the server. That information could be an authorization token or a cookie. Check access to the same URL using the browser in incognito mode if it still working. If the request returns 404, cookies and headers are the reason. Inspect cookies and headers in the browser in normal mode and try to use them in your HTTPSConnection.

For 2. I guess it is because your server is using TTLs v3.0 for HTTPS. Try to use python 3.8 which is enabled for this version. Check: https://docs.python.org/3/library/http.client.html#http.client.HTTPSConnection