I have a DataFrame common_ips
containing IPs as shown below.
I need to achieve two basic tasks:
- Identify private and public IPs.
- Check organisation for public IPs.
Here is what I am doing:
import json
import urllib
import re
baseurl = 'http://ipinfo.io/' # no HTTPS supported (at least: not without a plan)
def isIPpublic(ipaddress):
return not isIPprivate(ipaddress)
def isIPprivate(ipaddress):
if ipaddress.startswith("::ffff:"):
ipaddress=ipaddress.replace("::ffff:", "")
# IPv4 Regexp from https://stackoverflow.com/questions/30674845/
if re.search(r"^(?:10|127|172\.(?:1[6-9]|2[0-9]|3[01])|192\.168)\..*", ipaddress):
# Yes, so match, so a local or RFC1918 IPv4 address
return True
if ipaddress == "::1":
# Yes, IPv6 localhost
return True
return False
def getipInfo(ipaddress):
url = '%s%s/json' % (baseurl, ipaddress)
try:
urlresult = urllib.request.urlopen(url)
jsonresult = urlresult.read() # get the JSON
parsedjson = json.loads(jsonresult) # put parsed JSON into dictionary
return parsedjson
except:
return None
def checkIP(ipaddress):
if (isIPpublic(ipaddress)):
if bool(getipInfo(ipaddress)):
if 'bogon' in getipInfo(ipaddress).keys():
return 'Private IP'
elif bool(getipInfo(ipaddress).get('org')):
return getipInfo(ipaddress)['org']
else:
return 'No organization data'
else:
return 'No data available'
else:
return 'Private IP'
And applying it to my common_ips
DataFrame with
common_ips['Info'] = common_ips.IP.apply(checkIP)
But it's taking longer than I expected. And for some IPs, it's giving incorrect Info.
For instance:
where it should have been AS19902 Department of Administrative Services
as I cross-checked it by
and
What am I missing here ? And how can I achieve these tasks in a more Pythonic way ?