I am trying to download a certain file (named 010010-99999-year.gz) from an FTP server. This same file, but for different years is residing in different FTP directories. For instance:
ftp://ftp.ncdc.noaa.gov/pub/data/noaa/isd-lite/2000/010010-99999-1973.gz
ftp://ftp.ncdc.noaa.gov/pub/data/noaa/isd-lite/2001/010010-99999-1974.gz
and so on. The picture illustrates one of the directories:
The file is not located in all the directories (i.e. all years). In such case I want the script to ignore that missing files, print "not available", and continue with the next directory (i.e. next year). I could do this using the NLST listing by first generating a list of files in the current FTP directory and then checking if my file is on that list, but that is slow, and NOAA (the organization owning the server) does not like file listing (source). Therefore I came up with this code:
def FtpDownloader2(url="ftp.ncdc.noaa.gov"):
ftp=FTP(url)
ftp.login()
for year in range(1901,2015):
ftp.cwd("/pub/data/noaa/isd-lite")
ftp.cwd(str(year))
fullStationId="010010-99999-%s.gz" % year
try:
file=open(fullStationId,"wb")
ftp.retrbinary('RETR %s' % fullStationId, file.write)
print("File is available")
file.close()
except:
print("File not available")
ftp.close()
This downloads the existing files (year 1973-2014) correctly, but it is also generating empty files for years 1901-1972. The file is not in the FTP for 1901-1972. Am I doing anything wrong in the use of try and except, or is it some other issue?