0

I have code written out, have tested the first bit. (The logging into website) but I am trying to add on a screen scraping part into the code and am having a bit of trouble getting the result that I want. When I run the code I get "None" im unsure what is causing this. I think it is due to me maybe not having the right attribute that it is trying to scrape.

    import requests
import urllib2
from bs4 import BeautifulSoup

with requests.session() as c:
    url = 'https://signin.acellus.com/SignIn/index.html'
    USERNAME = 'My user name'
    PASSWORD = 'my password'
    c.get(url)
    login_data = dict(Name=USERNAME, Psswrd=PASSWORD, next='/')
    c.post(url, data=login_data, headers={"Referer": "https://www.acellus.com/"})
    page = c.get('https://admin252.acellus.com/StudentFunctions/progress.html?ClassID=326')


quote_page = 'https://admin252.acellus.com/StudentFunctions/progress.html?ClassID=326'
page = urllib2.urlopen(quote_page)
soup = BeautifulSoup(page, 'html.parser')
price_box = soup.find('div', attrs={'class':'Object7069'})
price = price_box
print price

This is a screenshot of the "inspect element" of the data I want to screen scrape

Kyle
  • 11
  • 2
  • 1
    I'm confused; you get page using requests (while logged in); but then get it again using urllib2 in which you don't log in... did you check whether the second one redirected you to a login page? – Foon Jan 09 '18 at 23:01
  • You create a request session, login, and then close it. – SuperStew Jan 09 '18 at 23:11
  • Sorry this probably sounds like a stupid question but how would I check if it redirected me to a login page? – Kyle Jan 10 '18 at 02:05

1 Answers1

0

I don't think using requests and urllib2 to log in is a good idea. There is mechanize module for python2.x using which you could log in through forms and retrieve content. Here is how your code would look like.

import mechanize
from bs4 import BeautifulSoup

# logging in...
br = mechanize.Browser()
br.set_handle_robots(False)
br.open("https://signin.acellus.com/SignIn/index.html")
br.select_form(nr=0)
br['AcellusID'] = 'your username'
br['Password'] = 'your password'
br.submit()

# parsing required information..
quote_page = 'https://admin252.acellus.com/StudentFunctions/progress.html?ClassID=326'
page = br.open(quote_page).read()
soup = BeautifulSoup(page, 'html.parser')
price_box = soup.find('div', attrs={'class':'Object7069'})
price = price_box
print price

Reference link: http://www.pythonforbeginners.com/mechanize/browsing-in-python-with-mechanize/

P.S: mechanize is only available for python2.x. If you wish to use python3.x, there are other options (Installing mechanize for python 3.4).

manji369
  • 186
  • 4
  • 16