1

I'd like to grab my medical summary page from the Stanford Health website https://myhealth.stanfordmedicine.org/myhealth/inside.asp?mode=download&view=true and dump it into a JSON file. However, I seem to be struggling with just getting past the login page.

Here's the code I've come up with so far:

import mechanize

br = mechanize.Browser()

br.set_handle_robots(True)
br.set_handle_refresh(False)
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1) 

# Open webpage and inspect its contents
url = "https://myhealth.stanfordmedicine.org/"

response = br.open(url)

# Test to make sure we've got the right page
# print response.read() # the text of the page

# Select form
br.select_form(nr=0)

# User credentials
br.form["Login"] = 'user@example.com'
br.form["Password"] = 'password123'
br.submit()

However, when I run it, I get the following error:

Traceback (most recent call last):
  File "test_mech_bitbybit.py", line 27, in <module>
    br.form["Login"] = 'user@example.com'
  File "build/bdist.macosx-10.6-intel/egg/mechanize/_form.py", line 2784, in __setitem__
ValueError: control 'Login' is disabled

In doing some research it appears as though JavaScript needs to be enabled in order for the login to be processed (in fact, with JavaScript disabled, the login/password fields become disabled and it's impossible to input anything in them). This leads me to believe that JavaScript has something to do with keeping the session alive and, possibly, handing off cookies to the browser. This is the point where I get overwhelmed and question whether I should even be using mechanize for this task.

Does anyone have experience, who'd be kind enough to hold my hand through this issue, and explain to me what I need to do to properly get through this login page and/or mimic whatever JavaScript is being used to accomplish?

  • I don't think that mechanize module has the ability to interact with Javascript, its purely Python and HTTP based. Going through the site, it appears that the forms are actually disabled if you have no Javascript turned on. Perhaps you should try something like http://stackoverflow.com/questions/5793414/mechanize-and-javascript – Rijvi Rajib Sep 22 '13 at 05:21
  • Rijvi, thank you for your response. You're correct, as I stated in my OP, the form is disabled if the site detects JavaScript isn't enabled (I'm assuming that's some sort of safeguard to prevent someone from logging in and then being unable to use the site and not knowing why). I'd like to know if there's a way to make the site believe I have JS enabled (via Python) and be able to properly submit login credentials. I also checked out the post you suggested and do not see it addressing my specific concern. – Jonathan Piccolo Sep 22 '13 at 07:03
  • You can try out controlling a browser to do the job using i.e. [selenium](http://selenium-python.readthedocs.org/en/latest/), or attempt to `POST` the data form. If the javascript does not do any difficult tampering on submit, you could try implementing your own function imitating the js on the site. – Maciej Gol Sep 22 '13 at 14:07
  • Thank you, kroolik! I'm going to attempt to POST the data. Do you have a recommendation on what I should use? Like urllib or requests or? – Jonathan Piccolo Sep 22 '13 at 19:07

0 Answers0