2

Wrote a simple test function using selenium webdriver in Python:

from selenium import webdriver

def test_webdriver():
    web = webdriver.PhantomJS()
    web.get('http://example.com')
    web.find_element_by_tag_name('html')
    web.find_element_by_tag_name('head')
    web.find_element_by_tag_name('meta')
    web.find_element_by_tag_name('body')
    web.find_element_by_tag_name('title')
    web.find_element_by_tag_name('p')
    web.find_element_by_tag_name('div')

This function took much longer than expected to run, so I profiled it with cProfile and saw some lines like this:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      ...
        9    0.000    0.000    0.157    0.017 .../python2.7/urllib2.py:386(open)
      ...

Which clearly indicates that webdriver is accessing the network on every find call in my test function.

I thought that webdriver grabs a DOM once and ONLY once with get() and then searches and manipulates it locally, similar to BeautifulSoup. Clearly it's not working like that so I'm left with some questions:

  • Is this the normal, expected behavior of webdriver, or just a misconfiguration on my part?
  • If this is normal behavior, then is there a way to force webdriver to not access the network on every function call?
  • What is it accessing the network for? It can't be refreshing the page on every find, that just doesn't make any sense.

NOTE: I understand that javascript on the test page may fire off unintended network calls, which is why I'm using http://example.com as my test page, to eliminate that possibility.

galarant
  • 1,959
  • 19
  • 24

3 Answers3

5

I believe that communication between WebDriver and the browser happens over a network connection: https://code.google.com/p/selenium/wiki/JsonWireProtocol

So while it's certainly not making nine requests to example.com, it could still be making nine local network requests to WebDriver - in your example, that's one to provision a browser, one to ask the browser to perform the GET, and seven lookups within the page DOM.

There should be some way to get your WebDriver client library to log the actual calls it makes to the browser.

ArthurDenture
  • 2,161
  • 1
  • 16
  • 15
  • On the server console it'll log each request it gets and each response it sends back - so it's as easy as looking at the server console output to see if you are right (I believe you are, so +1) – Arran Jul 07 '14 at 08:28
1

WebDriver is pretty low-level. You wouldn't want to implement general DOM caching there because the DOM changes constantly. Instead, build a framework on top of WebDriver which allows you to specify when caching will be appropriate. An example is the @CacheLookup annotation used by the Page Factory pattern of the Selenium-Java project.

Community
  • 1
  • 1
Joe Coder
  • 4,498
  • 31
  • 41
-1

You see network activity for each WebDriver call because that is how the WebDriver client communicates with the browser.

CMerrill
  • 1,857
  • 1
  • 14
  • 16