I am using Ruby on Rails with the Mechanize library to scrape store websites. The problem is that many times I can't crawl certain elements. However, I can see this when I 'view source' on the site.
For example, Walmart's category (in this case below it is "Health") is unscapeable. I believe this is because it is dynamically produced HTML (e.g. from javascript). In order to scrape this, I need a browser to process the web request.
http://www.walmart.com/ip/Replacement-Sensor-Module-for-AlcoMate-Prestige-Breathalyzer/10167376
I am also using a linux machine on Amazon EC2. It would be tough to install browser for UI scraping. Is there any Rails gem/plugin that can help me?
Thanks, all!!