1

Using: Rails 3.1.1

I am using the googleajax gem to perform Google-searches in a script with severals of thousands searches.

After some 20 searches or so, I need to have a rescue that waits and retries since it seems like you cannot perform more than a certain amount of searches in a row. After approximately one minute the retry makes the search continue for 10 more searches. The result is that it takes about one minute to perform 10 searches, which makes the script incredibly slow.

It seems likely that Google has a block in the amount of searches one can perform (based on ip? based on googleajax referrer?) but is there a way around it?

What can I do to be able to perform Google searches through the googleajax gem without having to pause and wait all the time? What alternatives do I have?

The code (with unimportant parts cut out):

            begin
              puts "Searching with " + gsquery
                results = GoogleAjax::Search.web(gsquery)[:results]
                if results.count > 0
                  puts "#{results.count} results found for #{page.name}. Registering the connection!"
                end
            rescue
                puts "Try again in 3 sec"
                sleep 3
                retry
            rescue Timeout::Error 
              puts "Timeout Error, sleep 15 sec"
              sleep 15
              retry
            end
Gian
  • 13,735
  • 44
  • 51
Christoffer
  • 2,271
  • 3
  • 26
  • 57

2 Answers2

2

Sorry, but I think you're out of luck. GoogleAjax uses the now deprecated web search API (it's been deprecated for over a year now), which may disappear at any point in the future, making the gem useless. Secondly, both the web search API and it's replacement are limited to a maximum number of queries a day, beyond which the service will just stop responding - it's 100 queries a day for the custom search API. To get more than that you'll have to pay (the rate is $5 / 1000 searches). The rate limit is based on the number of queries associated with a single API key.

I'd suggest that you:

  1. Use the google-api-client gem instead of GoogleAjax (it uses the Custom Web Search API which replaces the web search API)
  2. Get an API key for the custom search API using Google's API console
  3. Consider enabling billing. Half a cent per search is not terrible, and for several thousand searches will only cost you $10
Chris Bailey
  • 4,126
  • 24
  • 28
  • Thanks, Chris! I feared that would be the answer. But is there no way to hack GoogleAjax (while it is still around) to perform more searches before that happens? I guess the limit is set on searches per IP:nr or something? If I run the script on several computers (servers) or, even better, alter the IP artificially would that make it work? Perhaps alter the refering url (haven't been able to test that yet since the script is running)? Is that even possible, or will Google slap my fingers if I do? I have some 150 000 searches (don't ask) and is not too happy to pay up that sum right now. – Christoffer Jan 07 '12 at 01:29
  • My understanding was that the the limit was based on a per API key basis (this may only be true for newer APIs though). If you're not setting an API key then I'd assume that it limit is per IP address (as is the case for things like the static map API). – Chris Bailey Jan 08 '12 at 13:35
0

Ive found this neat little gem to be quite handy in my latest project. Ruby - Google Search API

Here is a simple use case for searching for an image. This basically states that if the item's name does not equal an empty string, return the search of the first 5 images using the item's name. If the item's name is equal to a empty string and thus being nil, do nothing.

- if item.name != "" 
  - Google::Search::Image.new(:query => item.name).first(5).each do |image|
    = image_tag(image.uri)