Created
July 11, 2011 12:47
-
-
Save aitor/1075766 to your computer and use it in GitHub Desktop.
google search by image
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| require 'rubygems' | |
| require 'net/http' | |
| require 'open-uri' | |
| require 'cgi' | |
| require 'nokogiri' | |
| def search_by_image_results_page(image) | |
| sbi_url = "http://www.google.com/searchbyimage?sbisrc=cr_1_0_0&image_url=#{CGI.escape(image)}" | |
| response = Net::HTTP.get_response(URI.parse(sbi_url)) | |
| response['location'] #google always makes a 302 redirect for the results | |
| end | |
| def parse_results(url) | |
| #Fake a browser because google redirects you to google search if there is no user agent | |
| chrome_user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.803.0 Safari/535.1" | |
| Nokogiri::HTML(open(url, "User-Agent" => chrome_user_agent)) | |
| end | |
| def report(set) | |
| puts "\n"*3 | |
| unless set.size == 0 | |
| puts "The image [#{@image_url}] is used on the following urls:" | |
| set.each_with_index do |link, idx| | |
| puts " ##{idx+1} - #{link.text}: #{link['href']}" | |
| end | |
| end | |
| puts "\n"*3 | |
| end | |
| # The image we want to check | |
| @image_url = "http://www.ikbrunel.org.uk/userFiles/GreatBritainWeb.jpg" | |
| results_page = search_by_image_results_page(@image_url) | |
| parsed_page = parse_results(results_page) | |
| links = parsed_page.search('//li[@class="g"]//h3[@class="r"]//a') # quick, dirty and fragile | |
| report(links) | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| [aitor@banba:~/workshop/github/aitor/eikon 12:45:34] ruby eikon.rb | |
| The image [http://www.ikbrunel.org.uk/userFiles/GreatBritainWeb.jpg] is used on the following urls: | |
| #1 - Brunel's ss Great Britain - a great day out in Bristol: http://www.ssgreatbritain.org/ | |
| #2 - SS Great Britain - Wikipedia, the free encyclopedia: http://en.wikipedia.org/wiki/SS_Great_Britain | |
| #3 - Visually similar images: /search?hl=en&tbs=simg:CAESVhpUCxCwjKcIGjgKNggBEhAyO6IBpALWAXFalgHHA8ICGiC7Z4CRSqBwRbPycayBw-pY1paEp0-Eh4efms4VtdVSZgwLEI6u_1ggaCgoICAESBKuhSWcM&q=ss+great+britain&tbm=isch&sa=X&ei=efAaTtW7EMSu8gPkhpwO&ved=0CC8Qsw4 | |
| #4 - MyBrunel.co.uk :: Great Britain :: © 2011: http://www.mybrunel.co.uk/ships/britain/index.php | |
| #5 - SS (Steam Ship) Great Britain: http://www.crwflags.com/fotw/flags/gb~ssgb.html | |
| #6 - Brunel 200 Legacy: The Brunel Banquet: http://www.brunel200.com/legacy/brunel_200_events/banquet.htm | |
| #7 - victorian bristol | ss great britian | Event view: http://www.xtimeline.com/evt/view.aspx?id=73970 | |
| #8 - SS Great Britain hakkında ansiklopedik bilgi: http://www.turkcebilgi.com/ss_great_britain/ansiklopedi |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment