Ruby Watir-webdriver saving image when navigating directly to the image - ruby

I'm trying to grab a set of information from a series of pages that are loaded via JS and to accomplish that I'm using watir-webdriver to load the page and nokogiri to parse them. This is working great, however, I need to grab a picture off of the page. The path of the picture is generated upon the page's loading so I wrote the following to create an array of relative URLS to the images and navigate directly to the absolute URL of the first index of the array, which is always the image I want.
img_srcs = $page_html.css('img').map{ |i| i['src'] } #genereates an array of relative urls pointing to every image
imageURL= "website.com" + img_srcs[1].gsub("..","").to_s #take the relative URL of image at index position 1 (the image) and converts it to an absolute URL
$browser.goto(imageURL)
How can I save this image which the browser has directly loaded? Any help would be appreciated and please let me know if I anything is unclear.
Edit:
I've now added the following code
image_source = $browser.image(:class => "decoded").image.src
File.open("#{$imageID}.txt", "w") do |f|
f.write open(image_source).read
f.close
end
However, I'm getting the error
C:/Ruby192/lib/ruby/gems/1.9.1/gems/watir-webdriver-0.6.4/lib/watir-webdriver/el
ements/element.rb:490:in 'assert_exists': unable to locate element, using {:tag_
name=>"img"} (Watir::Exception::UnknownObjectException)
from C:/Ruby192/lib/ruby/gems/1.9.1/gems/watir-webdriver-0.6.4/lib/watir
-webdriver/attribute_helper.rb:71:in 'block in define_string_attribute'
from 12.rb:121:in 'imageDownload'
from 12.rb:134:in 'navAndGrab'
from 12.rb:137:in '<main>'

When you do:
$browser.image(:class => "decoded").image.src
You are looking for the html:
<img class="decoded">
<img src="what_you_want"></img>
</img>
I am guessing your html is not like that, hence you get the exception regarding finding the image within the image.
You probably just want the first image with class decoded (remove the second .image):
image_source = $browser.image(:class => "decoded").src
Or maybe you want the full list of images and then get the first one:
image_source = $browser.images(:class => "decoded").first.src

Related

How to assert that image has been uploaded with Capybara?

Let's say I have a form where user can create a post with an image attached to it. I want to make sure the image attached is displayed on the next page:
visit url
fill_in the_form
click_on 'Create'
assert_selector '.post'
post = Post.first
img = page.find '.post .image'
assert_equal post.file.thumb.url, URI(img[:src]).path
But I'm told asserting against database objects in system tests is to be avoided. What do I do then?
So long as there's no "complex" file renaming happening on the backend, you already know the uploaded filename when populating the form:
fill_in the_form
Therefore, you could assert that the page contains an image with this name (perhaps using an xpath).
If there is trivial file renaming (e.g. replacing spaces with hyphens), then you could either (ideally) just choose a filename that does not change, or reproduce the renaming in your test.

download pdf files that are href links using ruby mechanize

Using Ruby Mechanize I have successfully submitted input values to a form and am able to get the resultant page based on the search criteria. The resultant page has pdf files as ahref links that i need to download.
Attribute href has value:
href='xxx.do?FILENAME=path/abc.pdf&SEARCHTEXT=aaa&ID=123_4
where SEARCHTEXT is the text entered as input originally. When i manually click the link pdf opens in a new window having
url as http://someip:8080/xxx/temp/123_4 which is the same ID seen in the href attribute. The actual filename however is different and is of the form xxx.123_2_.doc. My below code returns 0 byte file -
scraper.pluggable_parser.pdf = Mechanize::FileSaver
File.open('n1pdf.pdf', 'wb'){|f| f << scraper.get(alink).body}
where alink=http://someip:8080/xxx/temp/123_4
If i use
File.open("new.pdf", "w") do |f|
uri = URI(alink)
f << Net::HTTP.get(uri)
end
I get HTTP not found error.
I am not sure if i am doing this correct. Is ID a session id that is generated dynamically since all pdf files on the resultant page have this ID with _1/2/3 as filename(or url).
Please note that whenever i manually click and open a pdf file and then hardcore that in my code the file downloads but does not when my code dynamically extracts the ID value and assigns to alink. Not sure if this is related to cookies. Kindly help. Thank You.
Make sure it's the right absolute url:
uri = scraper.page.uri.merge(a[:href])
puts uri # just check to be sure
File.open('n1pdf.pdf', 'wb'){|f| f << scraper.get(uri).body}

Ruby Watir -- Trying to loop through links in cnn.com and click each one of them

I have created this method to loop through the links in a certain div in the web site. My porpose of the method Is to collect the links insert them in an array then click each one of them.
require 'watir-webdriver'
require 'watir-webdriver/wait'
site = Watir::Browser.new :chrome
url = "http://www.cnn.com/"
site.goto url
box = Array.new
container = site.div(class: "column zn__column--idx-1")
wanted_links = container.links
box << wanted_links
wanted_links.each do |link|
link.click
site.goto url
site.div(id: "nav__plain-header").wait_until_present
end
site.close
So far it seems like I am only able to click on the first link then I get an error message stating this:
unable to locate element, using {:element=>#<Selenium::WebDriver::Element:0x634e0a5400fdfade id="0.06177683611003881-3">} (Watir::Exception::UnknownObjectException)
I am very new to ruby. I appreciate any help. Thank you.
The problem is that once you navigate to another page, all of the element references (ie those in wanted_links) become stale. Even if you return to the same page, Watir/Selenium does not know it is the same page and does not know where the stored elements are.
If you are going to navigate away, you need to collect all of the data you need first. In this case, you just need the href values.
# Collect the href of each link
wanted_links = container.links.map(&:href)
# You have each page URL, so you can navigate directly without returning to the homepage
wanted_links.each do |link|
site.goto url
end
In the event that the links do not directly navigate to a page (eg they execute JavaScript when clicked), you will need to collect enough data to re-locate the elements later. What you use as the locator will depend on what is known to be static/unique. As an example, I will assume that the link text is a good locator.
# Collect the text of each link
wanted_links = container.links.map(&:text)
# Iterate through the links
wanted_links.each do |link_text|
container = site.div(class: "column zn__column--idx-1")
container.link(text: link_text).click
site.back
end

How can I resize external images and serve them on-the-fly?

I have a sinatra app that gets image urls from an API and I want to scale them and then serve them without storing them on the server. Most of the gems I have seen only get local images and then processes each one in a queue. I only need to scale five images and display them on the page. Is there a fast way to do this?
More Clarification:
I need a way to get an image externally (e.g. notmysite.com/img.jpg) and for the code to serve the scaled image on the page. I cant do it with css or other front-end methods because this page is going to be rendered by a script that distorts images scaled front-end.
Dragonfly uses imagemagick to scale the images. Here's some code I've cobbled together from previous stuff I've done with MiniMagick, so it'll be fairly similar.
Get yourself the file into a Tempfile. I've done this here with Faraday and Typheous. Then scale it using magick!
require 'faraday'
require 'faraday_middleware'
#require 'faraday/adapter/typhoeus' # see https://github.com/typhoeus/typhoeus/issues/226#issuecomment-9919517 if you get a problem with the requiring
require 'typhoeus/adapters/faraday'
configure do
Faraday.default_connection = Faraday::Connection.new(
:headers => { :accept => 'image/*',
:user_agent => "Sinatra via Faraday"}
) do |conn|
conn.use Faraday::Adapter::Typhoeus
end
end
helpers do
def grab_image_and_scale
response = Faraday.get url # you'll need to supply this variable somehow, your choice
filename = "SOMETHING.jpg"
tempfile = Tempfile.open(filename, 'wb') { |fp| fp.write(response.body) }
thumb = MiniMagick::Image.open( tempfile.path )
thumb.thumbnail( "75x75" )
thumb.write( File.join settings.public, "images", "thumb_#{filename}")
scaled = MiniMagick::Image.open( secure_path )
scaled.resize( "600" )
scaled.write( File.join settings.public, "images", "scaled_#{filename}")
end
end
I'll leave it to you to work out how to change the path to the public images folder into a tempfile (and it'd be nice if you shared how it's done:)
One way, that is not related to Ruby or sinatra, is to add width and height attributes to img HTML tag. You'll end up with something like this:
<img src="img.source.from.API.JPG" width="2000px" height="100px" />
Edit
Another method is to change the dimension using javascript, as suggested here: https://stackoverflow.com/a/11333825/693597. You might consider writing your JS file and included in the header of your HTML.

Rails3 - Problem saving base64 image with paperclip?

This is similar to the problem posted here, but that solution doesn't work for me. Maybe it's because I'm not passing the data in correctly.
I'm pulling screenshots from Flash and displaying them on the page using Jquery:
$SNAPSHOT_PREVIEW.attr("src","data:image/jpg;base64," + imgData);
$HIDDEN_BASE64_STRING.val(imgData);
I had it nice and working where you could save the image to Rails in Flash, but Flash won't allow you invoke a post action without the user pressing a button for security reasons. Makes sense. Anyway, now I can't get Paperclip to save the image coming from the HTML form:
#(photo has_attached_file:image)
#photo = params[:photo]
data = StringIO.new(Base64.decode64(params[:base64_string]))
data.class.class_eval { attr_accessor :original_filename, :content_type }
data.original_filename = "screenshots.jpg"
data.content_type = "image/jpg"
#photo.image = data
Yields the error:
NoMethodError (undefined method `image=' for #<ActiveSupport::HashWithIndifferentAccess:0x8c2e420>):
How do I need to finesse the base64 image data into a paperclip attachment?
For bonus points, do I need the hidden field to pass the data or is there a clever, browser compatible way to use the image src as a form value?
You have the code:
#photo = params[:photo]
params is just a hash, so later, when you call #photo.image, Rails bugs out. Perhaps you want:
#photo = Photo.new(params[:photo])
instead?

Resources