I'm grabbing image data using the Request module. The data that comes back looks like interpreted binary data like so:
`����JFIF��>CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), default quality
��C
$.' ",#(7),01444'9=82<.342��C
2!!
I have tried saving using:
image = open("test.jpg", "wb")
image.write(image_data)
image.close()
But that complains that it needs a bytes-like object. I have tried doing result.text.encode() with various formats like "utf-8" etc but the resulting image file cannot be opened. I have also tried doing bytes(result.text, "utf-8") and bytearray(result.text, "utf-8") and same problem. I think those are all roughly equivalent, anyway. Can someone help me convert this to a bytes-like object without destroying the data?
Also, my headers in the request is 'image/jpeg' but it still sends me the data as a string.
Thanks!
Use the content field instead of text:
import requests
r = requests.get('https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png')
with open('test.png', 'wb') as file:
file.write(r.content)
See: https://requests.readthedocs.io/en/master/user/quickstart/#binary-response-content
I am using prawnpdf/pdf-inspector to test that content of a PDF generated in my Rails app is correct.
I would want to check that the PDF file contains a link with certain URL. I looked at yob/pdf-reader but haven't found any useful information related to this topic
Is it possible to test URLs within PDF with Ruby/RSpec?
I would want the following:
expect(urls_in_pdf(pdf)).to include 'https://example.com/users/1'
The https://github.com/yob/pdf-reader contains a method for each page called text.
Do something like
pdf = PDF::Reader.new("tmp/pdf.pdf")
assert pdf.pages[0].text.include? 'https://example.com/users/1'
assuming what you are looking for is at the first page
Since pdf-inspector seems only to return text, you could try to use the pdf-reader directly (pdf-inspector uses it anyways).
reader = PDF::Reader.new("somefile.pdf")
reader.pages.each do |page|
puts page.raw_content # This should also give you the link
end
Anyway I only did a quick look at the github page. I am not sure what raw_content exactly returns. But there is also a low-level method to directly access the objects of the pdf:
reader = PDF::Reader.new("somefile.pdf")
puts reader.objects.inspect
With that it surely is possible to get the url.
I am attempting to use the following code to save a image from URL using python:
image = urllib.URLopener()
image.retrieve("http://example.com/image.jpg","image.jpg")
The image saves as expected, I was wondering whether it would be possible to set assign a User-agent using the urllib method?
i dont think you can add custom headers while using urllib
but i know there are multiple ways to do it using urllib2
one way you could is like this:
import urllib2
headers = { 'User-Agent' : 'Mozilla/5.0' }
req = urllib2.Request('http://example.com/image.jpg', None, headers)
html = urllib2.urlopen(req).read()
with open('download.jpg','r+') as f:
f.write(html)
this will download the image but the 'download.jpg' has to already exist
there are more ways to do it i would take a look at this Setting the User-Agent
also take a look at this Question
Good Luck!
I am trying to use HTTP::get to download an image of a Google chart from a URL I created.
This was my first attempt:
failures_url = [title, type, data, size, colors, labels].join("&")
require 'net/http'
Net::HTTP.start("http://chart.googleapis.com") { |http|
resp = http.get("/chart?#{failures_url")
open("pie.png" ,"wb") { |file|
file.write(resp.body)
}
}
Which produced only an empty PNG file.
For my second attempt I used the value stored inside failure_url inside the http.get() call.
require 'net/http'
Net::HTTP.start("http://chart.googleapis.com") { |http|
resp = http.get("/chart?chtt=Builds+in+the+last+12+months&cht=bvg&chd=t:296,1058,1217,1615,1200,611,2055,1663,1746,1950,2044,2781,1553&chs=800x375&chco=4466AA&chxl=0:|Jul-2010|Aug-2010|Sep-2010|Oct-2010|Nov-2010|Dec-2010|Jan-2011|Feb-2011|Mar-2011|Apr-2011|May-2011|Jun-2011|Jul-2011|2:|Months|3:|Builds&chxt=x,y,x,y&chg=0,6.6666666666666666666666666666667,5,5,0,0&chxp=3,50|2,50&chbh=23,5,30&chxr=1,0,3000&chds=0,3000")
open("pie.png" ,"wb") { |file|
file.write(resp.body)
}
}
And, for some reason, this version works even though the first attempt had the same data inside the http.get() call. Does anyone know why this is?
SOLUTION:
After trying to figure why this is happening I found "How do I download a binary file over HTTP?".
One of the comments mentions removing http:// in the Net::HTTP.start(...) call otherwise it won't succeed. Sure enough after I did this:
failures_url = [title, type, data, size, colors, labels].join("&")
require 'net/http'
Net::HTTP.start("chart.googleapis.com") { |http|
resp = http.get("/chart?#{failures_url")
open("pie.png" ,"wb") { |file|
file.write(resp.body)
}
}
it worked.
I'd go after the file using Ruby's Open::URI:
require "open-uri"
File.open('pie.png', 'wb') do |fo|
fo.write open("http://chart.googleapis.com/chart?#{failures_url}").read
end
The reason I prefer Open::URI is it handles redirects automatically, so WHEN Google makes a change to their back-end and tries to redirect the URL, the code will handle it magically. It also handles timeouts and retries more gracefully if I remember right.
If you must have lower level control then I'd look at one of the many other HTTP clients for Ruby; Net::HTTP is fine for creating new services or when a client doesn't exist, but I'd use Open::URI or something besides Net::HTTP until the need presents itself.
The URL:
http://chart.googleapis.com/chart?chtt=Builds+in+the+last+12+months&cht=bvg&chd=t:296,1058,1217,1615,1200,611,2055,1663,1746,1950,2044,2781,1553&chs=800x375&chco=4466AA&chxl=0:|Jul-2010|Aug-2010|Sep-2010|Oct-2010|Nov-2010|Dec-2010|Jan-2011|Feb-2011|Mar-2011|Apr-2011|May-2011|Jun-2011|Jul-2011|2:|Months|3:|Builds&chxt=x,y,x,y&chg=0,6.6666666666666666666666666666667,5,5,0,0&chxp=3,50|2,50&chbh=23,5,30&chxr=1,0,3000&chds=0,3000
makes URI upset. I suspect it is seeing characters that should be encoded in URLs.
For documentation purposes, here is what URI says when trying to parse that URL as-is:
URI::InvalidURIError: bad URI(is not URI?)
If I encode the URI first, I get a successful parse. Testing further using Open::URI shows it is able to retrieve the document at that point and returns 23701 bytes.
I think that is the appropriate fix for the problem if some of those characters are truly not acceptable to URI AND they are out of the RFC.
Just for information, the Addressable::URI gem is a great replacement for the built-in URI.
resp = http.get("/chart?#{failures_url")
If you copied your original code then you're missing a closing curly bracket in your path string.
Your original version did not have the parameter name for each parameter, just the data. For example, on the title, you cannot just submit "Builds+in+the+last+12+months", but instead it must be "chtt=Builds+in+the+last+12+months".
Try this:
failures_url = ["title="+title, "type="+type, "data="+data, "size="+size, "colors="+colors, "labels="+labels].join("&")
I am trying to do something quite simple using Sinatra and RMagick.
Take a image, through a simple form
file upload
Use RMagick to resize it
Then store it in a database for
persistence (irrelevant)
But after going through the RDocs and endless head banging testing
I can't seem to get the form image to a RMagick object cleanly.
This is the horrible thing that is currently working for me:
def image_resize(img_data)
filecount = rand
writer = File.new("/tmp/#{filecount}.jpg", "w")
writer.puts(img_data)
writer.close
resized_image = Magick::ImageList.new("/tmp/#{filecount}.jpg").first
resized_image.crop_resized!(100,100, Magick::NorthGravity)
resized.format = 'jpeg'
resized_image.to_blob
end
#call the method with my form image data
image_resize(params[:image][:tempfile].read)
So how do I do the obvious right thing and just stick my form image data straight into a RMagick object without having to write and read the disk.
I have tried various ways of reading in Magick::Image and ImageLists but have only got an abundance of errors barfed at me.
Thanks for any kind of direction
-1.2.
You need to get the path from the tempfile and pass that to Magick::Image’s read.
Here’s an example:
post "/upload-photo" do
image = Magick::Image.read(params[:image][:tempfile].path)[0]
image.crop_resized! 100, 100, Magick::CenterGravity
store_image_data image.to_blob
redirect "/done"
end
Or you can read straight from the ActionDispatch::Http::UploadedFile like so:
image = Magick::Image.from_blob(params[:image].read)