How to use open-uri or paperclip to download images into database and feed them to a Rest API - paperclip

I am working on a data integration app which need to fetch images from one API (with XML's urls) and post the images to a rails built REST API.
I tried paperclip to download all the images however don't know how to handle the Paperclip::Attachment type when trying to post the images with HTTMultiParty.
I am thinking about use open-uri instead of paperclip which will store file into binary. Can anyone give me an example on that? And is there any good option for posting image to API apart from httmultiparty.

It's better to answer this question myself because the solution can be varied.
So image fetch and feed through api can be done by httparty(download&upload text)+paperclip(download image by url)+httmultiparty(upload image), here are some code example I use in my application.
To me, httparty is easiest way to deal with api, codes can be easily done like this:
response = HTTParty.get('url')
response = HTTParty.post('url',
:headers => 'head content',
:body => {'data':'data content'})
Code example on paperclip is here: answer on stack over flow
The important part is parsing the paperclip image to binary file, code goes:
Paperclip.io_adapters.for(productData[0].image).read
The last example is HTTmultiparty, When you pass a query with an instance of a File as a value for a PUT or POST request, the wrapper will use a bit of magic and multipart-post to execute a multipart upload,apart from that it is pretty much the same as httparty:
class ImgClient
include HTTMultiParty
base_uri 'http://localhost:3000'
end
respond = ImgClient.post('url',
:headers => head,
:query => {
:image => Paperclip.io_adapters.for(product.image)
})
Hope this will be helpful for other api newbies.

Related

Accept file upload (without a form) in Sinatra

I have this Sinatra::Base code:
class Crush < Sinatra::Base
post '/upload' do
erb params.inspect
end
end
I am using Postman and its interface for uploading a file. So I send a POST request with form-data, where in the body of the request the name is hello and the value is a file test.txt which contains just a simple string hey there.
When I do params.inspect I get this long string
{"------WebKitFormBoundaryocOEEr26iZGSe75n\r\nContent-Disposition: form-data; name"=>"\"hello\"; filename=\"test.txt\"\r\nContent-Type: text/plain\r\n\r\nhey there\r\n------WebKitFormBoundaryocOEEr26iZGSe75n--\r\n"}
So basically a long has with a single key and a single value. Reading most Sinatra tutorials (where the file is accepted from a form), there's a nice way Sinatra handles this using params[:file], but this doesn't seem to be the case when the file is coming straight from the body of an HTTP request.
I tried a non-modular approach too withou Sinatra::Base, thinking it's some parsing middle-ware missing, but got the same result.
Is there something I'm missing here? Must I go and make my own custom parser to get the content of this long hash? Or is there an easier way?
I figured it's Postman issue. When I switch from 'x-www-form-urlencoded' to 'form-data' in Postman, in the Header section, the field: Content-Type => application/x-www-form-urlencoded is NOT removed. So for those who encounter this problem, make sure you remove it manually.

Send a html string in multipart post call using rest client ruby gem

I am using RestClient ruby gem to make a multipart call like below and it works fine
RestClient.post 'my_url', {:myfile => File.new("/path/to/image.jpg", 'rb') ,:multipart => true}
In second case I have a html string like this "<html> <title>sometitle</title></html>"
I want to post this html as a file in above call. Is it possible to post the above html in the call below.
One option is to create a new file with above html content and use that but I don't want to do this. I am new to ruby . Can anyone help.

Using Ruby Mechanize to download file served as attachement

I need the ability to grab reports off of a particular website. The below method below does everything I need it to do, the only catch is the report, "report.csv", is served back with "content-disposition:filename=report.csv" in the response header when the page is posted (the page posts to itself).
def download_report
page = #mechanize.click(#mechanize.current_page().link_with(:text => /Reporting/))
page.form.field_with(:name => "rep").option_with(:value => "adperf").click
page.form_with(:name => "get-report").field_with(:id => "sasReportingQuery.dateRange").option_with(:value => "Custom").click
start_date = DateTime.parse(#start_date)
end_date = DateTime.parse(#end_date)
page.form_with(:name => "get-report").field_with(:name => "sd_display").value = start_date.strftime("%m/%d/%Y")
page.form_with(:name => "get-report").field_with(:name => "ed_display").value = end_date.strftime("%m/%d/%Y")
page.form_with(:name => "get-report").submit
end
As far as I can tell, Mechanize is not capturing the file anywhere that I can get to it. Is there a way to get Mechanize to capture and download this file?
#mechanize.current_page() does not contain the file and #mechanize.history() does not show that the file url was presented to Mechanize.
The server appears to be telling the browser to save the document. "Content-disposition:filename" is the clue to that. Mechanize won't know what to do with that, and will try to read and parse the content, which, if it's a CSV, will not work.
Without seeing the HTML page you're working with it's impossible to know exactly what mechanism they're using to trigger the download. Clicking an element could fire a JavaScript event, which Mechanize won't handle. Or, it could send a form to the server, which responds with the document download. In either case, you have to figure out what is being sent, why, and what specifically defines the document you want, then use that information to request the document.
Mechanize isn't the right tool to download an attachment. Use Mechanize to navigate forms, then use Mechanize's embedded Nokogiri to extract the URL for the document.
Then use something like curb or Ruby's built-in OpenURI to retrieve the attachment, or see "Using WWW:Mechanize to download a file to disk without loading it all in memory first" for more information.
Check the class of the returned page page.class. if it is File then you can just save it.
...
page = page.form_with(:name => "get-report").submit
page.class # File?
page.save('path/to/file')

RestClient multipart upload from IO

I am trying to upload data as multipart using RestClient like so:
response = RestClient.post(url, io, {
:cookies => {
'JSESSIONID' => #sessionid
},
:multipart => true,
:content_type => 'multipart/form-data'
})
The io argument is a StringIO that contains my file, so it's from memory instead of from the disk.
The server (Tomcat servlet) is unable to read the multipart data, giving an error:
org.apache.commons.fileupload.FileUploadException: the request was rejected because no multipart boundary was found
So I believe that RestClient is not sending it in multipart format? Anyone see the problem? I am assuming the problem is on the Ruby (client) side, but I can post my servlet (Spring) code if anyone thinks it might be a server-side problem.
I also wonder what RestClient would use for the uploaded filename, since there isn't an actual file... Can you have a multipart request without a filename?
You can do this, it simply requires subclassing StringIO and adding a non-nil path method to it:
class MailIO < StringIO
def path
'message'
end
end
I've just checked this, and the Mailgun api is pretty down with this.
After consulting with the author of the rest-client library (Archiloque), it seems that if this is possible, the API is not set up to handle it easily. Using the :multipart => true parameter will cause the IO to be treated like a file, and it looks for a non-nil #path on the IO, which for a StringIO is always nil.
If anyone needs this in the future, you'll need to consult with the library's mailing list (code#archiloque.net), as the author seems to think it is possible but perhaps not straightforward.
It CAN easily do streaming uploads from an IO as long as it's not multipart format, which is what I ended up settling for.

Manual POST request

Scenario: I have logged into a website, gained cookies etc, got to a particular webpage with a form + hidden fields. I now want to be able to create my own http post with my own hidden form data instead of what is on the webpage and verify the response instead of using the one on the webpage.
Reason: Testing against pre-existing data (I know, I know) which could be different on each environment hence no predictable way to use it. We need a workaround.
Is there any way to do this without manually editing the existing form and submitting that? Feels a little 'hacky'.
Ideally, I would like to say something like:
browser.post 'url', 'field1=test&field2=abc'
I would probably switch to mechanize to muck around at the protocol level. Something like this added to your script
b = WWW::Mechanize.new
b.get('http://yoursite.com/current_page') do |page|
# Submit the login form
my_form = page.form_with(:action => '/post/url') do |f|
f.form_loginname = 'tim'
f.form_pw = 'password'
end.click_button
end

Resources