I am trying to automate filling out a form (which includes a file upload) using Mechanize. I have gone through the process in the GUI interface and the file uploads fine, so I know the file isn't corrupt, but when I run my mechanize script it fails. The script executes correctly, and according to the debug it uploaded the file, but Canvas (the service I'm uploading to) says the file could not be read. I have contacted Canvas support but they are unable to help since it's a non-standard use of their system.
Here is the script (which has been anonymized):
09 mech = Mechanize.new
10 mech.log = Logger.new(STDOUT)
11 mech.user_agent_alias = 'Mac Mozilla'
12 mech.get("https://ucdenver.test.instructure.com") do |page|
13 page.form_with(:action => "/login") do |f|
14 user_field = f.field_with(:name => "pseudonym_session[unique_id]")
15 user_field.value = user
16 pwd_field = f.field_with(:name => "pseudonym_session[password]")
17 pwd_field.value = pwd
18 end.submit
19 end
20
21 mech.get("https://ucdenver.test.instructure.com/accounts/1") do |page|
22 page.form_with(:action => "/accounts/1/courses") do |f|
23 course_field = f.field_with(:name => "course[name]")
24 course_field.value = "38492"
25 end.submit
26 end
27
28 mech.page.link_with(:href => %r/settings/,
29 :class => "settings").click
30
31 mech.page.link_with(:href => %r/import/,
32 :text => %r/Import Content/).click
33
34 mech.page.link_with(:href => %r/imports\/migrate/,
35 :text => %r/Import content from a content package/).click
36
37 mech.page.form_with(:action => %r/imports\/migrate/) do |f|
38 export_system = "migration_settings[migration_type]"
39 f[export_system] = "blackboard_exporter"
40 f.file_uploads.first.mime_type = "application/zip"
41 f.file_uploads.first.file_name = bkb_export
42 end.submit
Canvas support said the server is returning:
Could not unzip archive file, exit status 9, message:
[/mnt/var/web/migration_tool/data/attachment_420130214-13303-sv6ue5-0.jpg] End-of-central-
directory signature not found. Either this file is not a zipfile, or it constitutes one
disk of a multi-part archive. In the latter case the central directory and zipfile comment
will be found on the last disk(s) of this archive. unzip: cannot find zipfile directory in
one of /mnt/var/web/migration_tool/data/attachment_420130214-13303-sv6ue5-0.jpg or
/mnt/var/web/migration_tool/data/attachment_420130214-13303-sv6ue5-0.jpg.zip, and cannot
find /mnt/var/web/migration_tool/data/attachment_420130214-13303-sv6ue5-0.jpg.ZIP, period.
Unfortunately that error makes no sense to me, but it seems odd that it references a couple of .jpg's when the script is uploading a .zip file. Any ideas or assistance would be greatly appreciated, and I'm happy to provide additional information if I've left anything useful out.
If you're really feeling like a go getter you can sign up for a free account at http://canvas.instructure.com and see the code/network activity for yourself.
Make sure you are uploading file with correct type afaik ruby-mechanize specify
Content-Type: application/octet-stream
for every type of file, chances are server is checking ContentType and is not matching with required content type correctly and I don't remember the exact version of mechanize that i used but for me it was hardcoded in mechanize gem so you have to go through the source code of mechanize and fix where it is hardcoding content type for multipart/form-data file element.
Related
I am attempting to run the tool certlint, specifically the module called cablint-ct but when we try we get errors. There are three modules, certlint, cablint and cablint-ct -- all of which work except for cablint-ct.
Here is the command I am running in ruby:
ruby -I "/certlint-master/lib" "/certlint-master/bin/cablint-ct" "cert.der"
Here is the error I receive:
/certlint-master/lib/certlint/ct.rb:149:in `initialize': undefined method `+' for nil:NilClass (NoMethodError)
from /certlint-master/bin/cablint-ct:39:in `new'
from /certlint-master/bin/cablint-ct:39:in `<main>'
Here is the referring block of code in ct.rb (line 39):
30 def initialize(tbs_der)
31 asn = OpenSSL::ASN1.decode(tbs_der)
32 # tbsCertificate.version is optional, so we don't have a fixed
33 # offset. Check if the first item is a pure ASN1Data, which
34 # is a strong hint that it is an EXPLICIT wrapper for the first
35 # element in the struct. If so, this is the version, so everything
36 # is offset by one.
37 skip = asn.value[0].instance_of?(OpenSSL::ASN1::ASN1Data) ? 1 : 0
38 sig_alg_der = asn.value[1 + skip].to_der
39 #raw = OpenSSL::ASN1::Sequence.new([tbs_der, sig_alg_der, DER_SIG]).to_der
40 super(#raw)
41 end
42 end
and ct.rb (line 149)
148 def initialize(log)
149 #log = URI.parse(log + '/').normalize
150 end
I've opened an issue #37 on github with the owner of the tool but have not seen a response as of yet.
Will be grateful if someone can see if i am doing something wrong with my command or is there a coding issue somewhere?
UPDATE 1
I have figured that I need to be passing a URL into the command rather then a cert file. For example:
ruby -I "/certlint-master/lib" "/certlint-master/bin/cablint-ct" "https://ct.ws.symantec.com/ct/v1/get-entries?start=932966&end=932966"
I believe the code expect a JSON response from this URL and this link returns a file with JSON data, however, I get the following error:
/usr/share/ruby/json/common.rb:155:in `initialize': A JSON text must at least contain two octets! (JSON::ParserError)
from /usr/share/ruby/json/common.rb:155:in `new'
from /usr/share/ruby/json/common.rb:155:in `parse'
from /certlint-master/lib/certlint/ct.rb:184:in `_call'
from /certlint-master/lib/certlint/ct.rb:160:in `get_entries'
from /certlint-master/bin/cablint-ct:40:in `<main>'
Any ideas?
resolved, the command expects a known ct log url and index id e.g.
ruby -I "/certlint-master/lib" "/certlint-master/bin/cablint-ct" "https://ct.ws.symantec.com" 173977
or
ruby -I "/certlint-master/lib" "/certlint-master/bin/cablint-ct" "symantec" 173977
I'm having issues getting data from GitHub Archive.
The main issue is my problem with encoding {} and .. in my URL. Maybe I am misreading the Github API or not understanding encoding correctly.
require 'open-uri'
require 'faraday'
conn = Faraday.new(:url => 'http://data.githubarchive.org/') do |faraday|
faraday.request :url_encoded # form-encode POST params
faraday.response :logger # log requests to STDOUT
faraday.adapter Faraday.default_adapter # make requests with Net::HTTP
end
#query = '2015-01-01-15.json.gz' #this one works!!
query = '2015-01-01-{0..23}.json.gz' #this one doesn't work
encoded_query = URI.encode(query)
response = conn.get(encoded_query)
p response.body
The GitHub Archive example for retrieving a range of files is:
wget http://data.githubarchive.org/2015-01-01-{0..23}.json.gz
The {0..23} part is being interpreted by wget itself as a range of 0 .. 23. You can test this by executing that command with the -v flag which returns:
wget -v http://data.githubarchive.org/2015-01-01-{0..1}.json.gz
--2015-06-11 13:31:07-- http://data.githubarchive.org/2015-01-01-0.json.gz
Resolving data.githubarchive.org... 74.125.25.128, 2607:f8b0:400e:c03::80
Connecting to data.githubarchive.org|74.125.25.128|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2615399 (2.5M) [application/x-gzip]
Saving to: '2015-01-01-0.json.gz'
2015-01-01-0.json.gz 100%[===========================================================================================================================================>] 2.49M 3.03MB/s in 0.8s
2015-06-11 13:31:09 (3.03 MB/s) - '2015-01-01-0.json.gz' saved [2615399/2615399]
--2015-06-11 13:31:09-- http://data.githubarchive.org/2015-01-01-1.json.gz
Reusing existing connection to data.githubarchive.org:80.
HTTP request sent, awaiting response... 200 OK
Length: 2535599 (2.4M) [application/x-gzip]
Saving to: '2015-01-01-1.json.gz'
2015-01-01-1.json.gz 100%[===========================================================================================================================================>] 2.42M 867KB/s in 2.9s
2015-06-11 13:31:11 (867 KB/s) - '2015-01-01-1.json.gz' saved [2535599/2535599]
FINISHED --2015-06-11 13:31:11--
Total wall clock time: 4.3s
Downloaded: 2 files, 4.9M in 3.7s (1.33 MB/s)
In other words, wget is substituting values into the URL and then getting that new URL. This isn't obvious behavior, nor is it well documented, but you can find mention of it "out there". For instance in "All the Wget Commands You Should Know":
7. Download a list of sequentially numbered files from a server
wget http://example.com/images/{1..20}.jpg
To do what you want, you need to iterate over the range in Ruby using something like this untested code:
0.upto(23) do |i|
response = conn.get("/2015-01-01-#{ i }.json.gz")
p response.body
end
To get a better idea of what's going wrong, let's start with the example given in the GitHub documentation:
wget http://data.githubarchive.org/2015-01-01-{0..23}.json.gz
The thing to note here is that {0..23} is automagically getting expanded by bash. You can see this by running the following command:
echo {0..23}
> 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
This means wget doesn't get called just once, but instead gets called a total of 24 times. The problem you're having is that Ruby doesn't automagically expand {0..23} like bash does, and instead you're making a literal call to http://data.githubarchive.org/2015-01-01-{0..23}.json.gz, which doesn't exist.
Instead you will need to loop through 0..23 yourself and make a single call every time:
(0..23).each do |n|
query = "2015-01-01-#{n}.json.gz"
encoded_query = URI.encode(query)
response = conn.get(encoded_query)
p response.body
end
I'm building out a YahooFinance Api and keep hitting a brick wall when trying to use open URI.
Code:
uri = ("http://ichart.finance.yahoo.com/table.csv?s=#{URI.escape(code)}&a=#{start_month}&b=#{start_day}&c=#{start_year}&d=#{end_month}&e=#{end_day}&f=#{end_year}&g=d&ignore=.csv")
puts "#{uri}"
conn = open(uri)
Error:
`split': bad URI(is not URI?): http://ichart.finance.yahoo.com/table.csv?s=%255EIXIC&a=00&b=1&c=1994&d=09&e=14&f=2014&g=d&ignore=.csv} (URI::InvalidURIError)
I have tried URI.unescape(code) which outputs code as ^IXIC, as well as leaving any URI methods out and code will come through as %5EIXIC.
After reading around on stack overflow, I've tried both of these methods to no avail:
uri = URI.parse(URI.encode(url.strip))
safeurl = URI.encode(url.strip)
Even after looking through the code for another ruby yahoo-finance gem, here, I can't seem to find a solution. Any help is greatly appreciated. Thanks
EDIT: I am able to use open(uri) when I manually enter in the url in single quotes. Do double quotes, (used for inserting ruby objects), play a role here?
Don't try to inject variables into URLs. If they contain characters that need to be encoded per the spec, they won't be by interpolation. Instead, take advantage of the right tools for the job, like Ruby's URI class or the Addressable::URI gem.
See "How to post a URL containting curly braces and colons" for how to do this using well tested wheels.
In your situation, something like this will work:
require 'uri'
code = 'qwer3456*&^%'
start_month = 1
start_day = 1
start_year = 2014
end_month = 12
end_day = 31
end_year = 2015
uri = URI.parse("http://ichart.finance.yahoo.com/table.csv")
uri.query = URI.encode_www_form(
{
'g' => 'd',
'ignore' => '.csv',
's' => code,
'a' => start_month,
'b' => start_day,
'c' => start_year,
'd' => end_month,
'e' => end_day,
'f' => end_year
}
)
uri.to_s # => "http://ichart.finance.yahoo.com/table.csv?g=d&ignore=.csv&s=qwer3456*%26%5E%25&a=1&b=1&c=2014&d=12&e=31&f=2015"
The code works for me though I don't think the API endpoint is correct:
[1] pry(main)> uri = URI("http://ichart.finance.yahoo.com/table.csv?s=%255EIXIC&a=00&b=1&c=1994&d=09&e=14&f=2014&g=d&ignore=.csv")
=> #<URI::HTTP:0x007fd63a2fff40 URL:http://ichart.finance.yahoo.com/table.csv?s=%255EIXIC&a=00&b=1&c=1994&d=09&e=14&f=2014&g=d&ignore=.csv>
[3] pry(main)> Net::HTTP.get(uri)
=> "<!doctype html public \"-//W3C//DTD HTML 4.01//EN\" \"http://www.w3.org/TR/html4/strict.dtd\">\n<html><head><title>Yahoo! - 404 Not Found</title><style>\n/* nn4 hide */ \n/*/*/\nbody {font:small/1.2em arial,helvetica,clean,sans-serif;font:x-small;text-align:center;}table {font-size:inherit;font:x-small;}\nhtml>body {font:83%/1.2em arial,helvetica,clean,sans-serif;}input {font-size:100%;vertical-align:middle;}p, form {margin:0;padding:0;}\np {padding-bottom:6px;margin-bottom:10px;}#doc {width:48.5em;margin:0 auto;border:1px solid #fff;text-align:center;}#ygma {text-align:right;margin-bottom:53px}\n#ygma img {float:left;}#ygma div {border-bottom:1px solid #ccc;padding-bottom:8px;margin-left:152px;}#bd {clear:both;text-align:left;width:75%;margin:0 auto 20px;}\nh1 {font-size:135%;text-align:center;margin:0 0 15px;}legend {display:none;}fieldset {border:0 solid #fff;padding:.8em 0 .8em 4.5em;}\nform {position:relative;background:#eee;margin-bottom:15px;border:1px solid #ccc;border-width:1px 0;}\n#s1p {width:15em;margin-right:.1em;}\nform span {position:absolute;left:70%;top:.8em;}form a {font:78%/1.2em arial;display:block;padding-left:.8em;white-space:nowrap;background: url(http://l.yimg.com/a/i/s/bullet.gif) no-repeat left center;} \nform .sep {display:none;}.more {text-align:center;}#ft {padding-top:10px;border-top:1px solid #999;}#ft p {text-align:center;font:78% arial;}\n/* end nn4 hide */\n</style></head>\n<body><div id=\"doc\">\n<div id=\"ygma\"><img\nsrc=http://l.yimg.com/a/i/yahoo.gif\nwidth=147 height=31 border=0 alt=\"Yahoo!\"><div><a\nhref=\"http://us.rd.yahoo.com/404/*http://www.yahoo.com\">Yahoo!</a>\n - Help</div></div>\n<div id=\"bd\"><h1>Sorry, the page you requested was not found.</h1>\n<p>Please check the URL for proper spelling and capitalization. If\nyou're having trouble locating a destination on Yahoo!, try visiting the\n<strong><a\nhref=\"http://us.rd.yahoo.com/404/*http://www.yahoo.com\">Yahoo! home\npage</a></strong> or look through a list of <strong><a\nhref=\"http://us.rd.yahoo.com/404/*http://docs.yahoo.com/docs/family/more/\">Yahoo!'s\nonline services</a></strong>. Also, you may find what you're looking for\nif you try searching below.</p>\n<form name=\"s1\" action=\"http://us.rd.yahoo.com/404/*-http://search.yahoo.com/search\"><fieldset>\n<legend><label for=\"s1p\">Search the Web</label></legend>\n<input type=\"text\" size=30 name=\"p\" id=\"s1p\" title=\"enter search terms here\">\n<input type=\"submit\" value=\"Search\">\n<span>advanced search <span class=sep>|</span> most popular</span>\n</fieldset></form>\n<p class=\"more\">Please try <strong><a\nhref=\"http://us.rd.yahoo.com/404/*http://help.yahoo.com\">Yahoo!\nHelp Central</a></strong> if you need more assistance.</p>\n</div><div id=\"ft\"><p>Copyright © 2014 Yahoo! Inc.\nAll rights reserved. <a\nhref=\"http://us.rd.yahoo.com/404/*http://privacy.yahoo.com\">Privacy\nPolicy</a> - <a\nhref=\"http://us.rd.yahoo.com/404/*http://docs.yahoo.com/info/terms/\">Terms\nof Service</a></p></div>\n</div></body></html>\n"
Looks like your problem is the ignore=.csv part.
I mean this is probably trying to encode it as a domain extension. Probably you should remove the dot to solve the problem.
I'm using the (Axlsx gem and it's working great, but I need to add an image to a cell.
I know it can be done with an image file (see Adding image to Excel file generated by Axlsx.?), but I'm having a lot of trouble using our images stored in S3 (through Carrierwave).
Things I've tried:
# image.url = 'http://.../test.jpg'
ws.add_image(:image_src => image.url,:noSelect => true, :noMove => true) do |image|
# ArgumentError: File does not exist
or
ws.add_image(:image_src => image,:noSelect => true, :noMove => true) do |image|
# Invalid Data #<Object ...>
Not sure how to proceed
Try using read to pull the contents into a tempfile and use that location:
t = Tempfile.new('my_image')
t.binmode
t.write image.read
t.close
ws.add_image(:image_src => t.path, ...
To add an alternative answer for Paperclip & S3 as I couldn't find a reference for that besides this answer.
I'm using Rails 5.0.2 and Paperclip 4.3.1.
With image URLs like: http://s3.amazonaws.com/prod/accounts/logos/000/000/001/original/logo.jpg?87879987987987
#logo = #account.company_logo
if #logo.present?
#logo_image = Tempfile.new(['', ".#{#logo.url.split('.').last.split('?').first}"])
#logo_image.binmode # note that our tempfile must be in binary mode
#logo_image.write open(#logo.url).read
#logo_image.rewind
end
In the .xlsx file
sheet.add_image(image_src: #logo_image.path, noSelect: true, noMove: true, hyperlink: "#") do |image|...
Reference link: http://mensfeld.pl/tag/tempfile/ for more reading.
The .split('.').last.split('?').first is to get .jpg from logo.jpg? 87879987987987.
I am trying to implement the youtube_it youtube api wrapper for ruby and have it working except I'm stumped as to how the query results should be accessed.
Here is my query:
client.videos_by(:query => "penguin", :max_results => 1)
Submitting request [url=http://gdata.youtube.com/feeds/api/videos?max-results=1&start-index=1&vq=penguin].
=> #<YouTubeIt::Response::VideoSearch:0xb6c41b14 #feed_id="http://gdata.youtube.com/feeds/api/videos", #updated_at=Wed Nov 03 18:01:39 UTC 2010, #videos=[#<YouTubeIt::Model::Video:0xb6c424d8 #thumbnails=[#<YouTubeIt::Model::Thumbnail:0xb6c6b694 #url="http://i.ytimg.com/vi/oSbLpQEZP1Y/2.jpg", #width=120, #height=90, #time="00:01:34">, #<YouTubeIt::Model::Thumbnail:0xb6c6b248 #url="http://i.ytimg.com/vi/oSbLpQEZP1Y/1.jpg", #width=120, #height=90, #time="00:00:47">, #<YouTubeIt::Model::Thumbnail:0xb6c6a988 #url="http://i.ytimg.com/vi/oSbLpQEZP1Y/3.jpg", #width=120, #height=90, #time="00:02:21">, #<YouTubeIt::Model::Thumbnail:0xb6c69e34 #url="http://i.ytimg.com/vi/oSbLpQEZP1Y/0.jpg", #width=320, #height=240, #time="00:01:34">], #categories=[#<YouTubeIt::Model::Category:0xb6ca5d6c #term="Music", #label="Music">], #noembed=false, #racy=false, #favorite_count=7862, #duration=188, #author=#<YouTubeIt::Model::Author:0xb6c9942c #name="wili", #uri="http://gdata.youtube.com/feeds/api/users/wili">, #updated_at=Tue Nov 02 08:45:25 UTC 2010, #longitude=nil, #position=nil, #view_count=1682350, #html_content="penguin", #media_content=[#<YouTubeIt::Model::Content:0xb6c770d4 #url="http://www.youtube.com/v/oSbLpQEZP1Y?f=videos&app=youtube_gdata", #duration=188, #format=#<YouTubeIt::Model::Video::Format:0xb656d108 #name=:swf, #format_code=5>, #default=true, #mime_type="application/x-shockwave-flash">, #<YouTubeIt::Model::Content:0xb6c766d4 #url="rtsp://v5.cache3.c.youtube.com/CiILENy73wIaGQlWPxkBpcsmoRMYDSANFEgGUgZ2aWRlb3MM/0/0/0/video.3gp", #duration=188, #format=#<YouTubeIt::Model::Video::Format:0xb656d11c #name=:rtsp, #format_code=1>, #default=false, #mime_type="video/3gpp">, #<YouTubeIt::Model::Content:0xb6c75d38 #url="rtsp://v8.cache3.c.youtube.com/CiILENy73wIaGQlWPxkBpcsmoRMYESARFEgGUgZ2aWRlb3MM/0/0/0/video.3gp", #duration=188, #format=#<YouTubeIt::Model::Video::Format:0xb656d0f4 #name=:three_gpp, #format_code=6>, #default=false, #mime_type="video/3gpp">], #description="penguin", #latitude=nil, #title="penguin", #published_at=Mon May 08 18:11:01 UTC 2006, #player_url="http://www.youtube.com/watch?v=oSbLpQEZP1Y&feature=youtube_gdata_player", #rating=#<YouTubeIt::Model::Rating:0xb6c5eb4c #min=1, #max=5, #average=4.676985, #rater_count=2746>, #keywords=["pigloo", "penguin"], #video_id="http://gdata.youtube.com/feeds/api/videos/oSbLpQEZP1Y", #where=nil>], #total_result_count=291282, #offset=1, #max_result_count=1>
I would like to retrieve the URL and thumbnail links. Any ideas?
I don't have a great deal of knowledge of this particular gem, but your answer should at least be close to this. You can access the object directly through the videos accessor, which will give you the video object, on which thumbnails each have a url. so you could do the following:
reply = client.videos_by(:query => "penguin", :max_results => 1)
reply.videos.first.thumbnails.first.url # the thumbnail for the first video
reply.videos.first.player_url # The website for the video
reply.videos.first.media_content.first.url # direct embed url
It might be useful to search for some ruby beginners guides to help catch you up to speed as well. Good luck!
william's answer is correct, when you do it
client.videos_by(:query => "penguin", :max_results => 1)
this return an array called videos, so you just need iterate it
client.videos.each do |video|
video.title
video.thumbnails
video.video_id
end
good luck!