URL encoding issues with curly braces - ruby

I'm having issues getting data from GitHub Archive.
The main issue is my problem with encoding {} and .. in my URL. Maybe I am misreading the Github API or not understanding encoding correctly.
require 'open-uri'
require 'faraday'
conn = Faraday.new(:url => 'http://data.githubarchive.org/') do |faraday|
faraday.request :url_encoded # form-encode POST params
faraday.response :logger # log requests to STDOUT
faraday.adapter Faraday.default_adapter # make requests with Net::HTTP
end
#query = '2015-01-01-15.json.gz' #this one works!!
query = '2015-01-01-{0..23}.json.gz' #this one doesn't work
encoded_query = URI.encode(query)
response = conn.get(encoded_query)
p response.body

The GitHub Archive example for retrieving a range of files is:
wget http://data.githubarchive.org/2015-01-01-{0..23}.json.gz
The {0..23} part is being interpreted by wget itself as a range of 0 .. 23. You can test this by executing that command with the -v flag which returns:
wget -v http://data.githubarchive.org/2015-01-01-{0..1}.json.gz
--2015-06-11 13:31:07-- http://data.githubarchive.org/2015-01-01-0.json.gz
Resolving data.githubarchive.org... 74.125.25.128, 2607:f8b0:400e:c03::80
Connecting to data.githubarchive.org|74.125.25.128|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2615399 (2.5M) [application/x-gzip]
Saving to: '2015-01-01-0.json.gz'
2015-01-01-0.json.gz 100%[===========================================================================================================================================>] 2.49M 3.03MB/s in 0.8s
2015-06-11 13:31:09 (3.03 MB/s) - '2015-01-01-0.json.gz' saved [2615399/2615399]
--2015-06-11 13:31:09-- http://data.githubarchive.org/2015-01-01-1.json.gz
Reusing existing connection to data.githubarchive.org:80.
HTTP request sent, awaiting response... 200 OK
Length: 2535599 (2.4M) [application/x-gzip]
Saving to: '2015-01-01-1.json.gz'
2015-01-01-1.json.gz 100%[===========================================================================================================================================>] 2.42M 867KB/s in 2.9s
2015-06-11 13:31:11 (867 KB/s) - '2015-01-01-1.json.gz' saved [2535599/2535599]
FINISHED --2015-06-11 13:31:11--
Total wall clock time: 4.3s
Downloaded: 2 files, 4.9M in 3.7s (1.33 MB/s)
In other words, wget is substituting values into the URL and then getting that new URL. This isn't obvious behavior, nor is it well documented, but you can find mention of it "out there". For instance in "All the Wget Commands You Should Know":
7. Download a list of sequentially numbered files from a server
wget http://example.com/images/{1..20}.jpg
To do what you want, you need to iterate over the range in Ruby using something like this untested code:
0.upto(23) do |i|
response = conn.get("/2015-01-01-#{ i }.json.gz")
p response.body
end

To get a better idea of what's going wrong, let's start with the example given in the GitHub documentation:
wget http://data.githubarchive.org/2015-01-01-{0..23}.json.gz
The thing to note here is that {0..23} is automagically getting expanded by bash. You can see this by running the following command:
echo {0..23}
> 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
This means wget doesn't get called just once, but instead gets called a total of 24 times. The problem you're having is that Ruby doesn't automagically expand {0..23} like bash does, and instead you're making a literal call to http://data.githubarchive.org/2015-01-01-{0..23}.json.gz, which doesn't exist.
Instead you will need to loop through 0..23 yourself and make a single call every time:
(0..23).each do |n|
query = "2015-01-01-#{n}.json.gz"
encoded_query = URI.encode(query)
response = conn.get(encoded_query)
p response.body
end

Related

Inefficient memory usage when using object.get

I thought that when downloading s3 object into a file it would write to it by chunks to avoid loading the whole file into memory.
But apparently, this is not the case, this is my code:
puts("Memory (before file loaded): #{((`ps -o rss= -p #{Process.pid}`.to_i) / 1024.0).round(2)} MB")
my_s3_object.get(response_target: file_path)
puts("Memory (after file loaded): #{((`ps -o rss= -p #{Process.pid}`.to_i) / 1024.0).round(2)} MB")
Output:
Memory (before file loaded): 191.08 MB
Memory (after file loaded): 259.41 MB
Where my_s3_object is 130MB zip archive. Ok, so it's not fully loaded into memory but almost half of it.
Is there a way to improve memory usage by passing some params to get method? Or how do I do it?
I think you are looking for ranged requests, which are a "general" HTTP "pattern" and supported by the AWS SDK.
The documentation provides the following examples, which should allow you to download parts of the object, write them, discard the bytes from memory and read the next bytes until the whole file is downloaded. In the end memory usage will depend on the range of bytes you download with every request.
Example: To retrieve a byte range of an object
# The following example retrieves an object for an S3 bucket.
# The request specifies the range header to retrieve a specific
# byte range.
resp = client.get_object({
bucket: "examplebucket",
key: "SampleFile.txt",
range: "bytes=0-9",
})
resp.to_h outputs the following:
{
accept_ranges: "bytes",
content_length: 10,
content_range: "bytes 0-9/43",
content_type: "text/plain",
etag: "\"0d94420ffd0bc68cd3d152506b97a9cc\"",
last_modified: Time.parse("Thu, 09 Oct 2014 22:57:28 GMT"),
metadata: {
},
version_id: "null",
}
Streaming data to a block
# WARNING: yielding data to a block disables retries of networking errors
# However truncation of the body will be retried automatically using a range request
File.open('/path/to/file', 'wb') do |file|
s3.get_object(bucket: 'bucket-name', key: 'object-key') do |chunk, headers|
# headers['content-length']
file.write(chunk)
end
end

issues with executing cablint-ct part of certlint tool

I am attempting to run the tool certlint, specifically the module called cablint-ct but when we try we get errors. There are three modules, certlint, cablint and cablint-ct -- all of which work except for cablint-ct.
Here is the command I am running in ruby:
ruby -I "/certlint-master/lib" "/certlint-master/bin/cablint-ct" "cert.der"
Here is the error I receive:
/certlint-master/lib/certlint/ct.rb:149:in `initialize': undefined method `+' for nil:NilClass (NoMethodError)
from /certlint-master/bin/cablint-ct:39:in `new'
from /certlint-master/bin/cablint-ct:39:in `<main>'
Here is the referring block of code in ct.rb (line 39):
30 def initialize(tbs_der)
31 asn = OpenSSL::ASN1.decode(tbs_der)
32 # tbsCertificate.version is optional, so we don't have a fixed
33 # offset. Check if the first item is a pure ASN1Data, which
34 # is a strong hint that it is an EXPLICIT wrapper for the first
35 # element in the struct. If so, this is the version, so everything
36 # is offset by one.
37 skip = asn.value[0].instance_of?(OpenSSL::ASN1::ASN1Data) ? 1 : 0
38 sig_alg_der = asn.value[1 + skip].to_der
39 #raw = OpenSSL::ASN1::Sequence.new([tbs_der, sig_alg_der, DER_SIG]).to_der
40 super(#raw)
41 end
42 end
and ct.rb (line 149)
148 def initialize(log)
149 #log = URI.parse(log + '/').normalize
150 end
I've opened an issue #37 on github with the owner of the tool but have not seen a response as of yet.
Will be grateful if someone can see if i am doing something wrong with my command or is there a coding issue somewhere?
UPDATE 1
I have figured that I need to be passing a URL into the command rather then a cert file. For example:
ruby -I "/certlint-master/lib" "/certlint-master/bin/cablint-ct" "https://ct.ws.symantec.com/ct/v1/get-entries?start=932966&end=932966"
I believe the code expect a JSON response from this URL and this link returns a file with JSON data, however, I get the following error:
/usr/share/ruby/json/common.rb:155:in `initialize': A JSON text must at least contain two octets! (JSON::ParserError)
from /usr/share/ruby/json/common.rb:155:in `new'
from /usr/share/ruby/json/common.rb:155:in `parse'
from /certlint-master/lib/certlint/ct.rb:184:in `_call'
from /certlint-master/lib/certlint/ct.rb:160:in `get_entries'
from /certlint-master/bin/cablint-ct:40:in `<main>'
Any ideas?
resolved, the command expects a known ct log url and index id e.g.
ruby -I "/certlint-master/lib" "/certlint-master/bin/cablint-ct" "https://ct.ws.symantec.com" 173977
or
ruby -I "/certlint-master/lib" "/certlint-master/bin/cablint-ct" "symantec" 173977

How to refer tag value in yaml

server:
- import:
cmd: GET GPRS <gprsEn> <gprsVa> <gprsSt>
- update:
gprsEn: 1
gprsVa: 202
gprsSt: reegan
This is my yaml file how to refer gprsEn,gprsVa and gprsSt value in GET GPRS 1 202 reegan i need a output is like
GET GPRS 1 202 reegan
There is no string substitution defined anywhere in the YAML specification, so you have to do this yourself e.g. by doing:
import ruamel.yaml as yaml
yaml_str = """\
server:
- import:
cmd: GET GPRS <gprsEn> <gprsVa> <gprsSt>
- update:
gprsEn: 1
gprsVa: 202
gprsSt: reegan
"""
data = yaml.load(yaml_str, Loader=yaml.RoundTripLoader)
cmd = data['server'][0]['import']['cmd'].replace('<', '{').replace('>', '}')
keywords = data['server'][1]['update']
print(cmd.format(**keywords))
which prints exactly the output you want:
GET GPRS 1 202 reegan
You can of course also expand the parser, but it would still need to go through hoops to specify the source of the keyword/value expansion which in your case is non-relevant (i.e. not some toplevel mapping).

why do I get "200 Type set to I. (Net::FTPReplyError)"

Note: that I have both blocks of code (see below) in the same .rb file. The first time ftp.getbinaryfile() works then it throws the error.
Note: that file variable is a static path to the file used for debugging purposes only.
I have this code in ruby 2.0.0p481 (2014-05-08) [x64-mingw32]
file = "/Filetrack/E-mail_Gateway/_Installer/GA/E-mail Gateway_10.0_Changes_PUBLIC.pdf"
list = ftp.list('*')
list.each{|item|
counter=counter+1
counter++
ftp.getbinaryfile(file, where_to_save+File.basename(file)+counter.to_s, 1024)
puts "downloaded - .each used"
}
then in the same .rb file I got this code
ftp.list('*') { |item|
puts "downloading using .list('*') {"
counter++
ftp.getbinaryfile(file, where_to_save+File.basename(file)+counter.to_s, 1024)
puts "downloaded #{file}"
}
and that code throws me this error
Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:974:in `parse227': 200 Type set to I. (Net::FTPReplyError)
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:394:in `makepasv'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:406:in `transfercmd'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:487:in `block (2 levels) in retrbinary'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:199:in `with_binary'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:485:in `block in retrbinary'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/monitor.rb:211:in `mon_synchronize'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:484:in `retrbinary'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:617:in `getbinaryfile'
ftp session is created by
ftp = Net::FTP.new('ftp.***.***.net')
ftp.passive = false
ftp.debug_mode = true
ftp.login(ftp_username, ftp_password)
could someone explain why the second version works?
UPDATE
Added ftp debugging log:
put: USER r.***
get: 331 Password required for r.***.
put: PASS ************
get: 230-Welcome to FTP
get: 230 User r.****logged in.
put: TYPE I
get: 200 Type set to I.
put: CWD /Filetrack/E-mail_Gateway/_Installer/GA/010_000_003_000/
get: 250 CWD command successful.
put: TYPE A
get: 200 Type set to A.
put: PASV
get: 227 Entering Passive Mode (194,212,10,23,195,92).
put: LIST *
get: 125 Data connection already open; Transfer starting.
get: 226 Transfer complete.
put: TYPE I
get: 200 Type set to I.
put: PASV
get: 227 Entering Passive Mode (194,212,10,23,195,93).
put: RETR /Filetrack/E-mail_Gateway/_Installer/GA/010_000_003_000/E-mail Gateway_10.0_Changes_PUBLIC.pdf
get: 125 Data connection already open; Transfer starting.
get: 226 Transfer complete.
downloaded - .each used
put: TYPE A
get: 200 Type set to A.
put: PASV
get: 227 Entering Passive Mode (***,***,10,23,195,97).
put: LIST *
get: 125 Data connection already open; Transfer starting.
downloading using .list('*') {
put: TYPE I
get: 226 Transfer complete.
put: PASV
get: 200 Type set to I.
put: TYPE A
get: 227 Entering Passive Mode (***,***,10,23,195,98).
put: TYPE I
get: 200 Type set to A.
d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:974:in `parse227': 200 Type set to I. (Net::FTPReplyError)
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:394:in `makepasv'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:406:in `transfercmd'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:487:in `block (2 levels) in retrbinary'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:199:in `with_binary'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:485:in `block in retrbinary'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/monitor.rb:211:in `mon_synchronize'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:484:in `retrbinary'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:617:in `getbinaryfile'
from download2 - debugging.rb:41:in `block in <main>'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:518:in `block (3 levels) in retrlines'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:515:in `loop'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:515:in `block (2 levels) in retrlines'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:199:in `with_binary'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:512:in `block in retrlines'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/monitor.rb:211:in `mon_synchronize'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:511:in `retrlines'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:760:in `list'
from download2 - debugging.rb:38:in `<main>'
UPDATE2
log if ftp.passive = false is used
downloading using .list('*') {
put: TYPE I
get: 226 Transfer complete.
put: PORT ***,***,20,102,235,136
get: 200 Type set to I.
put: RETR /Filetrack/E-mail_Gateway/_Installer/GA/010_000_003_000/Email Gateway_10.0_Changes_PUBLIC.pdf
get: 200 PORT command successful.
put: TYPE A
put: TYPE I
d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/protocol.rb:211:in `write': An existing connection was forcibly closed by the remote host. (Errno::ECONNRESET)
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/protocol.rb:211:in `write0'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/protocol.rb:185:in `block in write'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/protocol.rb:202:in `writing'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/protocol.rb:184:in `write'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:283:in `putline'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:360:in `block in voidcmd'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/monitor.rb:211:in `mon_synchronize'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:359:in `voidcmd'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:183:in `send_type_command'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:172:in `binary='
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:201:in `ensure in with_binary'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:201:in `with_binary'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:512:in `block in retrlines'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/monitor.rb:211:in `mon_synchronize'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:511:in `retrlines'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:760:in `list'
from download2 - debugging.rb:39:in `<main>'
UPDATE3
I tried to run the same code in ftp active mode few times and actually all files are downloaded but the script finishes with an error.
downloading using .list('*') {
put: TYPE I
get: 200 Type set to I.
put: PORT **,**,20,102,197,73
get: 200 PORT command successful.
put: RETR /Filetrack/E-mail_Gateway/_Installer/GA/010_000_003_000/E-mail Gateway_10.0_Changes_PUBLIC.pdf
get: 150 Opening BINARY mode data connection for /Filetrack/E-mail_Gateway/_Installer/GA/010_000_003_000/E-mail Gateway_10.0_Changes_PUBLIC.pdf(
60911 bytes).
get: 226 Transfer complete.
put: TYPE A
get: 200 Type set to A.
downloaded /Filetrack/E-mail_Gateway/_Installer/GA/010_000_003_000/E-mail Gateway_10.0_Changes_PUBLIC.pdf
put: TYPE I
get: 200 Type set to I.
d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/protocol.rb:158:in `rescue in rbuf_fill': Net::ReadTimeout (Net::ReadTimeout)
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/protocol.rb:152:in `rbuf_fill'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/protocol.rb:134:in `readuntil'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:1108:in `readline'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:289:in `getline'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:300:in `getmultiline'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:318:in `getresp'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:338:in `voidresp'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:526:in `block (2 levels) in retrlines'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:199:in `with_binary'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:512:in `block in retrlines'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/monitor.rb:211:in `mon_synchronize'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:511:in `retrlines'
from d:/prog/Ruby200-x64/lib/ruby/2.0.0/net/ftp.rb:760:in `list'
from download2 - debugging.rb:39:in `<main>'
Reason for `200 Type set to I. (Net::FTPReplyError)`
Your connection uses PASSIVE mode. Since you have not shown the part of code you create FTP object, I will assume that you setting mode to passive explicitly.
ftp = Net::FTP.new('example.com')
ftp.passive = true
Based on the stack trace of exception, one can see that issue happens when the method makepasv issues PASV command, but instead of getting a response of 227 Entering Passive Mode (194,212,10,23,195,93)., it gets a response 200 Type set to I.
Implementation of makepasv and parse227 (Refer Reference 1 & Reference 2 later in the post) indicates that code specifically looks for return code of 227 and if that is not the case, it will throw an FTPError.
This is what is happening in the given scenario.
Why does `parse227` receive wrong response?
This can be attributed to the syntax shown below.
This may very well be not so well understood behavior (as I myself discovered during the course answering to this post)
ftp.list('*') { |item|
ftp.getbinaryfile(file, where_to_save+File.basename(file)+counter.to_s, 1024)
}
In the above code, a LIST * command is issued by ftp.list('*'). Typical response for this command would like below:
put: LIST *
get: 125 Data connection already open; Transfer starting.
get: 226 Transfer complete.
As can be seen, LIST * produces two lines of result. This fact is crucial to understand the issue.
The block passed to ftp.list('*') downloads a binary file using getbinaryfile method.
getbinaryfile will typically issue below commands:
TYPE I to put the connection in image (binary) mode
PASVto enter passive mode
RETR /path/of/file/to/download
When the block executes for the first result of ftp.list('*'), and starts issuing commands related to getbinaryfile, at that point of time, only first line of response of LIST * has been read - the second line is yet to be read. It is this second line that shows up as response to next command issued in the block.
Hence, when first command TYPE I is issued, the code reads the second line of LIST * as response (as evident in debug logs)
put: TYPE I
get: 226 Transfer complete.
When second command PASV is issued, the code reads the response of TYPE I (as evident from debug logs)
put: PASV
get: 200 Type set to I.
Implementation of makepasv is such that it expects that response to have response code of 227 (Refer line 394 and 973 in Reference 1 and Reference 2 respectively). An exception Net::FTPReplyError was thrown gets thrown in this case as parse227 was passed a response of TYPE I command.
In summary, when using passive mode, it seems that it is not feasible to perform other FTP operations in the block given to `ftp.list('*')
Why does `ftp.list('*').each` work?
In this case, the ftp.list('*') is invoked without a block, and hence it returns the Array of strings as output. Using each on that array does not create the similar situation - and hence, there are no issues observed.
Solution
It seems that author(s) of FTP#list expected the below two variants to work in equivalent manner:
ftp.list('*') { |f| } # block given to list
ftp.list('*').each { |f| } # block given to enum returned by list
As per official documentation of list API:
list(*args) { |line| ... }
Returns an array of file information in the
directory (the output is like ls -l). If a block is given, it
iterates through the listing.
If we look at the implementation of list, then, we see that when a block is given, each line read from ftp.list('*') is yielded to the block one by one. When using passive mode, if the block tries to execute any other FTP commands, this causes the above mentioned.
754 def list(*args, &block) # :yield: line
755 cmd = "LIST"
756 args.each do |arg|
757 cmd = cmd + " " + arg.to_s
758 end
759 if block
760 retrlines(cmd, &block)
761 else
762 lines = []
763 retrlines(cmd) do |line|
764 lines << line
765 end
766 return lines
767 end
768 end
We can solve this problem by changing the implementation to become equivalent to ftp.list('*').each variant by first collecting all lines from LIST * response into an array, and the passing that array to the block if a block was given. We will still stay true to the API documentation.
def list(*args, &block) # :yield: line
cmd = "LIST"
args.each do |arg|
cmd = cmd + " " + arg.to_s
end
# First lets fetch all the lines
lines = []
retrlines(cmd) do |line|
lines << line
end
if block
lines.each { |l| yield l }
else
return lines
end
end
I have reported a bug in Ruby Bug Tracker suggesting above change in implementation of FTP#list method.
Reference 1 - Implementation of makepasv
391 # sends the appropriate command to enable a passive connection
392 def makepasv # :nodoc:
393 if #sock.peeraddr[0] == "AF_INET"
394 host, port = parse227(sendcmd("PASV"))
395 else
396 host, port = parse229(sendcmd("EPSV"))
397 # host, port = parse228(sendcmd("LPSV"))
398 end
399 return host, port
400 end
Reference 2 - Implementation of parse227
968 # handler for response code 227
969 # (Entering Passive Mode (h1,h2,h3,h4,p1,p2))
970 #
971 # Returns host and port.
972 def parse227(resp) # :nodoc:
973 if resp[0, 3] != "227"
974 raise FTPReplyError, resp
975 end
976 if m = /\((?<host>\d+(,\d+){3}),(?<port>\d+,\d+)\)/.match(resp)
977 return parse_pasv_ipv4_host(m["host"]), parse_pasv_port(m["port"])
978 else
979 raise FTPProtoError, resp
980 end
981 end
Source code snippets were taken from ftp.rb.
UPDATE: 13 Sep, 2015
The proposed change has been accepted by Ruby core team for this issue.
I've solved it disabling "This server is sitting behind a router/firewall" function insde Server Advanced.
enter image description here

File upload with Mechanize fails

I am trying to automate filling out a form (which includes a file upload) using Mechanize. I have gone through the process in the GUI interface and the file uploads fine, so I know the file isn't corrupt, but when I run my mechanize script it fails. The script executes correctly, and according to the debug it uploaded the file, but Canvas (the service I'm uploading to) says the file could not be read. I have contacted Canvas support but they are unable to help since it's a non-standard use of their system.
Here is the script (which has been anonymized):
09 mech = Mechanize.new
10 mech.log = Logger.new(STDOUT)
11 mech.user_agent_alias = 'Mac Mozilla'
12 mech.get("https://ucdenver.test.instructure.com") do |page|
13 page.form_with(:action => "/login") do |f|
14 user_field = f.field_with(:name => "pseudonym_session[unique_id]")
15 user_field.value = user
16 pwd_field = f.field_with(:name => "pseudonym_session[password]")
17 pwd_field.value = pwd
18 end.submit
19 end
20
21 mech.get("https://ucdenver.test.instructure.com/accounts/1") do |page|
22 page.form_with(:action => "/accounts/1/courses") do |f|
23 course_field = f.field_with(:name => "course[name]")
24 course_field.value = "38492"
25 end.submit
26 end
27
28 mech.page.link_with(:href => %r/settings/,
29 :class => "settings").click
30
31 mech.page.link_with(:href => %r/import/,
32 :text => %r/Import Content/).click
33
34 mech.page.link_with(:href => %r/imports\/migrate/,
35 :text => %r/Import content from a content package/).click
36
37 mech.page.form_with(:action => %r/imports\/migrate/) do |f|
38 export_system = "migration_settings[migration_type]"
39 f[export_system] = "blackboard_exporter"
40 f.file_uploads.first.mime_type = "application/zip"
41 f.file_uploads.first.file_name = bkb_export
42 end.submit
Canvas support said the server is returning:
Could not unzip archive file, exit status 9, message:
[/mnt/var/web/migration_tool/data/attachment_420130214-13303-sv6ue5-0.jpg] End-of-central-
directory signature not found. Either this file is not a zipfile, or it constitutes one
disk of a multi-part archive. In the latter case the central directory and zipfile comment
will be found on the last disk(s) of this archive. unzip: cannot find zipfile directory in
one of /mnt/var/web/migration_tool/data/attachment_420130214-13303-sv6ue5-0.jpg or
/mnt/var/web/migration_tool/data/attachment_420130214-13303-sv6ue5-0.jpg.zip, and cannot
find /mnt/var/web/migration_tool/data/attachment_420130214-13303-sv6ue5-0.jpg.ZIP, period.
Unfortunately that error makes no sense to me, but it seems odd that it references a couple of .jpg's when the script is uploading a .zip file. Any ideas or assistance would be greatly appreciated, and I'm happy to provide additional information if I've left anything useful out.
If you're really feeling like a go getter you can sign up for a free account at http://canvas.instructure.com and see the code/network activity for yourself.
Make sure you are uploading file with correct type afaik ruby-mechanize specify
Content-Type: application/octet-stream
for every type of file, chances are server is checking ContentType and is not matching with required content type correctly and I don't remember the exact version of mechanize that i used but for me it was hardcoded in mechanize gem so you have to go through the source code of mechanize and fix where it is hardcoding content type for multipart/form-data file element.

Resources