I have a string representation of a MD5 hex digest for a file, that I want to convert to base64 in order to use the Content-MD5 HTTP header when uploading it. Is there a clearer or more efficient mechanism to do than the following?
def hex_to_base64_digest(hexdigest)
[[hexdigest].pack("H*")].pack("m").strip
end
hex_digest = "65a8e27d8879283831b664bd8b7f0ad4"
expected_base64_digest = "ZajifYh5KDgxtmS9i38K1A=="
raise "Does not match" unless hex_to_base64_digest(hex_digest) === expected_base64_digest
Seems pretty clear and efficient to me. You can save the call to strip by specifying 0 count for the 'm' pack format (if count is 0, no line feed are added, see RFC 4648)
def hex_to_base64_digest(hexdigest)
[[hexdigest].pack("H*")].pack("m0")
end
Related
i may recieve these two strings:
base = Base64.encode64(File.open("/home/usr/Desktop/test", "rb").read)
=> "YQo=\n"
string = File.open("/home/usr/Desktop/test", "rb").read
=> "a\n"
what i have tried so far is to check string with regular expression i-e. /([A-Za-z0-9+\/]{4})*([A-Za-z0-9+\/]{4}|[A-Za-z0-9+\/]{3}=|[A-Za-z0-9+\/]{2}==$)/ but this would be very heavy if the file is big.
I also have tried base.encoding.name and string.encoding.name but both returns the same.
I have also seen this post and got regular expression solution but any other solution ?
Any idea ? I just want to get is the string is actually text or base64 encoded text....
You can use something like this, not very performant but you are guaranteed not to get false positives:
require 'base64'
def base64?(value)
value.is_a?(String) && Base64.strict_encode64(Base64.decode64(value)) == value
end
The use of strict_encode64 versus encode64 prevents Ruby from inadvertently inserting newlines if you have a long string. See this post for details.
I would like to write a Ruby script (repl.rb) which can replace a string in a binary file (string is defined by a regex) to a different, but same length string.
It works like a filter, outputs to STDOUT, which can be redirected (ruby repl.rb data.bin > data2.bin), regex and replacement can be hardcoded. My approach is:
#!/usr/bin/ruby
fn = ARGV[0]
regex = /\-\-[0-9a-z]{32,32}\-\-/
replacement = "--0ca2765b4fd186d6fc7c0ce385f0e9d9--"
blk_size = 1024
File.open(fn, "rb") {|f|
while not f.eof?
data = f.read(blk_size)
data.gsub!(regex, str)
print data
end
}
My problem is that when string is positioned in the file that way it interferes with the block size used by reading the binary file. For example when blk_size=1024 and my 1st occurance of the string begins at byte position 1000, so I will not find it in the "data" variable. Same happens with the next read cycle. Should I process the whole file two times with different block size to ensure avoiding this worth case scenario, or is there any other approach?
I would posit that a tool like sed might be a better choice for this. That said, here's an idea: Read block 1 and block 2 and join them into a single string, then perform the replacement on the combined string. Split them apart again and print block 1. Then read block 3 and join block 2 and 3 and perform the replacement as above. Split them again and print block 2. Repeat until the end of the file. I haven't tested it, but it ought to look something like this:
File.open(fn, "rb") do |f|
last_block, this_block = nil
while not f.eof?
last_block, this_block = this_block, f.read(blk_size)
data = "#{last_block}#{this_block}".gsub(regex, str)
last_block, this_block = data.slice!(0, blk_size), data
print last_block
end
print this_block
end
There's probably a nontrivial performance penalty for doing it this way, but it could be acceptable depending on your use case.
Maybe a cheeky
f.pos = f.pos - replacement.size
at the end of the while loop, just before reading the next chunk.
I have a string representation of a MD5 hex digest for a file, that I want to convert to base64 in order to use the Content-MD5 HTTP header when uploading it. Is there a clearer or more efficient mechanism to do than the following?
def hex_to_base64_digest(hexdigest)
[[hexdigest].pack("H*")].pack("m").strip
end
hex_digest = "65a8e27d8879283831b664bd8b7f0ad4"
expected_base64_digest = "ZajifYh5KDgxtmS9i38K1A=="
raise "Does not match" unless hex_to_base64_digest(hex_digest) === expected_base64_digest
Seems pretty clear and efficient to me. You can save the call to strip by specifying 0 count for the 'm' pack format (if count is 0, no line feed are added, see RFC 4648)
def hex_to_base64_digest(hexdigest)
[[hexdigest].pack("H*")].pack("m0")
end
I need to generate boundary for a multi-part upload
post << "--#{BOUNDARY}\r\n"
post << "Content-Disposition: form-data; name=\"datafile\"; filename=\"#{filename}\"\r\n"
post << "Content-Type: text/plain\r\n"
post << "\r\n"
post << file
post << "\r\n--#{BOUNDARY}--\r\n"
The BOUNDARY need to be a random string (not present in the file).
In rails, I could do SecureRandom.hex(10)
Who can I do it without loading activesupport?
If you need a random alphanumeric string, use something like:
rand(10000000000000).floor.to_s(36)
This will make a random number (change the multiplier to make the string longer) and represent it in radix 36 (10 numbers + 26 letters).
For a Base64 string, you could do something like
require 'base64'
Base64.encode64(rand(10000000000000).to_s).chomp("=\n")
If you need strings of a fixed length, play with the random number range you're supplying, using something like 1000000 + rand(10000000).
Last time I used MD5 on rand like this:
require 'md5'
random_string = MD5.md5(rand(1234567).to_s).to_s
if you want random alphanumeric string then, you can do like below
o = [('a'..'z'),('A'..'Z'),('0'..'9')].map{|i| i.to_a}.flatten
string = (0...50).map{ o[rand(o.length)] }.join
This will also generate a alphanumeric random string
rand(36**length).to_s(36)
You can also pass the "length" to generate the size of random string. Ex.8 or 10
i have ruby code to parse data in excel file using Parseexcel gem. I need to save 2 columns in that file into a Hash, here is my code:
worksheet.each { |row|
if row != nil
key = row.at(1).to_s.strip
value = row.at(0).to_s.strip
if !parts.has_key?(key) and key.length > 0
parts[key] = value
end
end
}
however it still save duplicate keys into the hash: "020098-10". I checked the excel file at the specified row and found the difference are " 020098-10" and "020098-10". the first one has a leading space while the second doesn't. I dont' understand is it true that .strip function already remove all leading and trailing white space?
also when i tried to print out key.length, it gave me these weird number:
020098-10 length 18
020098-10 length 17
which should be 9....
If you will inspect the strings you receive, you will probably get something like:
" \x000\x002\x000\x000\x009\x008\x00-\x001\x000\x00"
This happens because of the strings encoding. Excel works with unicode while ruby uses ISO-8859-1 by default. The encodings will differ on various platforms.
You need to convert the data you receive from excel to a printable encoding.
However when you should not encode strings created in ruby as you will end with garbage.
Consider this code:
#enc = Encoding::Converter.new("UTF-16LE", "UTF-8")
def convert(cell)
if cell.numeric
cell.value
else
#enc.convert(cell.value).strip
end
end
parts = {}
worksheet.each do |row|
continue unless row
key = convert row.at(1)
value = convert row.at(0)
parts[key] = value unless parts.has_key?(key) or key.empty?
end
You may want change the encodings to a different ones.
The newer Spreadsheet-gem handles charset conversion automatically for you, to UTF-8 I think as standard but you can change it, so I'd recommend using it instead.