I have a base64-encoded, 32-bit integer, that's an IP address: DMmN60
I cannot for the life of me figure out how to both unpack it and turn it into a quad-dotted representation I can actually use.
Unpacking it with unpack('m') actually only gives me three bytes. I don't see how that's right either, but this is far from my expertise.
Hello from the future (two years later) with Ruby 1.9.3. To encode:
require 'base64'
ps = "204.152.222.180"
pa = ps.split('.').map { |_| _.to_i } # => [204, 152, 222, 180]
pbs = pa.pack("C*") # => "\xCC\x98\xDE\xB4"
es = Base64.encode64(s) # => "zJjetA==\n"
To decode:
require 'base64'
es = "zJjetA==\n"
pbs = Base64.decode64(es) # => "\xCC\x98\xDE\xB4"
pa = pbs.unpack("C*") # => [204, 152, 222, 180]
ps = pa.map { |_| _.to_s }.join('.') # => "204.152.222.180"
Name explanation:
ps = plain string
pa = plain array
pbs = plain binary string
es = encoded string
For your example:
Base64.decode64("DMmN60==").unpack("C*").map { |_| _.to_s }.join('.')
# => "12.201.141.235"
Just like #duskwuff said.
Your comment above said that the P10 base64 is non-standard, so I took a look at: http://carlo17.home.xs4all.nl/irc/P10.html which defines:
<base64> = 'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P',
'Q','R','S','T','U','V','W','X','Y','Z','a','b','c','d','e','f',
'g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v',
'w','x','y','z','0','1','2','3','4','5','6','7','8','9','[',']'
It is the same as http://en.wikipedia.org/wiki/Base64 -- except that the last two characters are + and / in the normal Base64. I don't see how this explains the discrepancy, since your example did not use those characters. I do wonder, however, if it was due to some other factor -- perhaps an endian issue.
Adding padding (DMmN60==) and decoding gives me the bytes:
0C C9 8D EB
Which decodes to 12.201.141.235.
Related
I am trying to create a bitcoin address in ruby according to the documentation of bitcoin wiki (bitcoin creation according bitcoin wiki).
Starting point is just some random string which emulates the output of ripmed160.
Unfortunately I don't quite succeed in doing so, here is my code:
require 'base58_gmp'
tx_hash = "a8d0c0184dde994a09ec054286f1ce581bebf46446a512166eae7628734ea0a5"
ripmed160 = tx_hash[0..39]
ripmed160_with_pre = "00" + ripmed160
sha1 = Digest::SHA256.hexdigest ripmed160_with_pre
sha2 = Digest::SHA256.hexdigest sha1
bin_address = Integer("0x" + ripmed160_with_pre + sha2[0..7])
bitcoin_address = "1" + Base58GMP.encode(bin_address, 'bitcoin') # => "1GPcbTYDBwJ42MfKkedxjmJ3nrgoaNd2Sf"
I get something that looks like a bitcoin address but it is not recognised by blockchain.info so I guess it is invalid.
Can you please help me to make that work.
When you calculate the SHA256 checksum, make sure to calculate it over the actual bytes of the previous step, not the hex encoding of those bytes:
# First convert to actual bytes.
bytes = [ripmed160_with_pre].pack('H*')
# Now calculate the first hash over the raw bytes, and
# return the raw bytes again for the next hash
# (note: digest not hexdigest).
sha1 = Digest::SHA256.digest bytes
# Second SHA256, using the raw bytes from the previous step
# but this time we can use hexdigest as the rest of the code
# assumes hex encoded strings
sha2 = Digest::SHA256.hexdigest sha1
I'm trying to format a UUIDv4 into a url friendly string. The typical format in base16 is pretty long and has dashes:
xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
To avoid dashes and underscores I was going to use base58 (like bitcoin does) so each character fully encode sqrt(58).floor = 7 bits.
I can pack the uuid into binary with:
[ uuid.delete('-') ].pack('H*')
To get 8-bit unsigned integers its:
binary.unpack('C*')
How can i unpack every 7-bits into 8-bit unsigned integers? Is there a pattern to scan 7-bits at a time and set the high bit to 0?
require 'base58'
uuid ="123e4567-e89b-12d3-a456-426655440000"
Base58.encode(uuid.delete('-').to_i(16))
=> "3fEgj34VWmVufdDD1fE1Su"
and back again
Base58.decode("3fEgj34VWmVufdDD1fE1Su").to_s(16)
=> "123e4567e89b12d3a456426655440000"
A handy pattern to reconstruct the uuid format from a template
template = 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'
src = "123e4567e89b12d3a456426655440000".each_char
template.each_char.reduce(''){|acc, e| acc += e=='-' ? e : src.next}
=> "123e4567-e89b-12d3-a456-426655440000"
John La Rooy's answer is great, but I just wanted to point out how simple the Base58 algorithm is because I think it's neat. (Loosely based on the base58 gem, plus bonus original int_to_uuid function):
ALPHABET = "123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ".chars
BASE = ALPHABET.size
def base58_to_int(base58_val)
base58_val.chars
.reverse_each.with_index
.reduce(0) do |int_val, (char, index)|
int_val + ALPHABET.index(char) * BASE ** index
end
end
def int_to_base58(int_val)
''.tap do |base58_val|
while int_val > 0
int_val, mod = int_val.divmod(BASE)
base58_val.prepend ALPHABET[mod]
end
end
end
def int_to_uuid(int_val)
base16_val = int_val.to_s(16)
[ 8, 4, 4, 4, 12 ].map do |n|
base16_val.slice!(0...n)
end.join('-')
end
uuid = "123e4567-e89b-12d3-a456-426655440000"
int_val = uuid.delete('-').to_i(16)
base58_val = int_to_base58(int_val)
int_val2 = base58_to_int(base58_val)
uuid2 = int_to_uuid(int_val2)
printf <<END, uuid, int_val, base_58_val, int_val2, uuid2
Input UUID: %s
Input UUID as integer: %d
Integer encoded as base 58: %s
Integer decoded from base 58: %d
Decoded integer as UUID: %s
END
Output:
Input UUID: 123e4567-e89b-12d3-a456-426655440000
Input UUID as integer: 24249434048109030647017182302883282944
Integer encoded as base 58: 3fEgj34VWmVufdDD1fE1Su
Integer decoded from base 58: 24249434048109030647017182302883282944
Decoded integer as UUID: 123e4567-e89b-12d3-a456-426655440000
I'm using an API in which I have to send client informations as a Json-object over a telnet connection (very strange, I know^^).
I'm german so the client information contains very often umlauts or the ß.
My procedure:
I generate a Hash with all the command information.
I convert the Hash to a Json-object.
I convert the Json-object to a string (with .to_s).
I send the string with the Net::Telnet.puts command.
My puts command looks like: (cmd is the Json-object)
host.puts(cmd.to_s.force_encoding('UTF-8'))
In the log files I see, that the Json-object don't contain the umlauts but for example this: ü instead of ü.
I proved that the string is (with or without the force_encoding() command) in UTF-8. So I think that the puts command doesn't send the strings in UTF-8.
Is it possible to send the command in UTF-8? How can I do this?
The whole methods:
host = Net::Telnet::new(
'Host' => host_string,
'Port' => port_integer,
'Output_log' => 'log/'+Time.now.strftime('%Y-%m-%d')+'.log',
'Timeout' => false,
'Telnetmode' => false,
'Prompt' => /\z/n
)
def send_cmd_container(host, cmd, params=nil)
cmd = JSON.generate({'*C'=>'se','Q'=>[get_cmd(cmd, params)]})
host.puts(cmd.to_s.force_encoding('UTF-8'))
add_request_to_logfile(cmd)
end
def get_cmd(cmd, params=nil)
if params == nil
return {'*C'=>'sq','CMD'=>cmd}
else
return {'*C'=>'sq','CMD'=>cmd,'PARAMS'=>params}
end
end
Addition:
I also log my sended requests by this method:
def add_request_to_logfile(request_string)
directory = 'log/'
File.open(File.join(directory, Time.now.strftime('%Y-%m-%d')+'.log'), 'a+') do |f|
f.puts ''
f.puts '> '+request_string
end
end
In the logfile my requests also don't contain UTF-8 umlauts but for example this: ü
TL;DR
Set 'Binmode' => true and use Encoding::BINARY.
The above should work for you. If you're interested in why, read on.
Telnet doesn't really have a concept of "encoding." Telnet just has two modes: Normal mode assumes you're sending 7-bit ASCII characters, and binary mode assumes you're sending 8-bit bytes. You can't tell Telnet "this is UTF-8" because Telnet doesn't know what that means. You can tell it "this is ASCII-7" or "this is a sequence of 8-bit bytes," and that's it.
This might seem like bad news, but it's actually great news, because it just so happens that UTF-8 encodes text as sequences of 8-bit bytes. früh, for example, is five bytes: 66 72 c3 bc 68. This is easy to confirm in Ruby:
puts str = "\x66\x72\xC3\xBC\x68"
# => früh
puts str.bytes.size
# => 5
In Net::Telnet we can turn on binary mode by passing the 'Binmode' => true option to Net::Telnet::new. But there's one more thing we have to do: Tell Ruby to treat the string like binary data, i.e. a sequence of 8-bit bytes.
You already tried to use String#force_encoding, but what you might not have realized is that String#force_encoding doesn't actually convert a string from one encoding to another. Its purpose isn't to change the data's encoding—its purpose is to tell Ruby what encoding the data is already in:
str = "früh" # => "früh"
p str.encoding # => #<Encoding:UTF-8>
p str[2] # => "ü"
p str.bytes # => [ 102, 114, 195, 188, 104 ] # This is the decimal represent-
# ation of the hexadecimal bytes
# we saw before, `66 72 c3 bc 68`
str.force_encoding(Encoding::BINARY) # => "fr\xC3\xBCh"
p str[2] # => "\xC3"
p str.bytes # => [ 102, 114, 195, 188, 104 ] # Same bytes!
Now I'll let you in on a little secret: Encoding::BINARY is just an alias for Encoding::ASCII_8BIT. Since ASCII-8BIT doesn't have multi-byte characters, Ruby shows ü as two separate bytes, \xC3\xBC. Those bytes aren't printable characters in ASCII-8BIT, so Ruby shows the \x## escape codes instead, but the data hasn't changed—only the way Ruby prints it has changed.
So here's the thing: Even though Ruby is now calling the string BINARY or ASCII-8BIT instead of UTF-8, it's still the same bytes, which means it's still UTF-8. Changing the encoding it's "tagged" as, however, means when Net::Telnet does (the equivalent of) data[n] it will always get one byte (instead of potentially getting multi-byte characters as in UTF-8), which is just what we want.
And so...
host = Net::Telnet::new(
# ...all of your other options...
'Binmode' => true
)
def send_cmd_container(host, cmd, params=nil)
cmd = JSON.generate('*C' => 'se','Q' => [ get_cmd(cmd, params) ])
cmd.force_encoding(Encoding::BINARY)
host.puts(cmd)
# ...
end
(Note: JSON.generate always returns a UTF-8 string, so you never have to do e.g. cmd.to_s.)
Useful diagnostics
A quick way to check what data Net::Telnet is actually sending (and receiving) is to set the 'Dump_log' option (in the same way you set the 'Output_log' option). It will write both sent and received data to a log file in hexdump format, which will allow you to see if the bytes being sent are correct. For example, I started a test server (nc -l 5555) and sent the string früh (host.puts "früh".force_encoding(Encoding::BINARY)), and this is what was logged:
> 0x00000: 66 72 c3 bc 68 0a fr..h.
You can see that it sent six bytes: the first two are f and r, the next two make up ü, and the last two are h and a newline. On the right, bytes that aren't printable characters are shown as ., ergo fr..h.. (By the same token, I sent the string I❤NY and saw I...NY. in the right column, because ❤ is three bytes in UTF-8: e2 9d a4).
So, if you set 'Dump_log' and send a ü, you should see c3 bc in the output. If you do, congratulations—you're sending UTF-8!
P.S. Read Yehuda Katz' article Ruby 1.9 Encodings: A Primer and the Solution for Rails. In fact, read it yearly. It's really, really useful.
I have a table like this
table = {57,55,0,15,-25,139,130,-23,173,148,-24,136,158}
it is utf8 encoded byte array by php unpack function
unpack('C*',$str);
how can I convert it to utf-8 string I can read in lua?
Lua doesn't provide a direct function for turning a table of utf-8 bytes in numeric form into a utf-8 string literal. But it's easy enough to write something for this with the help of string.char:
function utf8_from(t)
local bytearr = {}
for _, v in ipairs(t) do
local utf8byte = v < 0 and (0xff + v + 1) or v
table.insert(bytearr, string.char(utf8byte))
end
return table.concat(bytearr)
end
Note that none of lua's standard functions or provided string facilities are utf-8 aware. If you try to print utf-8 encoded string returned from the above function you'll just see some funky symbols. If you need more extensive utf-8 support you'll want to check out some of the libraries mention from the lua wiki.
Here's a comprehensive solution that works for the UTF-8 character set restricted by RFC 3629:
do
local bytemarkers = { {0x7FF,192}, {0xFFFF,224}, {0x1FFFFF,240} }
function utf8(decimal)
if decimal<128 then return string.char(decimal) end
local charbytes = {}
for bytes,vals in ipairs(bytemarkers) do
if decimal<=vals[1] then
for b=bytes+1,2,-1 do
local mod = decimal%64
decimal = (decimal-mod)/64
charbytes[b] = string.char(128+mod)
end
charbytes[1] = string.char(vals[2]+decimal)
break
end
end
return table.concat(charbytes)
end
end
function utf8frompoints(...)
local chars,arg={},{...}
for i,n in ipairs(arg) do chars[i]=utf8(arg[i]) end
return table.concat(chars)
end
print(utf8frompoints(72, 233, 108, 108, 246, 32, 8364, 8212))
--> Héllö €—
Using Ruby, how can one parse the ID3 tags of remote mp3 files without downloading the entire file to disk?
This question has been asked in Java and Silverlight, but no Ruby.
Edit: Looking at the Java answer, it seems as though it's possible (HTTP supports it) to download just the tail end of a file, which is where the tags are. Can this be done in Ruby?
which Ruby version are you using?
which ID3 Tag version are you trying to read?
ID3v1 tags are at the end of a file, in the last 128 bytes. With Net::HTTP it doesn't seem to be possible to seek forward towards the end of the file and read only the last N bytes. If you try that, using
headers = {"Range" => "bytes=128-"} , it always seems to download the complete file. resp.body.size => file-size . But no big loss, because ID3 version 1 is pretty much outdated at this point because of it's limitations, such as fixed length format, only ASCII text, ...). iTunes uses ID3 version 2.2.0.
ID3v2 tags are at the beginning of a file - to support streaming - you can download the initial part of the MP3 file, which contains the ID3v2 header, via HTTP protocol >= 1.1
The short answer:
require 'net/http'
require 'uri'
require 'id3' # id3 RUby library
require 'hexdump'
file_url = 'http://example.com/filename.mp3'
uri = URI(file_url)
size = 1000 # ID3v2 tags can be considerably larger, because of embedded album pictures
Net::HTTP.version_1_2 # make sure we use higher HTTP protocol version than 1.0
http = Net::HTTP.new(uri.host, uri.port)
resp = http.get( file_url , {'Range' => "bytes=0-#{size}"} )
# should check the response status codes here..
if resp.body =~ /^ID3/ # we most likely only read a small portion of the ID3v2 tag..
# file has ID3v2 tag
puts resp.body.hexdump
tag2 = ID3::Tag2.new
tag2.read_from_buffer( resp.body )
#id3_tag_size = tag2.ID3v2tag_size # that's the size of the whole ID3v2 tag
# we should now re-fetch the tag with the correct / known size
# ...
end
e.g.:
index 0 1 2 3 4 5 6 7 8 9 A B C D E F
00000000 ["49443302"] ["00000000"] ["11015454"] ["3200000d"] ID3.......TT2...
00000010 ["004b6167"] ["75796120"] ["48696d65"] ["00545031"] .Kaguya Hime.TP1
00000020 ["00000e00"] ["4a756e6f"] ["20726561"] ["63746f72"] ....Juno reactor
00000030 ["0054414c"] ["00001100"] ["4269626c"] ["65206f66"] .TAL....Bible of
00000040 ["20447265"] ["616d7300"] ["54524b00"] ["00050036"] Dreams.TRK....6
00000050 ["2f390054"] ["59450000"] ["06003139"] ["39370054"] /9.TYE....1997.T
00000060 ["434f0000"] ["1300456c"] ["65637472"] ["6f6e6963"] CO....Electronic
00000070 ["612f4461"] ["6e636500"] ["54454e00"] ["000d0069"] a/Dance.TEN....i
00000080 ["54756e65"] ["73207632"] ["2e300043"] ["4f4d0000"] Tunes v2.0.COM..
00000090 ["3e00656e"] ["67695475"] ["6e65735f"] ["43444442"] >.engiTunes_CDDB
000000a0 ["5f494473"] ["00392b36"] ["34374334"] ["36373436"] _IDs.9+647C46746
000000b0 ["38413234"] ["38313733"] ["41344132"] ["30334544"] 8A248173A4A203ED
000000c0 ["32323034"] ["4341422b"] ["31363333"] ["39390000"] 2204CAB+163399..
000000d0 ["00000000"] ["00000000"] ["00000000"] ["00000000"] ................
The long answer looks something like this: (you'll need id3 library version 1.0.0_pre or newer)
require 'net/http'
require 'uri'
require 'id3' # id3 RUby library
require 'hexdump'
file_url = 'http://example.com/filename.mp3'
def get_remote_id3v2_tag( file_url ) # you would call this..
id3v2tag_size = get_remote_id3v2_tag_size( file_url )
if id3v2tag_size > 0
buffer = get_remote_bytes(file_url, id3v2tag_size )
tag2 = ID3::Tag2.new
tag2.read_from_buffer( buffer )
return tag2
else
return nil
end
end
private
def get_remote_id3v2_tag_size( file_url )
buffer = get_remote_bytes( file_url, 100 )
if buffer.bytesize > 0
return buffer.ID3v2tag_size
else
return 0
end
end
private
def get_remote_bytes( file_url, n)
uri = URI(file_url)
size = n # ID3v2 tags can be considerably larger, because of embedded album pictures
Net::HTTP.version_1_2 # make sure we use higher HTTP protocol version than 1.0
http = Net::HTTP.new(uri.host, uri.port)
resp = http.get( file_url , {'Range' => "bytes=0-#{size-1}"} )
resp_code = resp.code.to_i
if (resp_code >= 200 && resp_code < 300) then
return resp.body
else
return ''
end
end
get_remote_id3v2_tag_size( file_url )
=> 2262
See:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35
http://en.wikipedia.org/wiki/Byte_serving
some examples how to download in parts files can be found here:
but please note that there seems to be no way to start downloading "in the middle"
How do I download a binary file over HTTP?
http://unixgods.org/~tilo/Ruby/ID3/docs/index.html
you would at least have to download the last blocks of the file, which contain the ID3 tags -- see ID3 tag definitions...
if you have access to the files on the remote file system, you could do. this remotely, and then transfer back the ID3 tags
Edit:
I was thinking of ID3 v1 tags -- version 2 tags are in the front.