How can I output leading zeros in Ruby? - ruby

I'm outputting a set of numbered files from a Ruby script. The numbers come from incrementing a counter, but to make them sort nicely in the directory, I'd like to use leading zeros in the filenames. In other words
file_001...
instead of
file_1
Is there a simple way to add leading zeros when converting a number to a string? (I know I can do "if less than 10.... if less than 100").

Use the % operator with a string:
irb(main):001:0> "%03d" % 5
=> "005"
The left-hand-side is a printf format string, and the right-hand side can be a list of values, so you could do something like:
irb(main):002:0> filename = "%s/%s.%04d.txt" % ["dirname", "filename", 23]
=> "dirname/filename.0023.txt"
Here's a printf format cheat sheet you might find useful in forming your format string. The printf format is originally from the C function printf, but similar formating functions are available in perl, ruby, python, java, php, etc.

If the maximum number of digits in the counter is known (e.g., n = 3 for counters 1..876), you can do
str = "file_" + i.to_s.rjust(n, "0")

Can't you just use string formatting of the value before you concat the filename?
"%03d" % number

Use String#next as the counter.
>> n = "000"
>> 3.times { puts "file_#{n.next!}" }
file_001
file_002
file_003
next is relatively 'clever', meaning you can even go for
>> n = "file_000"
>> 3.times { puts n.next! }
file_001
file_002
file_003

As stated by the other answers, "%03d" % number works pretty well, but it goes against the rubocop ruby style guide:
Favor the use of sprintf and its alias format over the fairly
cryptic String#% method
We can obtain the same result in a more readable way using the following:
format('%03d', number)

filenames = '000'.upto('100').map { |index| "file_#{index}" }
Outputs
[file_000, file_001, file_002, file_003, ..., file_098, file_099, file_100]

Related

How to convert bytes in number into a string of characters? (character representation of a number)

How do I easily convert a number, e.g. 0x616263, equivalently 6382179 in base 10, into a string by dividing the number up into sequential bytes? So the example above should convert into 'abc'.
I've experimented with Array.pack but cant figure out how to get it to convert more than one byte in the number, e.g. [0x616263].pack("C*") returns 'c'.
I've also tried 0x616263.to_s(256), but that throws an ArgumentError: invalid radix. I guess it needs some sort of encoding information?
(Note: Other datatypes in pack like N work with the example I've given above, but only because it fits within 4 bytes, so e.g. [0x616263646566].pack("N") gives cdef, not abcdef)
This question is vaguely similar to this one, but not really. Also, I sort of figured out how to get the hex representation string from a character string using "abcde".unpack("c*").map{|c| c.to_s(16)}.join(""), which gives '6162636465'. I basically want to go backwards.
I don't think this is an X-Y problem, but in case it is - I'm trying to convert a number I've decoded with RSA into a character string.
Thanks for any help. I'm not too experienced with Ruby. I'd also be interested in a Python solution (for fun), but I don't know if its right to add tags for two separate programming languages to this question.
To convert a single number 0x00616263 into 3 characters, what you really need to do first is separate them into three numbers: 0x00000061, 0x00000062, and 0x00000063.
For the last number, the hex digits you want are already in the correct place. But for the other two, you have to do a bitshift using >> 16 and >> 8 respectively.
Afterwards, use a bitwise and to get rid of the other digits:
num1 = (0x616263 >> 16) & 0xFF
num2 = (0x616263 >> 8) & 0xFF
num3 = 0x616263 & 0xFF
For the characters, you could then do:
char1 = ((0x616263 >> 16) & 0xFF).chr
char2 = ((0x616263 >> 8) & 0xFF).chr
char3 = (0x616263 & 0xFF).chr
Of course, bitwise operations aren't very Ruby-esque. There are probably more Ruby-like answers that someone else might provide.
64 bit integers
If your number is smaller than 2**64 (8 bytes), you can :
convert the "big-endian unsigned long long" to 8 bytes
remove the leading zero bytes
Ruby
[0x616263].pack('Q>').sub(/\x00+/,'')
# "abc"
[0x616263646566].pack('Q>').sub(/\x00+/,'')
# "abcdef"
Python 2 & 3
In Python, pack returns bytes, not a string. You can use decode() to convert bytes to a String :
import struct
import re
print(re.sub('\x00', '', struct.pack(">Q", 0x616263646566).decode()))
# abcdef
print(re.sub('\x00', '', struct.pack(">Q", 0x616263).decode()))
# abc
Large numbers
With gsub
If your number doesn't fit in 8 bytes, you could use a modified version of your code. This is shorter and outputs the string correctly if the first byte is smaller than 10 (e.g. for "\t") :
def decode(int)
if int < 2**64
[int].pack('Q>').sub(/\x00+/, '')
else
nhex = int.to_s(16)
nhex = '0' + nhex if nhex.size.odd?
nhex.gsub(/../) { |hh| hh.to_i(16).chr }
end
end
puts decode(0x616263) == 'abc'
# true
puts decode(0x616263646566) == 'abcdef'
# true
puts decode(0x0961) == "\ta"
# true
puts decode(0x546869732073656e74656e63652069732077617920746f6f206c6f6e6720666f7220616e20496e743634)
# This sentence is way too long for an Int64
By the way, here's the reverse method :
def encode(str)
str.reverse.each_byte.with_index.map { |b, i| b * 256**i }.inject(:+)
end
You should still check if your RSA code really outputs arbitrary large numbers or just an array of integers.
With shifts
Here's another way to get the result. It's similar to #Nathan's answer, but it works for any integer size :
def decode(int)
a = []
while int>0
a << (int & 0xFF)
int >>= 8
end
a.reverse.pack('C*')
end
According to fruity, it's twice as fast as the gsub solution.
I'm currently rolling with this:
n = 0x616263
nhex = n.to_s(16)
nhexarr = nhex.scan(/.{1,2}/)
nhexarr = nhexarr.map {|e| e.to_i(16)}
out = nhexarr.pack("C*")
But was hoping for a concise/built-in way to do this, so I'll leave this answer unaccepted for now.

Force Ruby to not output a float in standard form / scientific notation / exponential notation

I have the same problem as is found here for python, but for ruby.
I need to output a small number like this: 0.00001, not 1e-5.
For more information about my particular problem, I am outputting to a file using f.write("My number: " + small_number.to_s + "\n")
For my problem, accuracy isn't that big of an issue, so just doing an if statement to check if small_number < 1e-5 and then printing 0 is okay, it just doesn't seem as elegant as it should be.
So what is the more general way to do this?
f.printf "My number: %.5f\n", small_number
You can replace .5 (5 digits to the right of the decimal) with any particular formatting size you like, e.g., %8.3f would be total of 8 digits with three to the right of the decimal, much like C/C++ printf formatting strings.
If you always want 5 decimal places, you could use:
"%.5f" % small_number
I would do something like this so you can strip off trailing zero's:
puts ("%.15f" % small_number).sub(/0*$/,"")
Don't go too far past 15, or you will suffer from the imprecision of floating point numbers.
puts ("%.25f" % 0.01).sub(/0*$/,"")
0.0100000000000000002081668
This works also on integers, trim excess zeros, and always returns numbers as a valid floating point number. For clarity, this uses the sprintf instead of the more cryptic % operator.
def format_float(number)
sprintf('%.15f', number).sub(/0+$/, '').sub(/\.$/, '.0')
end
Examples:
format_float(1) => "1.0"
format_float(0.00000001) => "0.00000001"

Converting a hexadecimal number to binary in ruby

I am trying to convert a hex value to a binary value (each bit in the hex string should have an equivalent four bit binary value). I was advised to use this:
num = "0ff" # (say for eg.)
bin = "%0#{num.size*4}b" % num.hex.to_i
This gives me the correct output 000011111111. I am confused with how this works, especially %0#{num.size*4}b. Could someone help me with this?
You can also do:
num = "0ff"
num.hex.to_s(2).rjust(num.size*4, '0')
You may have already figured out, but, num.size*4 is the number of digits that you want to pad the output up to with 0 because one hexadecimal digit is represented by four (log_2 16 = 4) binary digits.
You'll find the answer in the documentation of Kernel#sprintf (as pointed out by the docs for String#%):
http://www.ruby-doc.org/core/classes/Kernel.html#M001433
This is the most straightforward solution I found to convert from hexadecimal to binary:
['DEADBEEF'].pack('H*').unpack('B*').first # => "11011110101011011011111011101111"
And from binary to hexadecimal:
['11011110101011011011111011101111'].pack('B*').unpack1('H*') # => "deadbeef"
Here you can find more information:
Array#pack: https://ruby-doc.org/core-2.7.1/Array.html#method-i-pack
String#unpack1 (similar to unpack): https://ruby-doc.org/core-2.7.1/String.html#method-i-unpack1
This doesn't answer your original question, but I would assume that a lot of people coming here are, instead of looking to turn hexadecimal to actual "0s and 1s" binary output, to decode hexadecimal to a byte string representation (in the spirit of such utilities as hex2bin). As such, here is a good method for doing exactly that:
def hex_to_bin(hex)
# Prepend a '0' for padding if you don't have an even number of chars
hex = '0' << hex unless (hex.length % 2) == 0
hex.scan(/[A-Fa-f0-9]{2}/).inject('') { |encoded, byte| encoded << [byte].pack('H2') }
end
Getting back to hex again is much easier:
def bin_to_hex(bin)
bin.unpack('H*').first
end
Converting the string of hex digits back to binary is just as easy. Take the hex digits two at a time (since each byte can range from 00 to FF), convert the digits to a character, and join them back together.
def hex_to_bin(s) s.scan(/../).map { |x| x.hex.chr }.join end

Ruby: Fuzzing through all unicode characters ‎(UTF8/Encoding/String Manipulation)

I can't iterate over the entire range of unicode characters.
I searched everywhere...
I am building a fuzzer and want to embed into a url, all unicode characters (one at a time).
For example:
http://www.example.com?a=\uff1c
I know that there are some built tools but I need more flexibility.
If i could do someting like the following: "\u" + "ff1c" it would be great.
This is the closest I got:
char = "\u0000"
...
#within iteration
char.succ!
...
but after the character "\u0039", which is the number 9, I will get "10" instead of ":"
You could use pack to convert numbers to UTF8 characters but I'm not sure if this solves your problem.
You can either create an array with numeric values of all the characters and use pack to get an UTF8 string or you can just loop from 0 to whatever you need and use pack within the loop.
I've written a small example to explain myself. The code below prints out the hex value of each character followed by the character itself.
0.upto(100) do |i|
puts "%04x" % i + ": " + [i].pack("U*")
end
Here's some simpler code, albeit slightly obfuscated, that takes advantage of the fact that Ruby will convert an integer on the right hand side of the << operator to a codepoint. This only works with Ruby 1.8 up for integer values <= 255. It will work for values greater than 255 in 1.9.
0.upto(100) do |i|
puts "" << i
end

Ruby String pad zero OPE ID

I'm working with OPE IDs. One file has them with two trailing zeros, eg, [998700, 1001900]. The other file has them with one or two leading zeros for a total length of six, eg, [009987, 010019]. I want to convert every OPE ID (in both files) to an eight-digit string with exactly two leading zeros and however many zeros at the end to get it to be eight digits long.
Try this:
a = [ "00123123", "077934", "93422", "1231234", "12333" ]
a.map { |n| n.gsub(/^0*/, '00').ljust(8, '0') }
=> ["00123123", "00779340", "00934220", "001231234", "00123330"]
If you have your data parsed and stored as strings, it could be done like this, for example.
n = ["998700", "1001900", "009987", "0010019"]
puts n.map { |i|
i =~ /^0*([0-9]+?)0*$/
"00" + $1 + "0" * [0, 6 - $1.length].max
}
Output:
00998700
00100190
00998700
00100190
This example on codepad.
I'm note very sure though, that I got the description exactly right. Please check the comments and I correct in case it's not exactly what you were looking for.
With the help of the answers given by #detunized & #nimblegorilla, I came up with:
"998700"[0..-3].rjust(6, '0').to_sym
to make the first format I described (always with two trailing zeros) equal to the second.

Resources