Related
I have the following array that represents decimal values of ASCII and non ASCII characters.
a=[32, 57, 50, 32, 56, 51, 32, 65, 52, 130, 0, 101, 131, 69, 72, 38, 146, 89, 9]
Converting to char looks like this
a.map{|b| b.chr}
=> [" ", "9", "2", " ", "8", "3", " ", "A", "4", "\x82", "\x00", "e", "\x83", "E", "H", "&", "\x92", "Y", "\t"]
and joining in order to create a string with bytes (pairs of hexadecimal numbers, [0-9A-F]) I do this:
a.map{|b| b.chr}.join
=> " 92 83 A4\x82\x00e\x83EH&\x92Y\t"
Then I want to remove the string beginning from the first non ASCII value that is \x82 and I do like this but nothing happens.
a.map{|b| b.chr}.join.gsub(/\\x.*/,"")
=> " 92 83 A4\x82\x00e\x83EH&\x92Y\t"
My expected output is to have only the hexadecimal numbers below:
92 83 A4
How can I do this?
Thanks for any help.
UPDATE
Testing with a larger array like below one, I see that the output is correct only for the #rewritten's solution. The output for this new arrays is " 92 83 49 26 92 59 00"
a=[32, 57, 50, 32, 56, 51, 32, 52, 57, 32, 50, 54, 32, 57, 50, 32, 53, 57,
32, 48, 48, 0, 0, 0, 0, 2, 130, 0, 0, 8, 254, 70, 124, 0, 6, 0, 3, 0, 3,
27, 0,2, 27, 3, 0, 227, 7, 1, 14, 17, 33, 0, 28, 14, 47, 38, 146, 89, 9]
a.map(&:chr).join.match(/^( \X\X)+/)[0] # rewritten's solution
a.map(&:chr).take_while(&"\x80".method(:>)).join # Aleksei's solution
a.map(&:chr).take_while(&:ascii_only?).join # cremno's solution
irb(main): a.map(&:chr).join.match(/^( \X\X)+/)[0]
=> " 92 83 49 26 92 59 00"
irb(main): a.map(&:chr).take_while(&"\x80".method(:>)).join
=> " 92 83 49 26 92 59 00\x00\x00\x00\x00\x02"
irb(main): a.map(&:chr).take_while(&:ascii_only?).join
=> " 92 83 49 26 92 59 00\x00\x00\x00\x00\x02"
Thanks to all for the help.
Just filter it out before joining an array into a string:
[" ", "9", "2", " ", "8", "3", " ", "A", "4", "\x82", "\x00"].
take_while(&"\x80".method(:>))
#⇒ [" ", "9", "2", " ", "8", "3", " ", "A", "4"]
Then do whatever you want with the resulting array.
Given the comment, I assume that you really want to ask about matching pattern "space, hex, hex" up to the first non-match.
This would be like
a.map(&:chr).join.match(/^( \X\X)+/)[0]
It uses the special \X placeholder for regular expressions that matches u̶p̶p̶e̶r̶c̶a̶s̶e̶ hex digits (0-9,A-F,a-f).
Additional info:
Again based on my interpretation of the question, if the original array is long (or a stream) there is no need to consume it all. You should better stop generating characters as soon as possible:
hexs = "0123456789ABCDEF".split.map(&:ord)
a.
lazy.
each_slice(3).
take_while { |spc, h1, h2| spc == 32 && hexs.include?(h1) && hexs.include?(h2) }.
flat_map(&:chr).
to_a.
join
This way any piece of your integer array is not even taken into account.
I have an array of [1, 0, 11, 0, 4, 0, 106, 211, 169, 1, 0, 12, 0, 8, 0, 1, 26, 25, 32, 189, 77, 216, 1, 0, 1, 0, 4, 0, 0, 0, 0, 12, 15].
I would love to create a string version mostly for logging purposes. My end result would be "01000B0004006AD3..."
I could not find a simple way to take each array byte value and pack a string with an ASCII presentation of the byte value.
My solution is cumbersome. I appreciate advice on making the solution slick.
array.each {|x|
value = (x>>4)&0x0f
if( value>9 ) then
result_string.concat (value-0x0a + 'A'.ord).chr
else
result_string.concat (value + '0'.ord).chr
end
value = (x)&0x0f
if( value>9 ) then
result_string.concat (value-0x0a + 'A'.ord).chr
else
result_string.concat (value + '0'.ord).chr
end
}
Your question isn't very clear, but I guess something like this is what you are looking for:
array.map {|n| n.to_s(16).rjust(2, '0').upcase }.join
#=> "01000B0004006AD3A901000C000800011A1920BD4DD80100010004000000000C0F"
or
array.map(&'%02X'.method(:%)).join
#=> "01000B0004006AD3A901000C000800011A1920BD4DD80100010004000000000C0F"
Which one of the two is more readable depends on how familiar your readers are with sprintf-style format strings, I guess.
It's actually pretty simple:
def hexpack(data)
data.pack('C*').unpack('H*')[0]
end
That packs your bytes using integer values (C) and unpacks the resulting string to hex (H). In practice:
hexpack([1, 0, 11, 0, 4, 0, 106, 211, 169, 1, 0, 12, 0, 8, 0, 1, 26, 25, 32, 189, 77, 216, 1, 0, 1, 0, 4, 0, 0, 0, 0, 12, 15])
# => "01000b0004006ad3a901000c000800011a1920bd4dd80100010004000000000c0f"
I might suggest you stick to hex or base64 instead of making your own formatting
dat = [1, 0, 11, 0, 4, 0, 106, 211, 169, 1, 0, 12, 0, 8, 0, 1, 26, 25, 32, 189, 77, 216, 1, 0, 1, 0, 4, 0, 0, 0, 0, 12, 15]
Hexadecimal
hex = dat.map { |x| sprintf('%02x', x) }.join
# => 01000b0004006ad3a901000c000800011a1920bd4dd80100010004000000000c0f
Base64
require 'base64'
base64 = Base64.encode64(dat.pack('c*'))
# => AQALAAQAatOpAQAMAAgAARoZIL1N2AEAAQAEAAAAAAwP\n
Proquints
What? Proquints are pronounceable unique identifiers which makes them great for reading/communicating binary data. In your case, maybe not the best because you're dealing with 30+ bytes here, but they're very suitable for smaller byte strings
# proquint.rb
# adapted to ruby from https://github.com/deoxxa/proquint
module Proquint
C = %w(b d f g h j k l m n p r s t v z)
V = %w(a i o u)
def self.encode (bytes)
bytes << 0 if bytes.size & 1 == 1
bytes.pack('c*').unpack('S*').reduce([]) do |acc, n|
c1 = n & 0x0f
v1 = (n >> 4) & 0x03
c2 = (n >> 6) & 0x0f
v2 = (n >> 10) & 0x03
c3 = (n >> 12) & 0x0f
acc << C[c1] + V[v1] + C[c2] + V[v2] + C[c3]
end.join('-')
end
def decode str
# learner's exercise
# or see some proquint library (eg) https://github.com/deoxxa/proquint
end
end
Proquint.encode dat
# => dabab-rabab-habab-potat-nokab-babub-babob-bahab-pihod-bohur-tadot-dabab-dabab-habab-babab-babub-zabab
Of course the entire process is reversible too. You might not need it, so I'll leave it as an exercise for the learner
It's particularly nice for things like IP address, or any other short binary blobs. You gain familiarity too as you see common byte strings in their proquint form
Proquint.encode [192, 168, 11, 51] # bagop-rasag
Proquint.encode [192, 168, 11, 52] # bagop-rabig
Proquint.encode [192, 168, 11, 66] # bagop-ramah
Proquint.encode [192, 168, 22, 19] # bagop-kisad
Proquint.encode [192, 168, 22, 20] # bagop-kibid
We perform checksums of some data in sql server as follows:
declare #cs int;
select
#cs = CHECKSUM_AGG(CHECKSUM(someid, position))
from
SomeTable
where
userid = #userId
group by
userid;
This data is then shared with clients. We'd like to be able to repeat the checksum at the client end... however there doesn't seem to be any info about how the checksums in the functions above are calculated. Can anyone enlighten me?
On SQL Server Forum, at this page, it's stated:
The built-in CHECKSUM function in SQL Server is built on a series of 4 bit left rotational xor operations. See this post for more explanation.
The CHECKSUM function doesn't provide a very good quality checksum and IMO is pretty useless for most purposes. As far as I know the algorithm isn't published. If you want a check that you can reproduce yourself then use the HashBytes function and one of the standard, published algorithms such as MD5 or SHA.
//Quick hash sum of SQL and C # mirror Ukraine
private Int64 HASH_ZKCRC64(byte[] Data)
{
Int64 Result = 0x5555555555555555;
if (Data == null || Data.Length <= 0) return 0;
int SizeGlobalBufer = 8000;
int Ost = Data.Length % SizeGlobalBufer;
int LeftLimit = (Data.Length / SizeGlobalBufer) * SizeGlobalBufer;
for (int i = 0; i < LeftLimit; i += 64)
{
Result = Result
^ BitConverter.ToInt64(Data, i)
^ BitConverter.ToInt64(Data, i + 8)
^ BitConverter.ToInt64(Data, i + 16)
^ BitConverter.ToInt64(Data, i + 24)
^ BitConverter.ToInt64(Data, i + 32)
^ BitConverter.ToInt64(Data, i + 40)
^ BitConverter.ToInt64(Data, i + 48)
^ BitConverter.ToInt64(Data, i + 56);
if ((Result & 0x0000000000000080) != 0)
Result = Result ^ BitConverter.ToInt64(Data, i + 28);
}
if (Ost > 0)
{
byte[] Bufer = new byte[SizeGlobalBufer];
Array.Copy(Data, LeftLimit, Bufer, 0, Ost);
for (int i = 0; i < SizeGlobalBufer; i += 64)
{
Result = Result
^ BitConverter.ToInt64(Bufer, i)
^ BitConverter.ToInt64(Bufer, i + 8)
^ BitConverter.ToInt64(Bufer, i + 16)
^ BitConverter.ToInt64(Bufer, i + 24)
^ BitConverter.ToInt64(Bufer, i + 32)
^ BitConverter.ToInt64(Bufer, i + 40)
^ BitConverter.ToInt64(Bufer, i + 48)
^ BitConverter.ToInt64(Bufer, i + 56);
if ((Result & 0x0000000000000080)!=0)
Result = Result ^ BitConverter.ToInt64(Bufer, i + 28);
}
}
byte[] MiniBufer = BitConverter.GetBytes(Result);
Array.Reverse(MiniBufer);
return BitConverter.ToInt64(MiniBufer, 0);
#region SQL_FUNCTION
/* CREATE FUNCTION [dbo].[HASH_ZKCRC64] (#data as varbinary(MAX)) Returns bigint
AS
BEGIN
Declare #I64 as bigint Set #I64=0x5555555555555555
Declare #Bufer as binary(8000)
Declare #i as int Set #i=1
Declare #j as int
Declare #Len as int Set #Len=Len(#data)
if ((#data is null) Or (#Len<=0)) Return 0
While #i<=#Len
Begin
Set #Bufer=Substring(#data,#i,8000)
Set #j=1
While #j<=8000
Begin
Set #I64=#I64
^ CAST(Substring(#Bufer,#j, 8) as bigint)
^ CAST(Substring(#Bufer,#j+8, 8) as bigint)
^ CAST(Substring(#Bufer,#j+16,8) as bigint)
^ CAST(Substring(#Bufer,#j+24,8) as bigint)
^ CAST(Substring(#Bufer,#j+32,8) as bigint)
^ CAST(Substring(#Bufer,#j+40,8) as bigint)
^ CAST(Substring(#Bufer,#j+48,8) as bigint)
^ CAST(Substring(#Bufer,#j+56,8) as bigint)
if #I64<0 Set #I64=#I64 ^ CAST(Substring(#Bufer,#j+28,8) as bigint)
Set #j=#j+64
End;
Set #i=#i+8000
End
Return #I64
END
*/
#endregion
}
I figured out the CHECKSUM algorithm, at least for ASCII characters. I created a proof of it in JavaScript (see https://stackoverflow.com/a/59014293/9642).
In a nutshell: rotate 4 bits left and xor by a code for each character. The trick was figuring out the "XOR codes". Here's the table of those:
var xorcodes = [
0, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 26, 27, 28, 29, 30, 31,
0, 33, 34, 35, 36, 37, 38, 39, // !"#$%&'
40, 41, 42, 43, 44, 45, 46, 47, // ()*+,-./
132, 133, 134, 135, 136, 137, 138, 139, // 01234567
140, 141, 48, 49, 50, 51, 52, 53, 54, // 89:;<=>?#
142, 143, 144, 145, 146, 147, 148, 149, // ABCDEFGH
150, 151, 152, 153, 154, 155, 156, 157, // IJKLMNOP
158, 159, 160, 161, 162, 163, 164, 165, // QRSTUVWX
166, 167, 55, 56, 57, 58, 59, 60, // YZ[\]^_`
142, 143, 144, 145, 146, 147, 148, 149, // abcdefgh
150, 151, 152, 153, 154, 155, 156, 157, // ijklmnop
158, 159, 160, 161, 162, 163, 164, 165, // qrstuvwx
166, 167, 61, 62, 63, 64, 65, 66, // yz{|}~
];
The main thing to note is the bias towards alphanumerics (their codes are similar and ascending). English letters use the same code regardless of case.
I haven't tested high codes (128+) nor Unicode.
I'm trying to convert a Binary file to Hexadecimal using Ruby.
At the moment I have the following:
File.open(out_name, 'w') do |f|
f.puts "const unsigned int modFileSize = #{data.length};"
f.puts "const char modFile[] = {"
first_line = true
data.bytes.each_slice(15) do |a|
line = a.map { |b| ",#{b}" }.join
if first_line
f.puts line[1..-1]
else
f.puts line
end
first_line = false
end
f.puts "};"
end
This is what the following code is generating:
const unsigned int modFileSize = 82946;
const char modFile[] = {
116, 114, 97, 98, 97, 108, 97, 115, 104, 0, 0, 0, 0, 0, 0
, 0, 0, 0, 0, 0, 62, 62, 62, 110, 117, 107, 101, 32, 111, 102
, 32, 97, 110, 97, 114, 99, 104, 121, 60, 60, 60, 8, 8, 130, 0
};
What I need is the following:
const unsigned int modFileSize = 82946;
const char modFile[] = {
0x74, 0x72, etc, etc
};
So I need to be able to convert a string to its hexadecimal value.
"116" => "0x74", etc
Thanks in advance.
Ruby 1.9 added an even easier way to do this:
"0x101".hex will return the number given in hexadecimal in the string.
Change this line:
line = a.map { |b| ", #{b}" }.join
to this:
line = a.map { |b| sprintf(", 0x%02X",b) }.join
(Change to %02x if necessary, it's unclear from the example whether the hex digits should be capitalized.)
I don't know if this is the best solution, but this a solution:
class String
def to_hex
"0x" + self.to_i.to_s(16)
end
end
"116".to_hex
=> "0x74"
Binary to hex conversion in four languages (including Ruby) might be helpful.
One of the comments on that page seems to provide a very easy short cut. The example covers reading input from STDIN, but any string representation should do.:
STDIN.read.to_i(base=16).to_s(base=2)
For another approach, check out unpack
I have this code to split a string into groups of 3 bytes:
str="hello"
ix=0, iy=0
bytes=[]
tby=[]
str.each_byte do |c|
if iy==3
iy=0
bytes[ix]=[]
tby.each_index do |i|
bytes[ix][i]=tby[i]
end
ix+=1
end
tby[iy]=c
iy+=1
end
puts bytes
I've based it on this example: http://www.ruby-forum.com/topic/75570
However I'm getting type errors from it. Thanks.
ix = 0, iy = 0 translates to ix = [0, (iy = 0)], which is why you get a type error.
However there is a less "procedural" way to do what you want to do:
For ruby 1.8.7+:
"hello world".each_byte.each_slice(3).to_a
#=> [[104, 101, 108], [108, 111, 32], [119, 111, 114], [108, 100]]
For ruby 1.8.6:
require 'enumerator'
"hello world".enum_for(:each_byte).enum_for(:each_slice, 3).to_a
Your problem is the line
ix=0, iy=0
It sets the value of ix to an array of twice 0, and iy to 0. You should replace it by
ix, iy = 0, 0