Why are my byte arrays not different even though print() says they are? - python-2.x

I am new to python so please forgive me if I'm asking a dumb question. In my function I generate a random byte array for a given number of bytes called "input_data", then I add bytewise some bit errors and store the result in another byte array called "output_data". The print function shows that it works exactly as expected, there are different bytes. But if I compare the byte arrays afterwards they seem to be identical!
def simulate_ber(packet_length, ber, verbose=False):
# generate input data
input_data = bytearray(random.getrandbits(8) for _ in xrange(packet_length))
if(verbose):
print(binascii.hexlify(input_data)+" <-- simulated input vector")
output_data = input_data
#add bit errors
num_errors = 0
for byte in range(len(input_data)):
error_mask = 0
for bit in range(0,7,1):
if(random.uniform(0, 1)*100 < ber):
error_mask |= 1 << bit
num_errors += 1
output_data[byte] = input_data[byte] ^ error_mask
if(verbose):
print(binascii.hexlify(output_data)+" <-- output vector")
print("number of simulated bit errors: " + str(num_errors))
if(input_data == output_data):
print ("data identical")
number of packets: 1
bytes per packet: 16
simulated bit error rate: 5
start simulation...
0d3e896d61d50645e4e3fa648346091a <-- simulated input vector
0d3e896f61d51647e4e3fe648346001a <-- output vector
number of simulated bit errors: 6
data identical
Where is the bug? I am sure the problem is somewhere between my ears...
Thank you in advance for your help!

output_data = input_data
Python is a referential language. When you do the above, both variables now refer to the same object in memory. e.g:
>>> y=['Hello']
>>> x=y
>>> x.append('World!')
>>> x
['Hello', 'World!']
>>> y
['Hello', 'World!']
Cast output_data as a new bytearray and you should be good:
output_data = bytearray(input_data)

Related

Lua Random number generator always produces the same number

I have looked up several tutorials on how to generate random numbers with lua, each said to use math.random(), so I did. however, every time I use it I get the same number every time, I have tried rewriting the code, and I always get the lowest possible number. I even included a random seed based on the OS time. code below.
require "math"
math.randomseed(os.time())
num = math.random(0,10)
print(num)
I'm using the random function like this:
math.randomseed(os.time())
num = math.random() and math.random() and math.random() and math.random(0, 10)
This is working fine. An other option would be to improve the built-in random function, described here.
This might help! I had to use these functions to write a class that generates Nano IDs. I basically used the milliseconds from the os.clock() function and used that for math.randomseed().
NanoId = {
validCharacters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-",
generate = function (size, validChars)
local response = ""
local ms = string.match(tostring(os.clock()), "%d%.(%d+)")
local temp = math.randomseed(ms)
if (size > 0 and string.len(validChars) > 0) then
for i = 1, size do
local num = math.random(string.len(validChars))
response = response..string.sub(validChars, num, num)
end
end
return response
end
}
function NanoId:Generate()
return self.generate(21, self.validCharacters)
end
-- Runtime Testing
for i = 1, 10 do
print(NanoId:Generate())
end
--[[
Output:
>>> p2r2-WqwvzvoIljKa6qDH
>>> pMoxTET2BrIjYUVXNMDNH
>>> w-nN7J0RVDdN6-R9iv4i-
>>> cfRMzXB4jZmc3quWEkAxj
>>> aFeYCA2kgOx-s4UN02s0s
>>> xegA--_EjEmcDk3Q1zh7K
>>> 6dkVRaNpW4cMwzCPDL3zt
>>> R2Fct5Up5OwnHeExDnqZI
>>> JwnlLZcp8kml-MHUEFAgm
>>> xPr5dULuv48UMaSTzdW5J
]]

Why do I need to run this function twice to get the expected output?

In system-verilog i have a random number generator like this:
task set_rand_value(output bit [7:0] rand_frms, output bit [13:0] rand_bcnt);
begin
bit [7:0] rand_num;
// Set the random value between 2 and 254
rand_num=$urandom_range(254,2);
rand_bcnt=$urandom_range(2047,1);
// Check to ensure number is even
if ((rand_num/2)*2 != rand_num) begin
rand_num=rand_num+1;
$display("Generated %d frames. 8'h%h", rand_num, rand_num);
$display("Packets generated with 0x%4h bytes.", rand_bcnt);
end
// Set the random frame number
else begin
rand_frms = rand_num;
$display("Generated %d frames. 8'h%h", rand_num, rand_num);
$display("Packets generated with 0x%4h bytes.", rand_bcnt);
end
end
endtask // set_rand_value
For some reason the first time I run this (and the third and fifth... etc) it returns a value of zero even though a number is properly generated.
I am calling it like this:
bit [ 7:0] num_frms;
bit [13:0] size_val;
// Generate random values
set_rand_value(num_frms, size_val); // Run twice to correct initial write error
$display("Error correction for packets sent: 0x%h", num_frms);
set_rand_value(num_frms, size_val);
$display("The number of packets being sent: 0x%h", num_frms);
Which gives me this output:
Generated 214 frames. 8'hd6
Packets generated with 0x05b2 bytes.
Error correction for packets sent: 0x00
Generated 252 frames. 8'hfc
Packets generated with 0x011f bytes.
The number of packets being sent: 0xfc
0x80fc011f // Expected number
I have tried quite a few things to correct it but for some reason the only way i can get it to behave the way i expect it to is to just run it twice every time i need to use it.
You only set num_frms if the random number is even. Why not just do
function void set_rand_value(output bit [7:0] rand_frms, output bit [13:0] rand_bcnt);
begin
// Set the random value between 2 and 254
rand_frms=$urandom*2;
rand_bcnt=$urandom_range(2047,1);
endfunction
P.S Only use tasks for routines that may block.
The reason why the code above did not work was you were not assigning rand_frms = rand_num; in the first half of the if condition . So only for 50% of the times read_frm would have a value in it. [ 50 % just been approximate ] .

Why length of encrypted numerical value is different?

I encrypted numerical value as following.
> secret = Sestrong textcureRandom::hex(128)
> encryptor = ::ActiveSupport::MessageEncryptor.new(secret, cipher: 'aes-256-cbc')
> message1 = 1
> message1.size
=> 8
> message1.class
=> Fixnum
> encrypt_message1 = encryptor.encrypt_and_sign(message1)
> encrypt_message1.length
=> 110
> message2 = 10000
> message2.size
=> 8
> message2.class
=> Fixnum
> encrypt_message2 = encryptor.encrypt_and_sign(message2)
> encrypt_message2.length
=> 110
Above result is expected result.
Because, class of number which is less than 4611686018427387903 is Fixnum, and size of Fixnum is 8 byte.
In addition, block size of AES is 128bit(16 byte).
8 byte < 16 byte.
So, both length of encrypted value of 1 and 10000 is same.
But, following case, Length of encrypted value is different.
> message3 = 1000000000000000000000000000
> message3.size
=> 12
> message3.class
=> Bignum
> encrypt_message3 = encryptor.encrypt_and_sign(message3)
> encrypt_message3.size
=> 138
1000000000000000000000000000 is Bignum,but this size is 12 and less than 16(block size of AES).
So, I expected that length of encrypted value is same to that of Fixnum.
But, these are different...
Why are these different?
There are multiple layers to what is happening here, and you cannot explain it solely on the data size + encryption used (ie you have to factor in the transformations that happen also)
Look at: https://github.com/rails/rails/blob/29be3f5d8386fc9a8a67844fa9b7d6860574e715/activesupport/lib/active_support/message_encryptor.rb
and after that look at:
https://github.com/rails/rails/blob/29be3f5d8386fc9a8a67844fa9b7d6860574e715/activesupport/lib/active_support/message_verifier.rb which is used in the encryptor.
There are a few stages:
serializing the data you pass in (this is done using Marshal.dump if you don't specify any serializer)
base64 encoding the data.
generating a digest (ie signature) for the data.
encrypting the data+digest and storign the result + iv from the cipher used in encrypted form.
If you want to understand the generated encrypted data you basically need to trace through the code above, but:
::Base64.strict_encode64(Marshal.dump(1)).size is 8
::Base64.strict_encode64(Marshal.dump(10000)).size is 8
::Base64.strict_encode64(Marshal.dump(1000000000000000000000000000)).size is 24
But:
Marshal.dump(1).size is 4
Marshal.dump(10000).size is 6
Marshal.dump(1000000000000000000000000000).size is 17
Here is how Marshal.dump works internally: http://jakegoulding.com/blog/2013/01/15/a-little-dip-into-rubys-marshal-format/
Here is how base64 encoding works: https://blogs.oracle.com/rammenon/entry/base64_explained Look at the rules for padding.

How to advance past a deflate byte sequence contained in a byte stream?

I have a byte stream that is a concatenation of sections, where each section is composed of a header plus a deflated byte stream.
I need to split this byte stream sections but the header only contains information about the data in uncompressed form, no hint about the compressed data length so I can advance properly in the stream and parse the next section.
So far the only way I found to advance past the deflated byte sequece is to parse it according to the this specification. From what I understood by reading the specification, a deflate stream is composed of blocks, which can be compressed blocks or literal blocks.
Literal blocks contain a size header which can be used to easily advance past it.
Compressed blocks are composed with 'prefix codes', which are bit sequences of variable length that have special meanings to the deflate algorithm. Since I'm only interested in finding out the deflated stream length, I guess the only code I need to look for is '0000000' which according to the specification signals the end of block.
So I came up with this coffeescript function to parse the deflate stream(I'm working on node.js)
# The job of this function is to return the position
# after the deflate stream contained in 'buffer'. The
# deflated stream begins at 'pos'.
advanceDeflateStream = (buffer, pos) ->
byteOffset = 0
finalBlock = false
while 1
if byteOffset == 6
firstTypeBit = 0b00000001 & buffer[pos]
pos++
secondTypeBit = 0b10000000 & buffer[pos]
type = firstTypeBit | (secondTypeBit << 1)
else
if byteOffset == 7
pos++
type = buffer[pos] & (0b01100000 >>> byteOffset)
if type == 0
# Literal block
# ignore the remaining bits and advance position
byteOffset = 0
pos++
len = buffer.readUInt16LE(pos)
pos += 2
lenComplement = buffer.readUInt16LE(pos)
if (len ^ ~lenComplement)
throw new Error('Literal block lengh check fail')
pos += (2 + len) # Advance past literal block
else if type in [1, 2]
# huffman block
# we are only interested in finding the 'block end' marker
# which is signaled by the bit string 0000000 (256)
eob = false
matchedZeros = 0
while !eob
byte = buffer[pos]
for i in [byteOffset..7]
# loop the remaining bits looking for 7 consecutive zeros
if (byte ^ (0b10000000 >>> byteOffset)) >>> (7 - byteOffset)
matchedZeros++
else
# reset counter
matchedZeros = 0
if matchedZeros == 7
eob = true
break
byteOffset++
if !eob
byteOffset = 0
pos++
else
throw new Error('Invalid deflate block')
finalBlock = buffer[pos] & (0b10000000 >>> byteOffset)
if finalBlock
break
return pos
To check if this works, I wrote a simple mocha test case:
zlib = require 'zlib'
test 'sample deflate stream', (done) ->
data = 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' # length 30
zlib.deflate data, (err, deflated) ->
# deflated.length == 11
advanceDeflateStream(deflated, 0).shoudl.eql(11)
done()
The problem is that this test fails and I do not know how to debug it. I accept any answer that points what I missed in the parsing algorithm or contains a correct version of the above function in any language.
The only way to find the end of a deflate stream or even a deflate block is to decode all of the Huffman codes contained within. There is no bit pattern that you can search for that can not appear earlier in the stream.

Determining All Possibilities for a Random String?

I was hoping someone with better math capabilities would assist me in figuring out the total possibilities for a string given it's length and character set.
i.e. [a-f0-9]{6}
What are the possibilities for this pattern of random characters?
It is equal to the number of characters in the set raised to 6th power.
In Python (3.x) interpreter:
>>> len("0123456789abcdef")
16
>>> 16**6
16777216
>>>
EDIT 1:
Why 16.7 million? Well, 000000 ... 999999 = 10^6 = 1M, 16/10 = 1.6 and
>>> 1.6**6
16.77721600000000
* EDIT 2:*
To create a list in Python, do: print(['{0:06x}'.format(i) for i in range(16**6)])
However, this is too huge. Here is a simpler, shorter example:
>>> ['{0:06x}'.format(i) for i in range(100)]
['000000', '000001', '000002', '000003', '000004', '000005', '000006', '000007', '000008', '000009', '00000a', '00000b', '00000c', '00000d', '00000e', '00000f', '000010', '000011', '000012', '000013', '000014', '000015', '000016', '000017', '000018', '000019', '00001a', '00001b', '00001c', '00001d', '00001e', '00001f', '000020', '000021', '000022', '000023', '000024', '000025', '000026', '000027', '000028', '000029', '00002a', '00002b', '00002c', '00002d', '00002e', '00002f', '000030', '000031', '000032', '000033', '000034', '000035', '000036', '000037', '000038', '000039', '00003a', '00003b', '00003c', '00003d', '00003e', '00003f', '000040', '000041', '000042', '000043', '000044', '000045', '000046', '000047', '000048', '000049', '00004a', '00004b', '00004c', '00004d', '00004e', '00004f', '000050', '000051', '000052', '000053', '000054', '000055', '000056', '000057', '000058', '000059', '00005a', '00005b', '00005c', '00005d', '00005e', '00005f', '000060', '000061', '000062', '000063']
>>>
EDIT 3:
As a function:
def generateAllHex(numDigits):
assert(numDigits > 0)
ceiling = 16**numDigits
for i in range(ceiling):
formatStr = '{0:0' + str(numDigits) + 'x}'
print(formatStr.format(i))
This will take a while to print at numDigits = 6.
I recommend dumping this to file instead like so:
def generateAllHex(numDigits, fileName):
assert(numDigits > 0)
ceiling = 16**numDigits
with open(fileName, 'w') as fout:
for i in range(ceiling):
formatStr = '{0:0' + str(numDigits) + 'x}'
fout.write(formatStr.format(i))
If you are just looking for the number of possibilities, the answer is (charset.length)^(length). If you need to actually generate a list of the possibilities, just loop through each character, recursively generating the remainder of the string.
e.g.
void generate(char[] charset, int length)
{
generate("",charset,length);
}
void generate(String prefix, char[] charset, int length)
{
for(int i=0;i<charset.length;i++)
{
if(length==1)
System.out.println(prefix + charset[i]);
else
generate(prefix+i,charset,length-1);
}
}
The number of possibilities is the size of your alphabet, to the power of the size of your string (in the general case, of course)
assuming your string size is 4: _ _ _ _ and your alphabet = { 0 , 1 }:
there are 2 possibilities to put 0 or 1 in the first place, second place and so on.
so it all sums up to: alphabet_size^String_size
first: 000000
last: ffffff
This matches hexadecimal numbers.
For any given set of possible values, the number of permutations is the number of possibilities raised to the power of the number of items.
In this case, that would be 16 to the 6th power, or 16777216 possibilities.

Resources