Bitwise convert a random to FP64 with vlues between 0 and 1 - random

I am writing a laboratory data processing script in Excel VBA.
The idea is acquiring improved random numbers using RtlGenRandom.
The difficulty is that RtlGenRandom seeds the byte array in a byte-wise manner.
So for reducing the time and keeping the "randomish" distribution, I want to apply a bitwise mask for reducing the number to [0 to 1] values.
Below is a mnemonic explanation what I want to do regardless of exact implementation:
longlong buffer
longlong mask
double rand_val
RtlGenRandom(buffer,8)
buffer=not(buffer or mask)
rand_val=typeless_copy(rand_val, buffer)
So I got a little lost what should be the value of that mask for the said truncation of valuses.
Bitwise, or in 0xHex.
I thought, it should be 0xC000 0000 0000 0000, but something is wrong.

Related

Probability of a collision using 32 bit CRC of a unique 32 byte array

I am trying to figure out if using 32 bit CRC will produce collision on 32 byte array.
BackGround
My system reads some configuration whenever it boots up from an external flash. I store the SHA256 hash of the last know configuration and when ever I read the configuration I calculate the SHA256 hash and compare it. If the two hash are different then the data is different.
I need to take that SHA256 and make it into a 32bit hash for another part of the system (due to some legacy code restrictions).
Questions
Will there be a high number of collision if I compute the 32 bit CRC on the 32 byte hash from SHA256?
I calculate the probability of collision to be 0. Can you let me know if this is correct?
The number of sample K is always 2 in my problem (I think) because I am calculating 32 bit CRC on two 32 bytes byte array (SHA256 byte array).
see calculation here
That's correct, if by "0" you mean that very small number. That small number is the probability that you would get a 32-bit CRC from random data that accidentally matches what you were expecting. It is simply 2-32.

How are code lengths limited to 16 bits at maximum in JPEG?

According to ITU T.81 that describes the JPEG format, BITS stores the "code length counts". Creation of it is described in Annex K Figure K.2 of the specification. JPEG specification expects that symbols will exist that will require huffman codes upto 32 bits in length when encoding is being carrying out. However, it limits huffman code length to 16 bits at maximum for when data is encoded. For this purpose the code lengths must be limited to 16 bits. The procedure for this is contained in Annex K Figure K.3 shown below:
My question is that will BITS have negative values as well when we do BITS(I)-2 and BITS(I)-1? Does it have to be declared as signed? If so, what meaning do negative values have? I have implemented this in code but it gives me negative values. So some images encode just fine but others where BITS has to be manipulated to 16 bits, the images always get corrupted.
As I understand it, negative values should be fine since those are i=[17,32] values, which are not used once you are done reducing it to 16 bits. The algorithm assumes signed math, notice the BITS(i) > 0 condition, negative values will fall through the "No" branch and eventually end after dealing with BITS(17).
In your implementation, I think you could use unsigned math if you really want to and just clamp the underflow to 0 (Naively, something like BITS(i) = BITS(i) > 2 ? BITS(i) - 2 : 0).

Actual length of input vector in VHDL

i am running a HDL code written in VHDL and i have an input vector with maximum length of 512 bits. Some of my inputs are less than the max size. So i want to find if there is a way to find the actual length of every input, in order to cut the unwanted zeros at the most significant bits of the input vector. Is there any possible way to do this kind of stuff?
I guess you are looking for an unambiguous padding method for your data. What I would recommend in your case is an adaption of the ISO/IEC 9797-1 padding method 2 as follows:
For every input data (even if it already has 512 bits), you add a leading '1' bit. Then you add leading '0' bits (possibly none) to fill up your vector.
To implement this scheme you would have to enlargen your input vector to 513 bits (because you have to always add at least one bit).
To remove the padding, you simple go through the vector starting at the MSB and find the first '1' bit, which marks the end of your apdding pattern.
Example (for 8+1 bit):
input: 10101
padded: 0001 10101
input: 00000000
padded: 1 00000000

Fixed Point Multiplication for FFT

I’m writing a Radix-2 DIT FFT algorithm in VHDL, which requires some fractional multiplication of input data by Twiddle Factor (TF). I use Fixed Point arithmetic’s to achieve that, with every word being 16 bit long, where 1 bit is a sign bit and the rest is distributed between integer and fraction. Therefore my dilemma:
I have no idea, in what range my input data will be, so if I just decide that 4 bits go to integer and the rest 11 bits to fraction, in case I get integer numbers higher than 4 bits = 15 decimal, I’m screwed. The same applies if I do 50/50, like 7 bits to integer and the rest to fraction. If I get numbers, which are very small, I’m screwed because of truncation or rounding, i.e:
Let’s assume I have an integer "3"(0000 0011) on input and TF of "0.7071" ( 0.10110101 - 8 bit), and let’s assume, for simplicity, my data is 8 bit long, therefore:
3x0.7071 = 2.1213
3x0.7071 = 0000 0010 . 0001 1111 = 2.12109375 (for 16 bits).
Here comes the trick - I need to up/down round or truncate 16 bits to 8 bits, therefore, I get 0000 0010, i.e 2 - the error is way too high.
My questions are:
How would you solve this problem of range vs precision if you don’t know the range of your input data AND you would have numbers represented in fixed point?
Should I make a process, which decides after every multiplication where to put the comma? Wouldn’t it make the multiplication slower?
Xilinx IP Core has 3 different ways for Fixed Number Arithmetic’s – Unscaled (similar to what I want to do, just truncate in case overflow happens), Scaled fixed point (I would assume, that in that case it decides after each multiplication, where the comma should be and what should be rounded) and Block Floating Point(No idea what it is or how it works - would appreciate an explanation). So how does this IP Core decide where to put the comma? If the decision is made depending on the highest value in my dataset, then in case I have just 1 high peak and the rest of the data is low, the error will be very high.
I will appreciate any ideas or information on any known methods.
You don't need to know the fixed-point format of your input. You can safely treat it as normalized -1 to 1 range or full integer-range.
The reason is that your output will have the same format as the input. Or, more likely for FFT, a known relationship like 3 bits increase, which would the output has 3 more integer bits than the input.
It is the core user's burden to know where the decimal point will end up, you have to document the change to dynamic range of course.

Pseudorandom generator in Assembly Language

I need a pseudorandom number generator algorithm for a assembler program assigned in a course, and I would prefer a simple algorithm. However, I cannot use an external library.
What is a good, simple pseudorandom number generator algorithm for assembly?
Easy one is to just choose two big relative primes a and b, then keep multiplying your random number by a and adding b. Use the modulo operator to keep the low bits as your random number and keep the full value for the next iteration.
This algorithm is known as the linear congruential generator.
Volume 2 of The Art of Computer Programming has a lot of information about pseudorandom number generation. The algorithms are demonstrated in assembler, so you can see for yourself which are simplest in assembler.
If you can link to an external library or object file, though, that would be your best bet. Then you could link to, e.g., Mersenne Twister.
Note that most pseudorandom number generators are not safe for cryptography, so if you need secure random number generation, you need to look beyond the basic algorithms (and probably should tap into OS-specific crypto APIs).
Simple code for testing, don't use with Crypto
From Testing Computer Software, page 138
With is 32 bit maths, you don't need the operation MOD 2^32
RNG = (69069*RNG + 69069) MOD 2^32
Well - Since I haven't seen a reference to the good old Linear Feedback Shift Register I post some SSE intrinsic based C-Code. Just for completenes. I wrote that thing a couple of month ago to sharpen my SSE-skills again.
#include <emmintrin.h>
static __m128i LFSR;
void InitRandom (int Seed)
{
LFSR = _mm_cvtsi32_si128 (Seed);
}
int GetRandom (int NumBits)
{
__m128i seed = LFSR;
__m128i one = _mm_cvtsi32_si128(1);
__m128i mask;
int i;
for (i=0; i<NumBits; i++)
{
// generate xor of adjecting bits
__m128i temp = _mm_xor_si128(seed, _mm_srli_epi64(seed,1));
// generate xor of feedback bits 5,6 and 62,61
__m128i NewBit = _mm_xor_si128( _mm_srli_epi64(temp,5),
_mm_srli_epi64(temp,61));
// Mask out single bit:
NewBit = _mm_and_si128 (NewBit, one);
// Shift & insert new result bit:
seed = _mm_or_si128 (NewBit, _mm_add_epi64 (seed,seed));
}
// Write back seed...
LFSR = seed;
// generate mask of NumBit ones.
mask = _mm_srli_epi64 (_mm_cmpeq_epi8(seed, seed), 64-NumBits);
// return random number:
return _mm_cvtsi128_si32 (_mm_and_si128(seed,mask));
}
Translating this code to assembler is trivial. Just replace the intrinsics with the real SSE instructions and add a loop around it.
Btw - the sequence this code genreates repeats after 4.61169E+18 numbers. That's a lot more than you'll get via the prime method and 32 bit arithmetic. If unrolled it's faster as well.
#jjrv
What you're describing is actually a linear congrential generator. The most random bits are the highest bits. To get a number from 0..N-1 you multiply the full value by N (32 bits by 32 bits giving 64 bits) and use the high 32 bits.
You shouldn't just use any number for a (the multiplier for progressing from one full value to the next), the numbers recommended in Knuth (Table 1 section 3.3.4 TAOCP vol 2 1981) are 1812433253, 1566083941, 69069 and 1664525.
You can just pick any odd number for b. (the addition).
Why not use an external library??? That wheel has been invented a few hundred times, so why do it again?
If you need to implement an RNG yourself, do you need to produce numbers on demand -- i.e. are you implementing a rand() function -- or do you need to produce streams of random numbers -- e.g. for memory testing?
Do you need an RNG that is crypto-strength? How long does it have to go before it repeats? Do you have to absolutely, positively guarantee uniform distribution of all bits?
Here's simple hack I used several years ago. I was working in embedded and I needed to test RAM on power-up and I wanted really small, fast code and very little state, and I did this:
Start with an arbitrary 4-byte constant for your seed.
Compute the 32-bit CRC of those 4 bytes. That gives you the next 4 bytes
Feed back those 4 bytes into the CRC32 algorithm, as if they had been appended. The CRC32 of those 8 bytes is the next value.
Repeat as long as you want.
This takes very little code (although you need a table for the crc32 function) and has very little state, but the psuedorandom output stream has a very long cycle time before it repeats. Also, it doesn't require SSE on the processor. And assuming you have the CRC32 function handy, it's trivial to implement.
Using masm615 to compiler:
delay_function macro
mov cx,0ffffh
.repeat
push cx
mov cx,0f00h
.repeat
dec cx
.until cx==0
pop cx
dec cx
.until cx==0
endm
random_num macro
mov cx,64 ;assum we want to get 64 random numbers
mov si,0
get_num:
push cx
delay_function ;since cpu clock is fast,so we use delay_function
mov ah,2ch
int 21h
mov ax,dx ;get clock 1/100 sec
div num ;assume we want to get a number from 0~num-1
mov arry[si],ah ;save to array you set
inc si
pop cx
loop get_num ;here we finish the get_random number
also you probably can emulate shifting register with XOR sum elements between separate bits, which will give you pseudo-random sequence of numbers.
Linear congruential (X = AX+C mod M) PRNG's might be a good one to assign for an assembler course as your students will have to deal with carry bits for intermediate AX results over 2^31 and computing a modulus. If you are the student they are fairly straightforward to implement in assembler and may be what the lecturer had in mind.

Resources