I'm currently converting a Visual Basic application to Ruby because we're moving it to the web. However when converting some algorithms I've run into a problem concerning bit shifting.
How I understand it, the problem lies in the size mask VB enforces on Integer types (as explained Here). Ruby, in practice, doesn't differentiate in these types.
So the problem:
Visual Basic
Dim i As Integer = 182
WriteLine(i << 24) '-1241513984
Ruby
puts 182 << 24 # 3053453312
I've been Googling and reading up on bit shifting the last hours but haven't found a way, or direction even, to tackle this problem.
You need to replicate what visual basic is doing, namely
mask the shift value as documented
cap mask the result with 0xFFFFFFFF (since ruby will have promoted the value to a bignum for you
if the top most bit is set, subtract 2^32 from the result (since signed integers are stored with 2s complement
For example
def shift_32 x, shift_amount
shift_amount &= 0x1F
x <<= shift_amount
x &= 0xFFFFFFFF
if (x & (1<<31)).zero?
x
else
x - 2**32
end
end
Related
Suppose I have a 64 bit unsigned integer (u64) mask, with one or more bits set.
I want to select one of the set bits uniformly at random from m to give a new mask x such that x & mask has one bit set. Some pseudocode that does this might be:
def uniform_random_bit_from_mask(mask):
assert mask > 0
set_indices = get_set_indices(mask)
random_index = uniform_random_choice(set_indices)
new_mask = set_bit(random_index, 0)
return new_mask
However I need to do this as fast as possible (code similar to the above in a low-level language is slowing a hot loop). Does anyone have a more efficient scheme?
The details how to optimize this depend on several factors you did not specify – the target architecture, the expected number of set bits in the mask, the language you want to use, the requirements on the randomness and many more. Without knowing further details, it's hard to give a useful answer, but I'll give a few hints that may prove useful anyway.
Most modern architectures have an instruction to count the number of set bits in an integer, generally called "popcount", and this instruction is exposed in most low-level languages. In Rust, you can use the count_ones() method. This gives you the total number k of bits to select from.
You can then generate a random number i between 0 and k - 1 (inclusive). The next step is to select the ith set bit in mask. An efficient approach to do so is this loop (Rust code):
for _ in 0..i {
mask &= mask - 1;
}
let new_mask = 1 << mask.trailing_zeros();
The loop clears the least significant set bit in each iteration. Since i < k, we know that mask can't be zero after the loop. The last line generates a new mask from the least significant bit of mask that is still set.
On common architectures, it is likely that the bottleneck will be the random number generator. If you are using Rust's rand crate, you can use SmallRng for improved performance, at the cost of being cryptographically insecure, which may not be relevant for your use case.
The famous linear congruential random number generator also known as minimal standard use formula
x(i+1)=16807*x(i) mod (2^31-1)
I want to implement this using Fortran.
However, as pointed out by "Numerical Recipes", directly implement the formula with default Integer type (32bit) will cause 16807*x(i) to overflow.
So the book recommend Schrage’s algorithm is based on an approximate factorization of m. This method can still implemented with default integer type.
However, I am wondering fortran actually has Integer(8) type whose range is -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 which is much bigger than 16807*x(i) could be.
but the book even said the following sentence
It is not possible to implement equations (7.1.2) and (7.1.3) directly
in a high-level language, since the product of a and m − 1 exceeds the
maximum value for a 32-bit integer.
So why can't we just use Integer(8) type to implement the formula directly?
Whether or not you can have 8-byte integers depends on your compiler and your system. What's worse is that the actual value to pass to kind to get a specific precision is not standardized. While most Fortran compilers I know use the number of bytes (so 8 would be 64 bit), this is not guaranteed.
You can use the selected_int_kindmethod to get a kind of int that has a certain range. This code compiles on my 64 bit computer and works fine:
program ran
implicit none
integer, parameter :: i8 = selected_int_kind(R=18)
integer(kind=i8) :: x
integer :: i
x = 100
do i = 1, 100
x = my_rand(x)
write(*, *) x
end do
contains
function my_rand(x)
implicit none
integer(kind=i8), intent(in) :: x
integer(kind=i8) :: my_rand
my_rand = mod(16807_i8 * x, 2_i8**31 - 1)
end function my_rand
end program ran
Update and explanation of #VladimirF's comment below
Modern Fortran delivers an intrinsic module called iso_fortran_env that supplies constants that reference the standard variable types. In your case, one would use this:
program ran
use, intrinsic :: iso_fortran_env, only: int64
implicit none
integer(kind=int64) :: x
and then as above. This code is easier to read than the old selected_int_kind. (Why did R have to be 18 again?)
Yes. The simplest thing is to append _8 to the integer constants to make them 8 bytes. I know it is "old style" Fortran but is is portable and unambiguous.
By the way, when you write:
16807*x mod (2^31-1)
this is equivalent to take the result of 16807*x and use an and with a 32-bit mask where all the bits are set to one except the sign bit.
The efficient way to write it by avoiding the expensive mod functions is:
iand(16807_8*x, Z'7FFFFFFF')
Update after comment :
or
iand(16807_8*x, 2147483647_8)
if your super modern compiler does not have backwards compatibility.
I was going through the go tutorial on golang.org and I came across an example that i partially understand...
MaxInt uint64 = 1<<64 - 1
Now I understand this to be shifting the bit 64 places to the left which would make it a 1 followed by 64 0's.
My question is why is this the max integer that can be achieved in a 64 bit number. Wouldn't the max integer be 111111111....(until the 64th 1) instead of 100000...(until the 64th one)?
What happens here, step by step:
Take 1.
Shift it to the left 64 bits. This is tricky. The result actually needs 65 bits for representation - namely 1 followed by 64 zeroes. Since we are calculating a 64 bit value here why does this even compile instead of overflowing to 0 or 1 or producing a compile error?
It works because the arithmetic used to calculate constants in Go is a bit magic (https://blog.golang.org/constants) in that it has nothing to do whatsoever with the type of the named constant being calculated. You can say foo uint8 = 1<<415 / 1<<414 and foo is now 2.
Subtract 1. This brings us back into 64 bits numbers, as it's actually 11....1 (64 times), which is indeed the maximum value of uint64. Without this step, the compiler would complain about us trying to cram 65 bit value into uint64.
Name the constant MaxInt and give it type uint64. Success!
The magic arithmetic used to calculate constants still has limitations (obviously). Shifts greater than 500 or so produce funny named stupid shift errors.
I am currently working on a framework which transforms C to VHDL and I am getting stuck on the implementation of the long long division. Indeed, my framework is only able to work on 32-bits variable, so parsing a C long long variable will result into 2 VHDL variables, one containing the most significant part, one containing the least significant part. So to sum up, from this :
long long a = 1LL;
The VHDL which will be generated will be something like :
var30 <= 00000000000000000000000000000000;
var31 <= 00000000000000000000000000000001;
Now my problem is : how can I divide 2 long long parameters (in VHDL), since they are splitted in 2 variables ? I had no problem for the addition/substraction, since I can work on the most (resp. least) significant part independently (just a carry to propagate), but I really don't see how I could perform a division, since with this kind of operation, the least and the most significant part are really bound together... If someone has an idea, it would be much appreciated
PS : I have the same problem for the multiplication
EDIT : I both work on signed/unsigned variables and the result should be a 64-bit variable
For both the multiplication and the division problem you can break the problem down like this: consider that each 64 bit value, x can be expressed as k*x.hi+x.lo where x.hi is the upper 32 bits, x.lo is the lower 32 bits, and k = 2^32. So for multiplication:
a*b = (a.hi*k+a.lo)*(b.hi*k+b.lo)
= a.hi*b.hi*k*k + (a.hi*b.lo + a.lo*b.hi)*k + a.lo*b.lo
If you just want a 64 bit result then the first term disappears and you get:
a*b = (a.hi*b.lo + a.lo*b.hi)*k + a.lo*b.lo
Remember that in general multiplication doubles the number of bits, so each 32 bit x 32 bit multiply in the above expressions will generate a 64 bit term. In some cases you only want the low 32 bits (first two terms in above expression) but for the last term you need both the low and high 32 bits.
Devise a simple algorithm which creates a file which contains nothing but its own checksum.
Let's say it is CRC-32, so this file must be 4 bytes long.
There might be some smart mathematical way of finding it out (or proving that none exists), if you know how the algorithm works.
But since I'm lazy and CRC32 has only 2^32 values, I would brute force it. While waiting for the algorithm to go through all 2^32 values, I would use Google and Stack Overflow to find whether somebody has a solution to it.
In case of SHA-1, MD5 and other more-or-less cryptographically secure algorithms, I would get intimidated by the mathematicians who designed those algorithms and just give up.
EDIT 1: Brute forcing... This far I've found one; CC4FBB6A in big-endian encoding. There might still be more. I'm checking 4 different encodings: ASCII uppercase and lowercase, and binary big-endian and little-endian.
EDIT 2: Brute force done. Here are the results:
CC4FBB6A (big-endian)
FFFFFFFF (big-endian & little-endian)
32F3B737 (uppercase ASCII)
The code is here. On my overclocked C2Q6600 that takes about 1.5 hours to run. Now that program is single-threaded, but it would be easy to make it multi-threaded, which would give a nice linear scalability.
Aside from Jerry Coffin and Esko Luontola's good answers to an unusual problem, I'd like to add:
Mathematically, we're looking for X such that F(X) = X, where F is the checksum function, and X is the data itself.
Since the checksum's output is of fixed size, and the input we are looking for is of the same size, there is no guarantee that such an X even exists! It could very well be that every input value of the fixed size is correlated with a different value of that size.
EDIT: Your question didn't specify the exact way the checksum is supposed to be formatted within the file, so I assumed you mean the byte-representation of the checksum. When strings and encodings and formatted-strings come to play, things become more complex.
Lacking any specific guidance to the contrary, I'd define the checksum of nonexistent data as a nonexistent checksum, so creating an empty file would fulfill the requirement.
Another typical method is a negative checksum -- i.e. after the data you write a value that makes the checksum of the whole file (including the checksum) come out to zero. In this case, you write a checksum of 0, and it all works out.
Brute force. This is Adler32, which I haven't implemented before, and didn't bother testing, so it's quite likely I've messed it up. I wouldn't expect a corrected version to run significantly slower, though, unless I've done something colossally wrong.
This assumes that the 32bit checksum value is written to the file little-endian (I didn't find a fixed point with it big-endian):
#include <iostream>
#include <stdint.h>
#include <iomanip>
const int modulus = 65521;
void checkAllAdlers(uint32_t sofar, int depth, uint32_t a, uint32_t b) {
if (depth == 4) {
if ((b << 16) + a == sofar) {
std::cout << "Got a fixed point: 0x" <<
std::hex << std::setw(8) << std::setfill('0') <<
sofar << "\n";
}
return;
}
for (uint32_t i = 0; i < 256; ++i) {
uint32_t newa = a + i;
if (newa >= modulus) newa -= modulus;
uint32_t newb = b + a;
if (newb >= modulus) newb -= modulus;
checkAllAdlers(sofar + (i << (depth*8)), depth + 1, newa, newb);
}
return;
}
int main() {
checkAllAdlers(0, 0, 1, 0);
}
Output:
$ g++ adler32fp.cpp -o adler32fp -O3 && time ./adler32fp
Got a fixed point: 0x03fb01fe
real 0m31.215s
user 0m30.326s
sys 0m0.015s
[Edit: several bugs fixed already, I have no confidence whatever in the correctness of this code ;-) Anyway, you get the idea: a 32 bit checksum which uses each byte of input only once is very cheap to brute force. Checksums are usually designed to be fast to compute, whereas hashes are usually much slower, even though they have superficially similar effects. If your checksum was "2 rounds of Adler32" (meaning that the target checksum was the result of computing the checksum and then computing the checksum of that checksum) then my recursive approach wouldn't help so much, there'd be proportionally less in common between inputs with a common prefix. MD5 has 4 rounds, SHA-512 has 80.]
Brute force it. CRC-32 gives you a string of length 8 containing digits and letters of A-F (in other words, it's a hexadecimal number). Try every combination, giving you 168 = many possibilities. Then hash each possibility and see if it gives you the original string.
You can try optimizing it by assuming the solution will use each character no more than two or three times, this might make it finish faster.
If you have access to a CRC32 implementation, you can also try to break the algorithm and find a solution much faster, but I have no idea how you'd do this.