What is an efficient algorithm to perform a one-bit logical shift of an integer variable? - algorithm

Let a and b be two integer variables with value >=0. To shift the binary bit sequence representing a (e.g. 10110010 for a=178) one bit to the left (i.e. into b=01100100=100) you can duplicate a and then take away 2^n, where n ist the bit length of the binary representation of a, so that b=a+a-2^n.
However, this algorithm takes O(n), where n is the number of bits in the sequence, and is in practice not viable for sequences of upwards of 10000 bits.
What is a more efficient alternative whose efficiency does not depend on the length of the bit chain?

Related

Is there any way to traverse all balanced bitvector?

Suppose a bit-vector with 2^n bits. It is a balanced bit-vector if it has exactly 2^(n-1) bit 1 (so does bit 0). Given n, can I print out all 2^n-bit balanced bit-vector in O(2^n)?
You can't.
Printing single 2^n bits vector takes O(2^n) time, and printing all possible vectors takes O(2^n*nCr(2^n, 2^(n-1))) - much more.

big-Theta running time in terms of input size N

while N-bit integer a > 1,
a = a / 2
I was thinking it is log(n) because each time you go through the while loop you are dividing a by two but my friend thinks its 2 logn(n).
Clearly your algorithm is in big-Theta(log(a)), where a is your number
But as far as I understand your problem, you want to know the asymptotic runtime depending on the amount of bits of your number
That's really difficult to say and depends on your number:
Let's say you have an n-bit integer and the Most significant bit is 1. You have to divide it n-times, to get a number smaller than 1.
Now let's look at a integer where only the least siginficant bit is 1 (so it equals the number 1 in decimal system). There you need just one division.
So i would say, it'll take n/2 in average which makes it big-Theta(n) where n is the amount of bits of your number. The worst-case is also in big-Theta(n) and the best-case is in big-Theta(1)
NOTE: Dividing a number by two in binary system has a similar effect as dividing a number by ten in decimal system
Dividing an integer by two can be efficiently implemented by taking a number in binary notation and shifting the bits. In the worst case, all the bits are set at you have to shift (n-1) bits for the first division, (n-2) bits for the second, etc. etc. until you shift 1 bit on the last iteration and find the number becomes equal to 1, at which point you stop. This means your algorithm must shift 1+2+...+(n-1) = n(n-1)/2 bits, making your algorithm O(n^2) in the number of bits of input.
A more efficient algorithm that will leave a with the same value is a = (a == 0 ? 0 : 1). This generates the same answer in linear time (equality checking is linear in the number of bits) and it works because your code will only leave a = 0 if a is originally zero; in all other cases, the highest-order bit ends up in the unit's place.

Representation of binary matrix with space complexity of O(n)

I have nxn binary matrices (i.e. a matrices whose elements are 0 or 1). Using a two dimensional array (that is, storing the value of each element) have a space complexity of O(n^2).
Is there any way to store them in a way such that the space complexity is O(n)? All operations like summation, subtraction, etc. is welcome.
The matrices are not sparse so using list of non-zero elements is out of question.
No, you can not store an n x n binary matrix in O(n) space.
The proof is just pigeonhole principle.
Suppose you devise a way to store an arbitrary n x n binary matrix.
There are 2n x n possible binary matrices of such size.
If you use k bits for the storage, there would be 2k possible contents of your storage.
Now, if k < n x n, we have 2k < 2n x n, and by pigeonhole principle, there exist two different matrices (say, A and B) which are stored the same way (say, X is stored).
So, when you have that X stored, you can not say whether the matrix you actually intended to store was A or B (or maybe some other matrix).
Thus you cannot uniquely decode your storage back into the form of the stored matrix, which destroys the whole purpose of storing it.
First proof: A n*n bit matrix has n*n states. However with a n-bit string you can only store n states. So unless n>=n*n (e.g. n=1), there is no way to encode n*n bits in an n bit sequence.
Second proof, less abstract but also less complete:
Imagine you have a 16*16 matrix with 256 bits, and somehow manage to store this in 16 bits.
Now, of course, your could take those 16 bits and store them in a 4x4 matrix, using your algorithm, resulting in 4bits. Now your store the 4 bits in 2x2 matrix and compress them in 2 bits.
--> Essentially, such an algorithm would be able to compress any imaginable amount of data in just 2 bits. While this is not an actual proof, it is still quite obvious that such an algorithm cannot exist.
I don't think it can guarantee you O(n) space, but you can look for a compression algorithm called LZW (Lempel-Ziv-Welch).
It's quite simple to code and it's easy to understand why and how it works, and it should work very well for binary arrays, and the biggest your matrix is, the best the compression rate will be.
Anyways, if you know some information about the matrix, you can try to represent it in an array somehow you can restore, for example:
if you matrix is 32x32 dimension, you can get any row of it and represent as a single int, so a whole row will become a single number and you may have your O(n)

Algorithm to sort a list in Θ(n) time

The problem is to sort a list containing n distinct integers that range in value from 1 to kn inclusive where k is a fixed positive integer. Design an algorithm to solve the problem in Θ(n) time.
I don't just want an answer. An explanation would help, or if someone could get me pointed in the right direction.
I know that Θ(n) time means the algorithm time is directly proportional to the number of elements. Not sure where to go from there.
Easy for fixed k: Create an array of kn counters. Set them all to zero. Iterate through the array, increasing the counter i by one if an array element equals i. Use the array of counters to re-create the sorted array.
Obviously this is inefficient if k > log n.
The key is that the integers only range from 1 to kn, so their length is limited. This is a little tricky:
The common assumption when we say that a sorting algorithm is O(N) is that the number N fits into a constant number of machine words so that we can do math on numbers of that size in constant time. Following this assumption, kN also fits into a constant number of machine words, since k is a fixed positive integer. Your input is therefore O(N) words long, and each word is fixed number of bits, so your input is O(N) bits long.
Therefore, any algorithm that takes time proportional to the number of bits in the input is considered O(N).
There are actually lots of choices, but when this particular question is asked in this particular way, the person asking usually wants you to come up with a radix sort:
https://en.wikipedia.org/wiki/Radix_sort
The MSB-first radix sort just partitions the integers into 2^W buckets according to the values of their top W bits, and then partitions each bucket according to the next W bits, etc., until all the bits are processed.
The time taken for this is O(N*(word_size/W)), but as we said the word size is constant, and W is constant, so this is O(N).

Choosing radix and modulus prime in rabin-karp rolling hash

The hash function is explained on Wikipedia
It says, "The choice of a and n is critical to get good hashing;" and refers to a Linear congruential generator article that doesn't feel relevant. I cant figure out how the values are chosen. Any suggestions?
The basis of this algorithm is that a nonzero polynomial of degree at most d has at most d zeros. Each length-k string has its own associated polynomial of degree k - 1, and we screen for possible matches by subtracting the polynomials of the strings in question and evaluating at a. If the strings are equal, then the result is always zero. If the strings are not equal, then the result is zero if and only if a is one of the zeros of the polynomial difference (this is the fact that puts the primality requirement on n, as the integers mod n otherwise would not be a field).
In theory, at least, we want a to be random so that an oblivious adversary cannot create false positives with any frequency. If we don't expect trouble, then it might be better to choose a so that multiplication by a is cheap (e.g., the binary expansion of a has a small number of one bits). Nevertheless, some choices are bad on typical string sets (e.g., a = 1). We want n to be large enough to avoid false positives (probability (k - 1)/n) by random chance but small enough and preferably of a special form so that the modulo computations are efficient.

Resources