Iterating through a bit sequence and finding each set bit - algorithm

My question is more or less what's in the title; I'm wondering if there's a fast way to going through a sequence of bits and finding each bit that's set.
More detailed information:
I'm currently working on a data stucture that represents a set of objects. In order to support some operations I need, the structure must be able to perform very fast intersection of subsets internally. The solution I've come up with is to have each subset of the structure's superset represented by a "bit array", where each bit maps to an index in the array that holds the superset's data. Example: if bit #1 is set in a subset, then the element at index 1 in the superset's array is present in the subset.
Each subset consists of an array of ulong big enough that there's enough bits to represent the entire superset (if the superset contains 256 elements, the size of the array must be 256 / 64 = 4). To find the intersection of 2 subsets, S1 and S2, I can simply iterate through S1 and S2's array, and find the bitwise-and between the ulongs at each index.
Now back to what my question is really about:
In order to return the data of a subset, I have to iterate through all the bits in the subset's "bit array" and find the bits that are set. This is how I curently do it:
/// <summary>
/// Gets an enumerator that enables enumeration over the strings in the subset.
/// </summary>
/// <returns> An enumerator. </returns>
public IEnumerator<string> GetEnumerator()
{
int bitArrayChunkIndex = 0;
int bitArrayChunkOffset = 0;
int bitArrayChunkCount = this.bitArray.Length;
while(bitArrayChunkIndex < bitArrayChunkCount)
{
ulong bitChunk = bitArray[bitArrayChunkIndex];
// RELEVANT PART
if (bitChunk != 0)
{
int bit = 0;
while (bit < BIT_ARRAY_CHUNK_SIZE /* 64 */)
{
if(bitChunk.BitIsSet(bit))
yield return supersetData[bitArrayChunkOffset + bit];
bit++;
}
}
bitArrayChunkIndex++;
bitArrayChunkOffset += BIT_ARRAY_CHUNK_SIZE;
// END OF RELEVANT PART
}
}
Is there any obvious ways to optimize this? Any bit hacks to enable it to be done very fast? Thanks!

On INTEL 386+, you can use machine instruction BitSearchFirst.
Following - sample for gcc. This is little tricky for process 64-bit words,
but anyway works quick and efficient.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main(int argc, char **argv) {
uint64_t val;
sscanf(argv[1], "%llx", &val);
printf("val=0x%llx\n", val);
uint32_t result;
if((uint32_t)val) { // first bit is inside lowest 32
asm("bsfl %1,%0" : "=r"(result) : "r"(val));
} else { // first bit is outside lowest 32
asm("bsfl %1,%0" : "=r"(result) : "r"(val >> 32));
result += 32;
}
printf("val=%llu; result=%u\n", val, result);
return 0;
}
Also, in your use x64 architecture, you can try to use bsfq instruction, and remove "if/else"

Take an array of sixteen integers, initialized with the number of bits set for the integers from zero to fifteen (i.e. 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4). Now take bitchunk % 16, and look up the result in the int array - that's the number of set bits in the first four bits of the chunk. Right shift four times, and repeat the entire operation fifteen more times.
You can do this with an array of 256 integers and 8 bit sub-chunks instead. I wouldn't recommend using an array of 4096 integers with 12 bit sub-chunks, that's getting a bit ridiculous.
int[] lookup = new int[16] {0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4};
int bitCount = 0;
for(int i = 0; i < 16; i++) {
int firstFourBits = bitChunk % 16;
bitCount += lookup[firstFourBits];
bitChunk = butChunk >> 4;
}

Related

Intel Secure Key (RDRAND) possibly strange behaviour

I've been using the Intel-provided RNG feature for some time, to provide myself with some randomness by means of a C++/CLI program I wrote myself.
However, after some time, something struck me as particularly suspicious. Among other uses, I ask for a random number between 1 and 4 and wrote the result down on paper each time. Here are the results :
2, 3, 3, 2, 1, 3, 4, 2, 3, 2, 3, 1, 3, 2, 3, 1, 2, 4, 2, 2, 1, 2, 1, 3, 1, 3, 3, 3, 3.
Number of 1s : 6
Number of 2s : 9
Number of 3s : 12
Number of 4s : 2
Total : 29
I'm actually wondering if there's a problem either with Intel's RNG, my algorithm, methodology or something else maybe ? Or do you consider the bias not to be significant enough yet ?
I'm using Windows 10 Pro, my CPU is an Intel Core i7-4710MQ.
Compiled with VS2017.
Methodology :
Start a Powershell command prompt
Load my assembly with Add-Type -Path <mydll>
Invoke [rdrw.Random]::Next(4)
Add one to the result
A detail that may be of importance : I don't ask for that number very often, so there's some time between draws and it usually comes when the RNG hasn't been used for some time (one hour at least).
And yes it's a lazy algorithm, I didn't want to bother myself with exceptions.
Algorithm follows :
#include <immintrin.h>
namespace rdrw {
#pragma managed(push,off)
unsigned long long getRdRand() {
unsigned long long val = 0;
while (!_rdrand64_step(&val));
return val;
}
#pragma managed(pop)
public ref class Random abstract sealed
{
public:
// Returns a random 64 bit unsigned integer
static unsigned long long Next() {
return getRdRand();
}
// Return a random unsigned integer between 0 and max-1 (inclusive)
static unsigned long long Next(unsigned long long max) {
unsigned long long nb = max - 1;
unsigned long long mask = 1;
unsigned long long draw = 0;
if (max <= 1)
return 0;
// Create a bitmask that's at least as big as the biggest acceptable value
while ((nb&mask) != nb)
{
mask <<= 1;
mask |= 1;
}
do
{
// Throw unnecessary bits
draw = Next() & mask;
} while (draw>nb);
return draw;
}
// return a random unsigned integer between min and max-1 inclusive
static unsigned long long Next(unsigned long long min, unsigned long long max) {
if (max == min)
return min;
if (max < min)
return 0;
unsigned long long diff = max - min;
return Next(diff) + min;
}
};
}
Thanks for your insights !

Counting sort - Efficiency

I was thinking about counting sort and how we implement it, actually how the algorithm works. I am stuck with one part, algorithm is really straightforward and easy to understand but one part of it doesn't seem necessary. I thought people might mistaken or so, but it seems like everyone using the same method so I am mistaken somewhere. Can you please explain.
Here is code for counting sort from geeksforgeeks
// C Program for counting sort
#include <stdio.h>
#include <string.h>
#define RANGE 255
// The main function that sort the given string arr[] in
// alphabatical order
void countSort(char arr[])
{
// The output character array that will have sorted arr
char output[strlen(arr)];
// Create a count array to store count of inidividul
// characters and initialize count array as 0
int count[RANGE + 1], i;
memset(count, 0, sizeof(count));
// Store count of each character
for(i = 0; arr[i]; ++i)
++count[arr[i]];
// Change count[i] so that count[i] now contains actual
// position of this character in output array
for (i = 1; i <= RANGE; ++i)
count[i] += count[i-1];
// Build the output character array
for (i = 0; arr[i]; ++i)
{
output[count[arr[i]]-1] = arr[i];
--count[arr[i]];
}
// Copy the output array to arr, so that arr now
// contains sorted characters
for (i = 0; arr[i]; ++i)
arr[i] = output[i];
}
// Driver program to test above function
int main()
{
char arr[] = "geeksforgeeks";//"applepp";
countSort(arr);
printf("Sorted character array is %s\n", arr);
return 0;
}
Cool , but about this part:
// Build the output character array
for (i = 0; arr[i]; ++i)
{
output[count[arr[i]]-1] = arr[i];
--count[arr[i]];
}
Why do I need this ?? Ok I counted my numbers :
Let's say I had array -> [1, 3, 6, 3, 2, 4]
INDEXES 0 1 2 3 4 5 6
I created this -> [0, 1, 1, 2, 1, 0, 1]
Than this part does this:
[0, 1+0, 1+1, 2+2, 4+1, 0+5, 1+5]
[0, 1, 2, 4, 5, 5, 6]
BUT WHY ??
Can't I just use my array like the one before ? Here is my idea and my code, please explain why it's wrong or, why other way is more useful.
void countingSort (int *arr) {
int countingArray[MAX_NUM] = {0};
for (i = 0 ; i < ARRAY_SIZE ; i++)
countingArray[arr[i]]++;
int output_Index = 0;
for (i = 0 ; i < MAX_NUM ; i++)
while ( countingArray[i]-- )
arr[output_Index++] = i;
}
For the simple case where you are sorting an array of integers, your code is simpler and better.
However, counting sort is a general sorting algorithm that can sort based on a sorting key derived from the items to be sorted, which is used to compare them, as opposed to directly comparing the items themselves. In the case of an array of integers, the items and the sort keys can be one and the same, you just compare them directly.
It looks to me as though the geeksforgeeks code has been adapted from a more generic example that allows the use of sorting keys, something like this:
// Store count of each item
for(i = 0; arr[i]; ++i)
++count[key(arr[i])];
// Change count[i] so that count[i] now contains actual
// position of this character in output array
for (i = 1; i <= RANGE; ++i)
count[i] += count[i-1];
// Build the output array
for (i = 0; arr[i]; ++i)
{
output[count[key(arr[i])]-1] = arr[i];
--count[key(arr[i])];
}
Where key is a function that computes a sort key based on an item (for an integer type you could just return the integer itself). In this case MAX_NUM would have to be replaced with MAX_KEY.
This approach uses the extra output array because the final result is generated by copying the items from arr rather than simply from the information in count (which only contains the count of items with each key). However, an in-place counting sort is possible.
The algorithm also guarantees a stable sort (items with the same sort key have their relative order preserved by sorting) - this is meaningless when sorting integers.
However, since they have removed the ability to sort based on key, there's no reason for the extra complexity and your way is better.
It's also possible that they have copied the code from a language like C++, where the int cast (which will be called when using an item to index an array) could be overloaded to return the sort key, but have mistakenly converted to C.
I think your version is a better approach. I suspect that the person who wrote this code sample had probably written similar code samples for other sorting algorithms — there are many sorting algorithms where you do need separate "scratch space" — and didn't put enough thought into this one.
Alternatively, (s)he may have felt that the algorithm is easier to explain if we separate "generating the result" from "moving the result into place"? I don't agree, if so, but the detailed comments make clear that (s)he had pedagogy in mind.
That said, there are a few minor issues with your version:
You forgot to declare i.
You should take the array-length as a parameter, rather than using a hardcoded ARRAY_SIZE. (In the code sample, this issue is avoided by using a string, so they could iterate until the terminating null byte.)
This may be subjective, but rather than while ( countingArray[i]-- ), I think it's clearer to write for (int j = 0; j < countingArray[i]; ++j).

BitSet bug in cracking the coding interview?

The following is the implementation of BitSet in the solution of question 10-4 in cracking the coding interview book. Why is it allocating an array of size/32 not (size/32 + 1). Am I missing something here or this is a bug?
If I pass 33 to the constructor of BitSet then I will allocate only one int and If I try to set or get the bit 32, I will get an AV!
package Question10_4;
class BitSet {
int[] bitset;
public BitSet(int size) {
bitset = new int[size >> 5]; // divide by 32
}
boolean get(int pos) {
int wordNumber = (pos >> 5); // divide by 32
int bitNumber = (pos & 0x1F); // mod 32
return (bitset[wordNumber] & (1 << bitNumber)) != 0;
}
void set(int pos) {
int wordNumber = (pos >> 5); // divide by 32
int bitNumber = (pos & 0x1F); // mod 32
bitset[wordNumber] |= 1 << bitNumber;
}
}
From what I can gather from reading the solution you mention (on page 205), and the little I understand about computer programming, it seems to me that this is a special implementation of a bitset, meant to take the argument of 32,000 in its construction (see the checkDuplicates function. The question is about examining an array with numbers from 1 to N, where N is at most 32,000, with only 4KB of memory).
This way, an array of 1000 elements is created, each one used for 32 bits in the bit set. You can see in the bitset class that to get a bit's position, we (floor) divide by 32 to get the array index, and then mod 32 to get the specific bit position.
Yes, answer in the book is incorrect.
Correct answer:
bitset = new int[(size + 31) >> 5]; // divide by 32

Printing numbers of the form 2^i * 5^j in increasing order

How do you print numbers of form 2^i * 5^j in increasing order.
For eg:
1, 2, 4, 5, 8, 10, 16, 20
This is actually a very interesting question, especially if you don't want this to be N^2 or NlogN complexity.
What I would do is the following:
Define a data structure containing 2 values (i and j) and the result of the formula.
Define a collection (e.g. std::vector) containing this data structures
Initialize the collection with the value (0,0) (the result is 1 in this case)
Now in a loop do the following:
Look in the collection and take the instance with the smallest value
Remove it from the collection
Print this out
Create 2 new instances based on the instance you just processed
In the first instance increment i
In the second instance increment j
Add both instances to the collection (if they aren't in the collection yet)
Loop until you had enough of it
The performance can be easily tweaked by choosing the right data structure and collection.
E.g. in C++, you could use an std::map, where the key is the result of the formula, and the value is the pair (i,j). Taking the smallest value is then just taking the first instance in the map (*map.begin()).
I quickly wrote the following application to illustrate it (it works!, but contains no further comments, sorry):
#include <math.h>
#include <map>
#include <iostream>
typedef __int64 Integer;
typedef std::pair<Integer,Integer> MyPair;
typedef std::map<Integer,MyPair> MyMap;
Integer result(const MyPair &myPair)
{
return pow((double)2,(double)myPair.first) * pow((double)5,(double)myPair.second);
}
int main()
{
MyMap myMap;
MyPair firstValue(0,0);
myMap[result(firstValue)] = firstValue;
while (true)
{
auto it=myMap.begin();
if (it->first < 0) break; // overflow
MyPair myPair = it->second;
std::cout << it->first << "= 2^" << myPair.first << "*5^" << myPair.second << std::endl;
myMap.erase(it);
MyPair pair1 = myPair;
++pair1.first;
myMap[result(pair1)] = pair1;
MyPair pair2 = myPair;
++pair2.second;
myMap[result(pair2)] = pair2;
}
}
This is well suited to a functional programming style. In F#:
let min (a,b)= if(a<b)then a else b;;
type stream (current, next)=
member this.current = current
member this.next():stream = next();;
let rec merge(a:stream,b:stream)=
if(a.current<b.current) then new stream(a.current, fun()->merge(a.next(),b))
else new stream(b.current, fun()->merge(a,b.next()));;
let rec Squares(start) = new stream(start,fun()->Squares(start*2));;
let rec AllPowers(start) = new stream(start,fun()->merge(Squares(start*2),AllPowers(start*5)));;
let Results = AllPowers(1);;
Works well with Results then being a stream type with current value and a next method.
Walking through it:
I define min for completenes.
I define a stream type to have a current value and a method to return a new string, essentially head and tail of a stream of numbers.
I define the function merge, which takes the smaller of the current values of two streams and then increments that stream. It then recurses to provide the rest of the stream. Essentially, given two streams which are in order, it will produce a new stream which is in order.
I define squares to be a stream increasing in powers of 2.
AllPowers takes the start value and merges the stream resulting from all squares at this number of powers of 5. it with the stream resulting from multiplying it by 5, since these are your only two options. You effectively are left with a tree of results
The result is merging more and more streams, so you merge the following streams
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
.
.
.
Merging all of these turns out to be fairly efficient with tail recursio and compiler optimisations etc.
These could be printed to the console like this:
let rec PrintAll(s:stream)=
if (s.current > 0) then
do System.Console.WriteLine(s.current)
PrintAll(s.next());;
PrintAll(Results);
let v = System.Console.ReadLine();
Similar things could be done in any language which allows for recursion and passing functions as values (it's only a little more complex if you can't pass functions as variables).
For an O(N) solution, you can use a list of numbers found so far and two indexes: one representing the next number to be multiplied by 2, and the other the next number to be multiplied by 5. Then in each iteration you have two candidate values to choose the smaller one from.
In Python:
numbers = [1]
next_2 = 0
next_5 = 0
for i in xrange(100):
mult_2 = numbers[next_2]*2
mult_5 = numbers[next_5]*5
if mult_2 < mult_5:
next = mult_2
next_2 += 1
else:
next = mult_5
next_5 += 1
# The comparison here is to avoid appending duplicates
if next > numbers[-1]:
numbers.append(next)
print numbers
So we have two loops, one incrementing i and second one incrementing j starting both from zero, right? (multiply symbol is confusing in the title of the question)
You can do something very straightforward:
Add all items in an array
Sort the array
Or you need an other solution with more math analysys?
EDIT: More smart solution by leveraging similarity with Merge Sort problem
If we imagine infinite set of numbers of 2^i and 5^j as two independent streams/lists this problem looks very the same as well known Merge Sort problem.
So solution steps are:
Get two numbers one from the each of streams (of 2 and of 5)
Compare
Return smallest
get next number from the stream of the previously returned smallest
and that's it! ;)
PS: Complexity of Merge Sort always is O(n*log(n))
I visualize this problem as a matrix M where M(i,j) = 2^i * 5^j. This means that both the rows and columns are increasing.
Think about drawing a line through the entries in increasing order, clearly beginning at entry (1,1). As you visit entries, the row and column increasing conditions ensure that the shape formed by those cells will always be an integer partition (in English notation). Keep track of this partition (mu = (m1, m2, m3, ...) where mi is the number of smaller entries in row i -- hence m1 >= m2 >= ...). Then the only entries that you need to compare are those entries which can be added to the partition.
Here's a crude example. Suppose you've visited all the xs (mu = (5,3,3,1)), then you need only check the #s:
x x x x x #
x x x #
x x x
x #
#
Therefore the number of checks is the number of addable cells (equivalently the number of ways to go up in Bruhat order if you're of a mind to think in terms of posets).
Given a partition mu, it's easy to determine what the addable states are. Image an infinite string of 0s following the last positive entry. Then you can increase mi by 1 if and only if m(i-1) > mi.
Back to the example, for mu = (5,3,3,1) we can increase m1 (6,3,3,1) or m2 (5,4,3,1) or m4 (5,3,3,2) or m5 (5,3,3,1,1).
The solution to the problem then finds the correct sequence of partitions (saturated chain). In pseudocode:
mu = [1,0,0,...,0];
while (/* some terminate condition or go on forever */) {
minNext = 0;
nextCell = [];
// look through all addable cells
for (int i=0; i<mu.length; ++i) {
if (i==0 or mu[i-1]>mu[i]) {
// check for new minimum value
if (minNext == 0 or 2^i * 5^(mu[i]+1) < minNext) {
nextCell = i;
minNext = 2^i * 5^(mu[i]+1)
}
}
}
// print next largest entry and update mu
print(minNext);
mu[i]++;
}
I wrote this in Maple stopping after 12 iterations:
1, 2, 4, 5, 8, 10, 16, 20, 25, 32, 40, 50
and the outputted sequence of cells added and got this:
1 2 3 5 7 10
4 6 8 11
9 12
corresponding to this matrix representation:
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
First of all, (as others mentioned already) this question is very vague!!!
Nevertheless, I am going to give a shot based on your vague equation and the pattern as your expected result. So I am not sure the following will be true for what you are trying to do, however it may give you some idea about java collections!
import java.util.List;
import java.util.ArrayList;
import java.util.SortedSet;
import java.util.TreeSet;
public class IncreasingNumbers {
private static List<Integer> findIncreasingNumbers(int maxIteration) {
SortedSet<Integer> numbers = new TreeSet<Integer>();
SortedSet<Integer> numbers2 = new TreeSet<Integer>();
for (int i=0;i < maxIteration;i++) {
int n1 = (int)Math.pow(2, i);
numbers.add(n1);
for (int j=0;j < maxIteration;j++) {
int n2 = (int)Math.pow(5, i);
numbers.add(n2);
for (Integer n: numbers) {
int n3 = n*n1;
numbers2.add(n3);
}
}
}
numbers.addAll(numbers2);
return new ArrayList<Integer>(numbers);
}
/**
* Based on the following fuzzy question # StackOverflow
* http://stackoverflow.com/questions/7571934/printing-numbers-of-the-form-2i-5j-in-increasing-order
*
*
* Result:
* 1 2 4 5 8 10 16 20 25 32 40 64 80 100 125 128 200 256 400 625 1000 2000 10000
*/
public static void main(String[] args) {
List<Integer> numbers = findIncreasingNumbers(5);
for (Integer i: numbers) {
System.out.print(i + " ");
}
}
}
If you can do it in O(nlogn), here's a simple solution:
Get an empty min-heap
Put 1 in the heap
while (you want to continue)
Get num from heap
print num
put num*2 and num*5 in the heap
There you have it. By min-heap, I mean min-heap
As a mathematician the first thing I always think about when looking at something like this is "will logarithms help?".
In this case it might.
If our series A is increasing then the series log(A) is also increasing. Since all terms of A are of the form 2^i.5^j then all members of the series log(A) are of the form i.log(2) + j.log(5)
We can then look at the series log(A)/log(2) which is also increasing and its elements are of the form i+j.(log(5)/log(2))
If we work out the i and j that generates the full ordered list for this last series (call it B) then that i and j will also generate the series A correctly.
This is just changing the nature of the problem but hopefully to one where it becomes easier to solve. At each step you can either increase i and decrease j or vice versa.
Looking at a few of the early changes you can make (which I will possibly refer to as transforms of i,j or just transorms) gives us some clues of where we are going.
Clearly increasing i by 1 will increase B by 1. However, given that log(5)/log(2) is approx 2.3 then increasing j by 1 while decreasing i by 2 will given an increase of just 0.3 . The problem then is at each stage finding the minimum possible increase in B for changes of i and j.
To do this I just kept a record as I increased of the most efficient transforms of i and j (ie what to add and subtract from each) to get the smallest possible increase in the series. Then applied whichever one was valid (ie making sure i and j don't go negative).
Since at each stage you can either decrease i or decrease j there are effectively two classes of transforms that can be checked individually. A new transform doesn't have to have the best overall score to be included in our future checks, just better than any other in its class.
To test my thougths I wrote a sort of program in LinqPad. Key things to note are that the Dump() method just outputs the object to screen and that the syntax/structure isn't valid for a real c# file. Converting it if you want to run it should be easy though.
Hopefully anything not explicitly explained will be understandable from the code.
void Main()
{
double C = Math.Log(5)/Math.Log(2);
int i = 0;
int j = 0;
int maxi = i;
int maxj = j;
List<int> outputList = new List<int>();
List<Transform> transforms = new List<Transform>();
outputList.Add(1);
while (outputList.Count<500)
{
Transform tr;
if (i==maxi)
{
//We haven't considered i this big before. Lets see if we can find an efficient transform by getting this many i and taking away some j.
maxi++;
tr = new Transform(maxi, (int)(-(maxi-maxi%C)/C), maxi%C);
AddIfWorthwhile(transforms, tr);
}
if (j==maxj)
{
//We haven't considered j this big before. Lets see if we can find an efficient transform by getting this many j and taking away some i.
maxj++;
tr = new Transform((int)(-(maxj*C)), maxj, (maxj*C)%1);
AddIfWorthwhile(transforms, tr);
}
//We have a set of transforms. We first find ones that are valid then order them by score and take the first (smallest) one.
Transform bestTransform = transforms.Where(x=>x.I>=-i && x.J >=-j).OrderBy(x=>x.Score).First();
//Apply transform
i+=bestTransform.I;
j+=bestTransform.J;
//output the next number in out list.
int value = GetValue(i,j);
//This line just gets it to stop when it overflows. I would have expected an exception but maybe LinqPad does magic with them?
if (value<0) break;
outputList.Add(value);
}
outputList.Dump();
}
public int GetValue(int i, int j)
{
return (int)(Math.Pow(2,i)*Math.Pow(5,j));
}
public void AddIfWorthwhile(List<Transform> list, Transform tr)
{
if (list.Where(x=>(x.Score<tr.Score && x.IncreaseI == tr.IncreaseI)).Count()==0)
{
list.Add(tr);
}
}
// Define other methods and classes here
public class Transform
{
public int I;
public int J;
public double Score;
public bool IncreaseI
{
get {return I>0;}
}
public Transform(int i, int j, double score)
{
I=i;
J=j;
Score=score;
}
}
I've not bothered looking at the efficiency of this but I strongly suspect its better than some other solutions because at each stage all I need to do is check my set of transforms - working out how many of these there are compared to "n" is non-trivial. It is clearly related since the further you go the more transforms there are but the number of new transforms becomes vanishingly small at higher numbers so maybe its just O(1). This O stuff always confused me though. ;-)
One advantage over other solutions is that it allows you to calculate i,j without needing to calculate the product allowing me to work out what the sequence would be without needing to calculate the actual number itself.
For what its worth after the first 230 nunmbers (when int runs out of space) I had 9 transforms to check each time. And given its only my total that overflowed I ran if for the first million results and got to i=5191 and j=354. The number of transforms was 23. The size of this number in the list is approximately 10^1810. Runtime to get to this level was approx 5 seconds.
P.S. If you like this answer please feel free to tell your friends since I spent ages on this and a few +1s would be nice compensation. Or in fact just comment to tell me what you think. :)
I'm sure everyone one's might have got the answer by now, but just wanted to give a direction to this solution..
It's a Ctrl C + Ctrl V from
http://www.careercup.com/question?id=16378662
void print(int N)
{
int arr[N];
arr[0] = 1;
int i = 0, j = 0, k = 1;
int numJ, numI;
int num;
for(int count = 1; count < N; )
{
numI = arr[i] * 2;
numJ = arr[j] * 5;
if(numI < numJ)
{
num = numI;
i++;
}
else
{
num = numJ;
j++;
}
if(num > arr[k-1])
{
arr[k] = num;
k++;
count++;
}
}
for(int counter = 0; counter < N; counter++)
{
printf("%d ", arr[counter]);
}
}
The question as put to me was to return an infinite set of solutions. I pondered the use of trees, but felt there was a problem with figuring out when to harvest and prune the tree, given an infinite number of values for i & j. I realized that a sieve algorithm could be used. Starting from zero, determine whether each positive integer had values for i and j. This was facilitated by turning answer = (2^i)*(2^j) around and solving for i instead. That gave me i = log2 (answer/ (5^j)). Here is the code:
class Program
{
static void Main(string[] args)
{
var startTime = DateTime.Now;
int potential = 0;
do
{
if (ExistsIandJ(potential))
Console.WriteLine("{0}", potential);
potential++;
} while (potential < 100000);
Console.WriteLine("Took {0} seconds", DateTime.Now.Subtract(startTime).TotalSeconds);
}
private static bool ExistsIandJ(int potential)
{
// potential = (2^i)*(5^j)
// 1 = (2^i)*(5^j)/potential
// 1/(2^1) = (5^j)/potential or (2^i) = potential / (5^j)
// i = log2 (potential / (5^j))
for (var j = 0; Math.Pow(5,j) <= potential; j++)
{
var i = Math.Log(potential / Math.Pow(5, j), 2);
if (i == Math.Truncate(i))
return true;
}
return false;
}
}

Convert Array of Decimal Digits to an Array of Binary Digits

This is probably a quite exotic question.
My Problem is as follows:
The TI 83+ graphing calculator allows you to program on it using either Assembly and a link cable to a computer or its built-in TI-BASIC programming language.
According to what I've found, it supports only 16-Bit Integers and some emulated floats.
I want to work with a bit larger numbers however (around 64 bit), so for that I use an array with the single digits:
{1, 2, 3, 4, 5}
would be the Decimal 12345.
In binary, that's 110000 00111001, or as a binary digit array:
{1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1}
which would be how the calculator displays it.
How would i go about converting this array of decimal digits (which is too large for the calculator to display it as a native type) into an array of decimal digits?
Efficiency is not an issue. This is NOT homework.
This would leave me free to implement Addition for such arrays and such.
thanks!
Thought about it and I think I would do it with the following 'algorithm'
check the last digit (5 in the example case)
if it is odd, store (from the reverse order) a 1 in the binary array
now divide the number by 2 through the following method:
begin with the first digit and clear the 'carry' variable.
divide it by 2 and add the 'carry' variable. If the remainder is 1 (check this before you do the divide with an and&1) then put 5 in the carry
repeat untill all digits have been done
repeat both steps again untill the whole number is reduced to 0's.
the number in your binary array is the binary representation
your example:
1,2,3,4,5
the 5 is odd so we store 1 in the binary array: 1
we divide the array by 2 using the algorithm:
0,2,3,4,5 => 0,1+5,3,4,5 => 0,6,1,4,5 => 0,6,1,2+5,5 => 0,6,1,7,2
and repeat:
0,6,1,7,2 last digit is even so we store a 0: 0,1 (notice we fill the binary string from right to left)
etc
you end up with a binary
EDIT:
Just to clarify above: All I'm doing is the age old algorithm:
int value=12345;
while(value>0)
{
binaryArray.push(value&1);
value>>=1; //divide by 2
}
except in your example we don't have an int but an array which represents a (10 base) int ;^)
On way would be to convert each digit in the decimal representation to it's binary representation and then add the binary representations of all the digits:
5 = 101
40 = 101000
300 = 100101100
2000 = 11111010000
10000 = 10011100010000
101
101000
100101100
11111010000
+ 10011100010000
----------------
11000000111001
Proof of concept in C#:
Methods for converting to an array of binary digits, adding arrays and multiplying an array by ten:
private static byte[] GetBinary(int value) {
int bit = 1, len = 1;
while (bit * 2 < value) {
bit <<= 1;
len++;
}
byte[] result = new byte[len];
for (int i = 0; value > 0;i++ ) {
if (value >= bit) {
value -= bit;
result[i] = 1;
}
bit >>= 1;
}
return result;
}
private static byte[] Add(byte[] a, byte[] b) {
byte[] result = new byte[Math.Max(a.Length, b.Length) + 1];
int carry = 0;
for (int i = 1; i <= result.Length; i++) {
if (i <= a.Length) carry += a[a.Length - i];
if (i <= b.Length) carry += b[b.Length - i];
result[result.Length - i] = (byte)(carry & 1);
carry >>= 1;
}
if (result[0] == 0) {
byte[] shorter = new byte[result.Length - 1];
Array.Copy(result, 1, shorter, 0, shorter.Length);
result = shorter;
}
return result;
}
private static byte[] Mul2(byte[] a, int exp) {
byte[] result = new byte[a.Length + exp];
Array.Copy(a, result, a.Length);
return result;
}
private static byte[] Mul10(byte[] a, int exp) {
for (int i = 0; i < exp; i++) {
a = Add(Mul2(a, 3), Mul2(a, 1));
}
return a;
}
Converting an array:
byte[] digits = { 1, 2, 3, 4, 5 };
byte[][] bin = new byte[digits.Length][];
int exp = 0;
for (int i = digits.Length - 1; i >= 0; i--) {
bin[i] = Mul10(GetBinary(digits[i]), exp);
exp++;
}
byte[] result = null;
foreach (byte[] digit in bin) {
result = result == null ? digit: Add(result, digit);
}
// output array
Console.WriteLine(
result.Aggregate(
new StringBuilder(),
(s, n) => s.Append(s.Length == 0 ? "" : ",").Append(n)
).ToString()
);
Output:
1,1,0,0,0,0,0,0,1,1,1,0,0,1
Edit:
Added methods for multiplying an array by tens. Intead of multiplying the digit before converting it to a binary array, it has to be done to the array.
The main issue here is that you're going between bases which aren't multiples of one another, and thus there isn't a direct isolated mapping between input digits and output digits. You're probably going to have to start with your least significant digit, output as many least significant digits of the output as you can before you need to consult the next digit, and so on. That way you only need to have at most 2 of your input digits being examined at any given point in time.
You might find it advantageous in terms of processing order to store your numbers in reversed form (such that the least significant digits come first in the array).

Resources