A problem from a programming competition... Digit Sums - algorithm

I need help solving problem N from this earlier competition:
Problem N: Digit Sums
Given 3 positive integers A, B and C,
find how many positive integers less
than or equal to A, when expressed in
base B, have digits which sum to C.
Input will consist of a series of
lines, each containing three integers,
A, B and C, 2 ≤ B ≤ 100, 1 ≤ A, C ≤
1,000,000,000. The numbers A, B and C
are given in base 10 and are separated
by one or more blanks. The input is
terminated by a line containing three
zeros.
Output will be the number of numbers,
for each input line (it must be given
in base 10).
Sample input
100 10 9
100 10 1
750000 2 2
1000000000 10 40
100000000 100 200
0 0 0
Sample output
10
3
189
45433800
666303
The relevant rules:
Read all input from the keyboard, i.e. use stdin, System.in, cin or equivalent. Input will be redirected from a file to form the input to your submission.
Write all output to the screen, i.e. use stdout, System.out, cout or equivalent. Do not write to stderr. Do NOT use, or even include, any module that allows direct manipulation of the screen, such as conio, Crt or anything similar. Output from your program is redirected to a file for later checking. Use of direct I/O means that such output is not redirected and hence cannot be checked. This could mean that a correct program is rejected!
Unless otherwise stated, all integers in the input will fit into a standard 32-bit computer word. Adjacent integers on a line will be separated by one or more spaces.
Of course, it's fair to say that I should learn more before trying to solve this, but i'd really appreciate it if someone here told me how it's done.
Thanks in advance, John.

Other people pointed out trivial solution: iterate over all numbers from 1 to A. But this problem, actually, can be solved in nearly constant time: O(length of A), which is O(log(A)).
Code provided is for base 10. Adapting it for arbitrary base is trivial.
To reach above estimate for time, you need to add memorization to recursion. Let me know if you have questions about that part.
Now, recursive function itself. Written in Java, but everything should work in C#/C++ without any changes. It's big, but mostly because of comments where I try to clarify algorithm.
// returns amount of numbers strictly less than 'num' with sum of digits 'sum'
// pay attention to word 'strictly'
int count(int num, int sum) {
// no numbers with negative sum of digits
if (sum < 0) {
return 0;
}
int result = 0;
// imagine, 'num' == 1234
// let's check numbers 1233, 1232, 1231, 1230 manually
while (num % 10 > 0) {
--num;
// check if current number is good
if (sumOfDigits(num) == sum) {
// one more result
++result;
}
}
if (num == 0) {
// zero reached, no more numbers to check
return result;
}
num /= 10;
// Using example above (1234), now we're left with numbers
// strictly less than 1230 to check (1..1229)
// It means, any number less than 123 with arbitrary digit appended to the right
// E.g., if this digit in the right (last digit) is 3,
// then sum of the other digits must be "sum - 3"
// and we need to add to result 'count(123, sum - 3)'
// let's iterate over all possible values of last digit
for (int digit = 0; digit < 10; ++digit) {
result += count(num, sum - digit);
}
return result;
}
Helper function
// returns sum of digits, plain and simple
int sumOfDigits(int x) {
int result = 0;
while (x > 0) {
result += x % 10;
x /= 10;
}
return result;
}
Now, let's write a little tester
int A = 12345;
int C = 13;
// recursive solution
System.out.println(count(A + 1, C));
// brute-force solution
int total = 0;
for (int i = 1; i <= A; ++i) {
if (sumOfDigits(i) == C) {
++total;
}
}
System.out.println(total);
You can write more comprehensive tester checking all values of A, but overall solution seems to be correct. (I tried several random A's and C's.)
Don't forget, you can't test solution for A == 1000000000 without memorization: it'll run too long. But with memorization, you can test it even for A == 10^1000.
edit
Just to prove a concept, poor man's memorization. (in Java, in other languages hashtables are declared differently) But if you want to learn something, it might be better to try to do it yourself.
// hold values here
private Map<String, Integer> mem;
int count(int num, int sum) {
// no numbers with negative sum of digits
if (sum < 0) {
return 0;
}
String key = num + " " + sum;
if (mem.containsKey(key)) {
return mem.get(key);
}
// ...
// continue as above...
// ...
mem.put(key, result);
return result;
}

Here's the same memoized recursive solution that Rybak posted, but with a simpler implementation, in my humble opinion:
HashMap<String, Integer> cache = new HashMap<String, Integer>();
int count(int bound, int base, int sum) {
// No negative digit sums.
if (sum < 0)
return 0;
// Handle one digit case.
if (bound < base)
return (sum <= bound) ? 1 : 0;
String key = bound + " " + sum;
if (cache.containsKey(key))
return cache.get(key);
int count = 0;
for (int digit = 0; digit < base; digit++)
count += count((bound - digit) / base, base, sum - digit);
cache.put(key, count);
return count;
}

This is not the complete solution (no input parsing). To get the number in base B, repeatedly take the modulo B, and then divide by B until the result is 0. This effectively computes the base-B digit from the right, and then shifts the number right.
int A,B,C; // from input
for (int x=1; x<A; x++)
{
int sumDigits = 0;
int v = x;
while (v!=0) {
sumDigits += (v % B);
v /= B;
}
if (sumDigits==C)
cout << x;
}
This is a brute force approach. It may be possible to compute this quicker by determining which sets of base B digits add up to C, arranging these in all permutations that are less than A, and then working backwards from that to create the original number.

Yum.
Try this:
int number, digitSum, resultCounter = 0;
for(int i=1; i<=A, i++)
{
number = i; //to avoid screwing up our counter
digitSum = 0;
while(number > 1)
{
//this is the next "digit" of the number as it would be in base B;
//works with any base including 10.
digitSum += (number % B);
//remove this digit from the number, square the base, rinse, repeat
number /= B;
}
digitSum += number;
//Does the sum match?
if(digitSum == C)
resultCounter++;
}
That's your basic algorithm for one line. Now you wrap this in another For loop for each input line you received, preceded by the input collection phase itself. This process can be simplified, but I don't feel like coding your entire answer to see if my algorithm works, and this looks right whereas the simpler tricks are harder to pass by inspection.
The way this works is by modulo dividing by powers of the base. Simple example, 1234 in base 10:
1234 % 10 = 4
1234 / 10 = 123 //integer division truncates any fraction
123 % 10 = 3 //sum is 7
123 / 10 = 12
12 % 10 = 2 //sum is 9
12 / 10 = 1 //end condition, add this and the sum is 10
A harder example to figure out by inspection would be the same number in base 12:
1234 % 12 = 10 //you can call it "A" like in hex, but we need a sum anyway
1234 / 12 = 102
102 % 12 = 6 // sum 16
102/12 = 8
8 % 12 = 8 //sum 24
8 / 12 = 0 //end condition, sum still 24.
So 1234 in base 12 would be written 86A. Check the math:
8*12^2 + 6*12 + 10 = 1152 + 72 + 10 = 1234
Have fun wrapping the rest of the code around this.

Related

Algorithm: Print the nth consecutive prime number

I'm currently learning algorithms and have came across a code challenge from an interviewer about a function that prints out the nth prime number sequentially. So it would be something like:
getPrimeNth(10) will print 1 2 3 5 7 11 13 17 19 23
but most of the ones I found will print out just the nth number, so 23, or just ones that will detect if it is prime numbers. I am going to risk getting downvoted for this but I can't seem to find the right solution for this.
One is not a prime, for starters.
Second, your question needs more clarification....
Primes are not challenging - there is a lot of information available.
The simplest solution for you would be to simply test every number by modding up to the square root of that number. If it mods to zero, it is not prime. Store the primes in an array one after another. I'm not going to straight up give you the answer, but read more about The Sieve of Eratosthenes - which is highly inefficient IMO, but where you must start.
Therefore, the first prime would be in slot 0, second in slot 1, etc, etc.
The below code tries to find and saves all possible primes upto N (defined by the macro). It just calls the utility function is_prime() which checks whether a given number is prime or not.
#define TRUE 1
#define FALSE 0
#define N 10
typedef short int bool;
bool is_prime(int num)
{
int i = 2;
for (i = 2; i <= (num - 1); i++) {
if ((num % i) == 0) {
return FALSE;
}
}
return TRUE;
}
int main()
{
int primes[N];
int num_primes = 0;
int num = 2; /* start with 2 */
while (num_primes != N) {
if (is_prime (num) == TRUE) {
primes[num_primes] = num;
num_primes++;
}
num++;
}
int i = 0;
for (i = 0; i < N; i++) {
printf ("%d ", primes[i]);
}
printf ("\n");
}
Output: 2 3 5 7 11 13 17 19 23 29

Finding the number of digits of an integer

What is the best method to find the number of digits of a positive integer?
I have found this 3 basic methods:
conversion to string
String s = new Integer(t).toString();
int len = s.length();
for loop
for(long long int temp = number; temp >= 1;)
{
temp/=10;
decimalPlaces++;
}
logaritmic calculation
digits = floor( log10( number ) ) + 1;
where you can calculate log10(x) = ln(x) / ln(10) in most languages.
First I thought the string method is the dirtiest one but the more I think about it the more I think it's the fastest way. Or is it?
There's always this method:
n = 1;
if ( i >= 100000000 ) { n += 8; i /= 100000000; }
if ( i >= 10000 ) { n += 4; i /= 10000; }
if ( i >= 100 ) { n += 2; i /= 100; }
if ( i >= 10 ) { n += 1; }
Well the correct answer would be to measure it - but you should be able to make a guess about the number of CPU steps involved in converting strings and going through them looking for an end marker
Then think how many FPU operations/s your processor can do and how easy it is to calculate a single log.
edit: wasting some more time on a monday morning :-)
String s = new Integer(t).toString();
int len = s.length();
One of the problems with high level languages is guessing how much work the system is doing behind the scenes of an apparently simple statement. Mandatory Joel link
This statement involves allocating memory for a string, and possibly a couple of temporary copies of a string. It must parse the integer and copy the digits of it into a string, possibly having to reallocate and move the existing memory if the number is large. It might have to check a bunch of locale settings to decide if your country uses "," or ".", it might have to do a bunch of unicode conversions.
Then finding the length has to scan the entire string, again considering unicode and any local specific settings such as - are you in a right->left language?.
Alternatively:
digits = floor( log10( number ) ) + 1;
Just because this would be harder for you to do on paper doesn't mean it's hard for a computer! In fact a good rule in high performance computing seems to have been - if something is hard for a human (fluid dynamics, 3d rendering) it's easy for a computer, and if it's easy for a human (face recognition, detecting a voice in a noisy room) it's hard for a computer!
You can generally assume that the builtin maths functions log/sin/cos etc - have been an important part of computer design for 50years. So even if they don't map directly into a hardware function in the FPU you can bet that the alternative implementation is pretty efficient.
I don't know, and the answer may well be different depending on how your individual language is implemented.
So, stress test it! Implement all three solutions. Run them on 1 through 1,000,000 (or some other huge set of numbers that's representative of the numbers the solution will be running against) and time how long each of them takes.
Pit your solutions against one another and let them fight it out. Like intellectual gladiators. Three algorithms enter! One algorithm leaves!
Test conditions
Decimal numeral system
Positive integers
Up to 10 digits
Language: ActionScript 3
Results
digits: [1,10],
no. of runs: 1,000,000
random sample: 8777509,40442298,477894,329950,513,91751410,313,3159,131309,2
result: 7,8,6,6,3,8,3,4,6,1
CONVERSION TO STRING: 724ms
LOGARITMIC CALCULATION: 349ms
DIV 10 ITERATION: 229ms
MANUAL CONDITIONING: 136ms
Note: Author refrains from making any conclusions for numbers with more than 10 digits.
Script
package {
import flash.display.MovieClip;
import flash.utils.getTimer;
/**
* #author Daniel
*/
public class Digits extends MovieClip {
private const NUMBERS : uint = 1000000;
private const DIGITS : uint = 10;
private var numbers : Array;
private var digits : Array;
public function Digits() {
// ************* NUMBERS *************
numbers = [];
for (var i : int = 0; i < NUMBERS; i++) {
var number : Number = Math.floor(Math.pow(10, Math.random()*DIGITS));
numbers.push(number);
}
trace('Max digits: ' + DIGITS + ', count of numbers: ' + NUMBERS);
trace('sample: ' + numbers.slice(0, 10));
// ************* CONVERSION TO STRING *************
digits = [];
var time : Number = getTimer();
for (var i : int = 0; i < numbers.length; i++) {
digits.push(String(numbers[i]).length);
}
trace('\nCONVERSION TO STRING - time: ' + (getTimer() - time));
trace('sample: ' + digits.slice(0, 10));
// ************* LOGARITMIC CALCULATION *************
digits = [];
time = getTimer();
for (var i : int = 0; i < numbers.length; i++) {
digits.push(Math.floor( Math.log( numbers[i] ) / Math.log(10) ) + 1);
}
trace('\nLOGARITMIC CALCULATION - time: ' + (getTimer() - time));
trace('sample: ' + digits.slice(0, 10));
// ************* DIV 10 ITERATION *************
digits = [];
time = getTimer();
var digit : uint = 0;
for (var i : int = 0; i < numbers.length; i++) {
digit = 0;
for(var temp : Number = numbers[i]; temp >= 1;)
{
temp/=10;
digit++;
}
digits.push(digit);
}
trace('\nDIV 10 ITERATION - time: ' + (getTimer() - time));
trace('sample: ' + digits.slice(0, 10));
// ************* MANUAL CONDITIONING *************
digits = [];
time = getTimer();
var digit : uint;
for (var i : int = 0; i < numbers.length; i++) {
var number : Number = numbers[i];
if (number < 10) digit = 1;
else if (number < 100) digit = 2;
else if (number < 1000) digit = 3;
else if (number < 10000) digit = 4;
else if (number < 100000) digit = 5;
else if (number < 1000000) digit = 6;
else if (number < 10000000) digit = 7;
else if (number < 100000000) digit = 8;
else if (number < 1000000000) digit = 9;
else if (number < 10000000000) digit = 10;
digits.push(digit);
}
trace('\nMANUAL CONDITIONING: ' + (getTimer() - time));
trace('sample: ' + digits.slice(0, 10));
}
}
}
This algorithm might be good also, assuming that:
Number is integer and binary encoded (<< operation is cheap)
We don't known number boundaries
var num = 123456789L;
var len = 0;
var tmp = 1L;
while(tmp < num)
{
len++;
tmp = (tmp << 3) + (tmp << 1);
}
This algorithm, should have speed comparable to for-loop (2) provided, but a bit faster due to (2 bit-shifts, add and subtract, instead of division).
As for Log10 algorithm, it will give you only approximate answer (that is close to real, but still), since analytic formula for computing Log function have infinite loop and can't be calculated precisely Wiki.
Use the simplest solution in whatever programming language you're using. I can't think of a case where counting digits in an integer would be the bottleneck in any (useful) program.
C, C++:
char buffer[32];
int length = sprintf(buffer, "%ld", (long)123456789);
Haskell:
len = (length . show) 123456789
JavaScript:
length = String(123456789).length;
PHP:
$length = strlen(123456789);
Visual Basic (untested):
length = Len(str(123456789)) - 1
conversion to string: This will have to iterate through each digit, find the character that maps to the current digit, add a character to a collection of characters. Then get the length of the resulting String object. Will run in O(n) for n=#digits.
for-loop: will perform 2 mathematical operation: dividing the number by 10 and incrementing a counter. Will run in O(n) for n=#digits.
logarithmic: Will call log10 and floor, and add 1. Looks like O(1) but I'm not really sure how fast the log10 or floor functions are. My knowledge of this sort of things has atrophied with lack of use so there could be hidden complexity in these functions.
So I guess it comes down to: is looking up digit mappings faster than multiple mathematical operations or whatever is happening in log10? The answer will probably vary. There could be platforms where the character mapping is faster, and others where doing the calculations is faster. Also to keep in mind is that the first method will creats a new String object that only exists for the purpose of getting the length. This will probably use more memory than the other two methods, but it may or may not matter.
You can obviously eliminate the method 1 from the competition, because the atoi/toString algorithm it uses would be similar to method 2.
Method 3's speed depends on whether the code is being compiled for a system whose instruction set includes log base 10.
For very large integers, the log method is much faster. For instance, with a 2491327 digit number (the 11920928th Fibonacci number, if you care), Python takes several minutes to execute the divide-by-10 algorithm, and milliseconds to execute 1+floor(log(n,10)).
import math
def numdigits(n):
return ( int(math.floor(math.log10(n))) + 1 )
Regarding the three methods you propose for "determining the number of digits necessary to represent a given number in a given base", I don't like any of them, actually; I prefer the method I give below instead.
Re your method #1 (strings): Anything involving converting back-and-forth between strings and numbers is usually very slow.
Re your method #2 (temp/=10): This is fatally flawed because it assumes that x/10 always means "x divided by 10". But in many programming languages (eg: C, C++), if "x" is an integer type, then "x/10" means "integer division", which isn't the same thing as floating-point division, and it introduces round-off errors at every iteration, and they accumulate in a recursive formula such as your solution #2 uses.
Re your method #3 (logs): it's buggy for large numbers (at least in C, and probably other languages as well), because floating-point data types tend not to be as precise as 64-bit integers.
Hence I dislike all 3 of those methods: #1 works but is slow, #2 is broken, and #3 is buggy for large numbers. Instead, I prefer this, which works for numbers from 0 up to about 18.44 quintillion:
unsigned NumberOfDigits (uint64_t Number, unsigned Base)
{
unsigned Digits = 1;
uint64_t Power = 1;
while ( Number / Power >= Base )
{
++Digits;
Power *= Base;
}
return Digits;
}
Keep it simple:
long long int a = 223452355415634664;
int x;
for (x = 1; a >= 10; x++)
{
a = a / 10;
}
printf("%d", x);
You can use a recursive solution instead of a loop, but somehow similar:
#tailrec
def digits (i: Long, carry: Int=1) : Int = if (i < 10) carry else digits (i/10, carry+1)
digits (8345012978643L)
With longs, the picture might change - measure small and long numbers independently against different algorithms, and pick the appropriate one, depending on your typical input. :)
Of course nothing beats a switch:
switch (x) {
case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7: case 8: case 9: return 1;
case 10: case 11: // ...
case 99: return 2;
case 100: // you get the point :)
default: return 10; // switch only over int
}
except a plain-o-array:
int [] size = {1,1,1,1,1,1,1,1,1,2,2,2,2,2,... };
int x = 234561798;
return size [x];
Some people will tell you to optimize the code-size, but yaknow, premature optimization ...
log(x,n)-mod(log(x,n),1)+1
Where x is a the base and n is the number.
Here is the measurement in Swift 4.
Algorithms code:
extension Int {
var numberOfDigits0: Int {
var currentNumber = self
var n = 1
if (currentNumber >= 100000000) {
n += 8
currentNumber /= 100000000
}
if (currentNumber >= 10000) {
n += 4
currentNumber /= 10000
}
if (currentNumber >= 100) {
n += 2
currentNumber /= 100
}
if (currentNumber >= 10) {
n += 1
}
return n
}
var numberOfDigits1: Int {
return String(self).count
}
var numberOfDigits2: Int {
var n = 1
var currentNumber = self
while currentNumber > 9 {
n += 1
currentNumber /= 10
}
return n
}
}
Measurement code:
var timeInterval0 = Date()
for i in 0...10000 {
i.numberOfDigits0
}
print("timeInterval0: \(Date().timeIntervalSince(timeInterval0))")
var timeInterval1 = Date()
for i in 0...10000 {
i.numberOfDigits1
}
print("timeInterval1: \(Date().timeIntervalSince(timeInterval1))")
var timeInterval2 = Date()
for i in 0...10000 {
i.numberOfDigits2
}
print("timeInterval2: \(Date().timeIntervalSince(timeInterval2))")
Output
timeInterval0: 1.92149806022644
timeInterval1: 0.557608008384705
timeInterval2: 2.83262193202972
On this measurement basis String conversion is the best option for the Swift language.
I was curious after seeing #daniel.sedlacek results so I did some testing using Swift for numbers having more than 10 digits. I ran the following script in the playground.
let base = [Double(100090000000), Double(100050000), Double(100050000), Double(100000200)]
var rar = [Double]()
for i in 1...10 {
for d in base {
let v = d*Double(arc4random_uniform(UInt32(1000000000)))
rar.append(v*Double(arc4random_uniform(UInt32(1000000000))))
rar.append(Double(1)*pow(1,Double(i)))
}
}
print(rar)
var timeInterval = NSDate().timeIntervalSince1970
for d in rar {
floor(log10(d))
}
var newTimeInterval = NSDate().timeIntervalSince1970
print(newTimeInterval-timeInterval)
timeInterval = NSDate().timeIntervalSince1970
for d in rar {
var c = d
while c > 10 {
c = c/10
}
}
newTimeInterval = NSDate().timeIntervalSince1970
print(newTimeInterval-timeInterval)
Results of 80 elements
0.105069875717163 for floor(log10(x))
0.867973804473877 for div 10 iterations
Adding one more approach to many of the already mentioned approaches.
The idea is to use binarySearch on an array containing the range of integers based on the digits of the int data type.
The signature of Java Arrays class binarySearch is :
binarySearch(dataType[] array, dataType key) which returns the index of the search key, if it is contained in the array; otherwise, (-(insertion point) – 1).
The insertion point is defined as the point at which the key would be inserted into the array.
Below is the implementation:
static int [] digits = {9,99,999,9999,99999,999999,9999999,99999999,999999999,Integer.MAX_VALUE};
static int digitsCounter(int N)
{
int digitCount = Arrays.binarySearch(digits , N<0 ? -N:N);
return 1 + (digitCount < 0 ? ~digitCount : digitCount);
}
Please note that the above approach only works for : Integer.MIN_VALUE <= N <= Integer.MAX_VALUE, but can be easily extended for Long data type by adding more values to the digits array.
For example,
I) for N = 555, digitCount = Arrays.binarySearch(digits , 555) returns -3 (-(2)-1) as it's not present in the array but is supposed to be inserted at point 2 between 9 & 99 like [9, 55, 99].
As the index we got is negative we need to take the bitwise compliment of the result.
At last, we need to add 1 to the result to get the actual number of digits in the number N.
In Swift 5.x, you get the number of digit in integer as below :
Convert to string and then count number of character in string
let nums = [1, 7892, 78, 92, 90]
for i in nums {
let ch = String(describing: i)
print(ch.count)
}
Calculating the number of digits in integer using loop
var digitCount = 0
for i in nums {
var tmp = i
while tmp >= 1 {
tmp /= 10
digitCount += 1
}
print(digitCount)
}
let numDigits num =
let num = abs(num)
let rec numDigitsInner num =
match num with
| num when num < 10 -> 1
| _ -> 1 + numDigitsInner (num / 10)
numDigitsInner num
F# Version, without casting to a string.

How to find a binary logarithm very fast? (O(1) at best)

Is there any very fast method to find a binary logarithm of an integer number? For example, given a number
x=52656145834278593348959013841835216159447547700274555627155488768 such algorithm must find y=log(x,2) which is 215. x is always a power of 2.
The problem seems to be really simple. All what is required is to find the position of the most significant 1 bit. There is a well-known method FloorLog, but it is not very fast especially for the very long multi-words integers.
What is the fastest method?
A quick hack: Most floating-point number representations automatically normalise values, meaning that they effectively perform the loop Christoffer Hammarström mentioned in hardware. So simply converting from an integer to FP and extracting the exponent should do the trick, provided the numbers are within the FP representation's exponent range! (In your case, your integer input requires multiple machine words, so multiple "shifts" will need to be performed in the conversion.)
If the integers are stored in a uint32_t a[], then my obvious solution would be as follows:
Run a linear search over a[] to find the highest-valued non-zero uint32_t value a[i] in a[] (test using uint64_t for that search if your machine has native uint64_t support)
Apply the bit twiddling hacks to find the binary log b of the uint32_t value a[i] you found in step 1.
Evaluate 32*i+b.
The answer is implementation or language dependent. Any implementation can store the number of significant bits along with the data, as it is often useful. If it must be calculated, then find the most significant word/limb and the most significant bit in that word.
If you're using fixed-width integers then the other answers already have you pretty-well covered.
If you're using arbitrarily large integers, like int in Python or BigInteger in Java, then you can take advantage of the fact that their variable-size representation uses an underlying array, so the base-2 logarithm can be computed easily and quickly in O(1) time using the length of the underlying array. The base-2 logarithm of a power of 2 is simply one less than the number of bits required to represent the number.
So when n is an integer power of 2:
In Python, you can write n.bit_length() - 1 (docs).
In Java, you can write n.bitLength() - 1 (docs).
You can create an array of logarithms beforehand. This will find logarithmic values up to log(N):
#define N 100000
int naj[N];
naj[2] = 1;
for ( int i = 3; i <= N; i++ )
{
naj[i] = naj[i-1];
if ( (1 << (naj[i]+1)) <= i )
naj[i]++;
}
The array naj is your logarithmic values. Where naj[k] = log(k).
Log is based on two.
This uses binary search for finding the closest power of 2.
public static int binLog(int x,boolean shouldRoundResult){
// assuming 32-bit integer
int lo=0;
int hi=31;
int rangeDelta=hi-lo;
int expGuess=0;
int guess;
while(rangeDelta>1){
expGuess=(lo+hi)/2; // or (loGuess+hiGuess)>>1
guess=1<<expGuess;
if(guess<x){
lo=expGuess;
} else if(guess>x){
hi=expGuess;
} else {
lo=hi=expGuess;
}
rangeDelta=hi-lo;
}
if(shouldRoundResult && hi>lo){
int loGuess=1<<lo;
int hiGuess=1<<hi;
int loDelta=Math.abs(x-loGuess);
int hiDelta=Math.abs(hiGuess-x);
if(loDelta<hiDelta)
expGuess=lo;
else
expGuess=hi;
} else {
expGuess=lo;
}
int result=expGuess;
return result;
}
The best option on top of my head would be a O(log(logn)) approach, by using binary search. Here is an example for a 64-bit ( <= 2^63 - 1 ) number (in C++):
int log2(int64_t num) {
int res = 0, pw = 0;
for(int i = 32; i > 0; i --) {
res += i;
if(((1LL << res) - 1) & num)
res -= i;
}
return res;
}
This algorithm will basically profide me with the highest number res such as (2^res - 1 & num) == 0. Of course, for any number, you can work it out in a similar matter:
int log2_better(int64_t num) {
var res = 0;
for(i = 32; i > 0; i >>= 1) {
if( (1LL << (res + i)) <= num )
res += i;
}
return res;
}
Note that this method relies on the fact that the "bitshift" operation is more or less O(1). If this is not the case, you would have to precompute either all the powers of 2, or the numbers of form 2^2^i (2^1, 2^2, 2^4, 2^8, etc.) and do some multiplications(which in this case aren't O(1)) anymore.
The example in the OP is an integer string of 65 characters, which is not representable by a INT64 or even INT128. It is still very easy to get the Log(2,x) from this string by converting it to a double-precision number. This at least gives you easy access to integers upto 2^1023.
Below you find some form of pseudocode
# 1. read the string
string="52656145834278593348959013841835216159447547700274555627155488768"
# 2. extract the length of the string
l=length(string) # l = 65
# 3. read the first min(l,17) digits in a float
float=to_float(string(1: min(17,l) ))
# 4. multiply with the correct power of 10
float = float * 10^(l-min(17,l) ) # float = 5.2656145834278593E64
# 5. Take the log2 of this number and round to the nearest integer
log2 = Round( Log(float,2) ) # 215
Note:
some computer languages can convert arbitrary strings into a double precision number. So steps 2,3 and 4 could be replaced by x=to_float(string)
Step 5 could be done quicker by just reading the double-precision exponent (bits 53 up to and including 63) and subtracting 1023 from it.
Quick example code: If you have awk you can quickly test this algorithm.
The following code creates the first 300 powers of two:
awk 'BEGIN{for(n=0;n<300; n++) print 2^n}'
The following reads the input and does the above algorithm:
awk '{ l=length($0); m = (l > 17 ? 17 : l)
x = substr($0,1,m) * 10^(l-m)
print log(x)/log(2)
}'
So the following bash-command is a convoluted way to create a consecutive list of numbers from 0 to 299:
$ awk 'BEGIN{for(n=0;n<300; n++) print 2^n}' | awk '{ l=length($0); m = (l > 17 ? 17 : l); x = substr($0,1,m) * 10^(l-m); print log(x)/log(2) }'
0
1
2
...
299

Multiplication of very long integers

Is there an algorithm for accurately multiplying two arbitrarily long integers together? The language I am working with is limited to 64-bit unsigned integer length (maximum integer size of 18446744073709551615). Realistically, I would like to be able to do this by breaking up each number, processing them somehow using the unsigned 64-bit integers, and then being able to put them back together in to a string (which would solve the issue of multiplied result storage).
Any ideas?
Most languages have functions or libraries that do this, usually called a Bignum library (GMP is a good one.)
If you want to do it yourself, I would do it the same way that people do long multiplication on paper. To do this you could either work with strings containing the number, or do it in binary using bitwise operations.
Example:
45
x67
---
315
+270
----
585
Or in binary:
101
x101
----
101
000
+101
------
11001
Edit: After doing it in binary I realized that it would be much simpler (and faster of course) to code using bitwise operations instead of strings containing the base-10 numbers. I've edited my binary multiplying example to show a pattern: for each 1-bit in the bottom number, add the top number, bit-shifted left the position of the 1-bit times to a variable. At the end, that variable will contain the product.
To store the product, you'll have to have two 64-bit numbers and imagine one of them being the first 64 bits and the other one the second 64 bits of the product. You'll have to write code that carries the addition from bit 63 of the second number to bit 0 of the first number.
If you can't use an existing bignum library like GMP, check out Wikipedia's article on binary multiplication with computers. There are a number of good, efficient algorithms for this.
The simplest way would be to use the schoolbook mechanism, splitting your arbitrarily sized numbers into chunks of 32-bit each.
Given A B C D * E F G H (each chunk 32-bit, for a total 128 bit)
You need an output array 9 dwords wide.
Set Out[0..8] to 0
You'd start by doing: H * D + out[8] => 64 bit result.
Store the low 32-bits in out[8] and take the high 32-bits as carry
Next: (H * C) + out[7] + carry
Again, store low 32-bit in out[7], use the high 32-bits as carry
after doing H*A + out[4] + carry, you need to continue looping until you have no carry.
Then repeat with G, F, E.
For G, you'd start at out[7] instead of out[8], and so forth.
Finally, walk through and convert the large integer into digits (which will require a "divide large number by a single word" routine)
Yes, you do it using a datatype that is effectively a string of digits (just like a normal 'string' is a string of characters). How you do this is highly language-dependent. For instance, Java uses BigDecimal. What language are you using?
This is often given as a homework assignment. The algorithm you learned in grade school will work. Use a library (several are mentioned in other posts) if you need this for a real application.
Here is my code piece in C. Good old multiply method
char *multiply(char s1[], char s2[]) {
int l1 = strlen(s1);
int l2 = strlen(s2);
int i, j, k = 0, c = 0;
char *r = (char *) malloc (l1+l2+1); // add one byte for the zero terminating string
int temp;
strrev(s1);
strrev(s2);
for (i = 0;i <l1+l2; i++) {
r[i] = 0 + '0';
}
for (i = 0; i <l1; i ++) {
c = 0; k = i;
for (j = 0; j < l2; j++) {
temp = get_int(s1[i]) * get_int(s2[j]);
temp = temp + c + get_int(r[k]);
c = temp /10;
r[k] = temp%10 + '0';
k++;
}
if (c!=0) {
r[k] = c + '0';
k++;
}
}
r[k] = '\0';
strrev(r);
return r;
}
//Here is a JavaScript version of an Karatsuba Algorithm running with less time than the usual multiplication method
function range(start, stop, step) {
if (typeof stop == 'undefined') {
// one param defined
stop = start;
start = 0;
}
if (typeof step == 'undefined') {
step = 1;
}
if ((step > 0 && start >= stop) || (step < 0 && start <= stop)) {
return [];
}
var result = [];
for (var i = start; step > 0 ? i < stop : i > stop; i += step) {
result.push(i);
}
return result;
};
function zeroPad(numberString, zeros, left = true) {
//Return the string with zeros added to the left or right.
for (var i in range(zeros)) {
if (left)
numberString = '0' + numberString
else
numberString = numberString + '0'
}
return numberString
}
function largeMultiplication(x, y) {
x = x.toString();
y = y.toString();
if (x.length == 1 && y.length == 1)
return parseInt(x) * parseInt(y)
if (x.length < y.length)
x = zeroPad(x, y.length - x.length);
else
y = zeroPad(y, x.length - y.length);
n = x.length
j = Math.floor(n/2);
//for odd digit integers
if ( n % 2 != 0)
j += 1
var BZeroPadding = n - j
var AZeroPadding = BZeroPadding * 2
a = parseInt(x.substring(0,j));
b = parseInt(x.substring(j));
c = parseInt(y.substring(0,j));
d = parseInt(y.substring(j));
//recursively calculate
ac = largeMultiplication(a, c)
bd = largeMultiplication(b, d)
k = largeMultiplication(a + b, c + d)
A = parseInt(zeroPad(ac.toString(), AZeroPadding, false))
B = parseInt(zeroPad((k - ac - bd).toString(), BZeroPadding, false))
return A + B + bd
}
//testing the function here
example = largeMultiplication(12, 34)
console.log(example)

Is there a simple algorithm that can determine if X is prime?

I have been trying to work my way through Project Euler, and have noticed a handful of problems ask for you to determine a prime number as part of it.
I know I can just divide x by 2, 3, 4, 5, ..., square root of X and if I get to the square root, I can (safely) assume that the number is prime. Unfortunately this solution seems quite klunky.
I've looked into better algorithms on how to determine if a number is prime, but get confused fast.
Is there a simple algorithm that can determine if X is prime, and not confuse a mere mortal programmer?
Thanks much!
The first algorithm is quite good and used a lot on Project Euler. If you know the maximum number that you want you can also research Eratosthenes's sieve.
If you maintain the list of primes you can also refine the first algo to divide only with primes until the square root of the number.
With these two algoritms (dividing and the sieve) you should be able to solve the problems.
Edit: fixed name as noted in comments
To generate all prime numbers less than a limit Sieve of Eratosthenes (the page contains variants in 20 programming languages) is the oldest and the simplest solution.
In Python:
def iprimes_upto(limit):
is_prime = [True] * limit
for n in range(2, limit):
if is_prime[n]:
yield n
for i in range(n*n, limit, n): # start at ``n`` squared
is_prime[i] = False
Example:
>>> list(iprimes_upto(15))
[2, 3, 5, 7, 11, 13]
I see that Fermat's primality test has already been suggested, but I've been working through Structure and Interpretation of Computer Programs, and they also give the Miller-Rabin test (see Section 1.2.6, problem 1.28) as another alternative. I've been using it with success for the Euler problems.
Here's a simple optimization of your method that isn't quite the Sieve of Eratosthenes but is very easy to implement: first try dividing X by 2 and 3, then loop over j=1..sqrt(X)/6, trying to divide by 6*j-1 and 6*j+1. This automatically skips over all numbers divisible by 2 or 3, gaining you a pretty nice constant factor acceleration.
Keeping in mind the following facts (from MathsChallenge.net):
All primes except 2 are odd.
All primes greater than 3 can be written in the form 6k - 1 or 6k + 1.
You don't need to check past the square root of n
Here's the C++ function I use for relatively small n:
bool isPrime(unsigned long n)
{
if (n == 1) return false; // 1 is not prime
if (n < 4) return true; // 2 and 3 are both prime
if ((n % 2) == 0) return false; // exclude even numbers
if (n < 9) return true; //we have already excluded 4, 6, and 8.
if ((n % 3) == 0) return false; // exclude remaining multiples of 3
unsigned long r = floor( sqrt(n) );
unsigned long f = 5;
while (f <= r)
{
if ((n % f) == 0) return false;
if ((n % (f + 2)) == 0) return false;
f = f + 6;
}
return true; // (in all other cases)
}
You could probably think of more optimizations of your own.
I'd recommend Fermat's primality test. It is a probabilistic test, but it is correct surprisingly often. And it is incredibly fast when compared with the sieve.
For reasonably small numbers, x%n for up to sqrt(x) is awfully fast and easy to code.
Simple improvements:
test 2 and odd numbers only.
test 2, 3, and multiples of 6 + or - 1 (all primes other than 2 or 3 are multiples of 6 +/- 1, so you're essentially just skipping all even numbers and all multiples of 3
test only prime numbers (requires calculating or storing all primes up to sqrt(x))
You can use the sieve method to quickly generate a list of all primes up to some arbitrary limit, but it tends to be memory intensive. You can use the multiples of 6 trick to reduce memory usage down to 1/3 of a bit per number.
I wrote a simple prime class (C#) that uses two bitfields for multiples of 6+1 and multiples of 6-1, then does a simple lookup... and if the number i'm testing is outside the bounds of the sieve, then it falls back on testing by 2, 3, and multiples of 6 +/- 1. I found that generating a large sieve actually takes more time than calculating primes on the fly for most of the euler problems i've solved so far. KISS principle strikes again!
I wrote a prime class that uses a sieve to pre-calculate smaller primes, then relies on testing by 2, 3, and multiples of six +/- 1 for ones outside the range of the sieve.
For Project Euler, having a list of primes is really essential. I would suggest maintaining a list that you use for each problem.
I think what you're looking for is the Sieve of Eratosthenes.
Your right the simples is the slowest. You can optimize it somewhat.
Look into using modulus instead of square roots.
Keep track of your primes. you only need to divide 7 by 2, 3, and 5 since 6 is a multiple of 2 and 3, and 4 is a multiple of 2.
Rslite mentioned the eranthenos sieve. It is fairly straight forward. I have it in several languages it home. Add a comment if you want me to post that code later.
Here is my C++ one. It has plenty of room to improve, but it is fast compared to the dynamic languages versions.
// Author: James J. Carman
// Project: Sieve of Eratosthenes
// Description: I take an array of 2 ... max values. Instead of removeing the non prime numbers,
// I mark them as 0, and ignoring them.
// More info: http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
#include <iostream>
int main(void) {
// using unsigned short.
// maximum value is around 65000
const unsigned short max = 50000;
unsigned short x[max];
for(unsigned short i = 0; i < max; i++)
x[i] = i + 2;
for(unsigned short outer = 0; outer < max; outer++) {
if( x[outer] == 0)
continue;
unsigned short item = x[outer];
for(unsigned short multiplier = 2; (multiplier * item) < x[max - 1]; multiplier++) {
unsigned int searchvalue = item * multiplier;
unsigned int maxValue = max + 1;
for( unsigned short maxIndex = max - 1; maxIndex > 0; maxIndex--) {
if(x[maxIndex] != 0) {
maxValue = x[maxIndex];
break;
}
}
for(unsigned short searchindex = multiplier; searchindex < max; searchindex++) {
if( searchvalue > maxValue )
break;
if( x[searchindex] == searchvalue ) {
x[searchindex] = 0;
break;
}
}
}
}
for(unsigned short printindex = 0; printindex < max; printindex++) {
if(x[printindex] != 0)
std::cout << x[printindex] << "\t";
}
return 0;
}
I will throw up the Perl and python code I have as well as soon as I find it. They are similar in style, just less lines.
Here is a simple primality test in D (Digital Mars):
/**
* to compile:
* $ dmd -run prime_trial.d
* to optimize:
* $ dmd -O -inline -release prime_trial.d
*/
module prime_trial;
import std.conv : to;
import std.stdio : w = writeln;
/// Adapted from: http://www.devx.com/vb2themax/Tip/19051
bool
isprime(Integer)(in Integer number)
{
/* manually test 1, 2, 3 and multiples of 2 and 3 */
if (number == 2 || number == 3)
return true;
else if (number < 2 || number % 2 == 0 || number % 3 == 0)
return false;
/* we can now avoid to consider multiples
* of 2 and 3. This can be done really simply
* by starting at 5 and incrementing by 2 and 4
* alternatively, that is:
* 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 35, 37, ...
* we don't need to go higher than the square root of the number */
for (Integer divisor = 5, increment = 2; divisor*divisor <= number;
divisor += increment, increment = 6 - increment)
if (number % divisor == 0)
return false;
return true; // if we get here, the number is prime
}
/// print all prime numbers less then a given limit
void main(char[][] args)
{
const limit = (args.length == 2) ? to!(uint)(args[1]) : 100;
for (uint i = 0; i < limit; ++i)
if (isprime(i))
w(i);
}
I am working thru the Project Euler problems as well and in fact just finished #3 (by id) which is the search for the highest prime factor of a composite number (the number in the ? is 600851475143).
I looked at all of the info on primes (the sieve techniques already mentioned here), and on integer factorization on wikipedia and came up with a brute force trial division algorithm that I decided would do.
So as I am doing the euler problems to learn ruby I was looking into coding my algorithm and stumbled across the mathn library which has a Prime class and an Integer class with a prime_division method. how cool is that. i was able to get the correct answer to the problem with this ruby snippet:
require "mathn.rb"
puts 600851475143.prime_division.last.first
this snippet outputs the correct answer to the console. of course i ended up doing a ton of reading and learning before i stumbled upon this little beauty, i just thought i would share it with everyone...
I like this python code.
def primes(limit) :
limit += 1
x = range(limit)
for i in xrange(2,limit) :
if x[i] == i:
x[i] = 1
for j in xrange(i*i, limit, i) :
x[j] = i
return [j for j in xrange(2, limit) if x[j] == 1]
A variant of this can be used to generate the factors of a number.
def factors(limit) :
limit += 1
x = range(limit)
for i in xrange(2,limit) :
if x[i] == i:
x[i] = 1
for j in xrange(i*i, limit, i) :
x[j] = i
result = []
y = limit-1
while x[y] != 1 :
divisor = x[y]
result.append(divisor)
y /= divisor
result.append(y)
return result
Of course, if I were factoring a batch of numbers, I would not recalculate the cache; I'd do it once and do lookups in it.
Is not optimized but it's a very simple function.
function isprime(number){
if (number == 1)
return false;
var times = 0;
for (var i = 1; i <= number; i++){
if(number % i == 0){
times ++;
}
}
if (times > 2){
return false;
}
return true;
}
Maybe this implementation in Java can be helpful:
public class SieveOfEratosthenes {
/**
* Calling this method with argument 7 will return: true true false false true false true false
* which must be interpreted as : 0 is NOT prime, 1 is NOT prime, 2 IS prime, 3 IS prime, 4 is NOT prime
* 5 is prime, 6 is NOT prime, 7 is prime.
* Caller may either revert the array for easier reading, count the number of primes or extract the prime values
* by looping.
* #param upTo Find prime numbers up to this value. Must be a positive integer.
* #return a boolean array where index represents the integer value and value at index returns
* if the number is NOT prime or not.
*/
public static boolean[] isIndexNotPrime(int upTo) {
if (upTo < 2) {
return new boolean[0];
}
// 0-index array, upper limit must be upTo + 1
final boolean[] isIndexNotPrime = new boolean[upTo + 1];
isIndexNotPrime[0] = true; // 0 is not a prime number.
isIndexNotPrime[1] = true; // 1 is not a prime number.
// Find all non primes starting from 2 by finding 2 * 2, 2 * 3, 2 * 4 until 2 * multiplier > isIndexNotPrime.len
// Find next by 3 * 3 (since 2 * 3 was found before), 3 * 4, 3 * 5 until 3 * multiplier > isIndexNotPrime.len
// Move to 4, since isIndexNotPrime[4] is already True (not prime) no need to loop..
// Move to 5, 5 * 5, (2 * 5 and 3 * 5 was already set to True..) until 5 * multiplier > isIndexNotPrime.len
// Repeat process until i * i > isIndexNotPrime.len.
// Assume we are looking up to 100. Break once you reach 11 since 11 * 11 == 121 and we are not interested in
// primes above 121..
for (int i = 2; i < isIndexNotPrime.length; i++) {
if (i * i >= isIndexNotPrime.length) {
break;
}
if (isIndexNotPrime[i]) {
continue;
}
int multiplier = i;
while (i * multiplier < isIndexNotPrime.length) {
isIndexNotPrime[i * multiplier] = true;
multiplier++;
}
}
return isIndexNotPrime;
}
public static void main(String[] args) {
final boolean[] indexNotPrime = SieveOfEratosthenes.isIndexNotPrime(7);
assert !indexNotPrime[2]; // Not (not prime)
assert !indexNotPrime[3]; // Not (not prime)
assert indexNotPrime[4]; // (not prime)
assert !indexNotPrime[5]; // Not (not prime)
assert indexNotPrime[6]; // (not prime)
assert !indexNotPrime[7]; // Not (not prime)
}
}
The AKS prime testing algorithm:
Input: Integer n > 1
if (n is has the form ab with b > 1) then output COMPOSITE
r := 2
while (r < n) {
if (gcd(n,r) is not 1) then output COMPOSITE
if (r is prime greater than 2) then {
let q be the largest factor of r-1
if (q > 4sqrt(r)log n) and (n(r-1)/q is not 1 (mod r)) then break
}
r := r+1
}
for a = 1 to 2sqrt(r)log n {
if ( (x-a)n is not (xn-a) (mod xr-1,n) ) then output COMPOSITE
}
output PRIME;
another way in python is:
import math
def main():
count = 1
while True:
isprime = True
for x in range(2, int(math.sqrt(count) + 1)):
if count % x == 0:
isprime = False
break
if isprime:
print count
count += 2
if __name__ == '__main__':
main()

Resources