Haskell performance tuning

Haskell performance tuning - performance

I'm quite new to Haskell, and to learn it better I started solving problems here and there and I ended up with this (project Euler 34).
145 is a curious number, as 1! + 4! + 5! = 1 + 24 + 120 = 145.
Find the sum of all numbers which are equal to the sum of the factorial >of their digits.
Note: as 1! = 1 and 2! = 2 are not sums they are not included.
I wrote a C and an Haskell brute force solution.
Could someone explain me the Haskell version is ~15x (~0.450 s vs ~6.5s )slower than the C implementation and how to possibly tune and speedup the Haskell solution?
unsigned int solve(){
unsigned int result = 0;
unsigned int i=10;
while(i<2540161){
unsigned int sumOfFacts = 0;
unsigned int number = i;
while (number > 0) {
unsigned int d = number % 10;
number /= 10;
sumOfFacts += factorial(d);
}
if (sumOfFacts == i)
result += i;
i++;
}
return result;
}
here the haskell solution
--BRUTE FORCE SOLUTION
solve:: Int
solve = sum (filter (\x-> sfc x 0 == x) [10..2540160])
--sum factorial of digits
sfc :: Int -> Int -> Int
sfc 0 acc = acc
sfc n acc = sfc n' (acc+fc r)
where
n' = div n 10
r = mod n 10 --n-(10*n')
fc 0 =1
fc 1 =1
fc 2 =2
fc 3 =6
fc 4 =24
fc 5 =120
fc 6 =720
fc 7 =5040
fc 8 =40320
fc 9 =362880

First, compile with optimizations. With ghc-7.10.1 -O2 -fllvm, the Haskell version runs in 0.54 secs for me. This is already pretty good.
If we want to do even better, we should first replace div with quot and mod with rem. div and mod do some extra work, because they handle the rounding of negative numbers differently. Since we only have positive numbers here, we should switch to the faster functions.
Second, we should replace the pattern matching in fc with an array lookup. GHC uses a branching construct for Int patterns, and uses binary search when the number of cases is large enough. We can do better here with a lookup.
The new code looks like this:
import qualified Data.Vector.Unboxed as V
facs :: V.Vector Int
facs =
V.fromList [1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880]
--BRUTE FORCE SOLUTION
solve:: Int
solve = sum (filter (\x-> sfc x 0 == x) [10..2540160])
--sum factorial of digits
sfc :: Int -> Int -> Int
sfc 0 acc = acc
sfc n acc = sfc n' (acc + V.unsafeIndex facs r)
where
(n', r) = quotRem n 10
main = print solve
It runs in 0.095 seconds on my computer.

Related

Implementing a FIR filter using Vectors

I have implemented a FIR filter in Haskell. I don't know that much about FIR filters and my code is heavily based on an existing C# implementation. Therefore, I have a feeling that my implementation is has too much of a C# style and is not really Haskell-like. I would like to know if there is a more idiomatic Haskell way of implementing my code. Ideally, I'm lucky for some combination of higher-order functions (map, filter, fold, etc.) that implement the algorithm.
My Haskell code looks like this:
applyFIR :: Vector Double -> Vector Double -> Vector Double
applyFIR b x = generate (U.length x) help
where
help i = if i >= (U.length b - 1) then loop i (U.length b - 1) else 0
loop yi bi = if bi < 0 then 0 else b !! bi * x !! (yi-bi) + loop yi (bi-1)
vec !! i = unsafeIndex vec i -- Shorthand for unsafeIndex
This code is based on the following C# code:
public float[] RunFilter(double[] x)
{
int M = coeff.Length;
int n = x.Length;
//y[n]=b0x[n]+b1x[n-1]+....bmx[n-M]
var y = new float[n];
for (int yi = 0; yi < n; yi++)
{
double t = 0.0f;
for (int bi = M - 1; bi >= 0; bi--)
{
if (yi - bi < 0) continue;
t += coeff[bi] * x[yi - bi];
}
y[yi] = (float) t;
}
return y;
}
As you can see, it's almost a straight copy. How can I turn my implementation into a more Haskell-like one? Do you have any ideas? The only thing I could come up with was using Vector.generate.
I know that the DSP library has an implementation available. But it uses lists and is way too slow for my use case. This Vector implementation is a lot faster than the one in DSP.
I've also tried implementing the algorithm using Repa. It is faster than the Vector implementation. Here is the result:
applyFIR :: V.Vector Float -> Array U DIM1 Float -> Array D DIM1 Float
applyFIR b x = R.traverse x id (\_ (Z :. i) -> if i >= len then loop i (len - 1) else 0)
where
len = V.length b
loop :: Int -> Int -> Float
loop yi bi = if bi < 0 then 0 else (V.unsafeIndex b bi) * x !! (Z :. (yi-bi)) + loop yi (bi-1)
arr !! i = unsafeIndex arr i

First of all, I don't think that your initial vector code is a faithful translation - that is, I think it disagrees with the C# code. For example, suppose that both "x" and "b" ("b" is coeff in C#) have length 3, and have all values of 1.0. Then for y[0] the C# code would produce x[0] * coeff[0], or 1.0. (it would hit continue for all other values of bi)
With your Haskell code, however, help 0 produces 0. Your Repa version seems to suffer from the same problem.
So let's start with a more faithful translation:
applyFIR :: Vector Double -> Vector Double -> Vector Double
applyFIR b x = generate (U.length x) help
where
help i = loop i (min i $ U.length b - 1)
loop yi bi = if bi < 0 then 0 else b !! bi * x !! (yi-bi) + loop yi (bi-1)
vec !! i = unsafeIndex vec i -- Shorthand for unsafeIndex
Now, you're basically doing a calculation like this for computing, say, y[3]:
... b[3] | b[2] | b[1] | b[0]
x[0] | x[1] | x[2] | x[3] | x[4] | x[5] | ....
multiply
b[3]*x[0]|b[2]*x[1] |b[1]*x[2] |b[0]*x[3]
sum
y[3] = b[3]*x[0] + b[2]*x[1] + b[1]*x[2] + b[0]*x[3]
So one way to think of what you're doing is "take the b vector, reverse it, and to compute spot i of the result, line b[0] up with x[i], multiply all the corresponding x and b entries, and compute the sum".
So let's do that:
applyFIR :: Vector Double -> Vector Double -> Vector Double
applyFIR b x = generate (U.length x) help
where
revB = U.reverse b
bLen = U.length b
help i = let sliceLen = min (i+1) bLen
bSlice = U.slice (bLen - sliceLen) sliceLen revB
xSlice = U.slice (i + 1 - sliceLen) sliceLen x
in U.sum $ U.zipWith (*) bSlice xSlice

Optimising F# answer for Euler #4

I have recently begun learning F#. Hoping to use it to perform any mathematically heavy algorithms in C# applications and to broaden my knowledge
I have so far avoided StackOverflow as I didn't want to see the answer to this until I came to one myself.
I want to be able to write very efficient F# code, focused on performance and then maybe in other ways, such as writing in F# concisely (number of lines etc.).
Project Euler Question 4:
A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 × 99.
Find the largest palindrome made from the product of two 3-digit numbers.
My Answer:
let IsPalindrome (x:int) = if x.ToString().ToCharArray() = Array.rev(x.ToString().ToCharArray()) then x else 0
let euler4 = [for i in [100..999] do
for j in [i..999] do yield i*j]
|> Seq.filter(fun x -> x = IsPalindrome(x)) |> Seq.max |> printf "Largest product of two 3-digit numbers is %d"
I tried using option and returning Some(x) and None in IsPalindrome but kept getting compiling errors as I was passing in an int and returning int option. I got a NullRefenceException trying to return None.Value.
Instead I return 0 if the number isn't a palindrome, these 0's go into the Sequence, unfortunately.
Maybe I could order the sequence and then get the top value? instead of using Seq.Max? Or filter out results > 1?
Would this be better? Any advice would be much appreciated, even if it's general F# advice.

Efficiency being a primary concern, using string allocation/manipulation to find a numeric palindrome seems misguided – here's my approach:
module NumericLiteralG =
let inline FromZero () = LanguagePrimitives.GenericZero
let inline FromOne () = LanguagePrimitives.GenericOne
module Euler =
let inline isNumPalindrome number =
let ten = 1G + 1G + 1G + 1G + 1G + 1G + 1G + 1G + 1G + 1G
let hundred = ten * ten
let rec findHighDiv div =
let div' = div * ten
if number / div' = 0G then div else findHighDiv div'
let rec impl n div =
div = 0G || n / div = n % ten && impl (n % div / ten) (div / hundred)
findHighDiv 1G |> impl number
let problem004 () =
{ 100 .. 999 }
|> Seq.collect (fun n -> Seq.init (1000 - n) ((+) n >> (*) n))
|> Seq.filter isNumPalindrome
|> Seq.max

Here's one way to do it:
/// handy extension for reversing a string
type System.String with
member s.Reverse() = String(Array.rev (s.ToCharArray()))
let isPalindrome x = let s = string x in s = s.Reverse()
seq {
for i in 100..999 do
for j in i..999 -> i * j
}
|> Seq.filter isPalindrome
|> Seq.max
|> printfn "The answer is: %d"

let IsPalindrom (str:string)=
let rec fn(a,b)=a>b||str.[a]=str.[b]&&fn(a+1,b-1)
fn(0,str.Length-1)
let IsIntPalindrome = (string>>IsPalindrom)
let sq={100..999}
sq|>Seq.map (fun x->sq|>Seq.map (fun y->(x,y),x*y))
|>Seq.concat|>Seq.filter (snd>>IsIntPalindrome)|>Seq.maxBy (snd)

just my solution:
let isPalin x =
x.ToString() = new string(Array.rev (x.ToString().ToCharArray()))
let isGood num seq1 = Seq.exists (fun elem -> (num % elem = 0 && (num / elem) < 999)) seq1
{998001 .. -1 .. 10000} |> Seq.filter(fun x -> isPalin x) |> Seq.filter(fun x -> isGood x {999 .. -1 .. 100}) |> Seq.nth 0

simplest way is to go from 999 to 100, because is much likley to be product of two large numbers.
j can then start from i because other way around was already tested
other optimisations would go in directions where multiplactions would go descending order, but that makes everything little more difficult. In general it is expressed as list mergeing.
Haskell (my best try in functional programming)
merge f x [] = x
merge f [] y = y
merge f (x:xs) (y:ys)
| f x y = x : merge f xs (y:ys)
| otherwise = y : merge f (x:xs) ys
compare_tuples (a,b) (c,d) = a*b >= c*d
gen_mul n = (n,n) : merge compare_tuples
( gen_mul (n-1) )
( map (\x -> (n,x)) [n-1,n-2 .. 1] )
is_product_palindrome (a,b) = x == reverse x where x = show (a*b)
main = print $ take 10 $ map ( \(a,b)->(a,b,a*b) )
$ filter is_product_palindrome $ gen_mul 9999
output (less than 1s)- first 10 palindromes =>
[(9999,9901,99000099),
(9967,9867,98344389),
(9999,9811,98100189),
(9999,9721,97200279),
(9999,9631,96300369),
(9999,9541,95400459),
(9999,9451,94500549),
(9767,9647,94222249),
(9867,9547,94200249),
(9999,9361,93600639)]
One can see that this sequence is lazy generated from large to small

Optimized version:
let Euler dgt=
let [mine;maxe]=[dgt-1;dgt]|>List.map (fun x->String.replicate x "9"|>int)
let IsPalindrom (str:string)=
let rec fn(a,b)=a>b||str.[a]=str.[b]&&fn(a+1,b-1)
fn(0,str.Length-1)
let IsIntPalindrome = (string>>IsPalindrom)
let rec fn=function
|x,y,max,a,_ when a=mine->x,y,max
|x,y,max,a,b when b=mine->fn(x,y,max,a-1,maxe)
|x,y,max,a,b->a*b|>function
|m when b=maxe&&m<max->x,y,max
|m when m>max&&IsIntPalindrome(m)->fn(a,b,m,a-1,maxe)
|m when m>max->fn(x,y,max,a,b-1)
|_->fn(x,y,max,a-1,maxe)
fn(0,0,0,maxe,maxe)
Log (switch #time on):
> Euler 2;;
Real: 00:00:00.004, CPU: 00:00:00.015, GC gen0: 0, gen1: 0, gen2: 0
val it : int * int * int = (99, 91, 9009)
> Euler 3;;
Real: 00:00:00.004, CPU: 00:00:00.015, GC gen0: 0, gen1: 0, gen2: 0
val it : int * int * int = (993, 913, 906609)
> Euler 4;;
Real: 00:00:00.002, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
val it : int * int * int = (9999, 9901, 99000099)
> Euler 5;;
Real: 00:00:00.702, CPU: 00:00:00.686, GC gen0: 108, gen1: 1, gen2: 0
val it : int * int * int = (99793, 99041, 1293663921) //int32 overflow
Extern to BigInteger:
let Euler dgt=
let [mine;maxe]=[dgt-1;dgt]|>List.map (fun x->new System.Numerics.BigInteger(String.replicate x "9"|>int))
let IsPalindrom (str:string)=
let rec fn(a,b)=a>b||str.[a]=str.[b]&&fn(a+1,b-1)
fn(0,str.Length-1)
let IsIntPalindrome = (string>>IsPalindrom)
let rec fn=function
|x,y,max,a,_ when a=mine->x,y,max
|x,y,max,a,b when b=mine->fn(x,y,max,a-1I,maxe)
|x,y,max,a,b->a*b|>function
|m when b=maxe&&m<max->x,y,max
|m when m>max&&IsIntPalindrome(m)->fn(a,b,m,a-1I,maxe)
|m when m>max->fn(x,y,max,a,b-1I)
|_->fn(x,y,max,a-1I,maxe)
fn(0I,0I,0I,maxe,maxe)
Check:
Euler 5;;
Real: 00:00:02.658, CPU: 00:00:02.605, GC gen0: 592, gen1: 1, gen2: 0
val it :
System.Numerics.BigInteger * System.Numerics.BigInteger *
System.Numerics.BigInteger =
(99979 {...}, 99681 {...}, 9966006699 {...})

Display all the possible numbers having its digits in ascending order

Write a program that can display all the possible numbers in between given two numbers, having its digits in ascending order.
For Example:-
Input: 5000 to 6000
Output: 5678 5679 5689 5789
Input: 90 to 124
Output: 123 124
Brute force approach can make it count to all numbers and check of digits for each one of them. But I want approaches that can skip some numbers and can bring complexity lesser than O(n). Do any such solution(s) exists that can give better approach for this problem?

I offer a solution in Python. It is efficient as it considers only the relevant numbers. The basic idea is to count upwards, but handle overflow somewhat differently. While we normally set overflowing digits to 0, here we set them to the previous digit +1. Please check the inline comments for further details. You can play with it here: http://ideone.com/ePvVsQ
def ascending( na, nb ):
assert nb>=na
# split each number into a list of digits
a = list( int(x) for x in str(na))
b = list( int(x) for x in str(nb))
d = len(b) - len(a)
# if both numbers have different length add leading zeros
if d>0:
a = [0]*d + a # add leading zeros
assert len(a) == len(b)
n = len(a)
# check if the initial value has increasing digits as required,
# and fix if necessary
for x in range(d+1, n):
if a[x] <= a[x-1]:
for y in range(x, n):
a[y] = a[y-1] + 1
break
res = [] # result set
while a<=b:
# if we found a value and add it to the result list
# turn the list of digits back into an integer
if max(a) < 10:
res.append( int( ''.join( str(k) for k in a ) ) )
# in order to increase the number we look for the
# least significant digit that can be increased
for x in range( n-1, -1, -1): # count down from n-1 to 0
if a[x] < 10+x-n:
break
# digit x is to be increased
a[x] += 1
# all subsequent digits must be increased accordingly
for y in range( x+1, n ):
a[y] = a[y-1] + 1
return res
print( ascending( 5000, 9000 ) )

Sounds like task from Project Euler. Here is the solution in C++. It is not short, but it is straightforward and effective. Oh, and hey, it uses backtracking.
// Higher order digits at the back
typedef std::vector<int> Digits;
// Extract decimal digits of a number
Digits ExtractDigits(int n)
{
Digits digits;
while (n > 0)
{
digits.push_back(n % 10);
n /= 10;
}
if (digits.empty())
{
digits.push_back(0);
}
return digits;
}
// Main function
void PrintNumsRec(
const Digits& minDigits, // digits of the min value
const Digits& maxDigits, // digits of the max value
Digits& digits, // digits of current value
int pos, // current digits with index greater than pos are already filled
bool minEq, // currently filled digits are the same as of min value
bool maxEq) // currently filled digits are the same as of max value
{
if (pos < 0)
{
// Print current value. Handle leading zeros by yourself, if need
for (auto pDigit = digits.rbegin(); pDigit != digits.rend(); ++pDigit)
{
if (*pDigit >= 0)
{
std::cout << *pDigit;
}
}
std::cout << std::endl;
return;
}
// Compute iteration boundaries for current position
int first = minEq ? minDigits[pos] : 0;
int last = maxEq ? maxDigits[pos] : 9;
// The last filled digit
int prev = digits[pos + 1];
// Make sure generated number has increasing digits
int firstInc = std::max(first, prev + 1);
// Iterate through possible cases for current digit
for (int d = firstInc; d <= last; ++d)
{
digits[pos] = d;
if (d == 0 && prev == -1)
{
// Mark leading zeros with -1
digits[pos] = -1;
}
PrintNumsRec(minDigits, maxDigits, digits, pos - 1, minEq && (d == first), maxEq && (d == last));
}
}
// High-level function
void PrintNums(int min, int max)
{
auto minDigits = ExtractDigits(min);
auto maxDigits = ExtractDigits(max);
// Make digits array of the same size
while (minDigits.size() < maxDigits.size())
{
minDigits.push_back(0);
}
Digits digits(minDigits.size());
int pos = digits.size() - 1;
// Placeholder for leading zero
digits.push_back(-1);
PrintNumsRec(minDigits, maxDigits, digits, pos, true, true);
}
void main()
{
PrintNums(53, 297);
}
It uses recursion to handle arbitrary amount of digits, but it is essentially the same as the nested loops approach. Here is the output for (53, 297):
056
057
058
059
067
068
069
078
079
089
123
124
125
126
127
128
129
134
135
136
137
138
139
145
146
147
148
149
156
157
158
159
167
168
169
178
179
189
234
235
236
237
238
239
245
246
247
248
249
256
257
258
259
267
268
269
278
279
289
Much more interesting problem would be to count all these numbers without explicitly computing it. One would use dynamic programming for that.

There is only a very limited number of numbers which can match your definition (with 9 digits max) and these can be generated very fast. But if you really need speed, just cache the tree or the generated list and do a lookup when you need your result.
using System;
using System.Collections.Generic;
namespace so_ascending_digits
{
class Program
{
class Node
{
int digit;
int value;
List<Node> children;
public Node(int val = 0, int dig = 0)
{
digit = dig;
value = (val * 10) + digit;
children = new List<Node>();
for (int i = digit + 1; i < 10; i++)
{
children.Add(new Node(value, i));
}
}
public void Collect(ref List<int> collection, int min = 0, int max = Int16.MaxValue)
{
if ((value >= min) && (value <= max)) collection.Add(value);
foreach (Node n in children) if (value * 10 < max) n.Collect(ref collection, min, max);
}
}
static void Main(string[] args)
{
Node root = new Node();
List<int> numbers = new List<int>();
root.Collect(ref numbers, 5000, 6000);
numbers.Sort();
Console.WriteLine(String.Join("\n", numbers));
}
}
}

Why the brute force algorithm may be very inefficient.
One efficient way of encoding the input is to provide two numbers: the lower end of the range, a, and the number of values in the range, b-a-1. This can be encoded in O(lg a + lg (b - a)) bits, since the number of bits needed to represent a number in base-2 is roughly equal to the base-2 logarithm of the number. We can simplify this to O(lg b), because intuitively if b - a is small, then a = O(b), and if b - a is large, then b - a = O(b). Either way, the total input size is O(2 lg b) = O(lg b).
Now the brute force algorithm just checks each number from a to b, and outputs the numbers whose digits in base 10 are in increasing order. There are b - a + 1 possible numbers in that range. However, when you represent this in terms of the input size, you find that b - a + 1 = 2lg (b - a + 1) = 2O(lg b) for a large enough interval.
This means that for an input size n = O(lg b), you may need to check in the worst case O(2 n) values.
A better algorithm
Instead of checking every possible number in the interval, you can simply generate the valid numbers directly. Here's a rough overview of how. A number n can be thought of as a sequence of digits n1 ... nk, where k is again roughly log10 n.
For a and a four-digit number b, the iteration would look something like
for w in a1 .. 9:
for x in w+1 .. 9:
for y in x+1 .. 9:
for x in y+1 .. 9:
m = 1000 * w + 100 * x + 10 * y + w
if m < a:
next
if m > b:
exit
output w ++ x ++ y ++ z (++ is just string concatenation)
where a1 can be considered 0 if a has fewer digits than b.
For larger numbers, you can imagine just adding more nested for loops. In general, if b has d digits, you need d = O(lg b) loops, each of which iterates at most 10 times. The running time is thus O(10 lg b) = O(lg b) , which is a far better than the O(2lg b) running time you get by checking if every number is sorted or not.
One other detail that I have glossed over, which actually does affect the running time. As written, the algorithm needs to consider the time it takes to generate m. Without going into the details, you could assume that this adds at worst a factor of O(lg b) to the running time, resulting in an O(lg2 b) algorithm. However, using a little extra space at the top of each for loop to store partial products would save lots of redundant multiplication, allowing us to preserve the originally stated O(lg b) running time.

One way (pseudo-code):
for (digit3 = '5'; digit3 <= '6'; digit3++)
for (digit2 = digit3+1; digit2 <= '9'; digit2++)
for (digit1 = digit2+1; digit1 <= '9'; digit1++)
for (digit0 = digit1+1; digit0 <= '9'; digit0++)
output = digit3 + digit2 + digit1 + digit0; // concatenation

An efficient algorithm to calculate the integer square root (isqrt) of arbitrarily large integers

Notice
For a solution in Erlang or C / C++, go to Trial 4 below.
Wikipedia Articles
Integer square root
The definition of "integer square root" could be found here
Methods of computing square roots
An algorithm that does "bit magic" could be found here
[ Trial 1 : Using Library Function ]
Code
isqrt(N) when erlang:is_integer(N), N >= 0 ->
erlang:trunc(math:sqrt(N)).
Problem
This implementation uses the sqrt() function from the C library, so it does not work with arbitrarily large integers (Note that the returned result does not match the input. The correct answer should be 12345678901234567890):
Erlang R16B03 (erts-5.10.4) [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V5.10.4 (abort with ^G)
1> erlang:trunc(math:sqrt(12345678901234567890 * 12345678901234567890)).
12345678901234567168
2>
[ Trial 2 : Using Bigint + Only ]
Code
isqrt2(N) when erlang:is_integer(N), N >= 0 ->
isqrt2(N, 0, 3, 0).
isqrt2(N, I, _, Result) when I >= N ->
Result;
isqrt2(N, I, Times, Result) ->
isqrt2(N, I + Times, Times + 2, Result + 1).
Description
This implementation is based on the following observation:
isqrt(0) = 0 # <--- One 0
isqrt(1) = 1 # <-+
isqrt(2) = 1 # |- Three 1's
isqrt(3) = 1 # <-+
isqrt(4) = 2 # <-+
isqrt(5) = 2 # |
isqrt(6) = 2 # |- Five 2's
isqrt(7) = 2 # |
isqrt(8) = 2 # <-+
isqrt(9) = 3 # <-+
isqrt(10) = 3 # |
isqrt(11) = 3 # |
isqrt(12) = 3 # |- Seven 3's
isqrt(13) = 3 # |
isqrt(14) = 3 # |
isqrt(15) = 3 # <-+
isqrt(16) = 4 # <--- Nine 4's
...
Problem
This implementation involves only bigint additions so I expected it to run fast. However, when I fed it with 1111111111111111111111111111111111111111 * 1111111111111111111111111111111111111111, it seems to run forever on my (very fast) machine.
[ Trial 3 : Using Binary Search with Bigint +1, -1 and div 2 Only ]
Code
Variant 1 (My original implementation)
isqrt3(N) when erlang:is_integer(N), N >= 0 ->
isqrt3(N, 1, N).
isqrt3(_N, Low, High) when High =:= Low + 1 ->
Low;
isqrt3(N, Low, High) ->
Mid = (Low + High) div 2,
MidSqr = Mid * Mid,
if
%% This also catches N = 0 or 1
MidSqr =:= N ->
Mid;
MidSqr < N ->
isqrt3(N, Mid, High);
MidSqr > N ->
isqrt3(N, Low, Mid)
end.
Variant 2 (modified above code so that the boundaries go with Mid+1 or Mid-1 instead, with reference to the answer by Vikram Bhat)
isqrt3a(N) when erlang:is_integer(N), N >= 0 ->
isqrt3a(N, 1, N).
isqrt3a(N, Low, High) when Low >= High ->
HighSqr = High * High,
if
HighSqr > N ->
High - 1;
HighSqr =< N ->
High
end;
isqrt3a(N, Low, High) ->
Mid = (Low + High) div 2,
MidSqr = Mid * Mid,
if
%% This also catches N = 0 or 1
MidSqr =:= N ->
Mid;
MidSqr < N ->
isqrt3a(N, Mid + 1, High);
MidSqr > N ->
isqrt3a(N, Low, Mid - 1)
end.
Problem
Now it solves the 79-digit number (namely 1111111111111111111111111111111111111111 * 1111111111111111111111111111111111111111) in lightening speed, the result is shown immediately. However, it takes 60 seconds (+- 2 seconds) on my machine to solve one million (1,000,000) 61-digit numbers (namely, from 1000000000000000000000000000000000000000000000000000000000000 to 1000000000000000000000000000000000000000000000000000001000000). I would like to do it even faster.
[ Trial 4 : Using Newton's Method with Bigint + and div Only ]
Code
isqrt4(0) -> 0;
isqrt4(N) when erlang:is_integer(N), N >= 0 ->
isqrt4(N, N).
isqrt4(N, Xk) ->
Xk1 = (Xk + N div Xk) div 2,
if
Xk1 >= Xk ->
Xk;
Xk1 < Xk ->
isqrt4(N, Xk1)
end.
Code in C / C++ (for your interest)
Recursive variant
#include <stdint.h>
uint32_t isqrt_impl(
uint64_t const n,
uint64_t const xk)
{
uint64_t const xk1 = (xk + n / xk) / 2;
return (xk1 >= xk) ? xk : isqrt_impl(n, xk1);
}
uint32_t isqrt(uint64_t const n)
{
if (n == 0) return 0;
if (n == 18446744073709551615ULL) return 4294967295U;
return isqrt_impl(n, n);
}
Iterative variant
#include <stdint.h>
uint32_t isqrt_iterative(uint64_t const n)
{
uint64_t xk = n;
if (n == 0) return 0;
if (n == 18446744073709551615ULL) return 4294967295U;
do
{
uint64_t const xk1 = (xk + n / xk) / 2;
if (xk1 >= xk)
{
return xk;
}
else
{
xk = xk1;
}
} while (1);
}
Problem
The Erlang code solves one million (1,000,000) 61-digit numbers in 40 seconds (+- 1 second) on my machine, so this is faster than Trial 3. Can it go even faster?
About My Machine
Processor : 3.4 GHz Intel Core i7
Memory : 32 GB 1600 MHz DDR3
OS : Mac OS X Version 10.9.1
Related Questions
Integer square root in python
The answer by user448810 uses "Newton's Method". I'm not sure whether doing the division using "integer division" is okay or not. I'll try this later as an update. [UPDATE (2015-01-11): It is okay to do so]
The answer by math involves using a 3rd party Python package gmpy, which is not very favourable to me, since I'm primarily interested in solving it in Erlang with only builtin facilities.
The answer by DSM seems interesting. I don't really understand what is going on, but it seems that "bit magic" is involved there, and so it's not quite suitable for me too.
Infinite Recursion in Meta Integer Square Root
This question is for C++, and the algorithm by AraK (the questioner) looks like it's from the same idea as Trial 2 above.

How about binary search like following doesn't need floating divisions only integer multiplications (Slower than newtons method) :-
low = 1;
/* More efficient bound
high = pow(10,log10(target)/2+1);
*/
high = target
while(low<high) {
mid = (low+high)/2;
currsq = mid*mid;
if(currsq==target) {
return(mid);
}
if(currsq<target) {
if((mid+1)*(mid+1)>target) {
return(mid);
}
low = mid+1;
}
else {
high = mid-1;
}
}
This works for O(logN) iterations so should not run forever for even very large numbers
Log10(target) Computation if needed :-
acc = target
log10 = 0;
while(acc>0) {
log10 = log10 + 1;
acc = acc/10;
}
Note : acc/10 is integer division
Edit :-
Efficient bound :- The sqrt(n) has about half the number of digits as n so you can pass high = 10^(log10(N)/2+1) && low = 10^(log10(N)/2-1) to get tighter bound and it should provide 2 times speed up.
Evaluate bound:-
bound = 1;
acc = N;
count = 0;
while(acc>0) {
acc = acc/10;
if(count%2==0) {
bound = bound*10;
}
count++;
}
high = bound*10;
low = bound/10;
isqrt(N,low,high);

How to find the number of values in a given range divisible by a given value?

I have three numbers x, y , z.
For a range between numbers x and y.
How can i find the total numbers whose % with z is 0 i.e. how many numbers between x and y are divisible by z ?

It can be done in O(1): find the first one, find the last one, find the count of all other.
I'm assuming the range is inclusive. If your ranges are exclusive, adjust the bounds by one:
find the first value after x that is divisible by z. You can discard x:
x_mod = x % z;
if(x_mod != 0)
x += (z - x_mod);
find the last value before y that is divisible by y. You can discard y:
y -= y % z;
find the size of this range:
if(x > y)
return 0;
else
return (y - x) / z + 1;
If mathematical floor and ceil functions are available, the first two parts can be written more readably. Also the last part can be compressed using math functions:
x = ceil (x, z);
y = floor (y, z);
return max((y - x) / z + 1, 0);
if the input is guaranteed to be a valid range (x >= y), the last test or max is unneccessary:
x = ceil (x, z);
y = floor (y, z);
return (y - x) / z + 1;

(2017, answer rewritten thanks to comments)
The number of multiples of z in a number n is simply n / z
/ being the integer division, meaning decimals that could result from the division are simply ignored (for instance 17/5 => 3 and not 3.4).
Now, in a range from x to y, how many multiples of z are there?
Let see how many multiples m we have up to y
0----------------------------------x------------------------y
-m---m---m---m---m---m---m---m---m---m---m---m---m---m---m---
You see where I'm going... to get the number of multiples in the range [ x, y ], get the number of multiples of y then subtract the number of multiples before x, (x-1) / z
Solution: ( y / z ) - (( x - 1 ) / z )
Programmatically, you could make a function numberOfMultiples
function numberOfMultiples(n, z) {
return n / z;
}
to get the number of multiples in a range [x, y]
numberOfMultiples(y) - numberOfMultiples(x-1)
The function is O(1), there is no need of a loop to get the number of multiples.
Examples of results you should find
[30, 90] ÷ 13 => 4
[1, 1000] ÷ 6 => 166
[100, 1000000] ÷ 7 => 142843
[777, 777777777] ÷ 7 => 111111001
For the first example, 90 / 13 = 6, (30-1) / 13 = 2, and 6-2 = 4
---26---39---52---65---78---91--
^ ^
30<---(4 multiples)-->90

I also encountered this on Codility. It took me much longer than I'd like to admit to come up with a good solution, so I figured I would share what I think is an elegant solution!
Straightforward Approach 1/2:
O(N) time solution with a loop and counter, unrealistic when N = 2 billion.
Awesome Approach 3:
We want the number of digits in some range that are divisible by K.
Simple case: assume range [0 .. n*K], N = n*K
N/K represents the number of digits in [0,N) that are divisible by K, given N%K = 0 (aka. N is divisible by K)
ex. N = 9, K = 3, Num digits = |{0 3 6}| = 3 = 9/3
Similarly,
N/K + 1 represents the number of digits in [0,N] divisible by K
ex. N = 9, K = 3, Num digits = |{0 3 6 9}| = 4 = 9/3 + 1
I think really understanding the above fact is the trickiest part of this question, I cannot explain exactly why it works.
The rest boils down to prefix sums and handling special cases.
Now we don't always have a range that begins with 0, and we cannot assume the two bounds will be divisible by K.
But wait! We can fix this by calculating our own nice upper and lower bounds and using some subtraction magic :)
First find the closest upper and lower in the range [A,B] that are divisible by K.
Upper bound (easier): ex. B = 10, K = 3, new_B = 9... the pattern is B - B%K
Lower bound: ex. A = 10, K = 3, new_A = 12... try a few more and you will see the pattern is A - A%K + K
Then calculate the following using the above technique:
Determine the total number of digits X between [0,B] that are divisible by K
Determine the total number of digits Y between [0,A) that are divisible by K
Calculate the number of digits between [A,B] that are divisible by K in constant time by the expression X - Y
Website: https://codility.com/demo/take-sample-test/count_div/
class CountDiv {
public int solution(int A, int B, int K) {
int firstDivisible = A%K == 0 ? A : A + (K - A%K);
int lastDivisible = B%K == 0 ? B : B - B%K; //B/K behaves this way by default.
return (lastDivisible - firstDivisible)/K + 1;
}
}
This is my first time explaining an approach like this. Feedback is very much appreciated :)

This is one of the Codility Lesson 3 questions. For this question, the input is guaranteed to be in a valid range. I answered it using Javascript:
function solution(x, y, z) {
var totalDivisibles = Math.floor(y / z),
excludeDivisibles = Math.floor((x - 1) / z),
divisiblesInArray = totalDivisibles - excludeDivisibles;
return divisiblesInArray;
}
https://codility.com/demo/results/demoQX3MJC-8AP/
(I actually wanted to ask about some of the other comments on this page but I don't have enough rep points yet).

Divide y-x by z, rounding down. Add one if y%z < x%z or if x%z == 0.
No mathematical proof, unless someone cares to provide one, but test cases, in Perl:
#!perl
use strict;
use warnings;
use Test::More;
sub multiples_in_range {
my ($x, $y, $z) = #_;
return 0 if $x > $y;
my $ret = int( ($y - $x) / $z);
$ret++ if $y%$z < $x%$z or $x%$z == 0;
return $ret;
}
for my $z (2 .. 10) {
for my $x (0 .. 2*$z) {
for my $y (0 .. 4*$z) {
is multiples_in_range($x, $y, $z),
scalar(grep { $_ % $z == 0 } $x..$y),
"[$x..$y] mod $z";
}
}
}
done_testing;
Output:
$ prove divrange.pl
divrange.pl .. ok
All tests successful.
Files=1, Tests=3405, 0 wallclock secs ( 0.20 usr 0.02 sys + 0.26 cusr 0.01 csys = 0.49 CPU)
Result: PASS

Let [A;B] be an interval of positive integers including A and B such that 0 <= A <= B, K be the divisor.
It is easy to see that there are N(A) = ⌊A / K⌋ = floor(A / K) factors of K in interval [0;A]:
1K 2K 3K 4K 5K
●········x········x··●·····x········x········x···>
0 A
Similarly, there are N(B) = ⌊B / K⌋ = floor(B / K) factors of K in interval [0;B]:
1K 2K 3K 4K 5K
●········x········x········x········x···●····x···>
0 B
Then N = N(B) - N(A) equals to the number of K's (the number of integers divisible by K) in range (A;B]. The point A is not included, because the subtracted N(A) includes this point. Therefore, the result should be incremented by one, if A mod K is zero:
N := N(B) - N(A)
if (A mod K = 0)
N := N + 1
Implementation in PHP
function solution($A, $B, $K) {
if ($K < 1)
return 0;
$c = floor($B / $K) - floor($A / $K);
if ($A % $K == 0)
$c++;
return (int)$c;
}
In PHP, the effect of the floor function can be achieved by casting to the integer type:
$c = (int)($B / $K) - (int)($A / $K);
which, I think, is faster.

Here is my short and simple solution in C++ which got 100/100 on codility. :)
Runs in O(1) time. I hope its not difficult to understand.
int solution(int A, int B, int K) {
// write your code in C++11
int cnt=0;
if( A%K==0 or B%K==0)
cnt++;
if(A>=K)
cnt+= (B - A)/K;
else
cnt+=B/K;
return cnt;
}

(floor)(high/d) - (floor)(low/d) - (high%d==0)
Explanation:
There are a/d numbers divisible by d from 0.0 to a. (d!=0)
Therefore (floor)(high/d) - (floor)(low/d) will give numbers divisible in the range (low,high] (Note that low is excluded and high is included in this range)
Now to remove high from the range just subtract (high%d==0)
Works for integers, floats or whatever (Use fmodf function for floats)

Won't strive for an o(1) solution, this leave for more clever person:) Just feel this is a perfect usage scenario for function programming. Simple and straightforward.
> x,y,z=1,1000,6
=> [1, 1000, 6]
> (x..y).select {|n| n%z==0}.size
=> 166
EDIT: after reading other's O(1) solution. I feel shamed. Programming made people lazy to think...

Division (a/b=c) by definition - taking a set of size a and forming groups of size b. The number of groups of this size that can be formed, c, is the quotient of a and b. - is nothing more than the number of integers within range/interval ]0..a] (not including zero, but including a) that are divisible by b.
so by definition:
Y/Z - number of integers within ]0..Y] that are divisible by Z
and
X/Z - number of integers within ]0..X] that are divisible by Z
thus:
result = [Y/Z] - [X/Z] + x (where x = 1 if and only if X is divisible by Y otherwise 0 - assuming the given range [X..Y] includes X)
example :
for (6, 12, 2) we have 12/2 - 6/2 + 1 (as 6%2 == 0) = 6 - 3 + 1 = 4 // {6, 8, 10, 12}
for (5, 12, 2) we have 12/2 - 5/2 + 0 (as 5%2 != 0) = 6 - 2 + 0 = 4 // {6, 8, 10, 12}

The time complexity of the solution will be linear.
Code Snippet :
int countDiv(int a, int b, int m)
{
int mod = (min(a, b)%m==0);
int cnt = abs(floor(b/m) - floor(a/m)) + mod;
return cnt;
}

here n will give you count of number and will print sum of all numbers that are divisible by k
int a = sc.nextInt();
int b = sc.nextInt();
int k = sc.nextInt();
int first = 0;
if (a > k) {
first = a + a/k;
} else {
first = k;
}
int last = b - b%k;
if (first > last) {
System.out.println(0);
} else {
int n = (last - first)/k+1;
System.out.println(n * (first + last)/2);
}

Here is the solution to the problem written in Swift Programming Language.
Step 1: Find the first number in the range divisible by z.
Step 2: Find the last number in the range divisible by z.
Step 3: Use a mathematical formula to find the number of divisible numbers by z in the range.
func solution(_ x : Int, _ y : Int, _ z : Int) -> Int {
var numberOfDivisible = 0
var firstNumber: Int
var lastNumber: Int
if y == x {
return x % z == 0 ? 1 : 0
}
//Find first number divisible by z
let moduloX = x % z
if moduloX == 0 {
firstNumber = x
} else {
firstNumber = x + (z - moduloX)
}
//Fist last number divisible by z
let moduloY = y % z
if moduloY == 0 {
lastNumber = y
} else {
lastNumber = y - moduloY
}
//Math formula
numberOfDivisible = Int(floor(Double((lastNumber - firstNumber) / z))) + 1
return numberOfDivisible
}

public static int Solution(int A, int B, int K)
{
int count = 0;
//If A is divisible by K
if(A % K == 0)
{
count = (B / K) - (A / K) + 1;
}
//If A is not divisible by K
else if(A % K != 0)
{
count = (B / K) - (A / K);
}
return count;
}

This can be done in O(1).
Here you are a solution in C++.
auto first{ x % z == 0 ? x : x + z - x % z };
auto last{ y % z == 0 ? y : y - y % z };
auto ans{ (last - first) / z + 1 };
Where first is the first number that ∈ [x; y] and is divisible by z, last is the last number that ∈ [x; y] and is divisible by z and ans is the answer that you are looking for.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Haskell performance tuning - performance

Related

Implementing a FIR filter using Vectors

Optimising F# answer for Euler #4

Display all the possible numbers having its digits in ascending order

An efficient algorithm to calculate the integer square root (isqrt) of arbitrarily large integers

How to find the number of values in a given range divisible by a given value?

Categories

Resources