Finding numbers whose digits sum to a prime - algorithm

I was trying to solve this problem on SPOJ, in which I have to find how many numbers are there in a range whose digits sum up to a prime. This range can be very big, (upper bound of 10^8 is given). The naive solution timed out, I just looped over the entire range and checked the required condition. I cant seem find a pattern or a formula too. Could someone please give a direction to proceed in??
Thanks in advance...

Here are some tips:
try to write a function that finds how many numbers in a given range have a given sum of the digits. Easiest way to implement this is to write a function that returns the number of numbers with a given sum of digits up to a given value a(call this f(sum,a)) and then the number of such numbers in the range a to b will be f(sum,b) - f(sum, a - 1)
Pay attention that the sum of the digits itself will not be too high - up to 8 * 9 < 100 so the number of prime sums to check is really small
Hope this helps.

I (seriously) doubt whether this 'opposite' approach will be any faster than #izomorphius's suggestion, but it might prompt some thoughts about improving the performance of your program:
1) Get the list of primes in the range 2..71 (you can omit 1 and 72 from any consideration since neither is prime).
2) Enumerate the integer partitions of each of the prime numbers in the list. Here's some Python code. You'd want to modify this so as not to generate partitions which were invalid, such as those containing numbers larger than 9.
3) For each of those partitions, pad out with 0s to make a set of 8 digits, then enumerate all the permutations of the padded set.
Now you have the list of numbers you require.

Generate the primes using the sieve of Eratosthenes up to the maximum sum (9 + 9...). Put them in a Hash table. Then you could likely loop quickly through 10^8 numbers and add up their sums. There might be more efficient methods, but this should be quick enough.

Related

How to use the Pisano Period to find the Last Digit of the Sum of Fibonacci Numbers?

I'm taking an online algorithms course, and I've come across a problem where I need to find the last digit of the sum of the Fibonacci numbers up the nth (current) number.
I need some help connecting the dots. As I understand it, the "Last Digit of the Sum of Fibonacci Numbers" problem has a solution that is somehow related to the Pisano Period.
But I don't really understand what that means.
The Pisano Period was used to calculate the remainder given some value of m for an extremely large Fibonacci Number, which was the focus of a prior problem (I.E., Solve Fn mod m = ???).
Forum posts (and the instruction set) seem to suggest that the length can somehow help us quickly zero in on the sum for the current Fibonacci without having to actually build up to it normally through a loop.
I would rather avoid just looking at someone else's solution if possible, so if anyone has any useful hints that can help me see the missing link, I would really appreciate it.
The last digit of a Fibonacci number is just that number reduced modulo 10. Pisaso periods are the periods of which the sequence of fibonacci numbers, modulo some base, repeat. So, if you're interested in F(x) mod 10, you'd interested in the Pisaso Period p(10).
If we have this period, say it was something like [1, 5, 2, 7, 0] (its not, but for sake of example), we'd know that the 3rd integer in the sequence ended with a "2". And because it repeats, we'd know the 8th integer also ends in a "2", and the 13th...
Generalizing this, we could say that the last digit of the number N could be found at the ith index in our list we just built, for i satisfying N = 5 * k + i, where k is just any integer, and 5 comes from the fact that our list has 5 elements (and thus repeats every 5 values). Rewriting this, we could say i = N mod 5.
Putting that all together (spoilers), we just need to find the actual values of the repeating sequence mod 10, and then take our input value N (for finding the Nth Fibonacci number mod 10), and index into said repeatingSequence at index N mod len(repeatingSequence) for our answer.
For reference, for base 10, the actual repeating sequence is:
011235831459437077415617853819099875279651673033695493257291

Return the count of all prime numbers in range [a,b] such that all the digits are from set {1,5,9} . 1<=a<=b<=10⁹

Return the count of all prime numbers in range [a,b] such that all the digits are from set {1,5,9} . 1<=a<=b<=10⁹.
My approach -
I was trying to generate all the numbers which are from set {1,5,9}. which comes out to be 3^9(19683) and after that I am checking for is it prime or not.
Can I do this in a better time complexity?
Never generate a large set and after check all elements of the set, ruling out most. That requires a lot of memory to store things you'll be discarding. Instead, find a single number with "valid" digits, check for primeness, and only then store it in a set. Accessing large arrays of memory is very time-intense on modern computers compared to doing math.
"I produced all the numbers": I hope you're doing this smartly! You never have to check a number with a last digit being 5 for primeness (there's only a single prime that ends in 5; that's 5 itself!), for example. Also, you hopefully don't just build all combinations of digits "manually". Say, you find a number 19551, then 19559 is also a candidate, you never have to manually "combine" digits to try out the last digit.
Of course, your prime-checking algorithm needs to be matching your kind of problem: You can remove the initial check for divisibility by 2 (you never produce even numbers), for example. You never need to check for divisibility by 5, because you never use 5 or 0 as last digit. Depending on your prime checking algorithm, you also would want to save the factor that "killed" the xxxx1 – that's one factor you don't have to check xxxx9 against. Do your 3-factor-checking based on the count of 1,5 and 9 in your number; you can directly infer cross-sum and hence 3-divisibility from that.

Number of positive integers in [1,1e18] that cannot be divided by any integers in [2,10]

I am having difficulty trying to solve the following problem:
For Q queries, Q <= 1e6, where each query is a positive integer N, N <= 1e18, find the number of integers in [1,N] that cannot be
divided by integers in [2,10] for each query.
I thought of using using a sieve method to filter out numbers in [1,1e18] for each query (similar to sieve of eratosthenes). However, the value of N could be very large. Hence, there is no way I could use this method. The most useful observation that I could make is that numbers ending with 0,2,4,5,6,8 are invalid. But that does not help me with this problem.
I saw a solution for a similar problem that uses a smaller number of queries (Q <= 200). But it doesn't work for this problem (and I don't understand that solution).
Could someone please advise me on how to solve this problem?
The only matter numbers in [2,10] are those primes which are 2, 3, 5, 7
So, Let say, the number cannot be divided by integers in [2,10] is the number cannot be divided by {2,3,5,7}
Which is also equalled to the total number between [1,n] minus all number that is divided by any combination of {2,3,5,7}.
So, this is the fun part: from [1,n] how many numbers that is divided by 2?
The answer is n/2 (why? simple, because every 2 number, there is one number divided by 2)
Similarly, how many numbers that is divided by 5? The answer is n/5
...
So, do we have our answer yet? No, as we found out that we have doubled count those numbers that divided by both {2, 5} or {2, 7} ..., so now, we need to minus them.
But wait, seems like we are double minus those that divided by {2,5,7} ... so we need to add it back
...
Keep doing this until all combinations are taken care of,
so there should be 2^4 combination, which is 16 in total, pretty small to deal with.
Take a look at Inclusion-Exclusion principle for some good understanding.
Good luck!
Here is an approach on how to handle this.
The place to start is to think about how you can split this into pieces. With such a problem, a place to start is the least common denominator (LCD) -- in this case 2,520 (the smallest number divisible by all the numbers less than 10).
The idea is that if x is not divisible by any number from 2-10, then x + 2,520 is also not divisible.
Hence, you can divide the problem into two pieces:
How many numbers between 1 and 2,520 are "relatively prime" to the numbers from 2-10?
How many times does 2,520 go into your target number? You need to take the remainder into account as well.

Algorithm to list all possible ways to break a number into k factors?

As part of my effort to explore algorithms through project Euler, I'm trying to write a method that will accept an integer 'n', number of factors 'k' and factorize it. If its not possible, it will throw an error.
For instance, if I enter factorize(13257440,3), the function will return a list of all possible unique sets with 3 elements where the product of the 3 elements is equal to 13257440.
My first though is to generate a multi-set of prime factors of n (with 'm' representing the size of the set), then partition the set into k partitions. Once partition sizes are determined, I would treat it as a combinations problem.
I'm having trouble however formulating algorithms for the two parts above, and have no idea where to start. Am I over complicating a simple problem with a simple solution? If not, what are some recommended approaches? Thanks!
primes decomposition
find all primes that can divide n without remainder. Use sieve of Eratosthenes to speed up the process considerably.
You can use/modify mine (warning this link is project Euler spoiler)
get primes up to n in C++
now you need to modify the code so the prime list will change to multiplicants list. For example if n=12 this will found { 2,3 } and you need { 2,2,3 } so if divider prime found check it again and again until it is not divisible anymore each time lessen the n.
Add a flag to each found prime (is used?) to speed up the next step...
The combination part
I assume the multiplicants can be the same so add k times 1 to the primes list at start and create function that create all possibilities of numbers up to some x from found unused primes. Add the counter for unused primes m so at start the m is set to prime list size and the flags are all set to unused.
Now you need to find all possibilities of using 1..m-k+1 numbers from the list. Each iteration set the picked number as used and decrease m so it is something like:
for (o=1;o<=m-k+1;o++)
here find all combination of o unused numbers so handle it as o digit number generation with base o without digit repetitions it is o! permutations.
You can use this (warning this link is Euler spoiler):
permutations in C++
do not forget to set flag for each used number and unset it after iteration is done. Rewrite this function so it is iterative with calls findfirst(), findnext() similar to mine permutation class.
Now you can nest all this k times (with use of nested fors from the permutation link or via recursion each time lessen the k and n)

Counting permutation of Strings

I need help with a problem. Given an input string with repetitions, say "aab", how to
count the number of distinct permutations of that string.
One formula that could be used is n!/n1!n2!.....nr!.
However calculating these ni's takes time O(rn) and O(n),if we
use a lookup table.
However I need a solution without use of such tables.Is any recursive or
dynamic programming solution possible for this problem.
Thanks in advance.
no. of distinct permutations will be n!/(c1!*c2*..*cn!)
here n is length of the string
ck denotes the no. of occurence of each distinct character.
For eg: string :aabb n=4 ca=2,cb=2
solution=4!/(2!*2!)=6
If you want to do this for very large strings, consider using the gamma function (with gamma(n+1)=n!), which is faster for large n and still gives you floating-point accuracy even in cases where you would get an int overflow.
If you have arbitrary precision arithmetic, you could probably push the effort down to O(r+n) by exploiting the fact that you can, e.g. write 1*2*3 * 1*2*3*4 * 1*2*3*4*5*6*7 as (1*2*3)^3 * 4^2 * 6*7. The end result will still have O(rn) digits and you'll still have an O(rn) time consumption, because multiplication cost increases with the size of the number.
I don't see the difference between lookup tables and dynamic programming - basically, dynamic programming uses a lookup table that you build on-the-fly. (i.e., use a lookup table, but only populate it on-demand).
Do you need approximate answers, or exact ones? Which part of this calculation do you think is slow?
If you need approximate answers, use the gamma function as #Yannick Versley suggested.
If you need exact answers, here is how I'd do it. I'd first figure out the prime factorization of the answer, then multiply those factors out. This avoids division. The hard part of figuring out the prime factorization is figuring out the prime factorization of n!. For that you can use a trick. Suppose that p is a prime, and k is the integer part of n/p'. Then the number of times thatpdividesn!iskplus the number of times thatpdividesk. Proceed recursively and it is quick to see that, for instance, the number of times that3is a factor of80!is26 + 8 + 2 = 36`. So after you find the primes up to 'n', it isn't hard to find the prime factorization of 'n!'.
Once you know the prime factorization, you can multiply it out. You expect to be dealing with large numbers, so try to arrange to do lots of small multiplications first, and only a few big ones. Here is a simple way to do that.
Make an array of the prime factors. Scramble it (to mix up big and small factors). Then as long as you have at least 2 factors in your array grab the first two, multiply them, push them onto the end. When you have one number left, that is your answer.
This should be much, much faster for large strings than the naive approach of multiplying the numbers one at a time. However in the end you will have very large numbers, and nothing can make multiplying those fast.
You can keep a running counts for each character, and build the result up as you go along. It's impossible to do better than O(n), since without looking at every character in the string you can't know how many of each character there are.
I've written some code in Python, with some simple unit tests. The code carefully avoids large intermediate values when the result is going to be small (in fact, the variable result is never larger than len(s) times the final result). If you were going to code this up in another language, say C, then you might use an array of size 256 rather than the defaultdict.
If you want an exact result, then I don't think you can do better than this.
from collections import defaultdict
def permutations(s):
seen = defaultdict(int)
for c in s:
seen[c] += 1
result = 1
n = 0
for k, count in seen.iteritems():
for j in xrange(count):
n += 1
result *= n
result //= j + 1
return result
test_cases = [
('abc', 6),
('aab', 3),
('abcd', 24),
('aabb', 6),
('aaaaa', 1),
('a', 1)]
for s, want in test_cases:
got = permutations(s)
if got != want:
print 'permutations(%s) = %s want %s' % (s, got, want)
As #MRalwasser says, the number of permutations should be n!. You can generate those permutations fairly simply, but the run time is going to be exponential because you have to hit exponentially many output strings. (Quick way to show O(n!) = O(2n) is by using Stirling's Formula.)

Resources