I am attempting to implement this "find the nth prime number" algorithm in Ruby 2.1.
I've tagged it 'algorithm' as well because I think the question is language-agnostic, and that the Ruby code written is simple enough to read even if you're unfamiliar. I've used descriptive variable names to help it.
Iterate over the whole number system, ignoring even numbers greater than 2 (2, 3, 5, 7, …)
For each integer, p, check if p is prime:
Iterate over the primes already found which are less than the square-root of p
For each prime in this set, f, check to see if it is a factor of p:
i. If f divides p then p is non-prime. Continue from 2 for the next p.
If no factors are found, p is prime. Continue to 3.
If p is not the nth prime we have found, add it to the list of primes. Continue from 2 for the next p.
Otherwise, p is the nth prime we have found and we should return it.
Sounds simple enough. So I write my method (function):
def nth_prime(n)
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
primes[-1].upto(Float::INFINITY) do |p|
return primes[n-1] if primes.length >= n-1
possible_prime = true
primes_to_check = primes.select{|x|x<=Math.sqrt(p)}
primes_to_check.each do |f|
if f%p==0
possible_prime = false
break
end
end
primes << p if possible_prime
end
end
The intent is to say nth_prime(10) and get the 10th prime number.
To explain my logic:
I start with a list of known primes, since the algorithm requires that. I list the first ten.
Then I iterate over the entire number system. (primes[-1]+2).upto(Float::Infinity) do |p| will offer each number up from the last known prime plus two (since +1 will result in an even number and evens over 2 cannot be prime) to infinity to the indented block as p. I have not skipped even numbers and have
The first thing I do is return the nth prime number if the list of known primes is already at least n elements long. This works for the known values -- if you ask for the 5th, you'll get 11 as a result.
Then I set a flag, possible_prime, to true. This indicates that nothing has proved it to be not a prime yet. I'm going to do some tests and if it survives those without the flag being changed to false, then p is proven to be prime and is appended to the known-primes array. Eventually that array will be as long as n and return the nth value.
I create an array, primes_to_check, containing all known primes <= the square root of p. Each one gets tested in turn as f.
If f can cleanly divide p, I know that p is not prime, so I change the flag to false, and break, which brings us out of the primes-to-check loop and back in the upto-infinity loop. There's only one statement left in that loop, the one that appends to the known-primes array if the flag is true, which it isn't so we restart the loop with the next number.
If no fs can cleanly divide p then p must be prime, which means it survives to the end of the primes-to-check loop with the flag still set to true, and reaches the final 'append p to known primes' statement.
Eventually this will make the primes array sufficiently longer to answer the question "What is the nth prime?".
Problem
Asking for the 10th prime does get me 29, the last prime I pre-supplied. But asking for 11 gets nil, or no value. I've gone over the code a hundred times and can't imagine a case in which no value gets returned.
What have I done wrong?
return primes[n-1] if primes.length >= n-1
For primes to have an element at index n-1, it must have length at least n.
if f%p==0
This checks whether a known prime is divisible by the candidate, not whether the candidate is divisible by a known prime.
primes[-1].upto(Float::INFINITY) do |p|
This starts the loop at a prime already in the list (29). 29 is correctly found to be prime, so it is added to the list again. You'll want to start the loop at a number after 29.
Algorithm for testing prime no.s:
1)Input num
2)counter= n-1
3)repeat
4)remainder = num%counter
5)if rem=0 then
6)broadcast not a prime.no and stop
7)change counter by -1
8)until counter = 1
9)say its a prime and stop
Related
I got a surprising interview question today at a big Bay Area tech company that I was absolutely stumped by despite seeming so easy. Was wondering if anyone has seen it or can offer a simpler solution as the interviewer didn't want to show me the answer. The solution can be written in any language or pseudocode.
Question:
Given a list of numbers, remove any extraneous repeating suffix sequences of numbers that appear at the end of the list until it has no repeating suffix sequences. The repeating sequence can be cut-off.
For example:
[1,2,3,4,5,6,7,5,6,7,5,6] -> [1,2,3,4,5,6,7]
explanation: [5, 6, 7] were repeating
Also consider the situation
[1,2,3,4,5,4,5,1,4,5,4,5,1,4,5,4,5,] -> [1,2,3,4,5,4,5,1] # not [1,2,3,4,5,4,5,1,4,5,4,5,1]
explanation: [4,5,4,5,1] is a repeating sequence
There are always two ways to approach this topic. Finding any solution and finding an efficient one. It is usually better to start with any and then think on how to optimize it.
Now as we can see in the second example, the problem is complicated by the fact that the repeating pattern is not known. So we could just do it for all the possible patterns at the end. Then we would need to check two things
is it actually repeating
how long is the result
Then we could just take the shortest result. Here is the Python code:
def remove_repeating_tail(a: list) -> list:
results = []
for i in range(len(a)):
tail = a[i:]
results.append(remove_repeats(a, tail))
if len(results) == 0:
return a
return sorted(results, key=len)[0]
Also we made sure we cover all the cases. Empty list, no repeating pattern. Next we need to write remove_repeats. Also we check the empty repeating pattern, so we need to be aware of that.
def remove_repeats(a: list, tail: list) -> list:
assert len(tail) <= len(a)
if len(tail) == 0:
return a
remainder = a
count = 0
while remainder[-len(tail):] == tail:
remainder = remainder[:-len(tail)]
count += 1
if count <= 1:
return a
return remainder
We remove the repeating pattern and then add it back at the end. Now it's time to test the code if it actually works, if that is possible in the interview.
remove_repeating_tail([1,2,3,4,5,6,7,5,6,7,5,6])
-> [1, 2, 3, 4, 5, 6]
remove_repeating_tail([1,2,3,4,5,4,5,1,4,5,4,5,1,4,5,4,5])
-> [1, 2, 3, 4, 5, 4, 5]
Also good to check some other cases:
remove_repeating_tail([1,2,3,4])
-> [1, 2, 3, 4]
remove_repeating_tail([])
-> []
After quite a bit of fixing we got the above, which I think is correct. In particular I missed:
first I had an infinite loop in remove_repeats for an empty tail
remove_repeats removed always the tail and sometimes everything, as I wasn't checking that there is at least one repeat. I then added the counting.
I made simple mistakes like writing results = res instead of results.append(res) leading to some Exceptions.
Then a lot of simplification. First I used some sentinel None to communicate back that it is not repeating, but we could just return the whole list. Then I checked the repeating with some if before the while loop, but realized its basically doing the same as the first iteration, so I used counting.
Similarly I don't like the if len(results) == 0: check. I would probably add a to the result in the beginning and remove the check, as now there is always a result. Then we could start the counting from 1 instead of 0. Still I kept it in.
If we want something fast, we first need to analyze the complexity.
So remove repeating tails for a list of size n and tail size k is: O(n / k). Then we call this function n times. And then we sort it. Wait why do we sort it, we could just take the minimum return min(results, key=len). That's better.
In each loop we call remove_repeats starting with k = 1 to n. So we have:
sum(k = 1 .. n) O(n / k). This is n / 1 + n / 2 + n / 3 + .. n / n. I had to look this up on Wikipedia, but these are called harmonic numbers. We can also just make our live easy and say its less than O(n^2) for now. Otherwise I found an approximation of H_n = n ln(n) + 0.5 n here. So the complexity overall is O(n log n). Not to bad I would say. Is it the optimal? Maybe. Here I would compare it to some other similar algorithms (like substring search, etc).
Before going there, at this point, I would check with the interviewer, where he would like to go next. As there are many directions.
This seems a tricky question and there may not be a simple solution. Best solution I can think of would be O(n) time and O(n) and that is if I am not missing any edge case.
Let's take as example
[1,2,3,4,5,4,5,1,4,5,4,5,1,4,5,4,5] -> [1,2,3,4,5,4,5,1]
Steps would be as follows:
Iterate over the input array from last index to first and build a dictionary (hashtable) with every number in the array being a key and value: a list of positions where the specific number is found in the array.
Occurrences dictionary will become:
{
5: [14, 11, 9, 6, 4],
4: [13, 10, 8, 5, 3],
1: [12, 7, 0],
3: [2]
2: [1]
}
Find the possible suffix lengths by calculating deltas between every position and first position for every number. This way we take into consideration the case in which a specific number repeats in the suffix or in the prefix.
We then add each distinct possible suffix length to a set.
We sort the possible suffix lengths in descending order.
We get following suffix lengths:
[12, 10, 7, 5, 2]
For every possible length l, we test if arr[n-1] == arr[n-1-l]. If l is our suffix's length, it means that the number at last position is repeated at exactly l positions before. We then check the last l elements to respect the same condition. If they do, we found the maximum suffix length. If not, the max suffix length is even smaller, so we check the next possible length.
After finding the correct suffix length, we delete the remaining numbers that repeat at positions pos-l. We then return the slice of array with suffix removed.
def removeRepeatingSuffixes(arr):
if not arr:
return []
n = len(arr)
occurrences = {}
for i in range(n - 1, -1, -1):
c = arr[i]
if c not in occurrences:
occurrences[c] = []
occurrences[c].append(i)
# treat edge case: no repeating suffix
if len(occurrences[arr[n-1]]) == 1:
return arr
# create a set of possible suffix lengths,
# based on the differences between the positions of each number.
possible_suffixes_lengths_set = set()
for c, olist in occurrences.items():
if len(olist) >= 2:
for i in range(len(olist)-1):
delta = olist[i] - olist[len(olist)-1]
possible_suffixes_lengths_set.add(delta)
suff_lengths = sorted(possible_suffixes_lengths_set, reverse=True)
for l in suff_lengths:
if arr[n - 1] == arr[n - 1 - l]:
# possible suffix length, check if last l characters repeat
ok_length = True
for j in range(n-2, n-1-l, -1):
if arr[j] != arr[j-l]:
ok_length = False
break
if ok_length:
last_i = n-1-l
while last_i > 0 and arr[last_i] == arr[last_i - l]:
last_i -= 1
# return non-repeating slice, from 0 to last_i
return arr[0:last_i + 1]
quick way to remove repeating or dedupe is change to a type set() instead of a list
I had an interesting interview question the other day that sort of stumped me. I couldn't find a really good answer for it. The problem stated:
Suppose you are given a number B and an array A of length n. The number B is a natural number, and all numbers in array A are distinct, natural numbers. Design an algorithm that would find the shortest sequence of numbers in array A that would sum up to the number B. Duplicates can be used.
So, as an example, let us say I have a number B = 19, and A = [9, 6, 3, 1]. I could say a solution is 6+6+6+1, or 3+3+3+3+3+3+1, but the solution they are looking for is 9+9+1, because that is the shortest sequence of numbers.
The algorithm that I designed would sort the array and reach into the largest number and subtract it from the original number. It would keep doing this until it could no longer subtract the largest number. It would then go through the array and see if it could keep finding any numbers that it could subtract from B. It actually looked a lot like this:
def domath(b, a):
a.sort()
x = []
n = 0
idx = -1
while b != 0:
n = a[idx]
if(b >= n):
b -= n
x.append(n)
else:
idx -= 1
return x
But this solution would not always work. It would only work if you were lucky enough to have, say, a 2 or a 1 in the array, or the numbers that you kept subtracting from b magically worked. Consider if B=21 and A=[7,8,9]. If it kept subtracting 9, it would not be able to find a solution.
So I was thinking "Okay, then maybe I need to backtrack a bit.".
If I reached into the x array, which keeps track of all the number we kept subtracting, I could add the latest number we subtracted from b, then try to move the idx to the next largest number. So, instead of doing 21 - 9, then 12 -9, it would do 21 - 9, then 12 - 8. It still wouldnt find anything, so then it would try 21 - 9, then 12 - 7. It still wouldnt find anything, so it would try 21 - 8, then 13 - 8, and it wouldnt find anything, so it would do 21 -8, then 13 -7, and it still wouldn't find anything, so it would try 21 -7, and continue on that, and determine if it could do it. If it cant (in this case, it should), it would just return "False" or something.
Is that... a good solution? I feel like there must be a better one, because the interviewers were kind of iffy about this solution.
Tricky. The linked wikipedia page suggests an approach that will take I think O (B * length (A)) which would take quite long if we had B = 1,000,000,000,000 instead of B = 21 with A = [9, 8, 7]. Your backtracking algorithm would handle this reasonably quickly if you start with a division:
111,111,111,111 nines leaves one, no way.
111,111,111,110 nines leaves ten, no way (trying 1 or 0 eights)
111,111,111,109 nines leaves 19, no way (trying 2, 1 or 0 eights)
111,111,111,108 nines leaves 28 = 4x7 (trying 3 .. 0 eights). Best so far.
111,111,111,107 nines leaves 37. 4x8 < 37, no solution can beat what we have.
In your example, B = 21, backtracking would also work quite well. If we just denote the numbers of nines, eights, and sevens, then you would just try the following: 2,0,0; 1,1,0; 1,0,1; 0,2,0; 0,1,1; 0,0,3.
You'd want to stop search branches when you have a solution and can prove that no further solution can be better. That's what I did: When you have 37 left and the highest number available is 8 then you need at least 5 numbers. And for every nine that you remove that number is going up at least by one, so the best solution so far cannot be beaten.
Consider a list [1,1,1,...,1,0,0,...,0] (an arbitrary list of zeros and ones). We want the whole possible permutations in this array, there'll be binomial(l,k) permutations (l stands for the length of the list and k for the number of ones in the list).
Right now, I have tested three different algorithms to generate the whole possible permutations, one that uses a recurrent function, one that calculates
the permutations via calculating the interval number [1,...,1,0,0,...,0]
to [0,0,...0,1,1,...,1] (since this can be seen as a binary number interval), and one that calculates the permutations using lexicographic order.
So far, the first two approaches fail in performance when the permutations are
approx. 32. The lexicographic technique works still pretty nice (only a few miliseconds to finish).
My question is, specifically for julia, which is the best way to calculate
permutations as I described earlier? I don't know too much in combinatorics, but I think a descent benchmark would be to generate all permutations from the total binomial(l,l/2)
As you have mentioned yourself in the comments, the case where l >> k is definitely desired. When this is the case, we can substantially improve performance by not handling vectors of length l until we really need them, and instead handle a list of indexes of the ones.
In the RAM-model, the following algorithm will let you iterate over all the combinations in space O(k^2), and time O(k^2 * binom(l,k))
Note however, that every time you generate a bit-vector from an index combination, you incur an overhead of O(l), in which you will also have the lower-bound (for all combinations) of Omega(l*binom(l,k)), and the memory usage grows to Omega(l+k^2).
The algorithm
"""
Produces all `k`-combinations of integers in `1:l` with prefix `current`, in a
lexicographical order.
# Arguments
- `current`: The current combination
- `l`: The parent set size
- `k`: The target combination size
"""
function combination_producer(l, k, current)
if k == length(current)
produce(current)
else
j = (length(current) > 0) ? (last(current)+1) : 1
for i=j:l
combination_producer(l, k, [current, i])
end
end
end
"""
Produces all combinations of size `k` from `1:l` in a lexicographical order
"""
function combination_producer(l,k)
combination_producer(l,k, [])
end
Example
You can then iterate over all the combinations as follows:
for c in #task(combination_producer(l, k))
# do something with c
end
Notice how this algorithm is resumable: You can stop the iteration whenever you want, and continue again:
iter = #task(combination_producer(5, 3))
for c in iter
println(c)
if c[1] == 2
break
end
end
println("took a short break")
for c in iter
println(c)
end
This produces the following output:
[1,2,3]
[1,2,4]
[1,2,5]
[1,3,4]
[1,3,5]
[1,4,5]
[2,3,4]
took a short break
[2,3,5]
[2,4,5]
[3,4,5]
If you want to get a bit-vector out of c then you can do e.g.
function combination_to_bitvector(l, c)
result = zeros(l)
result[c] = 1
result
end
where l is the desired length of the bit-vector.
I am trying to loop the numbers 1 to 1000 in such a way that I have all possible pairs, e.g., 1 and 1, 1 and 2, 1 and 3, ..., but also 2 and 1, 2 and 2, 2 and 3, et cetera, and so on.
In this case I have a condition (amicable_pair) that returns true if two numbers are an amicable pair. I want to check all numbers from 1 to n against each other and add all amicable pairs to a total total. The first value will be added to the total if it is part of an amicable pair (not the second value of the pair, since we'll find that later in the loop). To do this I wrote the following "Java-like" code:
def add_amicable_pairs(n)
amicable_values = []
for i in 1..n
for j in 1..n
if (amicable_pair?(i,j))
amicable_values.push(i)
puts "added #{i} from amicable pair #{i}, #{j}"
end
end
end
return amicable_values.inject(:+)
end
Two issues with this: (1) it is really slow. (2) In Ruby you should not use for-loops.
This is why I am wondering how this can be accomplished in a faster and more Ruby-like way. Any help would be greatly appreciated.
Your code has O(n^2) runtime, so if n gets moderately large then it will naturally be slow. Brute-force algorithms are always slow if the search space is large. To avoid this, is there some way you can directly find the "amicable pairs" rather than looping through all possible combinations and checking one by one?
As far as how to write the loops in a more elegant way, I would probably rewrite your code as:
(1..n).to_a.product((1..n).to_a).select { |a,b| amicable_pair?(a,b) }.reduce(0, &:+)
(1..1000).to_a.repeated_permutation(2).select{|pair| amicable_pair?(*pair)}
.map(&:first).inject(:+)
I need to generate a list of numbers (about 120.) The numbers range from 1 to X (max 10), both included. The algorithm should use every number an equal amount of times, or at least try, if some numbers are used once less, that's OK.
This is the first time I have to make this kind of algorithm, I've created very simple once, but I'm stumped on how to do this. I tried googling first, though don't really know what to call this kind of algorithms, so I couldn't find anything.
Thanks a lot!
It sounds like what you want to do is first fill a list with the numbers you want and then shuffle that list. One way to do this would be to add each of your numbers to the list and then repeat that process until the list has as many items as you want. After that, randomly shuffle the list.
In pseudo-code, generating the initial list might look something like this:
list = []
while length(list) < N
for i in 1, 2, ..., X
if length(list) >= N
break
end if
list.append(i)
end for
end while
I leave the shuffling part as an exercise to the reader.
EDIT:
As pointed out in the comments the above will always put more smaller numbers than larger numbers. If this isn't what's desired, you could iterate over the possible numbers in a random order. For example:
list = []
numbers = shuffle( [1, 2, ..., X] )
while length(list) < N
for i in 1, 2, ..., X
if length(list) >= N
break
end if
list.append( numbers[i] )
end for
end while
I think this should remove that bias.
What you want is a uniformly distributed random number (wiki). It means that if you generate 10 numbers between 1 to 10 then there is a high probability that all the numbers 1 upto 10 are present in the list.
The Random() class in java gives a fairly uniform distribution. So just go for it. To test, just check this:
Random rand = new Random();
for(int i=0;i<10;i++)
int rNum = rand.nextInt(10);
And see in the result whether you get all the numbers between 1 to 10.
One more similar discussion that might help: Uniform distribution with Random class