I am working on learning sets, cartesian products, cardinality, and so on. I am having trouble understanding what this notation means on my assignment, there was nothing in the notes about this.
The question is What are the first 6 elements of the set {n | n >= 5 and n is prime} and also
{What are the 5 smallest elements of the set {n | ∃i > 0 such that n = 2i}.
I have no clue what this n | nnotation means and also what the ' ∃ ' means
Any help is very appreciated!
The | symbol separates the variable representing the elements of the set from its constraints.
The question is asking you to find the first 6 elements of the set whose elements are described by the variable 'n' and the constraints on 'n' are that it is greater than or equal to 5, and it is prime.
So this will result in the following set : 5,7,11,13,17,19....
And the answer would be the first 6 elements of this set.
Related
I have an array of size n and I can apply any number of operations(zero included) on it. In an operation, I can take any two elements and replace them with the absolute difference of the two elements. We have to find the minimum possible element that can be generated using the operation. (n<1000)
Here's an example of how operation works. Let the array be [1,3,4]. Applying operation on 1,3 gives [2,4] as the new array.
Ex: 2 6 11 3 => ans = 0
This is because 11-6 = 5 and 5-3 = 2 and 2-2 = 0
Ex: 20 6 4 => ans = 2
Ex: 2 6 10 14 => ans = 0
Ex: 2 6 10 => ans = 2
Can anyone tell me how can I approach this problem?
Edit:
We can use recursion to generate all possible cases and pick the minimum element from them. This would have complexity of O(n^2 !).
Another approach I tried is Sorting the array and then making a recursion call where the either starting from 0 or 1, I apply the operations on all consecutive elements. This will continue till their is only one element left in the array and we can return the minimum at any point in the recursion. This will have a complexity of O(n^2) but doesn't necessarily give the right answer.
Ex: 2 6 10 15 => (4 5) & (2 4 15) => (1) & (2 15) & (2 11) => (13) & (9). The minimum of this will be 1 which is the answer.
When you choose two elements for the operation, you subtract the smaller one from the bigger one. So if you choose 1 and 7, the result is 7 - 1 = 6.
Now having 2 6 and 8 you can do:
8 - 2 -> 6 and then 6 - 6 = 0
You may also write it like this: 8 - 2 - 6 = 0
Let"s consider different operation: you can take two elements and replace them by their sum or their difference.
Even though you can obtain completely different values using the new operation, the absolute value of the element closest to 0 will be exactly the same as using the old one.
First, let's try to solve this problem using the new operations, then we'll make sure that the answer is indeed the same as using the old ones.
What you are trying to do is to choose two nonintersecting subsets of initial array, then from sum of all the elements from the first set subtract sum of all the elements from the second one. You want to find two such subsets that the result is closest possible to 0. That is an NP problem and one can efficiently solve it using pseudopolynomial algorithm similar to the knapsack problem in O(n * sum of all elements)
Each element of initial array can either belong to the positive set (set which sum you subtract from), negative set (set which sum you subtract) or none of them. In different words: each element you can either add to the result, subtract from the result or leave untouched. Let's say we already calculated all obtainable values using elements from the first one to the i-th one. Now we consider i+1-th element. We can take any of the obtainable values and increase it or decrease it by the value of i+1-th element. After doing that with all the elements we get all possible values obtainable from that array. Then we choose one which is closest to 0.
Now the harder part, why is it always a correct answer?
Let's consider positive and negative sets from which we obtain minimal result. We want to achieve it using initial operations. Let's say that there are more elements in the negative set than in the positive set (otherwise swap them).
What if we have only one element in the positive set and only one element in the negative set? Then absolute value of their difference is equal to the value obtained by using our operation on it.
What if we have one element in the positive set and two in the negative one?
1) One of the negative elements is smaller than the positive element - then we just take them and use the operation on them. The result of it is a new element in the positive set. Then we have the previous case.
2) Both negative elements are smaller than the positive one. Then if we remove bigger element from the negative set we get the result closer to 0, so this case is impossible to happen.
Let's say we have n elements in the positive set and m elements in the negative set (n <= m) and we are able to obtain the absolute value of difference of their sums (let's call it x) by using some operations. Now let's add an element to the negative set. If the difference before adding new element was negative, decreasing it by any other number makes it smaller, that is farther from 0, so it is impossible. So the difference must have been positive. Then we can use our operation on x and the new element to get the result.
Now second case: let's say we have n elements in the positive set and m elements in the negative set (n < m) and we are able to obtain the absolute value of difference of their sums (again let's call it x) by using some operations. Now we add new element to the positive set. Similarly, the difference must have been negative, so x is in the negative set. Then we obtain the result by doing the operation on x and the new element.
Using induction we can prove that the answer is always correct.
In looking through the dynamic programming algorithm for computing the minimum edit distance between two strings I am having a hard time grasping one thing. To me it seems like given the two strings s and t inserting a character into s would be the same as deleting a character from t. Why then do we need to consider these operations separately when computing the edit distance? I always have a hard time computing the indices in the recurrence relation because I can't intuitively understand this part.
I've read through Skiena and some other sources but they all don't explain this part well. This SO link explains the insert and delete operations better than elsewhere in terms of understanding what string is being inserted into or deleted from but I still can't figure out why they aren't one and the same.
Edit: Ok, I didn't do a very good job of detailing the source of my confusion.
The way Skiena explains computing the minimum edit distance m(i,j) of the first i characters of a string s and the first j characters of a string t based on already having computed solutions to the subproblems is as follows. m(i,j) will be the minimum of the following 3 possibilities:
opt[MATCH] = m[i-1][j-1].cost + match(s[i],t[j]);
opt[INSERT] = m[i][j-1].cost + indel(t[j]);
opt[DELETE] = m[i-1][j].cost + indel(s[i]);
The way I understand it the 3 operations are all operations on the string s. An INSERT means you have to insert a character at the end of string s to get the minimum edit distance. A DELETE means you have to delete the character at the end of string s to get the minimum edit distance.
Given s = "SU" and t = "SATU" INSERT and DELETE would be as follows:
Insert:
SU_
SATU
Delete:
SU
SATU_
My confusion was that an INSERT into s is the same as a DELETION from t. I'm probably confused on something basic but it's not intuitive to me yet.
Edit 2: I think this link kind of clarifies my confusion but I'd love an explanation given my specific questions above.
They aren't the same thing any more than < and > are the same thing. There is of course a sort of duality and you are correct to point it out. a < b if and only if b > a so if you have a good algorithm to test for b > a then it makes sense to use it when you need to test if a < b.
It is much easier to directly test if s can be obtained from t by deletion rather than to directly test if t can be obtained from s by insertion. It would be silly to randomly insert letters to s and see if you get t. I can't imagine that any implementation of edit-distance actually does that. Still, it doesn't mean that you can't distinguish between insertion and deletion.
More abstractly. There is a relation, R on any set of strings defined by
s R t <=> t can be obtained from s by insertion
deletion is the inverse relation. Closely related, but not the same.
The problem of edit distance can be restated as a problem of converting the source string into target string with minimum number of operations (including insertion, deletion and replacement of a single character).
Thus, in the process of converting a source string into a target string, if inserting a character from target string or deleting a character from the source string or replacing a character in the source string with a character from the target string yields the same (minimum) edit distance, then, well, all the operations can be said to be equivalent. In other words, it does not matter how you arrive at the target string as long as you have done minimum number of edits.
This is realized by looking at how the cost matrix is calculated. Consider a simpler problem where source = AT (represented vertically) and target = TA (represented horizontally). The matrix is then constructed as (coming from west, northwest, north in that order):
| ε | T | A |
| | | |
ε | 0 | 1 | 2 |
| | | |
A | 1 | min(2, 1, 2) = 1 | min(2, 1, 3) = 1 |
| | | |
T | 2 | min(3, 1, 3) = 1 | min(2, 2, 2) = 2 |
The idea of filling this matrix is:
If we moved east, we insert the current target string character.
If we moved south, we delete the current source string character.
If we moved southeast, we replace the current source character with current target character.
If all or any two of these impart the same cost in terms of editing, then they can be said to be equivalent and you can break the ties arbitrarily.
One of the first experiences with this comes when we find c(2, 2) in the cost matrix (c(0, 0) through c(0, 2) -- minimum costs of converting an empty string to "T", "TA" respectively, and c(0, 0) to c(2,0) -- costs of converting "A", "AT" respectively to empty string are clear).
Value of c(2, 2), can be realized either by:
inserting the current character in target, 'A' (we move east from c(2,1)) -- cost is 1 + 1 = 2, or
replacing the current character 'T' in source by current character in target 'A' -- cost is `1 + 1 = 2
deleting the current character in source, 'T' (we move south from c(1, 2)) -- cost is 1 + 1 = 2
Since all values are the same, which one are you going to choose?
If you choose to move from west, your alignment could be:
A T -
- T A
(one deletion, one 0-cost replacement, one insertion)
If you choose to move from north, your alignment could be:
- A T
T A -
(one insertion, one 0-cost replacement, one deletion)
If you choose to move from northwest, your alignment could be:
A T
T A
(Two 1-cost replacements).
All these edit graphs are equivalent in terms of given edit distance (under given cost function).
Edit distance is only interested in the minimum number of operations required to transform one sequence into another; it is not interested in the uniqueness of the transformation. In practice, there are often multiple ways to transform one string into another, that all have the minimum number of operations.
This is my approach for the problem statement http://www.spoj.com/problems/ABSP1/ - please check if there is any corner case on which my code is falling because according to my test cases it is giving correct answer.
Problem Statement:
You are given an array of N numbers in non-decreasing order. You have
to answer the summation of the absolute difference of all distinct
pairs in the given array.
scanf("%d",&TotalElements);
for(i=0;i<TotalElements;i++)
scanf("%d",&Array[i]);
FirstSum=TotalSum=0;
for(i=0;i<TotalElements;i++)
FirstSum+=abs(Array[i]-Array[0]);
TotalSum=FirstSum;
SumTillNow=Array[0];
for(i=1;i<TotalElements;i++){
Difference=Array[i]-Array[0];
NextSum=FirstSum-Difference*(TotalElements-i)-SumTillNow+(i)*Array[0];
TotalSum+=NextSum;
SumTillNow+=Array[i];
}
printf("%lld\n",TotalSum);
According to me your logic is fine.
I think the Wrong Answer may be related to the types of variables that you have used.
Let's look at this statement in your code closely.
NextSum=FirstSum-Difference*(TotalElements-i)-SumTillNow+(i)*Array[0];
Here FirstSum = summation( A[k] - A[0] ) for all k > 0
= summation( A[k] ) - N*A[0]
Difference = A[i] - A[0].
Hence the statement becomes:
NextSum = summation( A[k] ) - N*A[0] - (A[i] - A[0])*(N-i) - summation( A[j] ){j<i} + i*A[0]
= summation( A[m] ){m >= i} - A[i]*(N-i)
This sum takes into account all the absolute differences between A[i] and A[m] where m > i. This should give you the correct answer.
Also, there is a simpler way to carry out the summation. I include it for completeness.
If you look at the number of times each A[i] will appear in the sum of absolute differences,
"-A[0]" will appear N-1 times
"-A[1]" will appear N-2 times and A[1] will appear 1 time. Hence net effect will be (1 - (N-2))*A[1].
Similarly A[i]th term shall be (i - (N-i-1))*A[i] = (2i + 1 - N)*A[i].
You can calculate the series accordingly.
The problem setter doesn't seem to have enough experience. Here's why:
He says distinct pairs but it doesn't seem like it. Maybe what he meant to say was UNORDERED pairs.
The test data doesn't conform to the constraints. Asserts on input data verify this. Use 64 bit signed integers for input.
Make changes to your program keeping these two point in mind (especially point 2) and it should get accepted.
One things come to mind. SPOJ says:
Do you know what distinct pair means? Suppose you have an array of 3 elements: 3 5 6
Then all the distinct pairs are:
3 5
3 6
5 6
If you instead had the array of 3 elements: 3 3 4 then how many distinct pairs do you have? Similarly for: 3 3 3. SPOJ doesn't exactly clarify what a distinct pair is but I assume two pairs a1 a2 and b1 b2 are only distinct if a1 <> a2 or b1 <> b2. If this is the case then you will need to filter out all duplicates in the array.
I was solving this problem :http://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&category=286&page=show_problem&problem=3268
and I am stuck and can't find any hints.
The question:
You will be given an integer n ( n<=10^9 ) now you have to tell how many
distinct sets of integers are there such that each number from 1 to n can
be generated uniquely from a set. Also sum of set should be n. eg for n=5 , one such set is:
{1,2,2} as
1 can be generated only by { 1 }
2 by { 2 }
3 by {1,2} ( note the two 2's are indistinguishable)
4 by {2,2}
5 by {1,2,2}
for generating a number each number of a set can be used only once. ie for above set
we can't do {1,1} to generate 2 as only one 1 is there.
Also the set {1,2,2} is equivalent to {2,1,2} ie sets are unordered.
My approach:
The conclusion I came to was. Let F(S,k) denote number desired sets of sum S whose
largest element is k.Then to construct a valid set we can take two paths from this
state.Either to F(S+k,k) or to F(2*S+1,S+1).I keep a count of how many times I come
to state where S=n(the desired sum) and do not go further if S becomes > n.This is
clearly bruteforce which I just wrote to see if my logic was correct(which is correct)
.But this will give time limit exceed . How do I improve my approach??I have a
feeling it is done by dp/memoization.
This is a known integer sequence.
Spoilers: http://oeis.org/A002033
Say S = 5 and N = 3 the solutions would look like - <0,0,5> <0,1,4> <0,2,3> <0,3,2> <5,0,0> <2,3,0> <3,2,0> <1,2,2> etc etc.
In the general case, N nested loops can be used to solve the problem. Run N nested loop, inside them check if the loop variables add upto S.
If we do not know N ahead of time, we can use a recursive solution. In each level, run a loop starting from 0 to N, and then call the function itself again. When we reach a depth of N, see if the numbers obtained add up to S.
Any other dynamic programming solution?
Try this recursive function:
f(s, n) = 1 if s = 0
= 0 if s != 0 and n = 0
= sum f(s - i, n - 1) over i in [0, s] otherwise
To use dynamic programming you can cache the value of f after evaluating it, and check if the value already exists in the cache before evaluating it.
There is a closed form formula : binomial(s + n - 1, s) or binomial(s+n-1,n-1)
Those numbers are the simplex numbers.
If you want to compute them, use the log gamma function or arbitrary precision arithmetic.
See https://math.stackexchange.com/questions/2455/geometric-proof-of-the-formula-for-simplex-numbers
I have my own formula for this. We, together with my friend Gio made an investigative report concerning this. The formula that we got is [2 raised to (n-1) - 1], where n is the number we are looking for how many addends it has.
Let's try.
If n is 1: its addends are o. There's no two or more numbers that we can add to get a sum of 1 (excluding 0). Let's try a higher number.
Let's try 4. 4 has addends: 1+1+1+1, 1+2+1, 1+1+2, 2+1+1, 1+3, 2+2, 3+1. Its total is 7.
Let's check with the formula. 2 raised to (4-1) - 1 = 2 raised to (3) - 1 = 8-1 =7.
Let's try 15. 2 raised to (15-1) - 1 = 2 raised to (14) - 1 = 16384 - 1 = 16383. Therefore, there are 16383 ways to add numbers that will equal to 15.
(Note: Addends are positive numbers only.)
(You can try other numbers, to check whether our formula is correct or not.)
This can be calculated in O(s+n) (or O(1) if you don't mind an approximation) in the following way:
Imagine we have a string with n-1 X's in it and s o's. So for your example of s=5, n=3, one example string would be
oXooXoo
Notice that the X's divide the o's into three distinct groupings: one of length 1, length 2, and length 2. This corresponds to your solution of <1,2,2>. Every possible string gives us a different solution, by counting the number of o's in a row (a 0 is possible: for example, XoooooX would correspond to <0,5,0>). So by counting the number of possible strings of this form, we get the answer to your question.
There are s+(n-1) positions to choose for s o's, so the answer is Choose(s+n-1, s).
There is a fixed formula to find the answer. If you want to find the number of ways to get N as the sum of R elements. The answer is always:
(N+R-1)!/((R-1)!*(N)!)
or in other words:
(N+R-1) C (R-1)
This actually looks a lot like a Towers of Hanoi problem, without the constraint of stacking disks only on larger disks. You have S disks that can be in any combination on N towers. So that's what got me thinking about it.
What I suspect is that there is a formula we can deduce that doesn't require the recursive programming. I'll need a bit more time though.