N groups have a list of numbers that it can pick. Determine if all groups will be able to choose a number if one group chooses number i - algorithm

Say that there is a group, each with numbers that it can pick.
Group 1: [1, 2, 3, 4],
Group 2: [2, 3, 4],
Group 3: [2, 3, 4],
Group 4: [4]
Say that it initially Group 4 chooses 4 in its list. Now, other groups can't use a 4. One possible arrangement is group 1 chooses the 1, group 3 chooses the 2, and group 2 chooses the 3. However, if group 1 initially chooses 4, there is no valid arrangement, since the only number group 4 can use is 4. Since 4 was used by group 1, group 4 can't choose any numbers, and thus not every group can choose a number. Return true or false if given that group N chooses a number i that they can pick, all the rest of the groups can choose a number. Is there an efficient O(n) or O(n^2) algorithm?
My solution attempt/thoughts:
Originally, I thought that maybe converting this into a directed graph representation and determining if there is a way for all nodes to be traversed to from another node. However, this would mean there are loops involved, and also this is polynomial.

Related

How do i derive an expression for the worst case number of comparisons needed to merge two sorted arrays of length n/2

merge sort uses divide and conquer approach
The worst case number of comparisons needed to merge two sorted arrays of length 𝑛/2 is π‘›βˆ’1.
This is because when two values have been compared with each other, one of those two values will never be used in a comparison again: it is the value that follows the sorted order... and it is no longer part of the rest of the merge process.
In the worst case, the last comparison will be between the last value of the first array and the last value of the second array. All the other (π‘›βˆ’2) values were already excluded from further comparisons, so that means we already did π‘›βˆ’2 comparisons. Now the last one executes, which completes the comparison count to be π‘›βˆ’1.
Example of a worst case input
𝑛 = 10
A = [0, 2, 4, 6, 8]
B = [1, 3, 5, 7, 9]
Comparisons during the merge:
0 1
2 1
2 3
4 3
4 5
6 5
6 7
8 7
8 9
Example of a best case input
Just to put this in perspective, the best case occurs when the last value of the first array is less than the first value of the second array (or vice versa). This only needs 𝑛 / 2 comparisons.
𝑛 = 10
A = [0, 1, 2, 3, 4]
B = [5, 6, 7, 8, 9]
Comparisons during the merge:
0 5
1 5
2 5
3 5
4 5
No more comparisons are now needed, because the first list has no more values; only the values of the second list remain: but they don't need to be compared with anything, and can be appended to the result in the order they currently occur.
Remark
The fact that this process is used as part of merge sort -- implementing a divide and conquer algorithm -- is just background info. Merging two sorted lists can be needed in other contexts that do not relate to merge sort.

Adding numbers with different priorities into a total value that does not exceed the allowable

Suppose we have an array of numbers, each number has its own priority and price, the price is the value of the number, how to calculate the sum of a set of these numbers in decreasing order of priority so that the sum does not exceed the allowable one, please tell me at least the name of the algorithm with which it is can be done. Example: there are numbers 2, 3, 9 with priorities 3, 1, 2, respectively. The constraint is 4, therefore the number 9 is cut off immediately, since 9> 4, 2 and 3 we cannot add together, since 5> 3, therefore the choice of 2 numbers is 2 and 3, but since the number 2 has a higher priority, we add only his, this algorithm should work with any number of numbers.
It seems that you are looking for a greedy algorithm:
Order By priority
Scan this ordered collection from the beginning while
Adding item if total meets the constrait(s), skipping if constraint is broken.
In your case:
2, 3, 9 with priorities 3, 1, 2 and a constraint total <= 4
After ordering we have
2, 9, 3
then we scan:
2 take (total == 2 meets the constraint)
9 skip (total == 2 + 9 == 11 > 4 doesn't meet the constraint)
3 skip (total == 2 + 3 == 5 > 4 doesn't meet the constraint)
So far we should take the only 2 item.
Edit: you've dropped 9 since 9 > 4 and that's why 9 can't be in the solution. This process (when we drop items or, on the contrary, take items which are guaranteed to be in the solution) is called Kernelization
In general case when you can skip high priority item in order to take, say, ten low priority items you have Knapsack problem

Practical algorithms for permuting external memory

On a spinning disk, I have N records that I want to permute. In RAM, I have an array of N indices that contain the desired permutation. I also have enough RAM to hold n records at a time. What algorithm can I use to execute the permutation on disk as quickly as possible, taking into account the fact that sequential disk access is a lot faster?
I have plenty of excess disk to use for intermediate files, if desired.
This is a known problem. Find the cycles in your permutation order. For instance, given five records to permute [1, 0, 3, 4, 2], you have cycles (0, 1) and (2, 3, 4). You do this by picking an unused starting position; follow the index pointers until you return to your starting point. The sequence of pointers describes a cycle.
You then permute the records with an internal temporary variable, one record long.
temp = disk[0]
disk[0] = disk[1]
disk[1] = temp
temp = disk[2]
disk[2] = disk[3]
disk[3] = disk[4]
disk[4] = temp
Note that you can also perform the permutation as you traverse the pointers. You will also need some method to recall which positions have already been permuted, such as clearing the permutation index (set it to -1).
Can you see how to generalize that?
This is an problem with interval coordination. I'll simplify the notation slightly by changing the memory available to M records -- having upper- and lower-case N is a little confusing.
First, we re-cast the permutations as a series of intervals, the rotational span during which a record needs to reside in RAM. If a record needs to be written to a lower-numbered position, we increase the endpoint by the list size, to indicate the wraparound -- have to wait for the next disk rotation. For instance, using my earlier example, we expand the list:
[1, 0, 3, 4, 2]
0 -> 1
1 -> 0+5
2 -> 3
3 -> 4
4 -> 2+5
Now, we apply standard greedy scheduling resolution. First, sort by endpoint:
[0, 1]
[2, 3]
[3, 4]
[1, 5]
[4, 7]
Now, apply the algorithm for M-1 "lanes"; the extra one is needed for swap space. We fill each lane, appending the interval with the earliest endpoint, whose start-point doesn't overlap:
[0, 1] [2, 3] [3, 4] [4, 7]
[1, 5]
We can do this in a total of 7 "ticks" if M >= 3. If M=2, we defer the second lane by 2 rotations to [11, 15].
Sneftal's nice example gives us more troubles, with deeper overlap:
[0, 4]
[1, 5]
[2, 6]
[3, 7]
[4, 0+8]
[5, 1+8]
[6, 2+8]
[7, 3+8]
This requires 4 "lanes" if available, deferring lanes as needed if M < 5.
The pathological case is where every record in the permutation needs to be copied back one position, such as [3, 0, 1, 2], with M=2.
[0, 3]
[1, 4]
[2, 5]
[3, 6]
In this case, we walk through the deferral cycle multiple times. At the end of every rotation, we have to defer all remaining intervals by one rotation, resulting in
[0, 3] [3, 6] [2+4, 5+4] [1+4+4, 4+4+4]
Does that get you moving, or do you need more detail?
I have an idea, which might need further improvement. But here it goes:
suppose the hdd has the following structure:
5 4 1 2 3
And we want to write out this permutation:
2 3 5 1 4
Since hdd is a circular buffer, and assuming it can only rotate in one direction, we can write the above permutation using shifts as such:
5 >> 2
4 >> 3
1 >> 1
2 >> 2
3 >> 2
So let's put that in an array, and since we know it is a circular array, lets put its mirrors side by side:
| 2 3 1 2 2 | 2 3 1 2 2| 2 3 1 2 2 | 2 3 1 2 2 |... Inf
Since we want to favor sequential reads, (or writes) we can put a cost function to the above series. Let the cost function be linear, i. e:
0 1 2 3 4 5 6 7 8 9 10 ... Inf
Now, let us add the cost function to the above series, but how to select the starting point?
The idea is to select the starting point such that you get the maximum congruent monotonically increasing sequence.
For example, if you select the 0 point to be on "3", you'll get
(1) | - 3 2 4 5 | 6 8 7 9 10 | ...
If you select the 0 point to be on "2", the one just right of "1", you'll get:
(2) | - - - 2 3 | 4 6 5 7 8 | ...
Since we are trying to favor consecutive reads, lets define our read-write function to work as such:
f():
At any currently pointed hdd location, function will read the currently pointed hdd file, into available RAM. (namely, total space - 1, because we want to save 1 for swap)
If no available space is left on RAM for read, the function will assert and program will halt.
At any current hdd location, if ram holds the value that we want to be written in that hdd location, function reads the current file into swap space, writes the wanted value from the ram to hdd, and destroys the value in ram.
If a value is placed into hdd, function will check if the sequence is completed. If it is, program will return with success.
Now, we should note that if the following holds:
shift amount <= n - 1 (n : available memory we can hold)
We can traverse the hard disk in once pass using the above function. For example:
current: 4 5 6 7 0 1 2 3
we want: 0 1 2 3 4 5 6 7
n : 5
We can start anywhere we want, say from the initial "4". We read 4 items sequentially, (n has 4 items now) and we start placing from 0 1 2 3, (we can because n = 5 total, and 4 is used. 1 is used for swap). So the total operations is 4 consecutive reads, and then r-w operations for 8 times.
Using that analogy, it becomes clear that if we subtract "n-1" from equations (1) and (2), the positions which have value "<= 0" will be a better suit for initial position because the ones higher than zero will definitely require another pass.
So we select eq. (2) and subtract, for let's say "n = 3", we subtract 2 from eq. (2):
(2) | - - - 0 1 | 2 4 3 5 6 | ...
Now it is clear that, using f(), and starting from 0, assuming n = 3, we will have a starting operation as such: r, r, r-w, r-w, ...
So, how do we do the rest and find minimum cost? We will place an array with initial minimum cost, just below equation (2). The positions in that array will signify where we want f() to be executed.
| - - - 0 1 | 2 4 3 5 6 | ...
| - - - 1 1 | 1 1 1 1 1 | ...
The second array, the ones with 1's and 0's tell the program where to execute f(). Note that, if we assumed those locations wrong, f() will assert.
Before we start actually placing files into hdd, we of course want to see if the f() positions are correct. We check if there are assertions, we we will try to minimize cost whilst removing all assertions. So, e.g:
(1) 1111000000000000001111
(2) 1111111000000000000000
(1) obviously has higher cost that (2). So the question simplifies on finding the 1-0 array.
Some ideas on finding the best array:
Simplest solution is to write out all 1's and turn assertions into 0's. (essentially it's a skip). This method is guaranteed to work.
Brute force: write an array of as shown in (2) and start shifting 1's to right, in such an order that tries out every permutation available:
1111111100000000
1111111010000000
1111110110000000
...
Full random approach: Plug in mt1997 and start permuting. Whenever you see a sharp drop in cost, stop executing and implement hdd copy-paste. You won't find the global minimum, but you'll get a nice trade-off.
Genetic algorithms: For permutations where "shift count is much lower than n - 1", the methodology provided in this answer should (?) provide a global minimum and smooth gradients. This allows one to use genetic algorithms without relying on mutations too much.
One advantage I find in this approach is that, since OP mentioned that this is a real life problem, the method provides an easy(ier?) way to change cost functions. It is easier to detect the effect of say, having lots of contigous small files to be copied vs. having a single huge file. Or perhaps rrwwrrww is better than rrrrwwww?
Does any of this even make sense? We will have to try out ...

Count the coins including permutations of the same sequence

I've found a code to find number of possibilities to make change using given coins: How to count possible combination for coin problem. But how to count it, if we think about different permutations of the same sequence? I mean that, e.g. amount is 12, and "4 4 2 2" and "4 2 4 2" should be counted as 2, not 1.
As you've mentioned inside your question you can count the possible combinations as stated in How to count possible combination for coin problem. But in order to include the permutations into your answer:
If you distinguish the permutation of the same numbers [1 7 7] and [1 7 7] e.g. just count each sequence([1 7 7] here) as n! (n = # of elements in the sequence) [instead of 1]
Otherwise : multiply each sequence by n!/(m!l!...) where m = number of equal elements of type 1, l is number of equal elements of type 2 and so on... . For example for sequence like [a b b c c c] you should count this 6!/(2!*3!) [instead of 1]
So use the algorithm inside that link, that I don't repeat again, but just instead of counting each combination as 1 use the formula that I said (depending on the case you desire).
(! is factorial.)

Ad distribution problem: an optimal solution?

I'm asked to find a 2 approximate solution to this problem:
You’re consulting for an e-commerce site that receives a large number
of visitors each day. For each visitor i, where i € {1, 2 ..... n}, the site
has assigned a value v[i], representing the expected revenue that can be
obtained from this customer.
Each visitor i is shown one of m possible ads A1, A2 ..... Am as they
enter the site. The site wants a selection of one ad for each customer so
that each ad is seen, overall, by a set of customers of reasonably large
total weight.
Thus, given a selection of one ad for each customer, we will
define the spread of this selection to be the minimum, over j = 1, 2 ..... m,
of the total weight of all customers who were shown ad Aj.
Example: Suppose there are six customers with values 3, 4, 12, 2, 4, 6, and
there are m = 3 ads. Then, in this instance, one could achieve a spread of
9 by showing ad A1 to customers 1, 2, 4, ad A2 to customer 3, and ad A3 to
customers 5 and 6.
The ultimate goal is to find a selection of an ad for each customer
that maximizes the spread.
Unfortunately, this optimization problem
is NP-hard (you don’t have to prove this).
So instead give a polynomial-time algorithm that approximates the maximum spread within a factor of 2.
The solution I found is the following:
Order visitors values in descending order
Add the next visitor value (i.e. assign the visitor) to
the Ad with the current lowest total value
Repeat
This solution actually seems to always find the optimal solution, or I simply can't find a counterexample.
Can you find it? Is this a non-polinomial solution and I just can't see it?
With:
v = [7, 6, 5, 3, 3]
m = 2
The optimal solution is:
A1: 6 + 3 + 3 = 12
A2: 5 + 7 = 12
Your solution gives:
A1: 7 + 3 + 3 = 13
A2: 6 + 5 = 11

Resources