Understanding the concise DP solution for best time to buy and sell stocks IV

Understanding the concise DP solution for best time to buy and sell stocks IV - algorithm

The problem is the famous leetcode problem (or in similar other contexts), best to buy and sell stocks, with at most k transactions.
Here is the screenshot of the problem:
I am trying to make sense of this DP solution. You can ignore the first part of large k. I don't understand the dp part why it works.
class Solution(object):
def maxProfit(self, k, prices):
"""
:type k: int
:type prices: List[int]
:rtype: int
"""
# for large k greedy approach (ignore this part for large k)
if k >= len(prices) / 2:
profit = 0
for i in range(1, len(prices)):
profit += max(0, prices[i] - prices[i-1])
return profit
# Don't understand this part
dp = [[0]*2 for i in range(k+1)]
for i in reversed(range(len(prices))):
for j in range (1, k+1):
dp[j][True] = max(dp[j][True], -prices[i]+dp[j][False])
dp[j][False] = max(dp[j][False], prices[i]+dp[j-1][True])
return dp[k][True]
I was able to drive a similar solution, but that uses two rows (dp and dp2) instead of just one row (dp in the solution above). To me it looks like the solution is overwriting values on itself, which for this solution doesn't look right. However the answer works and passes leetcode.

Lets put words to it:
for i in reversed(range(len(prices))):
For each future price we already know in advance, after considering later prices.
for j in range (1, k+1):
For each possibility of considering this price as one of k two-price transactions.
dp[j][True] = max(dp[j][True], -prices[i]+dp[j][False])
If we consider this might be a purchase -- since we are going backwards in time, a purchase means a completed transaction -- we choose the best of (1) having considered the jth purchase already (dp[j][True]) or (2) subtract this price as a purchase and add the best result we have already that includes the jth sale (-prices[i] + dp[j][False]).
dp[j][False] = max(dp[j][False], prices[i]+dp[j-1][True])
Otherwise, we might consider this as a sale (the first half of a transaction since we're going backwards), so we choose the best of (1) the jth sale already considered (dp[j][False]), or (2) we add this price as a sale and add to that the best result we have so far for the first (j - 1) completed transactions (prices[i] + dp[j-1][True]).
Note that the first dp[j][False] is referring to the jth "half-transaction," the sale if you will, since we are going backwards in time, that would have been set in an earlier iteration on a later price. We then can possibly overwrite it with our consideration of this price as a jth sale.

Related

Algorithm for grouping train trips

Imagine you have a full calendar year in front of you. On some days you take the train, potentially even a few times in a single day and each trip could be to a different location (I.E. The amount you pay for the ticket can be different for each trip).
So you would have data that looked like this:
Date: 2018-01-01, Amount: $5
Date: 2018-01-01, Amount: $6
Date: 2018-01-04, Amount: $2
Date: 2018-01-06, Amount: $4
...
Now you have to group this data into buckets. A bucket can span up to 31 consecutive days (no gaps) and cannot overlap another bucket.
If a bucket has less than 32 train trips it will be blue. If it has 32 or more train trips in it, it will be red. The buckets will also get a value based on the sum of the ticket cost.
After you group all the trips the blue buckets get thrown out. And the value of all the red buckets gets summed up, we will call this the prize.
The goal, is to get the highest value for the prize.
This is the problem I have. I cant think of a good algorithm to do this. If anyone knows a good way to approach this I would like to hear it. Or if you know of anywhere else that can help with designing algorithms like this.

This can be solved by dynamic programming.
First, sort the records by date, and consider them in that order.
Let day (1), day (2), ..., day (n) be the days where the tickets were bought.
Let cost (1), cost (2), ..., cost (n) be the respective ticket costs.
Let fun (k) be the best prize if we consider only the first k records.
Our dynamic programming solution will calculate fun (0), fun (1), fun (2), ..., fun (n-1), fun (n), using the previous values to calculate the next one.
Base:
fun (0) = 0.
Transition:
What is the optimal solution, fun (k), if we consider only the first k records?
There are two possibilities: either the k-th record is dropped, then the solution is the same as fun (k-1), or the k-th record is the last record of a bucket.
Let us then consider all possible buckets ending with the k-th record in a loop, as explained below.
Look at records k, k-1, k-2, ..., down to the very first record.
Let the current index be i.
If the records from i to k span more than 31 consecutive days, break from the loop.
Otherwise, if the number of records, k-i+1, is at least 32, we can solve the subproblem fun (i-1) and then add the records from i to k, getting a prize of cost (i) + cost (i+1) + ... + cost (k).
The value fun (k) is the maximum of these possibilities, along with the possibility to drop the k-th record.
Answer: it is just fun (n), the case where we considered all the records.
In pseudocode:
fun[0] = 0
for k = 1, 2, ..., n:
fun[k] = fun[k-1]
cost_i_to_k = 0
for i = k, k-1, ..., 1:
if day[k] - day[i] > 31:
break
cost_i_to_k += cost[i]
if k-i+1 >= 32:
fun[k] = max (fun[k], fun[i-1] + cost_i_to_k)
return fun[n]
It is not clear whether we are allowed to split records on a single day into different buckets.
If the answer is no, we will have to enforce it by not considering buckets starting or ending between records in a single day.
Technically, it can be done by a couple of if statements.
Another way is to consider days instead of records: instead of tickets which have day and cost, we will work with days.
Each day will have cost, the total cost of tickets on that day, and quantity, the number of tickets.
Edit: as per comment, we indeed can not split any single day.
Then, after some preprocessing to get days records instead of tickets records, we can go as follows, in pseudocode:
fun[0] = 0
for k = 1, 2, ..., n:
fun[k] = fun[k-1]
cost_i_to_k = 0
quantity_i_to_k = 0
for i = k, k-1, ..., 1:
if k-i+1 > 31:
break
cost_i_to_k += cost[i]
quantity_i_to_k += quantity[i]
if quantity_i_to_k >= 32:
fun[k] = max (fun[k], fun[i-1] + cost_i_to_k)
return fun[n]
Here, i and k are numbers of days.
Note that we consider all possible days in the range: if there are no tickets for a particular day, we just use zeroes as its cost and quantity values.
Edit2:
The above allows us to calculate the maximum total prize, but what about the actual configuration of buckets which gets us there?
The general method will be backtracking: at position k, we will want to know how we got fun (k), and transition to either k-1 if the optimal way was to skip k-th record, or from k to i-1 for such i that the equation fun[k] = fun[i-1] + cost_i_to_k holds.
We proceed until i goes down to zero.
One of the two usual implementation approaches is to store par (k), a "parent", along with fun (k), which encodes how exactly we got the maximum.
Say, if par (k) = -1, the optimal solution skips k-th record.
Otherwise, we store the optimal index i in par (k), so that the optimal solution takes a bucket of records i to k inclusive.
The other approach is to store nothing extra.
Rather, we run a slight modification code which calculates fun (k).
But instead of assigning things to fun (k), we compare the right part of the assignment to the final value fun (k) we already got.
As soon as they are equal, we found the right transition.
In pseudocode, using the second approach, and days instead of individual records:
k = n
while k > 0:
k = prev (k)
function prev (k):
if fun[k] == fun[k-1]:
return k-1
cost_i_to_k = 0
quantity_i_to_k = 0
for i = k, k-1, ..., 1:
if k-i+1 > 31:
break
cost_i_to_k += cost[i]
quantity_i_to_k += quantity[i]
if quantity_i_to_k >= 32:
if fun[k] == fun[i-1] + cost_i_to_k:
writeln ("bucket from $ to $: cost $, quantity $",
i, k, cost_i_to_k, quantity_i_to_k)
return i-1
assert (false, "can't happen")

Simplify the challenge, but not too much, to make an overlookable example, which can be solved by hand.
That helps a lot in finding the right questions.
For example take only 10 days, and buckets of maximum length of 3:
For building buckets and colorizing them, we need only the ticket count, here 0, 1, 2, 3.
On Average, we need more than one bucket per day, for example 2-0-2 is 4 tickets in 3 days. Or 1-1-3, 1-3, 1-3-1, 3-1-2, 1-2.
But We can only choose 2 red buckets: 2-0-2 and (1-1-3 or 1-3-3 or 3-1-2) since 1-2 in the end is only 3 tickets, but we need at least 4 (one more ticket than max day span per bucket).
But while 3-1-2 is obviously more tickets than 1-1-3 tickets, the value of less tickets might be higher.
The blue colored area is the less interesting one, because it doesn't feed itself, by ticket count.

Place closest students as far as possible

You are in charge of a classroom which has n seats in a single row, numbered 0 through n-1
During the day students enter and leave the classroom for the exam.
In order to minimize the cheating, your task is to efficiently seat all incoming students.
You're given 2 types of queries: add_student(student_id) -> seat index and remove_student(student_id) -> void
The rules for seating the student is the following:
The seat must be unoccupied
The closest student must be as far away as possible
Ties can be resolved by choosing the lowest-numbered seat.
My Approach is to use Hashtable considering we have to assign seat index to each student which we can do using hash function. Is this approach correct?
If hashtable is the right approach then for - 'the closest student must be as far away as possible', how should I design efficient hash function?
Is there any better way to solve this problem?

This seemed like an interesting problem to me and so here is an algorithm that I have come up with, after giving it some thought.
So as the question says, it's only a single row numbered from 1 to n.
I was thinking of a greedy approach to this problem.
So here how it goes.
Person coming first will be seated at 1.
Person coming second will be seated at n, as this will create the farthest distance between them.
Now we start keeping the subsets, so we have one subset as (1,n).
Now when the third person comes in, he gets seated at mid of first subset i.e (1+n)/2. And after this we will have two subsets - (1,n/2) and (n/2,n).
Now when the fourth person arrives he can either sit in the mid of (1,n/2) and (n/2,n).
And this flow will go on.
Now suppose when a person leaves his seat and goes out of the class, then our subset range will change and then for the next incoming person, we will calculate new set of subsets for him.
Hope this helps. For this approach, I think an array of size n will also work nicely.

Update: This solution does not work because of the order the students can leave and open gaps.
So here's the solution I came up with. I would create an initial position array. So the position of the incoming student is just position[current_number_of_students-1].
Creating the array is the tricky part.
def position(number_of_chairs)
# handle edge cases
return [] if number_of_chairs.nil? || number_of_chairs <= 0
return [0] if number_of_chairs == 1
# initialize with first chair and the last chair
#position = [0, number_of_chairs - 1]
# We want to start adding the middle of row segments
# but in the correct order.
# The first segment is going to be the entire row
#segments = [0, number_of_chairs - 1]
while !#segments.empty?
current_segment = #segments.shift
mid = (current_segment[0] + current_segment[1]) / 2
if (mid > current_segment[0] && mid < current_segment[1])
#position << mid
# add the bottom half to the queue
#segments.push([current_segment[0], mid])
# add the top half to the queue
#segments.push([mid, current_segment[1]])
end
end
#position
end

How to find the minimum value of M?

I'm trying to solve this problem:
You have N relatives. You will talk to ith relative for exactly Ti
minutes. Each minute costs you 1 dollar . After the conversation,
they will add a recharge of Xi dollars in your mobile. Initially, you
have M dollars balance in your mobile phone.
Find the minimum value of M, that you must have initially, in your
phone, so that you don't run out of balance during any of the call
(encounter negative balance).
Note : You can call relatives in any order. Each relative will be
called exactly once.
Input:
N
T1 X1
T2 X2
2
1 1
2 1
Output:
2
This looks easy to me at first but I'm not able to find the exact solution.
My Initial thoughts:
We have no problem where Xi > Ti as it will not reduce our initial
balance. We need to take care of situation where where we will run
into loss i.e Ti > Xi.
But I am unable to make expression which will result in minimum
initial value.
Need guidance in approaching this problem to find optimal solution.

UPDATE:-
Binary Search approach seems to lead to wrong result (as proved by the
test case provided in the comment below by user greybeard.
So, this is another approach.We maintain the difference between call cost
and recharge amount.
Then we maintain two arrays/vectors.
If our recharge amount is strictly greater than cost of call, we put
the call in the first array ,else we put it in the second array.
Then we can sort the first array according to the cost and the second array
according to the recharge amount. We then update the diff by adding the
least amount of recharge from the call where our cost is greater than recharge
Then we can iterate through our first array and update our max
requirement,requirement for each call and current balance.Finally, our answer
will be the maximum between max requirement and the diff we have maintained.
Example :-
N = 2
T1 = 1 R1 = 1
T2 = 2 R2 = 1
Our first array contains nothing as all the calls have cost greater than
or equal to recharge amount. So, we place both calls in our second array
The diff gets updated to 2 before we sort the array. Then, we add the min
recharge we can get from the calls to our diff(i.e 1).Now, the diff stands
at 3.Then as our first array contains no elements, our answer is equal to
the diff i.e 3.
Time Complexity :- O(nlogn)
Working Example:-
#include<bits/stdc++.h>
using namespace std;
#define MAXN 100007
int n,diff;
vector<pair<int,int> > v1,v2;
int main(){
diff = 0;
cin>>n;
for(int i=0;i<n;i++){
int cost,recharge;
cin>>cost>>recharge;
if(recharge > cost){
v1.push_back(make_pair(cost,recharge));
}else{
v2.push_back(make_pair(recharge,cost));
}
diff += (cost-recharge);
}
sort(v1.begin(), v1.end());
sort(v2.begin(), v2.end());
if(v2.size() > 0)diff += v2[0].first;
int max_req = diff, req = 0,cur = 0;
for(int i=0; i<v1.size(); i++){
req = v1[i].first - cur;
max_req = max(max_req, req);
cur += v1[i].second-v1[i].first;
}
cout<<max(max_req,diff)<<endl;
return 0;
}

(This is a wiki post: you are invited to edit, and don't need much reputation to do so without involving a moderator.)
Working efficiently means accomplishing the task at hand, with no undue effort. Aspects here:
the OP asks for guidance in approaching this problem to find optimal solution - not for a solution (as this entirely similar, older question does).
the problem statement asks for the minimum value of M - not an optimal order of calls or how to find that.
To find the minimum balance initially required, categorise the relatives/(T, X)-pairs/calls (the order might have a meaning, if not for the problem as stated)
T < X Leaves X-T more for calls to follow. Do in order of increasing cost.
Start assuming an initial balance of 1. For each call, if you can afford it, subtract its cost, add its refund and be done accounting for it. If you can't afford it (yet), put it on hold/the back burner/in a priority queue. At the end of "rewarding calls", remove each head of the queue in turn, accounting for necassary increases in intitial balance.
This part ends with a highest balance, yet.
T = X No influence on any other call. Just do at top balance, in any order.
The top balance required for the whole sequence can't be lower than the cost of any single call, including these.
T > X Leaves T-X less for subsequent calls. Do in order of decreasing refund.
(This may, as any call, go to a balance of zero before refund.
As order of calls does not change the total cost, the ones requiring the least initial balance will be those yielding the lowest final one. For the intermediate balance required by this category, don't forget that least refund.)
Combine the requirements from all categories.
Remember the request for guidance.

Find the best way to buy p Product from limit x Vendors

I have to buy 100 Products ( or p Products) from 20 Vendors ( or v Vendors). Each Vendors have all of these Products, but they sell different Price.
I want to find the best price to get 100 Products. Asume that there is no Shipping Cost.
There are v^p ways. And I will get only one way that have best Price.
The problem seem to be easy if there is no requirement: LIMIT number of Vendors to x in the Orders because of Time Delivery ( or Some reasons).
So, the problem is: Find the best way to buy p Product from limit x Vendors ( There are v Vendors , x<=v).
I can generate all Combination of Vendors( There are C(v,x) combinations) and compare the Total Price. But There are so many combinations . (if there are 20 Vendors, there are around 185k combinations).
I stuck at this idea.
Someone has same problem , pls help me. Thank you very much.

This problem is equivalent to the non-metric k-center problem (cities = products, warehouses = vendors), which is NP-hard.
I would try mixed integer programming. Here's one formulation.
minimize c(i, j) y(i, j) # cost of all of the orders
subject to
for all i: sum over j of y(i, j) = 1 # buy each product once
for all i, j: y(i, j) <= z(j) # buy products only from chosen vendors
sum over j of z(j) <= x # choose at most x vendors
for all i, j: 0 <= y(i, j) <= 1
for all j: z(j) in {0, 1}
The interpretation of the variables is that i is a product, j is a vendor, c(i, j) is the cost of product i from vendor j, y(i, j) is 1 if we buy product i from vendor j and 0 otherwise, z(j) is 1 is we buy from vendor j at all and 0 otherwise.
There are many free mixed integer program solvers available.

Not Correct as shown by #Per the structure lacks optimal substructure
My assumptions are as follows, from the master table you need to create a sub list which has only "x" vendor columns, and "Best Price" is the "Sum" of all the prices.
Use a dynamic programming approach
What you do is define two functions, Picking (i,k) and NotPicking(i,k).
What it means is getting the best with ability to pick vendors from 1,.. i with maximum of k vendors.
Picking (1,_) = Sum(All prices)
NotPicking (1,_) = INF
Picking (_,0) = INF
NotPicking (_,0) = INF
Picking (i,k) = Min (Picking(i-1,k-1) + NotPicking(i-1,k-1)) - D (The difference you get because of having this vendor)
NotPicking (i,k) = Min (Picking(i-1,k) + NotPicking(i-1,k))
You just solve it for a i from 1 to V and k from 1 to X
You calculate the difference by maintaining for each picking the whole product list, and calculating the difference.

How about using a Greedy Approach. Since you have a limitation on the vendors ( you need to use at least x of the total v vendors). That means you need to choose at least 1 product from each vendor of the x ... And here's an example solution:
For each vendor in v, sort the products by price, then you will have "v" sets of sorted prices. Now you can pick the min of these sets and sort again, producing a new set of "v" products, containing only the cheapest ones.
Now, if p <= v, then pick the first p items and you are done, otherwise pick all v items and repeat the same logic until you reach p.

I haven't worked this out and verified, but I guess it might work. Try this:
Add two more columns called "Highest Price" and "Lowest Price" to the table and generate data for it: they should hold the highest and lowest price for each product amongst all vendors.
Also add another column, called "Range" which should hold the (highest price - lowest price).
Now do this 100 (p) times:
Pick the row with highest range. Buy the product with least price on
that row. Once bought, mark that cell as 'bought' (maybe set null).
Recalculate lowest price, range for that row (ignoring cells marked as 'bought').

EDIT: Hungarian algorithm is not the answer to your question unless you did not wanted to put a limit on vendors.
The algorithm you are looking for is Hungarian Algorithm.
There are many available implementations of it on the web.

Algorithm possible amounts (over)paid for a specific price, based on denominations

In a current project, people can order goods delivered to their door and choose 'pay on delivery' as a payment option. To make sure the delivery guy has enough change customers are asked to input the amount they will pay (e.g. delivery is 48,13, they will pay with 60,- (3*20,-)). Now, if it were up to me I'd make it a free field, but apparantly higher-ups have decided is should be a selection based on available denominations, without giving amounts that would result in a set of denominations which could be smaller.
Example:
denominations = [1,2,5,10,20,50]
price = 78.12
possibilities:
79 (multitude of options),
80 (e.g. 4*20)
90 (e.g. 50+2*20)
100 (2*50)
It's international, so the denominations could change, and the algorithm should be based on that list.
The closest I have come which seems to work is this:
for all denominations in reversed order (large=>small)
add ceil(price/denomination) * denomination to possibles
baseprice = floor(price/denomination) * denomination;
for all smaller denominations as subdenomination in reversed order
add baseprice + (ceil((price - baseprice) / subdenomination) * subdenomination) to possibles
end for
end for
remove doubles
sort
Is seems to work, but this has emerged after wildly trying all kinds of compact algorithms, and I cannot defend why it works, which could lead to some edge-case / new countries getting wrong options, and it does generate some serious amounts of doubles.
As this is probably not a new problem, and Google et al. could not provide me with an answer save for loads of pages calculating how to make exact change, I thought I'd ask SO: have you solved this problem before? Which algorithm? Any proof it will always work?

Its an application of the Greedy Algorithm http://mathworld.wolfram.com/GreedyAlgorithm.html (An algorithm used to recursively construct a set of objects from the smallest possible constituent parts)
Pseudocode
list={1,2,5,10,20,50,100} (*ordered *)
while list not null
found_answer = false
p = ceil(price) (* assume integer denominations *)
while not found_answer
find_greedy (p, list) (*algorithm in the reference above*)
p++
remove(first(list))
EDIT> some iterations are nonsense>
list={1,2,5,10,20,50,100} (*ordered *)
p = ceil(price) (* assume integer denominations *)
while list not null
found_answer = false
while not found_answer
find_greedy (p, list) (*algorithm in the reference above*)
p++
remove(first(list))
EDIT>
I found an improvement due to Pearson on the Greedy algorithm. Its O(N^3 log Z), where N is the number of denominations and Z is the greatest bill of the set.
You can find it in http://library.wolfram.com/infocenter/MathSource/5187/

You can generate in database all possible combination sets of payd coins and paper (im not good in english) and each row contains sum of this combination.
Having this database you can simple get all possible overpaid by one query,
WHERE sum >= cost and sum <= cost + epsilon
Some word about epsilon, hmm.. you can assign it from cost value? Maybe 10% of cost + 10 bucks?:
WHERE sum >= cost and sum <= cost * 1.10 + 10
Table structure must have number of columns representing number of coins and paper type.
Value of each column have number of occurences of this type of paid item.
This is not optimal and fastest solution of this problem but easy and simple to implement.
I think about better solution of this.
Other way you can for from cost to cost + epsilon and for each value calculate smallest possible number of paid items for each. I have algorithm for it. You can do this with this algorithm but this is in C++:
int R[10000];
sort(C, C + coins, cmp);
R[0]=0;
for(int i=1; i <= coins_weight; i++)
{
R[i] = 1000000;
for (int j=0; j < coins; j++)
{
if((C[j].weight <= i) && ((C[j].value + R[i - C[j].weight]) < R[i]))
{
R[i] = C[j].value + R[i - C[j].weight];
}
}
}
return R[coins_weight];

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio