Designing an algorithm to guarantee accepting an outcome with at least 1/4 probability - pseudocode

Designing an algorithm to guarantee accepting an outcome with at least 1/4 probability
Here is a problem that I am stumped with.
There
are n buyers in this auction, each with a distinct bid b_i > 0. The buyers appear in a random
order and the seller must decide immediately whether to accept the current buyer’s bid or
not. If he accepts, he makes a profit of b_i and the auction terminates. Otherwise, the bid is
withdrawn and the seller may see the next buyer.
Design a strategy (algorithm) that guarantees that the seller accepts the highest of the n bids
with probability at least 1/4, regardless of n, which you may assume even for simplicity.
So far, I've considered four cases depending on whether the highest and second highest bids fall under the first and second halves of n.

Related

How to order a set of tasks where each has a starting cost and reward

Assume you start with T amount of money, you want to find internships but because you have no experience you can't get into any, so you decided to buy into paid internships.
You are given a set of internships each with a buying cost C_b to get you in first, and once you are in you get paid back some money C_p. C_b can be either greater or smaller or the same than C_p so the net gain can be either positive or negative or 0.
Edit :The goal is to do all the internships.
Is there an algorithm to find out the order to do all the internships (in O(n^2))? And it can find out if there's no way to order e.g(The money left if not enough to buy you into any remaining internships). Thanks!
I'm really confused as to where to even start.
To restate the question: How can you find the most effective order to do buy-in internships in, that maximizes the number of internships you can do.
Given:
A set of internships, each with a buy-in cost (C_b) and payback amount (C_p)
C_b may be greater than or less than C_p
You have some amount of money that cannot go into the negative
The Algorithm
Sort the internships by net gain in descending order
Loop through this sorted set of internships
If money - C_b >= 0
then do the internship and money = money + net gain
Repeat step 2, looping over the skipped internships, until you complete a loop without doing another internship.
With this algorithm you will do the internships that pay you the most first, which increases your ability to buy in to more internships.
EDIT
I believe a more efficient algorithm would be, upon finding a suitable internship at step 2, to start the loop back at from the beginning (in case you're now able to do one of the skipped higher net gain internships).

Probability - expectation Puzzle: 1000 persons and a door

You stand in an office by a door, with a measuring tape. Every time a person walks in you measure him or her and only keep tally of the “record” tallest. If the new person is taller than the preceding one, you count a record. If later another person is taller, you have another record, etc.
A 1000 persons pass through the door. How many records do you expect to have?
(Assume independence of height/arrival. Also note that the answer does not depend on any assumption about the probability distribution other than independence.)
PS - I'm able to come up with answer (~7.5) with a brute force approach. ( Running this scenario over 1000000 times and taking average ). But here I'm looking for a theoretical approach.
consider x_1 to x_1000 as the record, and max(i) as max of the sequence until i. The question is reduced to finding expected number of times the max(i) changes.
for i=0 to 999:
if x_i+1>max(i), then max(i) changes
Also, P(x_i+1>max(i))=1/i+1
answer=> summation of 1/1+i (i varies from 0 to 999) which is approx. 7.49

How do I calculate the most profit-dense combination in the most efficient way?

I have a combinations problem that's bothering me. I'd like someone to give me their thoughts and point out if I'm missing some obvious solution that I may have overlooked.
Let's say that there is a shop that buys all of its supplies from one supplier. The supplier has a list of items for sale. Each item has the following attributes:
size, cost, quantity, m, b
m and b are constants in the following equation:
sales = m * (price) + b
This line slopes downward. The equation tells me how many of that item I will be able to sell if I charge that particular price. Each item has its own m and b values.
Let's say that the shop has limited storage space, and limited funds. The shop wants to fill its warehouse with the most profit-dense items possible.
(By the way, profit density = profit/size. I'm defining that profit density be only with regard to the items size. I could work with the density with regard to size and cost, but to do that I'd have to know the cost of warehouse space. That's not a number I know currently, so I'm just going to use size.)
The profit density of items drops the more you buy (see below.)
If I flip the line equation, I can see what price I'd have to charge to sell some given amount of the item in some given period of time.
price = (sales-b)/m
So if I buy n items and wanted to sell all of them, I'd have to charge
price = (n-b)/m
The revenue from this would be
price*n = n*(n-b)/m
The profit would be
price*n-n*cost = n*(n-b)/m - n*cost
and the profit-density would be
(n*(n-b)/m - n*cost)/(n*size)
or, equivalently
((n-b)/m - cost)/size
So let's say I have a table containing every available item, and each item's profit-density.
The question is, how many of each item do I buy in order to maximise the amount of money that the shop makes?
One possibility is to generate every possible combination of items within the bounds of cost and space, and choose the combo with the highest profitability. In a list of 1000 items, this takes too long. (I tried this and it took 17 seconds for a list of 1000. Horrible.)
Another option I tried (on paper) was to take the top two most profitable items on the list. Let's call the most profitable item A, the 2nd-most profitable item B, and the 3rd-most profitable item C. I buy as many of item A as I can until it's less profitable than item B. Then I repeat this process using B and C, for every item in the list.
It might be the case however, that after buying item B, item A is again the most profitable item, more so than C. So this would involve hopping from the current most profitable item to the next until the resources are exhausted. I could do this, but it seems like an ugly way to do it.
I considered dynamic programming, but since the profit-densities of the items change depending on the amount you buy, I couldn't come up with a resolution for this.
I've considered multiple-linear regression, and by 'consider' I mean I've said to myself "is multi-linear regression an option?" and then done nothing with it.
My spidey-sense tells me that there's a far more obvious method staring me in the face, but I'm not seeing it. Please help me kick myself and facepalm at the same time.
If you treat this as a simple exercise in multivariate optimization, where the controllable variables are the quantities bought, then you are optimizing a quadratic function subject to a linear constraint.
If you use a Lagrange multiplier and differentiate then you get a linear equation for each quantity variable involving itself and the Lagrange multiplier as the only unknowns, and the constraint gives you a single linear equation involving all of the quantities. So write each quantity as a linear function of the Lagrange multiplier and substitute into the constraint equation to get a linear equation in the Lagrange multiplier. Solve this and then plug the Lagrange multiplier into the simpler equations to get the quantities.
This gives you a solution if you are allowed to buy fractional and negative quantities of things if required. Clearly you are not, but you might hope that nothing is very negative and you can round the non-integer quantities to get a reasonable answer. If this isn't good enough for you, you could use it as a basis for branch and bound. If you make an assumption on the value of one of the quantities and solve for the others in this way, you get an upper bound on the possible best answer - the profit predicted neglecting real world constraints on non-negativity and integer values will always be at least the profit earned if you have to comply with these constraints.
You can treat this as a dynamic programming exercise, to make the best use of a limited resource.
As a simple example, consider just satisfying the constraint on space and ignoring that on cost. Then you want to find the items that generate the most profit for the available space. Choose units so that expressing the space used as an integer is reasonable, and then, for i = 1 to number of items, work out, for each integer value of space up to the limit, the selection of the first i items that gives the most return for that amount of space. As usual, you can work out the answers for i+1 from the answers for i: for each value from 0 up to the limit on space just consider all possible quantities of the i+1th item up to that amount of space, and work out the combined return from using that quantity of the item and then using the remaining space according to the answers you have already worked out for the first i items. When i reaches the total number of items you will be working out the best possible return for the problem you actually want to solve.
If you have constraints for both space and cost, then the state of the dynamic program is not the single variable (space) but a pair of variables (space, cost) but you can still solve it, although with more work. Consider all possible values of (space, cost) from (0, 0) up to the actual constraints - you have a 2-dimensional table of returns to compute instead of a single set of values from 0 to max-space. But you can still work from i=1 to N, computing the highest possible return for the first i items for each limit of (space, cost) and using the answers for i to compute the answers for i+1.

If you know the future prices of a stock, what's the best time to buy and sell?

Interview Question by a financial software company for a Programmer position
Q1) Say you have an array for which the ith element is the price of a given stock on
day i.
If you were only permitted to buy one share of the stock and sell one share
of the stock, design an algorithm to find the best times to buy and sell.
My Solution :
My solution was to make an array of the differences in stock prices between day i and day i+1 for arraysize-1 days and then use Kadane Algorithm to return the sum of the largest continuous sub array.I would then buy at the start of the largest continuous array and sell at the end of the largest
continous array.
I am wondering if my solution is correct and are there any better solutions out there???
Upon answering i was asked a follow up question, which i answered exactly the same
Q2) Given that you know the future closing price of Company x for the next 10 days ,
Design a algorithm to to determine if you should BUY,SELL or HOLD for every
single day ( You are allowed to only make 1 decision every day ) with the aim of
of maximizing profit
Eg: Day 1 closing price :2.24
Day 2 closing price :2.11
...
Day 10 closing price : 3.00
My Solution: Same as above
I would like to know what if theres any better algorithm out there to maximise profit given
that i can make a decision every single day
Q1 If you were only permitted to buy one share of the stock and sell one share of the stock, design an algorithm to find the best times to buy and sell.
In a single pass through the array, determine the index i with the lowest price and the index j with the highest price. You buy at i and sell at j (selling before you buy, by borrowing stock, is in general allowed in finance, so it is okay if j < i). If all prices are the same you don't do anything.
Q2 Given that you know the future closing price of Company x for the next 10 days , Design a algorithm to to determine if you should BUY,SELL or HOLD for every single day ( You are allowed to only make 1 decision every day ) with the aim of of maximizing profit
There are only 10 days, and hence there are only 3^10 = 59049 different possibilities. Hence it is perfectly possible to use brute force. I.e., try every possibility and simply select the one which gives the greatest profit. (Even if a more efficient algorithm were found, this would remain a useful way to test the more efficient algorithm.)
Some of the solutions produced by the brute force approach may be invalid, e.g. it might not be possible to own (or owe) more than one share at once. Moreover, do you need to end up owning 0 stocks at the end of the 10 days, or are any positions automatically liquidated at the end of the 10 days? Also, I would want to clarify the assumption that I made in Q1, namely that it is possible to sell before buying to take advantage of falls in stock prices. Finally, there may be trading fees to be taken into consideration, including payments to be made if you borrow a stock in order to sell it before you buy it.
Once these assumptions are clarified it could well be possible to take design a more efficient algorithm. E.g., in the simplest case if you can only own one share and you have to buy before you sell, then you would have a "buy" at the first minimum in the series, a "sell" at the last maximum, and buys and sells at any minima and maxima inbetween.
The more I think about it, the more I think these interview questions are as much about seeing how and whether a candidate clarifies a problem as they are about the solution to the problem.
Here are some alternative answers:
Q1) Work from left to right in the array provided. Keep track of the lowest price seen so far. When you see an element of the array note down the difference between the price there and the lowest price so far, update the lowest price so far, and keep track of the highest difference seen. My answer is to make the amount of profit given at the highest difference by selling then, after having bought at the price of the lowest price seen at that time.
Q2) Treat this as a dynamic programming problem, where the state at any point in time is whether you own a share or not. Work from left to right again. At each point find the highest possible profit, given that own a share at the end of that point in time, and given that you do not own a share at the end of that point in time. You can work this out from the result of the calculations of the previous time step: In one case compare the options of buying a share and subtracting this from the profit given that you did not own at the end of the previous point or holding a share that you did own at the previous point. In the other case compare the options of selling a share to add to the profit given that you owned at the previous time, or staying pat with the profit given that you did not own at the previous time. As is standard with dynamic programming you keep the decisions made at each point in time and recover the correct list of decisions at the end by working backwards.
Your answer to question 1 was correct.
Your answer to question 2 was not correct. To solve this problem you work backwards from the end, choosing the best option at each step. For example, given the sequence { 1, 3, 5, 4, 6 }, since 4 < 6 your last move is to sell. Since 5 > 4, the previous move to that is buy. Since 3 < 5, the move on 5 is sell. Continuing in the same way, the move on 3 is to hold and the move on 1 is to buy.
Your solution for first problem is Correct. Kadane's Algorithm runtime complexity is O(n) is a optimal solution for maximum subarray problem. And benefit of using this algorithm is that it is easy to implement.
Your solution for second problem is wrong according to me. What you can do is to store the left and right index of maximum sum subarray you find. Once you find have maximum sum subarray and its left and right index. You can call this function again on the left part i.e 0 to left -1 and on right part i.e. right + 1 to Array.size - 1. So, this is a recursion process basically and you can further design the structure of this recursion with base case to solve this problem. And by following this process you can maximize profit.
Suppose the prices are the array P = [p_1, p_2, ..., p_n]
Construct a new array A = [p_1, p_2 - p_1, p_3 - p_2, ..., p_n - p_{n-1}]
i.e A[i] = p_{i+1} - p_i, taking p_0 = 0.
Now go find the maximum sum sub-array in this.
OR
Find a different algorithm, and solve the maximum sub-array problem!
The problems are equivalent.

Open-ended tournament pairing algorithm

I'm developing a tournament model for a virtual city commerce game (Urbien.com) and would love to get some algorithm suggestions. Here's the scenario and current "basic" implementation:
Scenario
Entries are paired up duel-style, like on the original Facemash or Pixoto.com.
The "player" is a judge, who gets a stream of dueling pairs and must choose a winner for each pair.
Tournaments never end, people can submit new entries at any time and winners of the day/week/month/millenium are chosen based on the data at that date.
Problems to be solved
Rating algorithm - how to rate tournament entries and how to adjust their ratings after each match?
Pairing algorithm - how to choose the next pair to feed the player?
Current solution
Rating algorithm - the Elo rating system currently used in chess and other tournaments.
Pairing algorithm - our current algorithm recognizes two imperatives:
Give more duels to entries that have had less duels so far
Match people with similar ratings with higher probability
Given:
N = total number of entries in the tournament
D = total number of duels played in the tournament so far by all players
Dx = how many duels player x has had so far
To choose players x and y to duel, we first choose player x with probability:
p(x) = (1 - (Dx / D)) / N
Then choose player y the following way:
Sort the players by rating
Let the probability of choosing player j at index jIdx in the sorted list be:
p(j) = ...
0, if (j == x)
n*r^abs(jIdx - xIdx) otherwise
where 0 < r < 1 is a coefficient to be chosen, and n is a normalization factor.
Basically the probabilities in either direction from x form a geometic series, normalized so they sum to 1.
Concerns
Maximize informational value of a duel - pairing the lowest rated entry against the highest rated entry is very unlikely to give you any useful information.
Speed - we don't want to do massive amounts of calculations just to choose one pair. One alternative is to use something like the Swiss pairing system and pair up all entries at once, instead of choosing new duels one at a time. This has the drawback (?) that all entries submitted in a given timeframe will experience roughly the same amount of duels, which may or may not be desirable.
Equilibrium - Pixoto's ImageDuel algorithm detects when entries are unlikely to further improve their rating and gives them less duels from then on. The benefits of such detection are debatable. On the one hand, you can save on computation if you "pause" half the entries. On the other hand, entries with established ratings may be the perfect matches for new entries, to establish the newbies' ratings.
Number of entries - if there are just a few entries, say 10, perhaps a simpler algorithm should be used.
Wins/Losses - how does the player's win/loss ratio affect the next pairing, if at all?
Storage - what to store about each entry and about the tournament itself? Currently stored:
Tournament Entry: # duels so far, # wins, # losses, rating
Tournament: # duels so far, # entries
instead of throwing in ELO and ad-hoc probability formulae, you could use a standard approach based on the maximum likelihood method.
The maximum likelihood method is a method for parameter estimation and it works like this (example). Every contestant (player) is assigned a parameter s[i] (1 <= i <= N where N is total number of contestants) that measures the strength or skill of that player. You pick a formula that maps the strengths of two players into a probability that the first player wins. For example,
P(i, j) = 1/(1 + exp(s[j] - s[i]))
which is the logistic curve (see http://en.wikipedia.org/wiki/Sigmoid_function). When you have then a table that shows the actual results between the users, you use global optimization (e.g. gradient descent) to find those strength parameters s[1] .. s[N] that maximize the probability of the actually observed match result. E.g. if you have three contestants and have observed two results:
Player 1 won over Player 2
Player 2 won over Player 3
then you find parameters s[1], s[2], s[3] that maximize the value of the product
P(1, 2) * P(2, 3)
Incidentally, it can be easier to maximize
log P(1, 2) + log P(2, 3)
Note that if you use something like the logistics curve, it is only the difference of the strength parameters that matters so you need to anchor the values somewhere, e.g. choose arbitrarily
s[1] = 0
In order to have more recent matches "weigh" more, you can adjust the importance of the match results based on their age. If t measures the time since a match took place (in some time units), you can maximize the value of the sum (using the example)
e^-t log P(1, 2) + e^-t' log P(2, 3)
where t and t' are the ages of the matches 1-2 and 2-3, so that those games that occurred more recently weigh more.
The interesting thing in this approach is that when the strength parameters have values, the P(...) formula can be used immediately to calculate the win/lose probability for any future match. To pair contestants, you can pair those where the P(...) value is close to 0.5, and then prefer those contestants whose time-adjusted number of matches (sum of e^-t1 + e^-t2 + ...) for match ages t1, t2, ... is low. The best thing would be to calculate the total impact of a win or loss between two players globally and then prefer those matches that have the largest expected impact on the ratings, but that could require lots of calculations.
You don't need to run the maximum likelihood estimation / global optimization algorithm all the time; you can run it e.g. once a day as a batch run and use the results for the next day for matching people together. The time-adjusted match masses can be updated real time anyway.
On algorithm side, you can sort the players after the maximum likelihood run base on their s parameter, so it's very easy to find equal-strength players quickly.

Resources