Maximize variable x and minimize variable y - algorithm

I could frame it like this. A bunch of people enter a pool willing to fund something. They can offer the funds at whatever interest rate they think is right. So basically they make bids. I want to find the best and cheapest route to fund the money that was required by the guy who created the pool. I want maximum funds and minimum interest rate while also minimizing the total number of people the guy will have to pay back to

You have 3 different things you are trying to maximize/minimize. You can use weighting variables to determine what is most important to your case.
Suppose α is the importance you want to put on total capital. β is the weight you want to put on the number of lenders. α & β are subject to the constraint α + β ≤ 1.
A general formula to maximize your pool (where positive numbers are desirable) is:
Result = α*(funds)+β*(-lenders)+(1-α-β)*(-interest rate)
The positive numbers wouldn't be apples to apples comparisons unless you normalized the funds, lenders, and interest rate values.

Related

How to calculate the tax bracket needed to achieve a given budget income and tax rate

I have and assignment where I have a database of incomes of citizens of a country (one record - income of a single citizen). A tax is to be imposed on the citizens with a given tax rate of r (for instance 30%) on incomes over a certain level (just like normal tax brackets work). The government budget wants to earn a given tax income from this tax (let's say x). What is the tax bracket? I'm writing this in R, but I need help with the algorithm, not the coding itself.
To make it more clear: if citizen number 2342 earns an income i, that is higher than the bracket y, than he pays (i-y)*r. If another citizen earns less than y, he doesn't pay any tax. I have let's say 10000 i's of 10000 citizens. All the taxes payed by all the citizens need to sum to a given value x and the r rate is given. How to calculate the y bracket?
Thanks for all the help
You have one more degree of freedom than will permit a unique solution. Let's consider the degenerate case of a single taxpayer, making 10,000 monetary units. Let's say that you need to collect 1,000 in taxes. Your equation is simply
1000 = (10000 - y) * r
You're asking for the algorithm to solve this for r. As you can see, this will result in a simple linear equation.
Now look at the problem with two taxpayers, making 10,000 and 9,000. Now you have:
1000 = (10000 - y) * r + max(0, (9000 - y)) * r
You still have a piecewise linear function, but with a change in slope at y = 9000, where the second taxpayer begins to contribute. Wherever you set that bracket, there is a value for r that yields the desired total income.
It's a minimization problem.
Pick a guess for y
Calculate how much tax will be collected
Check if collected tax is within a given tolerance of the desired tax. If that is the case then stop.
If the collected tax is too low, you need to decrease y, otherwise increase y. Adjust y accordingly and repeat from 2.

Algorithms to deal with apportionment problems

I need an algorithm, or technique, or any guidance to optimize the following problem:
I have two companies:
Company A has 324 employees
Company B has 190 employees
The total of employees (A+B) is 514. I need to randomly select 28% of these 514 employees.
Ok, so let's do it: 28% of 514 is 143.92; Oh... this is bad, we are dealing with people here, so we cannot have decimal places. Ok then, I'll try rounding that up or down.
If I round down: 143 is 27,82101167% which is not good, since I must have at least 28%, so I must round up to 144.
So now I know that 144 employees must be selected.
The main problem comes now... It's time to check how much percentage I must use for each company to get the total of 144. How do I do that in order to have the percentage as close as possible to 28% for each company?
I'll exemplify:
If I just apply 28% for each company I get:
Company A has 324 employers: 0.28 * 324 = 90.72
Company B has 190 employers: 0.28 * 190 = 53.2
Again, I end up with decimal places. So I must figure out which ones I should round up, and which ones should I round down to get 144 total.
Note: For this example I only used two companies, but in the real problem I have 30 companies.
There are many methods to perform apportionment, and no objective best method.
The following is in terms of states and seats rather than companies and people. Credit probably goes to Dr. Larry Bowen who is cited on the base site for the first link.
Hamilton’s Method
Also known as the Method of Largest Remainders and sometimes as Vinton's Method.
Procedure:
Calculate the Standard Divisor.
Calculate each state’s Standard Quota.
Initially assign each state its Lower Quota.
If there are surplus seats, give them, one at a time, to states in descending order of the fractional parts of their Standard Quota.
Here, the Standard Divisor can be found by dividing the total population (the sum of the population of each company) by the number of people you want to sample (144 in this case). The Standard Quota is the company's population divided by the Standard Divisor. The Lower Quota is this value rounded down. However, this method has some flaws.
Problems:
The Alabama Paradox An increase in the total number of seats to be apportioned causes a state to lose a seat.
The Population Paradox An increase in a state’s population can cause it to lose a seat.
The New States Paradox Adding a new state with its fair share of seats can affect the number of seats due other states.
This is probably the most simple method to implement. Below are some other methods with their accompanying implementations and drawbacks.
Jefferson’s Method Also known as the Method of Greatest Divisors and in Europe as the Method of d'Hondt or the Hagenbach-Bischoff Method.
Procedure:
Calculate the Standard Divisor.
Calculate each state’s Standard Quota.
Initially assign each state its Lower Quota.
Check to see if the sum of the Lower Quotas is equal to the correct number of seats to be apportioned.
If the sum of the Lower Quotas is equal to the correct number of seats to be apportioned, then apportion to each state the number of seats equal to its Lower Quota.
If the sum of the Lower Quotas is NOT equal to the correct number of seats to be apportioned, then, by trial and error, find a number, MD, called the Modified Divisor to use in place of the Standard Divisor so that when the Modified Quota, MQ, for each state (computed by dividing each State's Population by MD instead of SD) is rounded DOWN, the sum of all the rounded (down) Modified Quotas is the exact number of seats to be apportioned. (Note: The MD will always be smaller than the Standard Divisor.) These rounded (down) Modified Quotas are sometimes called Modified Lower Quotas. Apportion each state its Modified Lower Quota.
Problem:
Violates the Quota Rule. (However, it can only violate Upper Quota—never Lower Quota.)
Webster’s Method Also known as the Webster-Willcox Method as well as the Method of Major Fractions.
Procedure:
Calculate the Standard Divisor.
Calculate each state’s Standard Quota.
Initially assign a state its Lower Quota if the fractional part of its Standard Quota is less than 0.5. Initially assign a state its Upper Quota if the fractional part of its Standard Quota is greater than or equal to 0.5. [In other words, round down or up based on the arithmetic mean (average).]
Check to see if the sum of the Quotas (Lower and/or Upper from Step 3) is equal to the correct number of seats to be apportioned.
If the sum of the Quotas (Lower and/or Upper from Step 3) is equal to the correct number of seats to be apportioned, then apportion to each state the number of seats equal to its Quota (Lower or Upper from Step 3).
If the sum of the Quotas (Lower and/or Upper from Step 3) is NOT equal to the correct number of seats to be apportioned, then, by trial and error, find a number, MD, called the Modified Divisor to use in place of the Standard Divisor so that when the Modified Quota, MQ, for each state (computed by dividing each State's Population by MD instead of SD) is rounded based on the arithmetic mean (average) , the sum of all the rounded Modified Quotas is the exact number of seats to be apportioned. Apportion each state its Modified Rounded Quota.
Problem:
Violates the Quota Rule. (However, violations are rare and are usually associated with contrived situations.)
Huntington-Hill Method Also known as the Method of Equal Proportions.
Current method used to apportion U.S. House
Developed around 1911 by Joseph A. Hill, Chief Statistician of the Bureau of the Census and Edward V. Huntington, Professor of Mechanics & Mathematics, Harvard
Preliminary terminology: The Geometric Mean
Procedure:
Calculate the Standard Divisor.
Calculate each state’s Standard Quota.
Initially assign a state its Lower Quota if the fractional part of its Standard Quota is less than the Geometric Mean of the two whole numbers that the Standard Quota is immediately between (for example, 16.47 is immediately between 16 and 17). Initially assign a state its Upper Quota if the fractional part of its Standard Quota is greater than or equal to the Geometric Mean of the two whole numbers that the Standard Quota is immediately between (for example, 16.47 is immediately between 16 and 17). [In other words, round down or up based on the geometric mean.]
Check to see if the sum of the Quotas (Lower and/or Upper from Step 3) is equal to the correct number of seats to be apportioned.
If the sum of the Quotas (Lower and/or Upper from Step 3) is equal to the correct number of seats to be apportioned, then apportion to each state the number of seats equal to its Quota (Lower or Upper from Step 3).
If the sum of the Quotas (Lower and/or Upper from Step 3) is NOT equal to the correct number of seats to be apportioned, then, by trial and error, find a number, MD, called the Modified Divisor to use in place of the Standard Divisor so that when the Modified Quota, MQ, for each state (computed by dividing each State's Population by MD instead of SD) is rounded based on the geometric mean, the sum of all the rounded Modified Quotas is the exact number of seats to be apportioned. Apportion each state its Modified Rounded Quota.
Problem:
Violates the Quota Rule.
For reference, the Quota Rule :
Quota Rule
An apportionment method that always allocates only lower and/or upper bounds follows the quota rule.
The problem can be framed as that of finding the closest integer approximation to a set of ratios. For instance, if you want to assign respectively A, B, C ≥ 0 members from 3 groups to match the ratios a, b, c ≥ 0 (with a + b + c = N > 0), where N = A + B + C > 0 is the total allocation desired, then you're approximating (a, b, c) by (A, B, C) with A, B and C restricted to integers.
One way to solve this may be to set it up as a least squares problem - that of minimizing |a - A|² + |b - B|² + |c - C|²; subject to the constraints A + B + C = N and A, B, C ≥ 0.
A necessary condition for the optimum is that it be a local optimum with respect discrete unit changes. For instance, (A,B,C) → (A+1,B-1,C), if B > 0 ... which entails the condition (A - B ≥ a - b - 1 or B = 0).
For the situation at hand, the optimization problem is:
|A - a|² + |B - b|²
a = 144×324/(324+190) ≅ 90.770, b = 144×190/(324+190) ≅ 53.230
which leads to the conditions:
A - B ≥ a - b - 1 ≅ +36.541 or B = 0
B - A ≥ b - a - 1 ≅ -38.541 or A = 0
A + B = 144
Since they are integers the inequalities can be strengthened:
A - B ≥ +37 or B = 0
B - A ≥ -38 or A = 0
A + B = 144
The boundary cases A = 0 and B = 0 are ruled out, since they don't satisfy all three conditions. So, you're left with 37 ≤ A - B ≤ 38 or, since A + B = 144: 181 ≤ 2A ≤ 182 or A = 91 ... and B = 53.
It is quite possible that this way of framing the problem may be equivalent, in terms of its results, to one of the algorithms cited in an earlier reply.
My suggestion is to just take 28% of each company and round up to the nearest person.
In your case, you would go with 91 and 54. Admittedly, this does result in having a bit over 28%.
The most accurate method is as follows:
Calculate the exact number that you want.
Take 28% for each company and round down.
Sort the companies in descending order by the remainder.
Go through the list and choose the top n elements until you get exactly the number you want.
Since I originally posted this question I came across a description of this exact problem in Martin Fowler's book "Patterns of Enterprise Application Architecture" (page 489 and 490).
Martin talks about a "Matt Foemmel’s simple conundrum" of dividing 5 cents between two accounts, but must obey the distribution of 70% and 30%. This describes my problem in a much simpler way.
Here are the solutions he presents in his book to that problem:
Perhaps the most common is to ignore it—after all, it’s only a penny here
and there. However this tends to make accountants understandably nervous.
When allocating you always do the last allocation by subtracting from
what you’ve allocated so far. This avoids losing pennies, but you can get a
cumulative amount of pennies on the last allocation.
Allow users of a Money class to declare the rounding scheme when they call
the method. This permits a programmer to say that the 70% case rounds up and
the 30% rounds down. Things can get complicated when you allocate across ten
accounts instead of two. You also have to remember to round. To encourage people to remember I’ve seen some Money classes force a rounding parameter into
the multiply operation. Not only does this force the programmer to think about
what rounding she needs, it also might remind her of the tests to write. However,
it gets messy if you have a lot of tax calculations that all round the same way.
My favorite solution: have an allocator function on the money. The
parameter to the allocator is a list of numbers, representing the ratio to be allocated (it would look something like aMoney.allocate([7,3])). The allocator returns
a list of monies, guaranteeing that no pennies get dropped by scattering them
across the allocated monies in a way that looks pseudo-random from the outside. The allocator has faults: You have to remember to use it and any precise
rules about where the pennies go are difficult to enforce.

How do I calculate the most profit-dense combination in the most efficient way?

I have a combinations problem that's bothering me. I'd like someone to give me their thoughts and point out if I'm missing some obvious solution that I may have overlooked.
Let's say that there is a shop that buys all of its supplies from one supplier. The supplier has a list of items for sale. Each item has the following attributes:
size, cost, quantity, m, b
m and b are constants in the following equation:
sales = m * (price) + b
This line slopes downward. The equation tells me how many of that item I will be able to sell if I charge that particular price. Each item has its own m and b values.
Let's say that the shop has limited storage space, and limited funds. The shop wants to fill its warehouse with the most profit-dense items possible.
(By the way, profit density = profit/size. I'm defining that profit density be only with regard to the items size. I could work with the density with regard to size and cost, but to do that I'd have to know the cost of warehouse space. That's not a number I know currently, so I'm just going to use size.)
The profit density of items drops the more you buy (see below.)
If I flip the line equation, I can see what price I'd have to charge to sell some given amount of the item in some given period of time.
price = (sales-b)/m
So if I buy n items and wanted to sell all of them, I'd have to charge
price = (n-b)/m
The revenue from this would be
price*n = n*(n-b)/m
The profit would be
price*n-n*cost = n*(n-b)/m - n*cost
and the profit-density would be
(n*(n-b)/m - n*cost)/(n*size)
or, equivalently
((n-b)/m - cost)/size
So let's say I have a table containing every available item, and each item's profit-density.
The question is, how many of each item do I buy in order to maximise the amount of money that the shop makes?
One possibility is to generate every possible combination of items within the bounds of cost and space, and choose the combo with the highest profitability. In a list of 1000 items, this takes too long. (I tried this and it took 17 seconds for a list of 1000. Horrible.)
Another option I tried (on paper) was to take the top two most profitable items on the list. Let's call the most profitable item A, the 2nd-most profitable item B, and the 3rd-most profitable item C. I buy as many of item A as I can until it's less profitable than item B. Then I repeat this process using B and C, for every item in the list.
It might be the case however, that after buying item B, item A is again the most profitable item, more so than C. So this would involve hopping from the current most profitable item to the next until the resources are exhausted. I could do this, but it seems like an ugly way to do it.
I considered dynamic programming, but since the profit-densities of the items change depending on the amount you buy, I couldn't come up with a resolution for this.
I've considered multiple-linear regression, and by 'consider' I mean I've said to myself "is multi-linear regression an option?" and then done nothing with it.
My spidey-sense tells me that there's a far more obvious method staring me in the face, but I'm not seeing it. Please help me kick myself and facepalm at the same time.
If you treat this as a simple exercise in multivariate optimization, where the controllable variables are the quantities bought, then you are optimizing a quadratic function subject to a linear constraint.
If you use a Lagrange multiplier and differentiate then you get a linear equation for each quantity variable involving itself and the Lagrange multiplier as the only unknowns, and the constraint gives you a single linear equation involving all of the quantities. So write each quantity as a linear function of the Lagrange multiplier and substitute into the constraint equation to get a linear equation in the Lagrange multiplier. Solve this and then plug the Lagrange multiplier into the simpler equations to get the quantities.
This gives you a solution if you are allowed to buy fractional and negative quantities of things if required. Clearly you are not, but you might hope that nothing is very negative and you can round the non-integer quantities to get a reasonable answer. If this isn't good enough for you, you could use it as a basis for branch and bound. If you make an assumption on the value of one of the quantities and solve for the others in this way, you get an upper bound on the possible best answer - the profit predicted neglecting real world constraints on non-negativity and integer values will always be at least the profit earned if you have to comply with these constraints.
You can treat this as a dynamic programming exercise, to make the best use of a limited resource.
As a simple example, consider just satisfying the constraint on space and ignoring that on cost. Then you want to find the items that generate the most profit for the available space. Choose units so that expressing the space used as an integer is reasonable, and then, for i = 1 to number of items, work out, for each integer value of space up to the limit, the selection of the first i items that gives the most return for that amount of space. As usual, you can work out the answers for i+1 from the answers for i: for each value from 0 up to the limit on space just consider all possible quantities of the i+1th item up to that amount of space, and work out the combined return from using that quantity of the item and then using the remaining space according to the answers you have already worked out for the first i items. When i reaches the total number of items you will be working out the best possible return for the problem you actually want to solve.
If you have constraints for both space and cost, then the state of the dynamic program is not the single variable (space) but a pair of variables (space, cost) but you can still solve it, although with more work. Consider all possible values of (space, cost) from (0, 0) up to the actual constraints - you have a 2-dimensional table of returns to compute instead of a single set of values from 0 to max-space. But you can still work from i=1 to N, computing the highest possible return for the first i items for each limit of (space, cost) and using the answers for i to compute the answers for i+1.

Open-ended tournament pairing algorithm

I'm developing a tournament model for a virtual city commerce game (Urbien.com) and would love to get some algorithm suggestions. Here's the scenario and current "basic" implementation:
Scenario
Entries are paired up duel-style, like on the original Facemash or Pixoto.com.
The "player" is a judge, who gets a stream of dueling pairs and must choose a winner for each pair.
Tournaments never end, people can submit new entries at any time and winners of the day/week/month/millenium are chosen based on the data at that date.
Problems to be solved
Rating algorithm - how to rate tournament entries and how to adjust their ratings after each match?
Pairing algorithm - how to choose the next pair to feed the player?
Current solution
Rating algorithm - the Elo rating system currently used in chess and other tournaments.
Pairing algorithm - our current algorithm recognizes two imperatives:
Give more duels to entries that have had less duels so far
Match people with similar ratings with higher probability
Given:
N = total number of entries in the tournament
D = total number of duels played in the tournament so far by all players
Dx = how many duels player x has had so far
To choose players x and y to duel, we first choose player x with probability:
p(x) = (1 - (Dx / D)) / N
Then choose player y the following way:
Sort the players by rating
Let the probability of choosing player j at index jIdx in the sorted list be:
p(j) = ...
0, if (j == x)
n*r^abs(jIdx - xIdx) otherwise
where 0 < r < 1 is a coefficient to be chosen, and n is a normalization factor.
Basically the probabilities in either direction from x form a geometic series, normalized so they sum to 1.
Concerns
Maximize informational value of a duel - pairing the lowest rated entry against the highest rated entry is very unlikely to give you any useful information.
Speed - we don't want to do massive amounts of calculations just to choose one pair. One alternative is to use something like the Swiss pairing system and pair up all entries at once, instead of choosing new duels one at a time. This has the drawback (?) that all entries submitted in a given timeframe will experience roughly the same amount of duels, which may or may not be desirable.
Equilibrium - Pixoto's ImageDuel algorithm detects when entries are unlikely to further improve their rating and gives them less duels from then on. The benefits of such detection are debatable. On the one hand, you can save on computation if you "pause" half the entries. On the other hand, entries with established ratings may be the perfect matches for new entries, to establish the newbies' ratings.
Number of entries - if there are just a few entries, say 10, perhaps a simpler algorithm should be used.
Wins/Losses - how does the player's win/loss ratio affect the next pairing, if at all?
Storage - what to store about each entry and about the tournament itself? Currently stored:
Tournament Entry: # duels so far, # wins, # losses, rating
Tournament: # duels so far, # entries
instead of throwing in ELO and ad-hoc probability formulae, you could use a standard approach based on the maximum likelihood method.
The maximum likelihood method is a method for parameter estimation and it works like this (example). Every contestant (player) is assigned a parameter s[i] (1 <= i <= N where N is total number of contestants) that measures the strength or skill of that player. You pick a formula that maps the strengths of two players into a probability that the first player wins. For example,
P(i, j) = 1/(1 + exp(s[j] - s[i]))
which is the logistic curve (see http://en.wikipedia.org/wiki/Sigmoid_function). When you have then a table that shows the actual results between the users, you use global optimization (e.g. gradient descent) to find those strength parameters s[1] .. s[N] that maximize the probability of the actually observed match result. E.g. if you have three contestants and have observed two results:
Player 1 won over Player 2
Player 2 won over Player 3
then you find parameters s[1], s[2], s[3] that maximize the value of the product
P(1, 2) * P(2, 3)
Incidentally, it can be easier to maximize
log P(1, 2) + log P(2, 3)
Note that if you use something like the logistics curve, it is only the difference of the strength parameters that matters so you need to anchor the values somewhere, e.g. choose arbitrarily
s[1] = 0
In order to have more recent matches "weigh" more, you can adjust the importance of the match results based on their age. If t measures the time since a match took place (in some time units), you can maximize the value of the sum (using the example)
e^-t log P(1, 2) + e^-t' log P(2, 3)
where t and t' are the ages of the matches 1-2 and 2-3, so that those games that occurred more recently weigh more.
The interesting thing in this approach is that when the strength parameters have values, the P(...) formula can be used immediately to calculate the win/lose probability for any future match. To pair contestants, you can pair those where the P(...) value is close to 0.5, and then prefer those contestants whose time-adjusted number of matches (sum of e^-t1 + e^-t2 + ...) for match ages t1, t2, ... is low. The best thing would be to calculate the total impact of a win or loss between two players globally and then prefer those matches that have the largest expected impact on the ratings, but that could require lots of calculations.
You don't need to run the maximum likelihood estimation / global optimization algorithm all the time; you can run it e.g. once a day as a batch run and use the results for the next day for matching people together. The time-adjusted match masses can be updated real time anyway.
On algorithm side, you can sort the players after the maximum likelihood run base on their s parameter, so it's very easy to find equal-strength players quickly.

Calculating annual percentage rate when repayment amount is not constant

I need to calculate APR for loans. Constant repayment loans were covered clearly here Calculating annual percentage rate (need some help with inherited code)
My problem is where the repayment amount is not constant. The monthly repayments can differ and therefore Newton-Raphson does not seem applicable.
The formula is still 0 = loan amount - sum[Rp/(1+x)^p] where Rp is the repayment amount for repayment p. There are n repayments. Is there a way to solve this or is there a good way to make a good second guess to x based on the results of previous guesses?
It sounds like you're given the Rp values and want to calculate x. You can just use Newton-Raphson as before - the question you linked to showed you how to do that.
For this one, you just need to change your F(x) and F'(x) functions.
F(x) = loan amount - sum[Rp/(1+x)^p]
You'll have to write code with a little loop in it to do the sum.
F'(x) = + sum[-p*Rp/(1+x)^(p+1)]
A little loop there and you're set.

Resources