How to split payments equally when accounting for transaction fees - algorithm

Iā€™m trying to figure out a code for determining how to split payments equally when there is a transaction fee involved in paying the parties.
Assuming there are 5 parties, and 1 of them receives $1000 that needs to be split out equally between the 5. How much should party 1 send the remaining 4 people accounting for a hypothetical 1.9% + $0.1 fee, such that each of the 5 people had the same balance at the end. In this scenario, party 1 can only make N-1 transactions, i.e, 4 transactions.
Any help would be greatly appreciated!

Let's define some names:
š‘› : the number of parties
š‘Ÿ : the fee coefficient (a coefficient between 0 and 1)
š‘ : the fee constant, applied after the coefficient is applied
š‘Ž : the initial amount
š‘” : the gross payment made in each transaction
š‘ : the net amount that every one will have in the end
We are asked to derive š‘” for given š‘›, š‘Ÿ, š‘ and š‘Ž with the following constraints:
In a transaction, the received amount (š‘) is the paid amount (š‘”) with fees applied:
š‘ = š‘”(1 āˆ’ š‘Ÿ) āˆ’ š‘
After making 4 transactions, the payer is left with the same amount as the receivers have (š‘):
š‘ = š‘Ž āˆ’ š‘”(š‘› āˆ’ 1)
This set of equalities can be resolved as follows:
Ā  Ā  Ā  š‘”(1 āˆ’ š‘Ÿ) āˆ’ š‘ = š‘Ž āˆ’ š‘”(š‘› āˆ’ 1)
Ā  Ā  Ā  ā‡” š‘”(1 āˆ’ š‘Ÿ) + š‘”(š‘› āˆ’ 1) = š‘Ž + š‘
Ā  Ā  Ā  ā‡” š‘”(š‘› āˆ’ š‘Ÿ) = š‘Ž + š‘
Ā  Ā  Ā  ā‡” š‘” = (š‘Ž + š‘) / (š‘› āˆ’ š‘Ÿ)
For the given example, we have this input:
Ā  Ā  Ā  š‘› = 5
Ā  Ā  Ā  š‘Ÿ = 0.019
Ā  Ā  Ā  š‘ = 0.10
Ā  Ā  Ā  š‘Ž = 1000
The result is thus:
Ā  Ā  Ā  š‘” = (š‘Ž + š‘) / (š‘› āˆ’ š‘Ÿ) = (1000 + 0.10) / (5 āˆ’ 0.019) = 200.78297530616342...
Verification of constraints:
Ā  Ā  Ā  š‘ = š‘”(1 āˆ’ š‘Ÿ) āˆ’ š‘ = 200.78297530616342 * (1 āˆ’ 0.019) āˆ’ 0.10 = 196.8680987753463...
Ā  Ā  Ā  š‘ = š‘Ž āˆ’ š‘”(š‘› āˆ’ 1) = 1000 āˆ’ 200.78297530616342 * (5 āˆ’ 1) = 196.8680987753463...
As we deal with dollars, we must round to the cent, and so there will be slight divergence.
The first party will pay 200.78 in 4 transactions and after these, each party will have 196.87, except the first party; they will have one cent more: 196.88
Snippet
Here is a little runnable code, where you can input the parameters and see the result:
const roundCents = x => Math.round(x * 100) / 100;
// Apply formula, but rounded to the cent
function solve (n, r, c, a) {
const g = roundCents((a + c) / (n - r));
const p1 = roundCents(g * (1 - r) - c);
const p2 = roundCents(a - g * (n - 1));
return [g, p1, p2];
}
// I/O management
document.addEventListener("input", refresh);
const inputs = document.querySelectorAll("input");
const outputs = document.querySelectorAll("span");
function refresh() {
const [n, pct, c, a] = Array.from(inputs, input => +input.value);
const results = solve(n, pct/100, c, a);
outputs.forEach((output, i) => output.textContent = results[i].toFixed(2));
}
refresh();
input { width: 5em}
Number of parties: <input type="number" value="5" min="2"><br>
Fee percentage: <input type="number" value="1.9" min="0" max="100" step="0.1"><br>
Fee constant: <input type="number" value="0.10" min="0" step="0.01"><br>
Initial amount: <input type="number" value="1000" min="0" step="0.01"><br>
<hr>
Gross payments: $<span></span><br>
Net received: $<span></span><br>
Net remaining: $<span></span><br>

Related

Why don't we include 0 matches while calculating jaccard distance between binary numbers?

I am working on a program based on Jaccard Distance, and I need to calculate the Jaccard Distance between two binary bit vectors. I came across the following on the net:
If p1 = 10111 and p2 = 10011,
The total number of each combination attributes for p1 and p2:
M11 = total number of attributes where p1 & p2 have a value 1,
M01 = total number of attributes where p1 has a value 0 & p2 has a value 1,
M10 = total number of attributes where p1 has a value 1 & p2 has a value 0,
M00 = total number of attributes where p1 & p2 have a value 0.
Jaccard similarity coefficient = J =
intersection/union = M11/(M01 + M10 + M11)
= 3 / (0 + 1 + 3) = 3/4,
Jaccard distance = J' = 1 - J = 1 - 3/4 = 1/4,
Or J' = 1 - (M11/(M01 + M10 + M11)) = (M01 + M10)/(M01 + M10 + M11)
= (0 + 1)/(0 + 1 + 3) = 1/4
Now, while calculating the coefficient, why was "M00" not included in the denominator? Can anyone please explain?
Jaccard coefficient is a measure of asymmetric binary attributes,f.e., a scenario where the presence of an item is more important than its absence.
Since M00 deals only with absence, we do not consider it while calculating Jaccard coeffecient.
For example, while checking for the presence/absence of a disease, the presence of the disease is the more significant outcome.
Hope it helps!
The Jacquard index of A and B is |Aāˆ©B|/|AāˆŖB| = |Aāˆ©B|/(|A| + |B| - |Aāˆ©B|).
We have: |Aāˆ©B| = M11, |A| = M11 + M10, |B| = M11 + M01.
So |Aāˆ©B|/(|A| + |B| - |Aāˆ©B|) = M11 / (M11 + M10 + M11 + M01 - M11) = M11 / (M10 + M01 + M11).
This Venn diagram may help:

Number of ways of distributing n identical balls into groups such that each group has atleast k balls?

I am trying to do this using recursion with memoization ,I have identified the following base cases .
I) when n==k there is only one group with all the balls.
II) when k>n then no groups can have atleast k balls,hence zero.
I am unable to move forward from here.How can this be done?
As an illustration when n=6 ,k=2
(2,2,2)
(4,2)
(3,3)
(6)
That is 4 different groupings can be formed.
This can be represented by the two dimensional recursive formula described below:
T(0, k) = 1
T(n, k) = 0 n < k, n != 0
T(n, k) = T(n-k, k) + T(n, k + 1)
^ ^
There is a box with k balls, No box with k balls, advance to next k
put them
In the above, T(n,k) is the number of distributions of n balls such that each box gets at least k.
And the trick is to think of k as the lowest possible number of balls, and seperate the problem to two scenarios: Is there a box with exactly k balls (if so, place them and recurse with n-k balls), or not (and then, recurse with minimal value of k+1, and same number of balls).
Example, to calculate your example: T(6,2) (6 balls, minimum 2 per box):
T(6,2) = T(4,2) + T(6,3)
T(4,2) = T(2,2) + T(4,3) = T(0,2) + T(2,3) + T(1,3) + T(4,4) =
= T(0,2) + T(2,3) + T(1,3) + T(0,4) + T(4,5) =
= 1 + 0 + 0 + 1 + 0
= 2
T(6,3) = T(3,3) + T(6,4) = T(0,3) + T(3,4) + T(2,4) + T(6,5)
= T(0,3) + T(3,4) + T(2,4) + T(1,5) + T(6,6) =
= T(0,3) + T(3,4) + T(2,4) + T(1,5) + T(0,6) + T(6,7) =
= 1 + 0 + 0 + 0 + 1 + 0
= 2
T(6,2) = T(4,2) + T(6,3) = 2 + 2 = 4
Using Dynamic Programming, it can be calculated in O(n^2) time.
This case can be solved pretty simple:
Number of buckets
The maximum-number of buckets b can be determined as follows:
b = roundDown(n / k)
Each valid distribution can use at most b buckets.
Number of distributions with x buckets
For a given number of buckets the number of distribution can be found pretty simple:
Distribute k balls to each bucket. Find the number of ways to distribute the remaining balls (r = n - k * x) to x buckets:
total_distributions(x) = bincoefficient(x , n - k * x)
EDIT: this will onyl work, if order matters. Since it doesn't for the question, we can use a few tricks here:
Each distribution can be mapped to a sequence of numbers. E.g.: d = {d1 , d2 , ... , dx}. We can easily generate all of these sequences starting with the "first" sequence {r , 0 , ... , 0} and subsequently moving 1s from the left to the right. So the next sequence would look like this: {r - 1 , 1 , ... , 0}. If only sequences matching d1 >= d2 >= ... >= dx are generated, no duplicates will be generated. This constraint can easily be used to optimize this search a bit: We can only move a 1 from da to db (with a = b - 1), if da - 1 >= db + 1 is given, since otherwise the constraint that the array is sorted is violated. The 1s to move are always the rightmost that can be moved. Another way to think of this would be to view r as a unary number and simply split that string into groups such that each group is atleast as long as it's successor.
countSequences(x)
sequence[]
sequence[0] = r
sequenceCount = 1
while true
int i = findRightmostMoveable(sequence)
if i == -1
return sequenceCount
sequence[i] -= 1
sequence[i + 1] -= 1
sequenceCount
findRightmostMoveable(sequence)
for i in [length(sequence) - 1 , 0)
if sequence[i - 1] > sequence[i] + 1
return i - 1
return -1
Actually findRightmostMoveable could be optimized a bit, if we look at the structure-transitions of the sequence (to be more precise the difference between two elements of the sequence). But to be honest I'm by far too lazy to optimize this further.
Putting the pieces together
range(1 , roundDown(n / k)).map(b -> countSequences(b)).sum()

Given an integer N, in how many ways can we tile a 4 x N rectangle with 3 x 1 tiles?

I know tiling problems are not uncommon and they are usually solved with Dynamic programming. I've also read quite similar question here about tiling 3XN rectangle with 2X1 tiles but I still have problem figuring out the recurrent relations.
Right now I know you can title 4X3 rectangle with 3X1 titles in 3 ways. So there's a relation here like f(n) = 3*f(n-3) + [other]. I've been scratching my head for a while now to figure what the 'other' part should be. Some help would be greatly appreciated.
I do not think you can get a closed form solution to this easily, instead you can do a bit mask DP that takes the state of the last 4x3 block with the first column unfulfilled.
The reason you can't get a closed form sol is because of tilings like these
======
||===|
||===|
||===|
And depending on the next tile you place your bitmask will change and you will be able to get some sort of recursive algorithm. You can read more about bitmask DPs here
http://www.quora.com/Algorithms/How-can-we-cover-an-MxN-area-with-minimum-cost-from-a-set-of-tiles-having-different-dimensions-and-different-cost
Following the link you've posted, I've tried to reconstruct the recursive relation so it will fit 4XN rectangles and 3X1 tiles, this is what i've got:
******** AAA******* BBB****** A*******
******** = BBB******* + A******** + A*******
******** CCC******* A******** A*******
******** DDD******* A******** BBB*****
f(n) = f(n-3) + g(n-1) + g(n-1)
******** AB******* AAA****** ABBB******
******** AB******* BBB****** ACCC******
******** = AB******* + CCC****** + ADDD******
****** ******* ******* ********
g(n) = f(n-2) + h(n-2) + i(n-2)
******** AAA******
******* ********
******* = ********
******* ********
h(n) = g(n-1)
******** AAA******
****** *******
****** = *******
****** *******
i(n) = j(n-2)
****** ******* ********
******* A******* AAA******
******* = A******* + BBB******
******* A******* CCC******
j(n) = f(n-1) i(n-1)
From that we get:
f(n) = f(n-3) + 2*g(n-1)
g(n) = f(n-2) + h(n-2) + i(n-2) ==> g(n) = f(n-2) + g(n-3) + i(n-2)
h(n) = g(n-1)
i(n) = j(n-2)
j(n) = f(n-1) + i(n-1)
And the stopping conditions for the functions are:
f(0) = 1, f(1) = 0, f(2) = 0
g(0) = 0, g(1) = 1, g(2) = 1
i(0) = 0, i(1) = 0
j(0) = 0
Hope this helps!
To prove that a recurrence is correct, you need to work out the cases as Ron did, but if you just want to know what it is experimentally, then a couple of terms (depending on the degree of the recurrence) may suffice. The first few are 1, 3, 13, 57, 249, 1087, 4745. Then you can solve for the coefficients with linear algebra.
[ 1 3 13] [x] [ 57]
[ 3 13 57] [y] = [ 249]
[13 57 249] [z] [1087]
The solution is x = 1 and y = -3 and z = 5. We can now verify that 57 - 3*249 + 5*1087 = 4745, and OEIS confirms (another great resource) that the recurrence indeed is T(N) = 5 T(N - 1) - 3 T(N - 2) + T(N - 3). Here's the Python code I used.
import numpy
memo = {frozenset(): 1}
def memoized_ntilings(s, k=3):
if (s in memo):
return memo[s]
(x, y) = min(s)
n = 0
h = frozenset((((x + i), y) for i in range(k)))
if h.issubset(s):
n += memoized_ntilings((s - h), k)
v = frozenset(((x, (y + i)) for i in range(k)))
if v.issubset(s):
n += memoized_ntilings((s - v), k)
memo[s] = n
return n
def ntilings(n, m=4):
return memoized_ntilings(frozenset(((x, y) for x in range(n) for y in range(m))))
def fibonacci(n):
(a, b) = (0, 1)
for i in range(n):
(a, b) = (b, (a + b))
return a
def guess_recurrence(callable):
degree = 1
while True:
ab = numpy.array([[callable((i + j)) for i in range((degree + 2))] for j in range((degree + 5))])
a = ab[:, :(- 1)]
b = ab[:, (- 1)]
result = numpy.linalg.lstsq(a, b)
x = result[0]
residuals = result[1]
if (numpy.linalg.norm(residuals) < (1e-12 * numpy.linalg.norm(b))):
return x
degree += 1
if (__name__ == '__main__'):
print(guess_recurrence(fibonacci))
print(guess_recurrence((lambda n: ntilings((n * 3)))))
The output is the following.
[ 1. 1.]
[ 1. -3. 5.]

Segmented Least Squares

Give an algorithm that takes a sequence of points in the plane (x_1, y_1), (x_2, y_2), ...., (x_n, y_n) and an integer k as input and returns the best piecewise linear function f consisting of at most k pieces that minimizes the sum squared error. You may assume that you have access to an algorithm that computes the sum squared error for one segment through a set of n points in Ī˜(n) time.The solution should use O(n^2k) time and O(nk) space.
Can anyone help me with this problem? Thank you so much!
(This is too late for your homework, but hope it helps anyway.)
First is dynamic programming in python / numpy for k = 4 only,
to help you understand how dynamic programming works;
once you understand that, writing a loop for any k should be easy.
Also, Cost[] is a 2d matrix, space O(n^2);
see the notes at the end for getting down to space O(n k)
#!/usr/bin/env python
""" split4.py: min-cost split into 4 pieces, dynamic programming k=4 """
from __future__ import division
import numpy as np
__version__ = "2014-03-09 mar denis"
#...............................................................................
def split4( Cost, verbose=1 ):
""" split4.py: min-cost split into 4 pieces, dynamic programming k=4
min Cost[0:a] + Cost[a:b] + Cost[b:c] + Cost[c:n]
Cost[a,b] = error in least-squares line fit to xy[a] .. xy[b] *including b*
or error in lsq horizontal lines, sum (y_j - av y) ^2 for each piece --
o--
o-
o---
o----
| | | |
0 2 5 9
(Why 4 ? to walk through step by step, then put in a loop)
"""
# speedup: maxlen 2 n/k or so
Cost = np.asanyarray(Cost)
n = Cost.shape[1]
# C2 C3 ... costs, J2 J3 ... indices of best splits
J2 = - np.ones(n, dtype=int) # -1, NaN mark undefined / bug
C2 = np.ones(n) * np.NaN
J3 = - np.ones(n, dtype=int)
C3 = np.ones(n) * np.NaN
# best 2-splits of the left 2 3 4 ...
for nleft in range( 1, n ):
J2[nleft] = j = np.argmin([ Cost[0,j-1] + Cost[j,nleft] for j in range( 1, nleft+1 )]) + 1
C2[nleft] = Cost[0,j-1] + Cost[j,nleft]
# an idiom for argmin j, min value c together
# best 3-splits of the left 3 4 5 ...
for nleft in range( 2, n ):
J3[nleft] = j = np.argmin([ C2[j-1] + Cost[j,nleft] for j in range( 2, nleft+1 )]) + 2
C3[nleft] = C2[j-1] + Cost[j,nleft]
# best 4-split of all n --
j4 = np.argmin([ C3[j-1] + Cost[j,n-1] for j in range( 3, n )]) + 3
c4 = C3[j4-1] + Cost[j4,n-1]
j3 = J3[j4]
j2 = J2[j3]
jsplit = np.array([ 0, j2, j3, j4, n ])
if verbose:
print "split4: len %s pos %s cost %.3g" % (np.diff(jsplit), jsplit, c4)
print "split4: J2 %s C2 %s" %(J2, C2)
print "split4: J3 %s C3 %s" %(J3, C3)
return jsplit
#...............................................................................
if __name__ == "__main__":
import random
import sys
import spread
n = 10
ncycle = 2
plot = 0
seed = 0
# run this.py a=1 b=None c=[3] 'd = expr' ... in sh or ipython
for arg in sys.argv[1:]:
exec( arg )
np.set_printoptions( 1, threshold=100, edgeitems=10, linewidth=100, suppress=True )
np.random.seed(seed)
random.seed(seed)
print "\n", 80 * "-"
title = "Dynamic programming least-square horizontal lines %s n %d seed %d" % (
__file__, n, seed)
print title
x = np.arange( n + 0. )
y = np.sin( 2*np.pi * x * ncycle / n )
# synthetic time series ?
print "y: %s av %.3g variance %.3g" % (y, y.mean(), np.var(y))
print "Cost[j,k] = sum (y - av y)^2 --" # len * var y[j:k+1]
Cost = spread.spreads_allij( y )
print Cost # .round().astype(int)
jsplit = split4( Cost )
# split4: len [3 2 3 2] pos [ 0 3 5 8 10]
if plot:
import matplotlib.pyplot as pl
title += "\n lengths: %s" % np.diff(jsplit)
pl.title( title )
pl.plot( y )
for js, js1 in zip( jsplit[:-1], jsplit[1:] ):
if js1 <= js: continue
yav = y[js:js1].mean() * np.ones( js1 - js + 1 )
pl.plot( np.arange( js, js1 + 1 ), yav )
# pl.legend()
pl.show()
Then, the following code does Cost[] for horizontal lines only, slope 0;
extending it to line segments of any slope, in time O(n), is left as an exercise.
""" spreads( all y[:j] ) in time O(n)
define spread( y[] ) = sum (y - average y)^2
e.g. spread of 24 hourly temperatures y[0:24] i.e. y[0] .. y[23]
around a horizontal line at the average temperature
(spread = 0 for constant temperature,
24 c^2 for constant + [c -c c -c ...],
24 * variance(y) )
How fast can one compute all 24 spreads
1 hour (midnight to 1 am), 2 hours ... all 24 ?
A simpler problem: compute all 24 averages in time O(n):
N = np.arange( 1, len(y)+1 )
allav = np.cumsum(y) / N
= [ y0, (y0 + y1) / 2, (y0 + y1 + y2) / 3 ...]
An identity:
spread(y) = sum(y^2) - n * (av y)^2
Voila: the code below, all spreads() in time O(n).
Exercise: extend this to spreads around least-squares lines
fit to [ y0, [y0 y1], [y0 y1 y2] ... ], not just horizontal lines.
"""
from __future__ import division
import sys
import numpy as np
#...............................................................................
def spreads( y ):
""" [ spread y[:1], spread y[:2] ... spread y ] in time O(n)
where spread( y[] ) = sum (y - average y )^2
= n * variance(y)
"""
N = np.arange( 1, len(y)+1 )
return np.cumsum( y**2 ) - np.cumsum( y )**2 / N
def spreads_allij( y ):
""" -> A[i,j] = sum (y - av y)^2, spread of y around its average
for all y[i:j+1]
time, space O(n^2)
"""
y = np.asanyarray( y, dtype=float )
n = len(y)
A = np.zeros((n,n))
for i in range(n):
A[i,i:] = spreads( y[i:] )
return A
So far we have an n x n cost matrix, space O(n^2).
To get down to space O( n k ),
look closely at the pattern of Cost[i,j] accesses in the dyn-prog code:
for nleft .. to n:
Cost_nleft = Cost[j,nleft ] -- time nleft or nleft^2
for k in 3 4 5 ...:
min [ C[k-1, j-1] + Cost_nleft[j] for j .. to nleft ]
Here Cost_nleft is one row of the full n x n cost matrix, ~ n segments, generated as needed.
This can be done in time O(n) for line segments.
But if "error for one segment through a set of n points takes O(n) time",
it seems we're up to time O(n^3). Comments anyone ?
If you can do least squares for some segment in n^2, it's easy to do what you want in n^2 k^2 with dynamic programming. You might be able to optimize that to a single k only.

how many n digit numbers are there with product p

What algorithm should we use to get the count of n digit numbers such that the product of its digits is p; the special condition here is that none of the digits should be 1;
What i have thought so far is to do a prime factorization of p. Say n=3 and p=24.
we first do a prime factorization of 24 to get : 2*2*2*3.
now i have problem in determining the combinations of these which are
4*2*3 , 2*4*3, .... etc
Even if can do so... how will I scale for n is way smaller than the count of primes.
I am not too sure if thats the right direction... any inputs are welcome.
First, you don't really need full prime decomposition, only decomposition to primes smaller than your base (I guess you mean 10 here but the problem can be generalized to any base). So we only need factorization into the first 4 primes: 2, 3, 5 and 7. If the rest (prime or not) factor is anything bigger than 1, then the problem has 0 solutions.
Now, lets assume that the number p is factored into:
p = 2^d1 * 3^d2 * 5^d3 * 7^d4
and is also composed from the n digits:
p = d(n-1)d(n-2)...d2d1d0
Then, rearranging the digits, is will also be:
p = 2^q2 * 3^q3 * 4^q4 * 5^q3 * ... * 9^q9
where qi >= 0 and q2 + q3 + ... q9 = n
and also (due to the factorization):
for prime=2: d1 = q2 + 2*q4 + q6 + 3*q8
for prime=3: d2 = q3 + q6 + 2*q9
for prime=5: d3 = q5
for prime=7: d4 = q7
So the q5 and q7 are fixed and we have to find all non-negative integer solutions to the equations:
(where the unknowns are the rest qi: q2, q3, q4, q6, q8 and q9)
d1 = q2 + 2*q4 + q6 + 3*q8
d2 = q3 + q6 + 2*q9
n - d3 - d4 = q2 + q3 + q4 + q6 + q8 + q9
For every one of the above solutions, there are several rearrangements of the digits, which can be found by the formula:
X = n! / ( q2! * q3! * ... q9! )
which have to be summed up.
There may be a closed formula for this, using generating functions, you could post it at Math.SE
Example for p=24, n=3:
p = 2^3 * 3^1 * 5^0 * 7^0
and we have:
d1=3, d2=1, d3=0, d4=0
The integer solutions to:
3 = q2 + 2*q4 + q6 + 3*q8
1 = q3 + q6 + 2*q9
3 = q2 + q3 + q4 + q6 + q8 + q9
are (q2, q3, q4, q6, q8, q9) =:
(2, 0, 0, 1, 0, 0)
(1, 1, 1, 0, 0, 0)
which give:
3! / ( 2! * 1! ) = 3
3! / ( 1! * 1! * 1! ) = 6
and 3+6 = 9 total solutions.
Example for p=3628800, n=10:
p = 2^8 * 3^4 * 5^1 * 7^1
and we have:
d1=8, d2=4, d3=1, d4=1
The integer solutions to:
8 = q2 + 2*q4 + q6 + 3*q8
4 = q3 + q6 + 2*q9
8 = q2 + q3 + q4 + q6 + q8 + q9
are (q2, q3, q4, q6, q8, q9) (along with the corresponding digits and the rearrangements per solution):
(5, 0, 0, 0, 1, 2) 22222899 57 10! / (5! 2!) = 15120
(4, 0, 2, 0, 0, 2) 22224499 57 10! / (4! 2! 2!) = 37800
(4, 1, 0, 1, 1, 1) 22223689 57 10! / (4!) = 151200
(3, 2, 1, 0, 1, 1) 22233489 57 10! / (3! 2!) = 302400
(4, 0, 1, 2, 0, 1) 22224669 57 10! / (4! 2!) = 75600
(3, 1, 2, 1, 0, 1) 22234469 57 10! / (3! 2!) = 302400
(2, 2, 3, 0, 0, 1) 22334449 57 10! / (3! 2! 2!) = 151200
(2, 4, 0, 0, 2, 0) 22333388 57 10! / (4! 2! 2!) = 37800
(3, 2, 0, 2, 1, 0) 22233668 57 10! / (3! 2! 2!) = 151200
(2, 3, 1, 1, 1, 0) 22333468 57 10! / (3! 2!) = 302400
(1, 4, 2, 0, 1, 0) 23333448 57 10! / (4! 2!) = 75600
(4, 0, 0, 4, 0, 0) 22226666 57 10! / (4! 4!) = 6300
(3, 1, 1, 3, 0, 0) 22234666 57 10! / (3! 3!) = 100800
(2, 2, 2, 2, 0, 0) 22334466 57 10! / (2! 2! 2! 2!) = 226800
(1, 3, 3, 1, 0, 0) 23334446 57 10! / (3! 3!) = 100800
(0, 4, 4, 0, 0, 0) 33334444 57 10! / (4! 4!) = 6300
which is 2043720 total solutions, if I haven't done any mistakes..
I don't think I'd start by tackling what is known to be a 'hard' problem, computing the prime decomposition. By I don't think I mean my gut feeling, rather than any rigorous computation of complexity, tells me.
Since you are ultimately only interested in the single-digit divisors of p I'd start by dividing p by 2, then by 3, then 4, all the way up to 9. Of course, some of these divisions won't produce an integer result in which case you can discard that digit from further consideration.
For your example of p = 24 you'll get {{2},12}, {{3},8}, {{4},6}, {{6},4}, {{8},3} (ie tuples of divisor and remainder). Now apply the approach again, though this time you are looking for the 2 digit numbers whose digits multiply to the remainder. That is, for {{2},12} you would get {{2,2},6},{{2,3},4},{{2,4},3},{{2,6},2}. As it happens all of these results deliver 3-digit numbers whose digits multiply to 24, but in general it is possible that some of the remainders will still have 2 or more digits and you'll need to trim the search tree at those points. Now go back to {{3},8} and carry on.
Note that this approach avoids having to separately calculate how many permutations of a set of digits you need to consider because it enumerates them all. It also avoids having to consider 2*2 and 4 as separate candidates for inclusion.
I expect you could speed this up with a little memoisation too.
Now I look forward to someone more knowledgeable in combinatorics telling us the closed-form solution to this problem.
You can use dynamic programming approach based on the following formula:
f[ n ][ p ] = 9 * ( 10^(n-1) - 9^(n-1) ), if p = 0
0, if n = 1 and p >= 10
1, if n = 1 and p < 10
sum{ f[ n - 1 ][ p / k ] for 0 < k < 10, p mod k = 0 }, if n > 1
The first case is a separate case for p = 0. This case calculates in O(1), besides helps to exclude k = 0 values from 4th case.
The 2nd and 3rd cases are the dynamic base.
The 4th case k sequentially takes all possible values of the last digit, and we sum up quantities of numbers with product p with last digit k by reducing to the same problem of smaller size.
This will have O( n * p ) running time if you implement dp with memorization.
PS: My answer is for more general problem than OP described. If condition that no digit must be equal to 1 must be satisfied, formulas can be adjusted as follows:
f[ n ][ p ] = 8 * ( 9^(n-1) - 8^(n-1) ), if p = 0
0, if n = 1 and p >= 10 or p = 1
1, if n = 1 and 1 < p < 10
sum{ f[ n - 1 ][ p / k ] for 1 < k < 10, p mod k = 0 }, if n > 1
For the N digit numbers and product of its digits is p;
For example if n = 3 and p =24
Arrangement would be as follow (Permutation)
= (p!)/(p-n)!
= (24!) /(24 -3)!
= (24 * 23 * 22 * 21 )! / 21 !
= (24 * 23 * 22 )
= 12144
So it would be 12144 arrangement can be made
And for Combination is as follow
= (p!)/(n!) * (p-n)!
= (24!) /(3!) * (24 -3)!
= (24 * 23 * 22 * 21 )! / (3!) * 21 !
= (24 * 23 * 22 ) / 6
= 2024
May this will help you
The problems seems contrived but in any case there are upper bounds to what you seen. For example p can have no prime divisor > 7 since it needs to be a single digit ("such that the product of its digits").
Hence suppose p = 1 * 2^a * 3^b * 5^c * 7^d.
2^a can come from ceil(a/3) to 'a' digits. 3^b can come from ceil(b/2) to 'b' digits. 5^c and 7^d can come from 'c' and 'd' digits respectively. The remaining digits can be filled with 1s.
Hence n can range from ceil(a/3)+ceil(b/2)+c+d to infinity while p has a set of fixed values.
Prime factorization feels like the right direction, though you don't need any prime greater than 7, so you can just divide by 2,3,5,7 repeatedly. (No solution if we don't get a prime, or get one > 7).
Once we have the prime factors, p % x and p / x can be implemented as constant time operations (you don't actually need p, you can just keep the prime factors).
My idea is, calculate the combinations with the algorithm below, and the permutations from there is easy.
getCombinations(map<int, int> primeCounts, int numSoFar, string str)
if (numSoFar == n)
if (primeCounts == allZeroes)
addCombination(str);
else
;// do nothing, too many digits
else if (primeCounts[7] >= 1) // p % 7
getCombinations(primeCounts - [7]->1, numSoFar-1, str + "7")
else if (primeCounts[5] >= 1) // p % 5
getCombinations(primeCounts - [5]->1, numSoFar-1, str + "5")
else if (primeCounts[3] >= 2) // p % 9
getCombinations(primeCounts - [3]->2, numSoFar-1, str + "9")
getCombinations(primeCounts - [3]->2, numSoFar-2, str + "33")
else if (primeCounts[2] >= 3) // p % 8
getCombinations(primeCounts - [2]->3, numSoFar-1, str + "8")
getCombinations(primeCounts - [2]->3, numSoFar-2, str + "24")
getCombinations(primeCounts - [2]->3, numSoFar-3, str + "222")
else if (primeCounts[3] >= 1 && primeCounts[2] >= 1) // p % 6
getCombinations(primeCounts - {[2]->1,[3]->1}, numSoFar-1, str + "6")
getCombinations(primeCounts - {[2]->1,[3]->1}, numSoFar-2, str + "23")
else if (primeCounts[2] >= 2) // p % 4
getCombinations(primeCounts - [2]->2, numSoFar-1, str + "4")
getCombinations(primeCounts - [2]->2, numSoFar-2, str + "22")
else if (primeCounts[3] >= 1) // p % 3
getCombinations(primeCounts - [3]->1, numSoFar-1, str + "3")
else if (primeCounts[2] >= 1) // p % 2
getCombinations(primeCounts - [2]->1, numSoFar-1, str + "2")
else ;// do nothing, too few digits
Given the order in which things are done, I don't think there would be duplicates.
Improvement:
You needn't look at p%7 again (deeper down the stack) once you've looked at p%5, since we know it can't be divisible by 7 any more, so a lot of those checks can be optimised away.
primeCounts needn't be a map, it can just be an array of length 4, and it needn't be copied, one can just increase and decrease the values appropriately. Something similar can be done with str as well (character array).
If there were too many digits for getCombinations(..., str + "8"), there's no point in checking "24" or "222". This and similar checks shouldn't be too difficult to implement (just have the function return a bool).

Resources