How to balance fill int into a symmetric matrix - algorithm

Suppose I have a matrix A, it is symmetric. That is A(i,j)=A(j,i)
The value of A(i,j) can be i or j.
How can I fill the value into matrix A to make sure the exist times of each value as close as possible? (or as balance as possible)? Is there any algorithm can handle this?
Example A:
A = 1 1 1 1
1 2 2 2
1 2 3 3
1 2 3 4
exist times of 1 is 7
exist times of 2 is 5
exist times of 3 is 3
exist times of 4 is 1
Example B:
A = 1 2 1 1
2 2 3 2
1 3 3 4
1 2 4 4
exist times of 1 is 5
exist times of 2 is 4
exist times of 3 is 3
exist times of 4 is 3
In example B the values is (5,4,3,3), they are closer than example A (7,5,3,1)
I am looking forward a solution for nxn matrix.
Extend
If the matrix is sparse, that is the some element can not be filled in matrix. Which algorithm can be used to handle this problem?
Thanks for your time.

Found one solution, but without a real algorithm...
1 2 3 1 1
2 2 3 4 2
3 3 3 4 5
1 4 4 4 5
1 2 5 5 5
Basically: 25/5=5, looked for how to fill with 5 of each 1-5.
for 5 - reversed L from corner,
then up and left one spot for 4s,
and for 3s.
got "creative" for 2s and 1s...
I guess it's kind of algorithm...

Here is a solution written in python based on Weighted Bipartite Matching (or the isomorphic Minimum Cost Flow problem.)
#!/usr/bin/python
"""
filename: mcf_matrix_assign.py
purpose: demonstrate the use of weighted bipartite matching (isomorphic to MCF
with a suitable transform) to solve a matrix assignment problem with
certain conditions and optimization goals.
"""
import networkx as nx
N = 5
K = N # ensure K is large enough to satisfy flow, N <= K <= N*N
# setting K larger simply means a longer runtime
G = nx.DiGraph()
total_demand = 0
for i in range(N*N):
# assert a row-major linear indexing of the matrix
row, col = i / N, i % N
if row >= col:
continue # symmetry fix certain values
total_demand += 1
G.add_node('s'+str(i),demand=-1);
G.add_edge('s'+str(i), 'v'+str(row), weight = 0, capacity = 1)
G.add_edge('s'+str(i), 'v'+str(col), weight = 0, capacity = 1)
G.add_node('sink', demand = total_demand)
# attach each 'value' to the sink with incrementally larger weight
for i in range(N):
for j in range(K):
dummy_node = 'v'+str(i)+'w'+str(j)
G.add_edge('v'+str(i), dummy_node, weight = j, capacity = 1)
G.add_edge(dummy_node, 'sink', weight = 0, capacity = 1)
flow_dict = nx.min_cost_flow(G)
# decode the solution to get the matrix assignment reported by the MCF (or
# equivalently weighted bipartite matching)
solution = [ -1 for i in range(N*N) ]
for i in range(N*N):
# assert a row-major linear indexing of the matrix
row, col = i / N, i % N
if row == col:
solution[i] = row
continue # symmetry fix certain values
if row > col:
solution[i] = solution[col*N+row]
continue # symmetry fix certain values
adjacency = flow_dict['s'+str(i)]
solution[i] = row if adjacency['v'+str(row)] == 1 else col;
# print the solution
for row in range(N):
print ''.join(['-' for _ in range(4*N+1)])
print '|',
for col in range(N):
print str(solution[row*N+col]+1) + ' |',
print '\n',
print ''.join(['-' for _ in range(4*N+1)])
print 'Histogram summary:'
counts = [ (i+1, sum([ 0 if s != i else 1 for s in solution ])) for i in range(N) ]
for value, count in counts:
print ' Value ', value, " appears ", count, " times."
This produces the solution:
---------------------
| 1 | 1 | 3 | 1 | 5 |
---------------------
| 1 | 2 | 2 | 4 | 2 |
---------------------
| 3 | 2 | 3 | 4 | 3 |
---------------------
| 1 | 4 | 4 | 4 | 5 |
---------------------
| 5 | 2 | 3 | 5 | 5 |
---------------------
Histogram summary:
Value 1 appears 5 times.
Value 2 appears 5 times.
Value 3 appears 5 times.
Value 4 appears 5 times.
Value 5 appears 5 times.
And here is the solution when N=4 in the script.
-----------------
| 1 | 2 | 1 | 4 |
-----------------
| 2 | 2 | 3 | 4 |
-----------------
| 1 | 3 | 3 | 3 |
-----------------
| 4 | 4 | 3 | 4 |
-----------------
Histogram summary:
Value 1 appears 3 times.
Value 2 appears 3 times.
Value 3 appears 5 times.
Value 4 appears 5 times.
It's fairly easy to prove that this will always find an optimal answer in polynomial time.
Explanation
It is probably easiest to explain what is happening by describing the graph construction for a small case. For this discussion, fix N=3.
In this case we have a matrix assignment with variables
X s0 s1
X X s2
X X X
where X denotes a fixed value and sk denotes the kth slot in the array to fill.
In this case we also have 3 available value assignments [1,2,3] for each of the slots sk. (This is where it is easy to make modifications to the "allowed" values for any sk.)
If we construct a bipartite graph between the slots sk and the value assignments v1,v2,v3 in a way that edges of capacity 1 and weight zero are used to connect sk to each legal vi assignment, we can then solve it easily using MCF.
For illustration, the appropriate graph for N=3 is shown below:
Once the minimum cost flow is computed, we can decode the assignment by checking which edges are used in the solution.
A note on performance
networkx was used here in python purely out of convenience, it is by no means efficient in any sense of the word. The quality of implementation of the MCF algorithm in networkx is quite low and I would not recommend trying to scale it up.
For serious application, I would instead recommend the lemon MCF library (in particular the cost-scaling algorithm is competitive) or, you can use Andrew Goldberg's implementation of cost-scaling (which is hard to find but exists) and is probably quite efficient as well.

There is a special pattern to follow in order to get the best possible result. For each column (of row 1), start filling the matrix diagonally with values 1, 2, ..., n, fixing the correspondent symmetric slot. At the end, you will have the best possible result.
#include <iostream>
using namespace std;
int main(){
int n = 4; //size of matrix
int values[n]; for(int i = 0; i < n; i++) values[i] = 0;
int matrix[n][n]; for(int i = 0; i < n; i++) for(int j = 0; j < n; j++) matrix[i][j] = -1;
for(int c = 0; c < n; c++){
int i = 0, j = c;
for(int x = 0; x < n; x++){
if(matrix[i][j] != -1) {
break;
}
matrix[i][j] = matrix[j][i] = x;
i = (i + 1) % n;
j = (j + 1) % n;
}
}
for(int i = 0; i < n; i++){
for(int j = 0; j < n; j++){
cout<<matrix[i][j] + 1<<" ";
values[matrix[i][j]]++;
}
cout<<endl;
}
cout<<endl;
for (int i = 0; i < n; i++) {
cout<<(i + 1)<<" appears "<<values[i]<<" times"<<endl;
}
return 0;
}
OUTPUT
1 1 1 4
1 2 2 2
1 2 3 3
4 2 3 4
1 appears 5 times
2 appears 5 times
3 appears 3 times
4 appears 3 times
You can test it here.
The complexity is O(n²), since you have to fill all the matrix.
When n is odd, the solution is always n occurrences for each number, but when n is even, this is impossible.

Related

How to solve M times prefix sum with better time complexity

The problem is to find the prefix sum of array of length N by repeating the process M times. e.g.
Example N=3
M=4
array = 1 2 3
output = 1 6 21
Explanation:
Step 1 prefix Sum = 1 3 6
Step 2 prefix sum = 1 4 10
Step 3 prefix sum = 1 5 15
Step 4(M) prefix sum = 1 6 21
Example 2:
N=5
M=3
array = 1 2 3 4 5
output = 1 5 15 35 70
I was not able to solve the problem and kept getting lime limit exceeded. I used dynamic programming to solve it in O(NM) time. I looked around and found the following general mathematical solution but I still not able to solve it because my math isn't that great to understand it. Can someone solve it in a better time complexity?
https://math.stackexchange.com/questions/234304/sum-of-the-sum-of-the-sum-of-the-first-n-natural-numbers
Hint: 3, 4, 5 and 6, 10, 15 are sections of diagonals on Pascal's Triangle.
JavaScript code:
function f(n, m) {
const result = [1];
for (let i = 1; i < n; i++)
result.push(result[i-1] * (m + i + 1) / i);
return result;
}
console.log(JSON.stringify(f(3, 4)));
console.log(JSON.stringify(f(5, 3)));

Count the number of all the shortest paths for any two vertices in a graph

For any pair of different vertices in a given undirected graph G= , I want to find the number of all the shortest paths ("SP", in abbreviation) (it is not required or necessary to find/print the exact vertices on a certain path). For example, for the following graph given in edge-list format, there are two SPs: (1,3,2) and (1,4,2).
vertex =
1 3
2 4
1 4
2 3
1 8
4 7
3 6
5 2
I want to implement this algorithm based on Floyd-Warshall algorithm , which is a famous algorithm based on the idea of dynamic programming to find the value of the shortest path for each pair of vertices in O(N^3), say the result is an 2D array a[n][n]. n is the number of vertices. For the above graph, it is:
0 2 1 1 3 2 2 1
2 0 1 1 1 2 2 3
1 1 0 2 2 1 3 2
1 1 2 0 2 3 1 2
3 1 2 2 0 3 3 4
2 2 1 3 3 0 4 3
2 2 3 1 3 4 0 3
1 3 2 2 4 3 3 0
The code for constructing graph matrix G and solving for matrix a is as follows:
v = vertex(:,1);
t = vertex(:,2);
G = zeros( max(max(v),max(t)));
% Build the matrix for graph:
for i = 1:length(v)
G(v(i), t(i)) = G(v(i), t(i)) + 1;
G(t(i), v(i)) = G(v(i), t(i)); % comment here is input is bi-directional
end
a = G;
n = length(a);
a(a==0) = Inf;
a(1:n+1:n^2)=0; % diagonal element to be zero
for k = 1:n
for i= 1:n
for j= 1:n % for j=i+1:n
if a(i,j) > a(i,k) + a(k,j)
a(i,j) = a(i,k) + a(k,j);
% a(j,i) = a(i,j);
end
end
end
end
Now, let's defined a 2D array b[n][n] as the number of ALL the SPs for each pair of vertices. For example, we expect b[1][2] = 2.
I wrote the following code in MATLAB (if you are not familiar whit MATLAB, just treat it as a pseudo-code). It gives almost correct values for all the pairs except several wrong values for certain pairs. For example. after running the cod,e b[5][8] = 0, which is wrong (correct answer should be 2)
%%
% find the number of ALL SP paths for ALL pairs based on the "a" array:
% b is a two-dim array, b(i,j) is the total number of SP for pair( i,j)
b = G;
for k=1:n
for i=1:n
for j= i+1:n
if(i==j)
continue; % b(i,i)=0
end
if (k==j) % the same as : G(k,j)==0
continue;
end
if(k==i && G(k,j)~=0)
b(i,j) = 1;
continue;
end
if(a(i,j) ~= a(i,k)+G(k,j)) % w(u,v)=G(u,v) in unweighted graph)
continue;
end
% sigma(s,v) = sigma(s,v) + sigma(s,u);
b(i,j) = b(i,j) + b(k,i);
end
end
end

Algorithm for generating strings of +/-s with a specific property

I am interested in writing a function generate(n,m) which exhaustively generating strings of length n(n-1)/2 consisting solely of +/- characters. These strings will then be transformed into an n × n symmetric (-1,0,1)-matrix in the following way:
toTriangle["+--+-+-++-"]
{{1, -1, -1, 1}, {-1, 1, -1}, {1, 1}, {-1}}
toMatrix[%, 0] // MatrixForm
| 0 1 -1 -1 1 |
| 1 0 -1 1 -1 |
matrixForm = |-1 -1 0 1 1 |
|-1 1 1 0 -1 |
| 1 -1 1 -1 0 |
Thus the given string represents the upper-right triangle of the matrix, which is then reflected to generate the rest of it.
Question: How can I generate all +/- strings such that the resulting matrix has precisely m -1's per row?
For example, generate(5,3) will give all strings of length 5(5-1)/2 = 10 such that each row contains precisely three -1's.
I'd appreciate any help with constructing such an algorithm.
This is the logic to generate every matrix for a given n and m. It's a bit convoluted, so I'm not sure how much faster than brute force an implementation would be; I assume the difference will become more pronounced for larger values.
(The following will generate an output of zeros and ones for convenience, where zero represents a plus and a one represents a minus.)
A square matrix where each row has m ones translates to a triangular matrix where these folded row/columns have m ones:
x 0 1 0 1 x 0 1 0 1 0 1 0 1
0 x 1 1 0 x 1 1 0 1 1 0
1 1 x 0 0 x 0 0 0 0
0 1 0 x 1 x 1 1
1 0 0 1 x x
Each of these groups overlaps with all the other groups; choosing values for the first k groups means that the vertical part of group k+1 is already determined.
We start by putting the number of ones required per row on the diagonal; e.g. for (5,2) that is:
2 . . . .
2 . . .
2 . .
2 .
2
Then we generate every bit pattern with m ones for the first group; there are (n-1 choose m) of these, and they can be efficiently generated, e.g. with Gosper's hack.
(4,2) -> 0011 0101 0110 1001 1010 1100
For each of these, we fill them in in the matrix, and subtract them from the numbers of required ones:
X 0 0 1 1
2 . . .
2 . .
1 .
1
and then recurse with the smaller triangle:
2 . . .
2 . .
1 .
1
If we come to a point where some of the numbers of required ones on the diagonal are zero, e.g.:
2 . . .
1 . .
0 .
1
then we can already put a zero in this column, and generate the possible bit patterns for fewer columns; in the example that would be (2,2) instead of (3,2), so there's only one possible bit pattern: 11. Then we distribute the bit pattern over the columns that have a non-zero required count under them:
2 . 0 . X 1 0 1
1 . . 0 . .
0 . 0 .
1 0
However, not all possible bit patterns will lead to valid solutions; take this example:
2 . . . . X 0 0 1 1
2 . . . 2 . . . 2 . . . X 0 1 1
2 . . 2 . . 2 . . 2 . . 2 . .
2 . 1 . 1 . 0 . 0 .
2 1 1 0 0
where we end up with a row that requires another 2 ones while both columns can no longer take any ones. The way to spot this situation is by looking at the list of required ones per column that is created by each option in the penultimate step:
pattern required
0 1 1 -> 2 0 0
1 0 1 -> 1 1 0
1 1 0 -> 1 0 1
If the first value in the list is x, then there must be at least x non-zero values after it; which is false for the first of the three options.
(There is room for optimization here: in a count list like 1,1,0,6,0,2,1,1 there are only 2 non-zero values before the 6, which means that the 6 will be decremented at most 2 times, so its minimum value when it becomes the first element will be 4; however, there are only 3 non-zero values after it, so at this stage you already know this list will not lead to any valid solutions. Checking this would add to the code complexity, so I'm not sure whether that would lead to an improvement in execution speed.)
So the complete algorithm for (n,m) starts with:
Create an n-sized list with all values set to m (count of ones required per group).
Generate all bit patterns of size n-1 with m ones; for each of these:
Subtract the pattern from a copy of the count list (without the first element).
Recurse with the pattern and the copy of the count list.
and the recursive steps after that are:
Receive the sequence so far, and a count list.
The length of the count list is n, and its first element is m.
Let k be the number of non-zero values in the count list (without the first element).
Generate all bit pattern of size k with m ones; for each of these:
Create a 0-filled list sized n-1.
Distribute the bit pattern over it, skipping the columns with a zero count.
Add the value list to the sequence so far.
Subtract the value list from a copy of the count list (without the first element).
If the first value in the copy of the count list is greater than the number of non-zeros after it, skip this pattern.
At the deepest recursion level, store the sequence, or else:
Recurse with the sequence so far, and the copy of the count list.
Here's a code snippet as a proof of concept; in a serious language, and using integers instead of arrays for the bitmaps, this should be much faster:
function generate(n, m) {
// if ((n % 2) && (m % 2)) return; // to catch (3,1)
var counts = [], pattern = [];
for (var i = 0; i < n - 1; i++) {
counts.push(m);
pattern.push(i < m ? 1 : 0);
}
do {
var c_copy = counts.slice();
for (var i = 0; i < n - 1; i++) c_copy[i] -= pattern[i];
recurse(pattern, c_copy);
}
while (revLexi(pattern));
}
function recurse(sequence, counts) {
var n = counts.length, m = counts.shift(), k = 0;
for (var i = 0; i < n - 1; i++) if (counts[i]) ++k;
var pattern = [];
for (var i = 0; i < k; i++) pattern.push(i < m ? 1 : 0);
do {
var values = [], pos = 0;
for (var i = 0; i < n - 1; i++) {
if (counts[i]) values.push(pattern[pos++]);
else values.push(0);
}
var s_copy = sequence.concat(values);
var c_copy = counts.slice();
var nonzero = 0;
for (var i = 0; i < n - 1; i++) {
c_copy[i] -= values[i];
if (i && c_copy[i]) ++nonzero;
}
if (c_copy[0] > nonzero) continue;
if (n == 2) {
for (var i = 0; i < s_copy.length; i++) {
document.write(["+ ", "− "][s_copy[i]]);
}
document.write("<br>");
}
else recurse(s_copy, c_copy);
}
while (revLexi(pattern));
}
function revLexi(seq) { // reverse lexicographical because I had this lying around
var max = true, pos = seq.length, set = 1;
while (pos-- && (max || !seq[pos])) if (seq[pos]) ++set; else max = false;
if (pos < 0) return false;
seq[pos] = 0;
while (++pos < seq.length) seq[pos] = set-- > 0 ? 1 : 0;
return true;
}
generate(5, 2);
Here are the number of results and the number of recursions for values of n up to 10, so you can compare them to check correctness. When n and m are both odd numbers, there are no valid results; this is calculated correctly, except in the case of (3,1); it is of course easy to catch these cases and return immediately.
(n,m) results number of recursions
(4,0) (4,3) 1 2 2
(4,1) (4,2) 3 6 7
(5,0) (5,4) 1 3 3
(5,1) (5,3) 0 12 20
(5,2) 12 36
(6,0) (6,5) 1 4 4
(6,1) (6,4) 15 48 76
(6,2) (6,3) 70 226 269
(7,0) (7,6) 1 5 5
(7,1) (7,5) 0 99 257
(7,2) (7,4) 465 1,627 2,313
(7,3) 0 3,413
(8,0) (8,7) 1 6 6
(8,1) (8,6) 105 422 1,041
(8,2) (8,5) 3,507 13,180 23,302
(8,3) (8,4) 19,355 77,466 93,441
(9,0) (9,8) 1 7 7
(9,1) (9,7) 0 948 4,192
(9,2) (9,6) 30,016 119,896 270,707
(9,3) (9,5) 0 1,427,457 2,405,396
(9,4) 1,024,380 4,851,650
(10,0) (10,9) 1 8 8
(10,1) (10,8) 945 4440 18930
(10,2) (10,7) 286,884 1,210,612 3,574,257
(10,3) (10,6) 11,180,820 47,559,340 88,725,087
(10,4) (10,5) 66,462,606 313,129,003 383,079,169
I doubt that you really want all variants for large n,m values - number of them is tremendous large.
This problem is equivalent to generation of m-regular graphs (note that if we replace all 1's by zeros and all -1's by 1 - we can see adjacency matrix of graph. Regular graph - degrees of all vertices are equal to m).
Here we can see that number of (18,4) regular graphs is about 10^9 and rises fast with n/m values. Article contains link to program genreg intended for such graphs generation. FTP links to code and executable don't work for me - perhaps too old.
Upd: Here is another link to source (though 1996 year instead of paper's 1999)
Simple approach to generate one instance of regular graph is described here.
For small n/m values you can also try brute-force: fill the first row with m ones (there are C(n,m) variants and for every variants fill free places in the second row and so on)
Written in Wolfram Mathematica.
generate[n_, m_] := Module[{},
x = Table[StringJoin["i", ToString[i], "j", ToString[j]],
{j, 1, n}, {i, 2, n}];
y = Transpose[x];
MapThread[(x[[#, ;; #2]] = y[[#, ;; #2]]) &,
{-Range[n - 1], Reverse#Range[n - 1]}];
Clear ## Names["i*"];
z = ToExpression[x];
Clear[s];
s = Reduce[Join[Total## == m & /# z,
0 <= # <= 1 & /# Union[Flatten#z]],
Union#Flatten[z], Integers];
Clear[t, u, v];
Array[(t[#] =
Partition[Flatten[z] /.
ToRules[s[[#]]], n - 1] /.
{1 -> -1, 0 -> 1}) &, Length[s]];
Array[Function[a,
(u[a] = StringJoin[Flatten[MapThread[
Take[#, 1 - #2] &,
{t[a], Reverse[Range[n]]}]] /.
{1 -> "+", -1 -> "-"}])], Length[s]];
Array[Function[a,
(v[a] = MapThread[Insert[#, 0, #2] &,
{t[a], Range[n]}])], Length[s]]]
Timing[generate[9, 4];]
Length[s]
{202.208, Null}
1024380
The program takes 202 seconds to generate 1,024,380 solutions. E.g. the last one
u[1024380]
----++++---++++-+-+++++-++++--------
v[1024380]
0 -1 -1 -1 -1 1 1 1 1
-1 0 -1 -1 -1 1 1 1 1
-1 -1 0 -1 1 -1 1 1 1
-1 -1 -1 0 1 1 -1 1 1
-1 -1 1 1 0 1 1 -1 -1
1 1 -1 1 1 0 -1 -1 -1
1 1 1 -1 1 -1 0 -1 -1
1 1 1 1 -1 -1 -1 0 -1
1 1 1 1 -1 -1 -1 -1 0
and the first ten strings
u /# Range[10]
++++----+++----+-+-----+----++++++++
++++----+++----+-+------+--+-+++++++
++++----+++----+-+-------+-++-++++++
++++----+++----+--+---+-----++++++++
++++----+++----+---+--+----+-+++++++
++++----+++----+----+-+----++-++++++
++++----+++----+--+-----+-+--+++++++
++++----+++----+--+------++-+-++++++
++++----+++----+---+---+--+--+++++++

Number of n-element permutations with exactly k inversions

I am trying to efficiently solve SPOJ Problem 64: Permutations.
Let A = [a1,a2,...,an] be a permutation of integers 1,2,...,n. A pair
of indices (i,j), 1<=i<=j<=n, is an inversion of the permutation A if
ai>aj. We are given integers n>0 and k>=0. What is the number of
n-element permutations containing exactly k inversions?
For instance, the number of 4-element permutations with exactly 1
inversion equals 3.
To make the given example easier to see, here are the three 4-element permutations with exactly 1 inversion:
(1, 2, 4, 3)
(1, 3, 2, 4)
(2, 1, 3, 4)
In the first permutation, 4 > 3 and the index of 4 is less than the index of 3. This is a single inversion. Since the permutation has exactly one inversion, it is one of the permutations that we are trying to count.
For any given sequence of n elements, the number of permutations is factorial(n). Thus if I use the brute force n2 way of counting the number of inversions for each permutation and then checking to see if they are equal to k, the solution to this problem would have the time complexity O(n! * n2).
Previous Research
A subproblem of this problem was previously asked here on StackOverflow. An O(n log n) solution using merge sort was given which counts the number of inversions in a single permutation. However, if I use that solution to count the number of inversions for each permutation, I would still get a time complexity of O(n! * n log n) which is still very high in my opinion.
This exact question was also asked previously on Stack Overflow but it received no answers.
My goal is to avoid the factorial complexity that comes from iterating through all permutations. Ideally I would like a mathematical formula that yields the answer to this for any n and k but I am unsure if one even exists.
If there is no math formula to solve this (which I kind of doubt) then I have also seen people giving hints that an efficient dynamic programming solution is possible. Using DP or another approach, I would really like to formulate a solution which is more efficient than O(n! * n log n), but I am unsure of where to start.
Any hints, comments, or suggestions are welcome.
EDIT: I have answered the problem below with a DP approach to computing Mahonian numbers.
The solution needs some explanations.
Let's denote the number of permutations with n items having exactly k inversions
by I(n, k)
Now I(n, 0) is always 1. For any n there exist one and only one permutation which has 0
inversions i.e., when the sequence is increasingly sorted
Now I(0, k) is always 0 since we don't have the sequence itself
Now to find the I(n, k) let's take an example of sequence containing 4 elements
{1,2,3,4}
for n = 4 below are the permutations enumerated and grouped by number of inversions
|___k=0___|___k=1___|___k=2___|___k=3___|___k=4___|___k=5___|___k=6___|
| 1234 | 1243 | 1342 | 1432 | 2431 | 3421 | 4321 |
| | 1324 | 1423 | 2341 | 3241 | 4231 | |
| | 2134 | 2143 | 2413 | 3412 | 4312 | |
| | | 2314 | 3142 | 4132 | | |
| | | 3124 | 3214 | 4213 | | |
| | | | 4123 | | | |
| | | | | | | |
|I(4,0)=1 |I(4,1)=3 |I(4,2)=5 |I(4,3)=6 |I(4,4)=5 |I(4,5)=3 |I(4,6)=1 |
| | | | | | | |
Now to find the number of permutation with n = 5 and for every possible k
we can derive recurrence I(5, k) from I(4, k) by inserting the nth (largest)
element(5) somewhere in each permutation in the previous permutations,
so that the resulting number of inversions is k
for example, I(5,4) is nothing but the number of permutations of the sequence {1,2,3,4,5}
which has exactly 4 inversions each.
Let's observe I(4, k) now above until column k = 4 the number of inversions is <= 4
Now lets place the element 5 as shown below
|___k=0___|___k=1___|___k=2___|___k=3___|___k=4___|___k=5___|___k=6___|
| |5|1234 | 1|5|243 | 13|5|42 | 143|5|2 | 2431|5| | 3421 | 4321 |
| | 1|5|324 | 14|5|23 | 234|5|1 | 3241|5| | 4231 | |
| | 2|5|134 | 21|5|43 | 241|5|3 | 3412|5| | 4312 | |
| | | 23|5|14 | 314|5|4 | 4132|5| | | |
| | | 31|5|24 | 321|5|4 | 4213|5| | | |
| | | | 412|5|3 | | | |
| | | | | | | |
| 1 | 3 | 5 | 6 | 5 | | |
| | | | | | | |
Each of the above permutation which contains 5 has exactly 4 inversions.
So the total permutation with 4 inversions I(5,4) = I(4,4) + I(4,3) + I(4,2) + I(4,1) + I(4,0)
= 1 + 3 + 5 + 6 + 5 = 20
Similarly for I(5,5) from I(4,k)
|___k=0___|___k=1___|___k=2___|___k=3___|___k=4___|___k=5___|___k=6___|
| 1234 | |5|1243 | 1|5|342 | 14|5|32 | 243|5|1 | 3421|5| | 4321 |
| | |5|1324 | 1|5|423 | 23|5|41 | 324|5|1 | 4231|5| | |
| | |5|2134 | 2|5|143 | 24|5|13 | 341|5|2 | 4312|5| | |
| | | 2|5|314 | 31|5|44 | 413|5|2 | | |
| | | 3|5|124 | 32|5|14 | 421|5|3 | | |
| | | | 41|5|23 | | | |
| | | | | | | |
| | 3 | 5 | 6 | 5 | 3 | |
| | | | | | | |
So the total permutation with 5 inversions I(5,5) = I(4,5) + I(4,4) + I(4,3) + I(4,2) + I(4,1)
= 3 + 5 + 6 + 5 + 3 = 22
So I(n, k) = sum of I(n-1, k-i) such that i < n && k-i >= 0
Also, k can go up to n*(n-1)/2 this occurs when the sequence is sorted in decreasing order
https://secweb.cs.odu.edu/~zeil/cs361/web/website/Lectures/insertion/pages/ar01s04s01.html
http://www.algorithmist.com/index.php/SPOJ_PERMUT1
#include <stdio.h>
int dp[100][100];
int inversions(int n, int k)
{
if (dp[n][k] != -1) return dp[n][k];
if (k == 0) return dp[n][k] = 1;
if (n == 0) return dp[n][k] = 0;
int j = 0, val = 0;
for (j = 0; j < n && k-j >= 0; j++)
val += inversions(n-1, k-j);
return dp[n][k] = val;
}
int main()
{
int t;
scanf("%d", &t);
while (t--) {
int n, k, i, j;
scanf("%d%d", &n, &k);
for (i = 1; i <= n; i++)
for (j = 0; j <= k; j++)
dp[i][j] = -1;
printf("%d\n", inversions(n, k));
}
return 0;
}
It's one day later and I have managed to solve the problem using dynamic programming. I submitted it and my code was was accepted by SPOJ so I figure I'll share my knowledge here for anyone who is interested in the future.
After looking in the Wikipedia page which discusses inversion in discrete mathematics, I found an interesting recommendation at the bottom of the page.
Numbers of permutations of n elements with k inversions; Mahonian
numbers: A008302
I clicked on the link to OEIS and it showed me an infinite sequence of integers called the Triangle of Mahonian numbers.
1, 1, 1, 1, 2, 2, 1, 1, 3, 5, 6, 5, 3, 1, 1, 4, 9, 15, 20, 22, 20, 15,
9, 4, 1, 1, 5, 14, 29, 49, 71, 90, 101, 101, 90, 71, 49, 29, 14, 5, 1,
1, 6, 20, 49, 98, 169, 259, 359, 455, 531, 573, 573, 531, 455, 359,
259, 169, 98, 49, 20, 6, 1 . . .
I was curious about what these numbers were since they seemed familiar to me. Then I realized that I had seen the subsequence 1, 3, 5, 6, 5, 3, 1 before. In fact, this was the answer to the problem for several pairs of (n, k), namely (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6). I looked at what was on both sides of this subsequence and was amazed to see that it was all valid (i.e. greater than 0 permutations) answers for n < 4 and n > 4.
The formula for the sequence was given as:
coefficients in expansion of Product_{i=0..n-1} (1+x+...+x^i)
This was easy enough for me to understand and verify. I could basically take any n and plug into the formula. Then the coefficient for the xk term would be the answer for (n, k).
I will show an example for n = 3.
(x0)(x0 + 1)(x0 + x1 + x2)
= (1)(1 + x)(1 + x + x2)
= (1 + x)(1 + x + x2)
= 1 + x + x + x2 + x2 + x3
= 1 + 2x + 2x2 + x3
The final expansion was 1 + 2x + 2x2 + x3 and the coefficients of the xk terms were 1, 2, 2, and 1 for k = 0, 1, 2, 3 respectively. This just happens to be all valid numbers of inversions for 3-element permutations.
1, 2, 2, 1 is the 3rd row of the Mahonian numbers when they are laid out in a table as follows:
1
1 1
1 2 2 1
1 3 5 6 5 3 1
etc.
So basically computing my answer came down to simply calculating the nth Mahonian row and taking the kth element with k starting at 0 and printing 0 if the index was out of range. This was a simple case of bottom-up dynamic programming since each ith row could be used to easily compute the i+1st row.
Given below is the Python solution I used which ran in only 0.02 seconds. The maximum time limit for this problem was 3 seconds for their given test cases and I was getting a timeout error before so I think this optimization is rather good.
def mahonian_row(n):
'''Generates coefficients in expansion of
Product_{i=0..n-1} (1+x+...+x^i)
**Requires that n is a positive integer'''
# Allocate space for resulting list of coefficients?
# Initialize them all to zero?
#max_zero_holder = [0] * int(1 + (n * 0.5) * (n - 1))
# Current max power of x i.e. x^0, x^0 + x^1, x^0 + x^1 + x^2, etc.
# i + 1 is current row number we are computing
i = 1
# Preallocate result
# Initialize to answer for n = 1
result = [1]
while i < n:
# Copy previous row of n into prev
prev = result[:]
# Get space to hold (i+1)st row
result = [0] * int(1 + ((i + 1) * 0.5) * (i))
# Initialize multiplier for this row
m = [1] * (i + 1)
# Multiply
for j in range(len(m)):
for k in range(len(prev)):
result[k+j] += m[j] * prev[k]
# Result now equals mahonian_row(i+1)
# Possibly should be memoized?
i = i + 1
return result
def main():
t = int(raw_input())
for _ in xrange(t):
n, k = (int(s) for s in raw_input().split())
row = mahonian_row(n)
if k < 0 or k > len(row) - 1:
print 0
else:
print row[k]
if __name__ == '__main__':
main()
I have no idea of the time complexity but I am absolutely certain this code can be improved through memoization since there are 10 given test cases and the computations for previous test cases can be used to "cheat" on future test cases. I will make that optimization in the future, but hopefully this answer in its current state will help anyone attempting this problem in the future since it avoids the naive factorial-complexity approach of generating and iterating through all permutations.
If there is a dynamic programming solution, there is probably a way to do it step by step, using the results for permutations of length n to help with the results for permutations of length n+1.
Given a permutation of length n - values 1-n, you can get a permutation of length n+1 by adding value (n+1) at n+1 possible positions. (n+1) is larger than any of 1-n so the number of inversions you create when you do this depends on where you add it - add it at the last position and you create no inversions, add it at the last but one position and you create one inversion, and so on - look back at the n=4 cases with one inversion to check this.
So if you consider one of n+1 places where you can add (n+1) if you add it at place j counting from the right so the last position as position 0 the number of permutations with K inversions this creates is the number of permutations with K-j inversions on n places.
So if at each step you count the number of permutations with K inversions for all possible K you can update the number of permutations with K inversions for length n+1 using the number of permutations with K inversions for length n.
A major problem in computing these coefficients is the size of the order of the resultant product. The polynomial Product i=1,2,..,n {(1+x).(1+x+x^2)....(1+x+x^2+..+x^i)+...(1+x+x^2+...+x^n) will have an order equivalent to n*(n+1). Consequently, this puts a restrictive computational limit on the process. If we use a process where the previous results for the Product for n-1 are used in the process for computation of the Product for n, we are looking at the storage of (n-1)*n integers. It is possible to use a recursive process, which will be much slower, and again it is limited to integers less than the square root of the common size of the integer. The following is some rough and ready recursive code for this problem. The function mahonian(r,c) returns the c th coefficient for the r th Product. But again it is extremely slow for large Products greater than 100 or so. Running this it can be seen that recursion is clearly not the answer.
unsigned int numbertheory::mahonian(unsigned int r, unsigned int c)
{
unsigned int result=0;
unsigned int k;
if(r==0 && c==0)
return 1;
if( r==0 && c!=0)
return 0;
for(k=0; k <= r; k++)
if(r > 0 && c >=k)
result = result + mahonian(r-1,c-k);
return result;
}
As a matter of interest I have included the following which is a c++ version of Sashank which is lot more faster than my recursion example. Note I use the armadillo library.
uvec numbertheory::mahonian_row(uword n){
uword i = 2;
uvec current;
current.ones(i);
uword current_size;
uvec prev;
uword prev_size;
if(n==0){
current.ones(1);
return current;
}
while (i <= n){ // increment through the rows
prev_size=current.size(); // reset prev size to current size
prev.set_size(prev_size); // set size of prev vector
prev= current; //copy contents of current to prev vector
current_size =1+ (i*(i+1)/2); // reset current_size
current.zeros(current_size); // reset current vector with zeros
for(uword j=0;j<i+1; j++) //increment through current vector
for(uword k=0; k < prev_size;k++)
current(k+j) += prev(k);
i++; //increment to next row
}
return current; //return current vector
}
uword numbertheory::mahonian_fast(uword n, uword c) {
**This function returns the coefficient of c order of row n of
**the Mahonian numbers
// check for input errors
if(c >= 1+ (n*(n+1)/2)) {
cout << "Error. Invalid input parameters" << endl;
}
uvec mahonian;
mahonian.zeros(1+ (n*(n+1)/2));
mahonian = mahonian_row(n);
return mahonian(c);
}
We can make use to dynamic programming to solve this problem. we have n place to fill with numbers to from 1 to n, _ _ _ _ _ _ _ take n=7, then at very first place we can achieve atmost n-1 inversion and at least 0 , similarly for second place we can achieve atmost n-2 inversion and at least 0, in general, we can achieve atmost n-i inversions at ith index, irrespective of the choice of number we place before.
our recursive formula will look like :
f(n,k) = f(n-1,k) + f(n-1,k-1) + f(n-1,k-2) ............. f(n-1,max(0,k-(n-1))
no inversion one inversion two inversion n-1 inversion
we can achieve 0 inversions by placing smallest of the remaining number from the set (1,n)
1 inversion by placing second smallest and so on,
base condition for our recursive formula will be.
if( i==0 && k==0 ) return 1(valid permutation)
if( i==0 && k!=0 ) return 0 (invalid permutation).
if we draw recursion tree we will see subproblems repeated multiple times, Hence use memoization to reduce complexity to O(n*k).

nᵗʰ ugly number

Numbers whose only prime factors are 2, 3, or 5 are called ugly numbers.
Example:
1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, ...
1 can be considered as 2^0.
I am working on finding nth ugly number. Note that these numbers are extremely sparsely distributed as n gets large.
I wrote a trivial program that computes if a given number is ugly or not. For n > 500 - it became super slow. I tried using memoization - observation: ugly_number * 2, ugly_number * 3, ugly_number * 5 are all ugly. Even with that it is slow. I tried using some properties of log - since that will reduce this problem from multiplication to addition - but, not much luck yet. Thought of sharing this with you all. Any interesting ideas?
Using a concept similar to Sieve of Eratosthenes (thanks Anon)
for (int i(2), uglyCount(0); ; i++) {
if (i % 2 == 0)
continue;
if (i % 3 == 0)
continue;
if (i % 5 == 0)
continue;
uglyCount++;
if (uglyCount == n - 1)
break;
}
i is the nth ugly number.
Even this is pretty slow. I am trying to find the 1500th ugly number.
A simple fast solution in Java. Uses approach described by Anon..
Here TreeSet is just a container capable of returning smallest element in it. (No duplicates stored.)
int n = 20;
SortedSet<Long> next = new TreeSet<Long>();
next.add((long) 1);
long cur = 0;
for (int i = 0; i < n; ++i) {
cur = next.first();
System.out.println("number " + (i + 1) + ": " + cur);
next.add(cur * 2);
next.add(cur * 3);
next.add(cur * 5);
next.remove(cur);
}
Since 1000th ugly number is 51200000, storing them in bool[] isn't really an option.
edit
As a recreation from work (debugging stupid Hibernate), here's completely linear solution. Thanks to marcog for idea!
int n = 1000;
int last2 = 0;
int last3 = 0;
int last5 = 0;
long[] result = new long[n];
result[0] = 1;
for (int i = 1; i < n; ++i) {
long prev = result[i - 1];
while (result[last2] * 2 <= prev) {
++last2;
}
while (result[last3] * 3 <= prev) {
++last3;
}
while (result[last5] * 5 <= prev) {
++last5;
}
long candidate1 = result[last2] * 2;
long candidate2 = result[last3] * 3;
long candidate3 = result[last5] * 5;
result[i] = Math.min(candidate1, Math.min(candidate2, candidate3));
}
System.out.println(result[n - 1]);
The idea is that to calculate a[i], we can use a[j]*2 for some j < i. But we also need to make sure that 1) a[j]*2 > a[i - 1] and 2) j is smallest possible.
Then, a[i] = min(a[j]*2, a[k]*3, a[t]*5).
I am working on finding nth ugly number. Note that these numbers are extremely sparsely distributed as n gets large.
I wrote a trivial program that computes if a given number is ugly or not.
This looks like the wrong approach for the problem you're trying to solve - it's a bit of a shlemiel algorithm.
Are you familiar with the Sieve of Eratosthenes algorithm for finding primes? Something similar (exploiting the knowledge that every ugly number is 2, 3 or 5 times another ugly number) would probably work better for solving this.
With the comparison to the Sieve I don't mean "keep an array of bools and eliminate possibilities as you go up". I am more referring to the general method of generating solutions based on previous results. Where the Sieve gets a number and then removes all multiples of it from the candidate set, a good algorithm for this problem would start with an empty set and then add the correct multiples of each ugly number to that.
My answer refers to the correct answer given by Nikita Rybak.
So that one could see a transition from the idea of the first approach to that of the second.
from collections import deque
def hamming():
h=1;next2,next3,next5=deque([]),deque([]),deque([])
while True:
yield h
next2.append(2*h)
next3.append(3*h)
next5.append(5*h)
h=min(next2[0],next3[0],next5[0])
if h == next2[0]: next2.popleft()
if h == next3[0]: next3.popleft()
if h == next5[0]: next5.popleft()
What's changed from Nikita Rybak's 1st approach is that, instead of adding next candidates into single data structure, i.e. Tree set, one can add each of them separately into 3 FIFO lists. This way, each list will be kept sorted all the time, and the next least candidate must always be at the head of one ore more of these lists.
If we eliminate the use of the three lists above, we arrive at the second implementation in Nikita Rybak' answer. This is done by evaluating those candidates (to be contained in three lists) only when needed, so that there is no need to store them.
Simply put:
In the first approach, we put every new candidate into single data structure, and that's bad because too many things get mixed up unwisely. This poor strategy inevitably entails O(log(tree size)) time complexity every time we make a query to the structure. By putting them into separate queues, however, you will see that each query takes only O(1) and that's why the overall performance reduces to O(n)!!! This is because each of the three lists is already sorted, by itself.
I believe you can solve this problem in sub-linear time, probably O(n^{2/3}).
To give you the idea, if you simplify the problem to allow factors of just 2 and 3, you can achieve O(n^{1/2}) time starting by searching for the smallest power of two that is at least as large as the nth ugly number, and then generating a list of O(n^{1/2}) candidates. This code should give you an idea how to do it. It relies on the fact that the nth number containing only powers of 2 and 3 has a prime factorization whose sum of exponents is O(n^{1/2}).
def foo(n):
p2 = 1 # current power of 2
p3 = 1 # current power of 3
e3 = 0 # exponent of current power of 3
t = 1 # number less than or equal to the current power of 2
while t < n:
p2 *= 2
if p3 * 3 < p2:
p3 *= 3
e3 += 1
t += 1 + e3
candidates = [p2]
c = p2
for i in range(e3):
c /= 2
c *= 3
if c > p2: c /= 2
candidates.append(c)
return sorted(candidates)[n - (t - len(candidates))]
The same idea should work for three allowed factors, but the code gets more complex. The sum of the powers of the factorization drops to O(n^{1/3}), but you need to consider more candidates, O(n^{2/3}) to be more precise.
A lot of good answers here, but I was having trouble understanding those, specifically how any of these answers, including the accepted one, maintained the axiom 2 in Dijkstra's original paper:
Axiom 2. If x is in the sequence, so is 2 * x, 3 * x, and 5 * x.
After some whiteboarding, it became clear that the axiom 2 is not an invariant at each iteration of the algorithm, but actually the goal of the algorithm itself. At each iteration, we try to restore the condition in axiom 2. If last is the last value in the result sequence S, axiom 2 can simply be rephrased as:
For some x in S, the next value in S is the minimum of 2x,
3x, and 5x, that is greater than last. Let's call this axiom 2'.
Thus, if we can find x, we can compute the minimum of 2x, 3x, and 5x in constant time, and add it to S.
But how do we find x? One approach is, we don't; instead, whenever we add a new element e to S, we compute 2e, 3e, and 5e, and add them to a minimum priority queue. Since this operations guarantees e is in S, simply extracting the top element of the PQ satisfies axiom 2'.
This approach works, but the problem is that we generate a bunch of numbers we may not end up using. See this answer for an example; if the user wants the 5th element in S (5), the PQ at that moment holds 6 6 8 9 10 10 12 15 15 20 25. Can we not waste this space?
Turns out, we can do better. Instead of storing all these numbers, we simply maintain three counters for each of the multiples, namely, 2i, 3j, and 5k. These are candidates for the next number in S. When we pick one of them, we increment only the corresponding counter, and not the other two. By doing so, we are not eagerly generating all the multiples, thus solving the space problem with the first approach.
Let's see a dry run for n = 8, i.e. the number 9. We start with 1, as stated by axiom 1 in Dijkstra's paper.
+---------+---+---+---+----+----+----+-------------------+
| # | i | j | k | 2i | 3j | 5k | S |
+---------+---+---+---+----+----+----+-------------------+
| initial | 1 | 1 | 1 | 2 | 3 | 5 | {1} |
+---------+---+---+---+----+----+----+-------------------+
| 1 | 1 | 1 | 1 | 2 | 3 | 5 | {1,2} |
+---------+---+---+---+----+----+----+-------------------+
| 2 | 2 | 1 | 1 | 4 | 3 | 5 | {1,2,3} |
+---------+---+---+---+----+----+----+-------------------+
| 3 | 2 | 2 | 1 | 4 | 6 | 5 | {1,2,3,4} |
+---------+---+---+---+----+----+----+-------------------+
| 4 | 3 | 2 | 1 | 6 | 6 | 5 | {1,2,3,4,5} |
+---------+---+---+---+----+----+----+-------------------+
| 5 | 3 | 2 | 2 | 6 | 6 | 10 | {1,2,3,4,5,6} |
+---------+---+---+---+----+----+----+-------------------+
| 6 | 4 | 2 | 2 | 8 | 6 | 10 | {1,2,3,4,5,6} |
+---------+---+---+---+----+----+----+-------------------+
| 7 | 4 | 3 | 2 | 8 | 9 | 10 | {1,2,3,4,5,6,8} |
+---------+---+---+---+----+----+----+-------------------+
| 8 | 5 | 3 | 2 | 10 | 9 | 10 | {1,2,3,4,5,6,8,9} |
+---------+---+---+---+----+----+----+-------------------+
Notice that S didn't grow at iteration 6, because the minimum candidate 6 had already been added previously. To avoid this problem of having to remember all of the previous elements, we amend our algorithm to increment all the counters whenever the corresponding multiples are equal to the minimum candidate. That brings us to the following Scala implementation.
def hamming(n: Int): Seq[BigInt] = {
#tailrec
def next(x: Int, factor: Int, xs: IndexedSeq[BigInt]): Int = {
val leq = factor * xs(x) <= xs.last
if (leq) next(x + 1, factor, xs)
else x
}
#tailrec
def loop(i: Int, j: Int, k: Int, xs: IndexedSeq[BigInt]): IndexedSeq[BigInt] = {
if (xs.size < n) {
val a = next(i, 2, xs)
val b = next(j, 3, xs)
val c = next(k, 5, xs)
val m = Seq(2 * xs(a), 3 * xs(b), 5 * xs(c)).min
val x = a + (if (2 * xs(a) == m) 1 else 0)
val y = b + (if (3 * xs(b) == m) 1 else 0)
val z = c + (if (5 * xs(c) == m) 1 else 0)
loop(x, y, z, xs :+ m)
} else xs
}
loop(0, 0, 0, IndexedSeq(BigInt(1)))
}
Basicly the search could be made O(n):
Consider that you keep a partial history of ugly numbers. Now, at each step you have to find the next one. It should be equal to a number from the history multiplied by 2, 3 or 5. Chose the smallest of them, add it to history, and drop some numbers from it so that the smallest from the list multiplied by 5 would be larger than the largest.
It will be fast, because the search of the next number will be simple:
min(largest * 2, smallest * 5, one from the middle * 3),
that is larger than the largest number in the list. If they are scarse, the list will always contain few numbers, so the search of the number that have to be multiplied by 3 will be fast.
Here is a correct solution in ML. The function ugly() will return a stream (lazy list) of hamming numbers. The function nth can be used on this stream.
This uses the Sieve method, the next elements are only calculated when needed.
datatype stream = Item of int * (unit->stream);
fun cons (x,xs) = Item(x, xs);
fun head (Item(i,xf)) = i;
fun tail (Item(i,xf)) = xf();
fun maps f xs = cons(f (head xs), fn()=> maps f (tail xs));
fun nth(s,1)=head(s)
| nth(s,n)=nth(tail(s),n-1);
fun merge(xs,ys)=if (head xs=head ys) then
cons(head xs,fn()=>merge(tail xs,tail ys))
else if (head xs<head ys) then
cons(head xs,fn()=>merge(tail xs,ys))
else
cons(head ys,fn()=>merge(xs,tail ys));
fun double n=n*2;
fun triple n=n*3;
fun ij()=
cons(1,fn()=>
merge(maps double (ij()),maps triple (ij())));
fun quint n=n*5;
fun ugly()=
cons(1,fn()=>
merge((tail (ij())),maps quint (ugly())));
This was first year CS work :-)
To find the n-th ugly number in O (n^(2/3)), jonderry's algorithm will work just fine. Note that the numbers involved are huge so any algorithm trying to check whether a number is ugly or not has no chance.
Finding all of the n smallest ugly numbers in ascending order is done easily by using a priority queue in O (n log n) time and O (n) space: Create a priority queue of numbers with the smallest numbers first, initially including just the number 1. Then repeat n times: Remove the smallest number x from the priority queue. If x hasn't been removed before, then x is the next larger ugly number, and we add 2x, 3x and 5x to the priority queue. (If anyone doesn't know the term priority queue, it's like the heap in the heapsort algorithm). Here's the start of the algorithm:
1 -> 2 3 5
1 2 -> 3 4 5 6 10
1 2 3 -> 4 5 6 6 9 10 15
1 2 3 4 -> 5 6 6 8 9 10 12 15 20
1 2 3 4 5 -> 6 6 8 9 10 10 12 15 15 20 25
1 2 3 4 5 6 -> 6 8 9 10 10 12 12 15 15 18 20 25 30
1 2 3 4 5 6 -> 8 9 10 10 12 12 15 15 18 20 25 30
1 2 3 4 5 6 8 -> 9 10 10 12 12 15 15 16 18 20 24 25 30 40
Proof of execution time: We extract an ugly number from the queue n times. We initially have one element in the queue, and after extracting an ugly number we add three elements, increasing the number by 2. So after n ugly numbers are found we have at most 2n + 1 elements in the queue. Extracting an element can be done in logarithmic time. We extract more numbers than just the ugly numbers but at most n ugly numbers plus 2n - 1 other numbers (those that could have been in the sieve after n-1 steps). So the total time is less than 3n item removals in logarithmic time = O (n log n), and the total space is at most 2n + 1 elements = O (n).
I guess we can use Dynamic Programming (DP) and compute nth Ugly Number. Complete explanation can be found at http://www.geeksforgeeks.org/ugly-numbers/
#include <iostream>
#define MAX 1000
using namespace std;
// Find Minimum among three numbers
long int min(long int x, long int y, long int z) {
if(x<=y) {
if(x<=z) {
return x;
} else {
return z;
}
} else {
if(y<=z) {
return y;
} else {
return z;
}
}
}
// Actual Method that computes all Ugly Numbers till the required range
long int uglyNumber(int count) {
long int arr[MAX], val;
// index of last multiple of 2 --> i2
// index of last multiple of 3 --> i3
// index of last multiple of 5 --> i5
int i2, i3, i5, lastIndex;
arr[0] = 1;
i2 = i3 = i5 = 0;
lastIndex = 1;
while(lastIndex<=count-1) {
val = min(2*arr[i2], 3*arr[i3], 5*arr[i5]);
arr[lastIndex] = val;
lastIndex++;
if(val == 2*arr[i2]) {
i2++;
}
if(val == 3*arr[i3]) {
i3++;
}
if(val == 5*arr[i5]) {
i5++;
}
}
return arr[lastIndex-1];
}
// Starting point of program
int main() {
long int num;
int count;
cout<<"Which Ugly Number : ";
cin>>count;
num = uglyNumber(count);
cout<<endl<<num;
return 0;
}
We can see that its quite fast, just change the value of MAX to compute higher Ugly Number
Using 3 generators in parallel and selecting the smallest at each iteration, here is a C program to compute all ugly numbers below 2128 in less than 1 second:
#include <limits.h>
#include <stdio.h>
#if 0
typedef unsigned long long ugly_t;
#define UGLY_MAX (~(ugly_t)0)
#else
typedef __uint128_t ugly_t;
#define UGLY_MAX (~(ugly_t)0)
#endif
int print_ugly(int i, ugly_t u) {
char buf[64], *p = buf + sizeof(buf);
*--p = '\0';
do { *--p = '0' + u % 10; } while ((u /= 10) != 0);
return printf("%d: %s\n", i, p);
}
int main() {
int i = 0, n2 = 0, n3 = 0, n5 = 0;
ugly_t u, ug2 = 1, ug3 = 1, ug5 = 1;
#define UGLY_COUNT 110000
ugly_t ugly[UGLY_COUNT];
while (i < UGLY_COUNT) {
u = ug2;
if (u > ug3) u = ug3;
if (u > ug5) u = ug5;
if (u == UGLY_MAX)
break;
ugly[i++] = u;
print_ugly(i, u);
if (u == ug2) {
if (ugly[n2] <= UGLY_MAX / 2)
ug2 = 2 * ugly[n2++];
else
ug2 = UGLY_MAX;
}
if (u == ug3) {
if (ugly[n3] <= UGLY_MAX / 3)
ug3 = 3 * ugly[n3++];
else
ug3 = UGLY_MAX;
}
if (u == ug5) {
if (ugly[n5] <= UGLY_MAX / 5)
ug5 = 5 * ugly[n5++];
else
ug5 = UGLY_MAX;
}
}
return 0;
}
Here are the last 10 lines of output:
100517: 338915443777200000000000000000000000000
100518: 339129266201729628114355465608000000000
100519: 339186548067800934969350553600000000000
100520: 339298130282929870605468750000000000000
100521: 339467078447341918945312500000000000000
100522: 339569540691046437734055936000000000000
100523: 339738624000000000000000000000000000000
100524: 339952965770562084651663360000000000000
100525: 340010386766614455386112000000000000000
100526: 340122240000000000000000000000000000000
Here is a version in Javascript usable with QuickJS:
import * as std from "std";
function main() {
var i = 0, n2 = 0, n3 = 0, n5 = 0;
var u, ug2 = 1n, ug3 = 1n, ug5 = 1n;
var ugly = [];
for (;;) {
u = ug2;
if (u > ug3) u = ug3;
if (u > ug5) u = ug5;
ugly[i++] = u;
std.printf("%d: %s\n", i, String(u));
if (u >= 0x100000000000000000000000000000000n)
break;
if (u == ug2)
ug2 = 2n * ugly[n2++];
if (u == ug3)
ug3 = 3n * ugly[n3++];
if (u == ug5)
ug5 = 5n * ugly[n5++];
}
return 0;
}
main();
here is my code , the idea is to divide the number by 2 (till it gives remainder 0) then 3 and 5 . If at last the number becomes one it's a ugly number.
you can count and even print all ugly numbers till n.
int count = 0;
for (int i = 2; i <= n; i++) {
int temp = i;
while (temp % 2 == 0) temp=temp / 2;
while (temp % 3 == 0) temp=temp / 3;
while (temp % 5 == 0) temp=temp / 5;
if (temp == 1) {
cout << i << endl;
count++;
}
}
This problem can be done in O(1).
If we remove 1 and look at numbers between 2 through 30, we will notice that there are 22 numbers.
Now, for any number x in the 22 numbers above, there will be a number x + 30 in between 31 and 60 that is also ugly. Thus, we can find at least 22 numbers between 31 and 60. Now for every ugly number between 31 and 60, we can write it as s + 30. So s will be ugly too, since s + 30 is divisible by 2, 3, or 5. Thus, there will be exactly 22 numbers between 31 and 60. This logic can be repeated for every block of 30 numbers after that.
Thus, there will be 23 numbers in the first 30 numbers, and 22 for every 30 after that. That is, first 23 uglies will occur between 1 and 30, 45 uglies will occur between 1 and 60, 67 uglies will occur between 1 and 30 etc.
Now, if I am given n, say 137, I can see that 137/22 = 6.22. The answer will lie between 6*30 and 7*30 or between 180 and 210. By 180, I will have 6*22 + 1 = 133rd ugly number at 180. I will have 154th ugly number at 210. So I am looking for 4th ugly number (since 137 = 133 + 4)in the interval [2, 30], which is 5. The 137th ugly number is then 180 + 5 = 185.
Another example: if I want the 1500th ugly number, I count 1500/22 = 68 blocks. Thus, I will have 22*68 + 1 = 1497th ugly at 30*68 = 2040. The next three uglies in the [2, 30] block are 2, 3, and 4. So our required ugly is at 2040 + 4 = 2044.
The point it that I can simply build a list of ugly numbers between [2, 30] and simply find the answer by doing look ups in O(1).
Here is another O(n) approach (Python solution) based on the idea of merging three sorted lists. The challenge is to find the next ugly number in increasing order. For example, we know the first seven ugly numbers are [1,2,3,4,5,6,8]. The ugly numbers are actually from the following three lists:
list 1: 1*2, 2*2, 3*2, 4*2, 5*2, 6*2, 8*2 ... ( multiply each ugly number by 2 )
list 2: 1*3, 2*3, 3*3, 4*3, 5*3, 6*3, 8*3 ... ( multiply each ugly number by 3 )
list 3: 1*5, 2*5, 3*5, 4*5, 5*5, 6*5, 8*5 ... ( multiply each ugly number by 5 )
So the nth ugly number is the nth number of the list merged from the three lists above:
1, 1*2, 1*3, 2*2, 1*5, 2*3 ...
def nthuglynumber(n):
p2, p3, p5 = 0,0,0
uglynumber = [1]
while len(uglynumber) < n:
ugly2, ugly3, ugly5 = uglynumber[p2]*2, uglynumber[p3]*3, uglynumber[p5]*5
next = min(ugly2, ugly3, ugly5)
if next == ugly2: p2 += 1 # multiply each number
if next == ugly3: p3 += 1 # only once by each
if next == ugly5: p5 += 1 # of the three factors
uglynumber += [next]
return uglynumber[-1]
STEP I: computing three next possible ugly numbers from the three lists
ugly2, ugly3, ugly5 = uglynumber[p2]*2, uglynumber[p3]*3, uglynumber[p5]*5
STEP II, find the one next ugly number as the smallest of the three above:
next = min(ugly2, ugly3, ugly5)
STEP III: moving the pointer forward if its ugly number was the next ugly number
if next == ugly2: p2+=1
if next == ugly3: p3+=1
if next == ugly5: p5+=1
note: not using if with elif nor else
STEP IV: adding the next ugly number into the merged list uglynumber
uglynumber += [next]

Resources