Iterating shuffled [0..n) without arrays

Iterating shuffled [0..n) without arrays - algorithm

I know of a couple of routines that work as follows:
Xn+1 = Routine(Xn, max)
For example, something like a LCG generator:
Xn+1 = (a*Xn + c) mod m
There isn't enough parameterization in this generator to generate every sequence.
Dream Function:
Xn+1 = Routine(Xn, max, permutation number)
This routine, parameterized by an index into the set of all permutations, would return the next number in the sequence. The sequence may be arbitrarily large (so storing the array and using factoradic numbers is impractical.
Failing that, does anyone have pointers to similar functions that are either stateless or have a constant amount of state for arbitrary 'max', such that they will iterate a shuffled list.

There are n! permutations of n elements. Storing which one you're using requires at least log(n!) / log(2) bits. By Stirling's approximation, this takes roughly n log(n) / log (2) bits.
Explicitly storing one index takes log(n) / log(2) bits. Storing all n, as in an array of indices takes n times as many, or again n log(n) / log(2). Information-theoretically, there is no better way than explicitly storing the permutation.
In other words, the index you pass in of what permutation in the set you want takes the same asymptotic storage space as just writing out the permutation. If, for, example, you limit the index of the permutation to 32 bit values, you can only handle permutations of up to 12 elements. 64 bit indices only get you up to 20 elements.
As the index takes the same space as the permutation would, either change your representation to just use the permutation directly, or accept unpacking into an array of size N.

From my response to another question:
It is actually possible to do this in
space proportional to the number of
elements selected, rather than the
size of the set you're selecting from,
regardless of what proportion of the
total set you're selecting. You do
this by generating a random
permutation, then selecting from it
like this:
Pick a block cipher, such as TEA
or XTEA. Use XOR folding to
reduce the block size to the smallest
power of two larger than the set
you're selecting from. Use the random
seed as the key to the cipher. To
generate an element n in the
permutation, encrypt n with the
cipher. If the output number is not in
your set, encrypt that. Repeat until
the number is inside the set. On
average you will have to do less than
two encryptions per generated number.
This has the added benefit that if
your seed is cryptographically secure,
so is your entire permutation.
I wrote about this in much more detail
here.
Of course, there's no guarantee that every permutation can be generated (and depending on your block size and key size, that may not even be possible), but the permutations you can get are highly random (if they weren't, it wouldn't be a good cipher), and you can have as many of them as you want.

If you are wanting a function that takes up less stack space, then you should look into using an iterated version, rather than a function. You can also use a datastructure like a TreeMap, and have it stored on disk, and read on an as needed basis.
X(n+1) = Routine(Xn, max, permutation number)
for(i = n; i > 0; i--)
{
int temp = Map.lookup(i)
otherfun(temp,max,perm)
}

Is it possible to index a set of permutations without previously computing and storing the whole thing in memory? I tried something like this before and didn't find a solution - I think it is impossible (in the mathematical sense).
Disclaimer: I may have misunderstood your question...

Code that uses an iterate interface. Time complexity is O(n^2), Space complexity has an overhead of: copy of n (log n bits), an iteration variable (log n bits), keeping track of n-i (log n bits), , copy of current value (log n bits), copy of p (n log n bits), creation of next value (log n bits), and a bit set of used values (n bits). You can't avoid an overhead of n log n bits. Timewise, this is also O(n^2), for setting the bits. This can be reduced a bit, but at the cost of using a decorated tree to store the used values.
This can be altered to use arbitrary precision integers and bit sets by using calls to the appropriate libraries instead, and the above bounds will actually start to kick in, rather than being capped at N=8, portably (an int can be the same as a short, and as small as 16 bits). 9! = 362880 > 65536 = 2^16
#include <math.h>
#include <stdio.h>
typedef signed char index_t;
typedef unsigned int permutation;
static index_t permutation_next(index_t n, permutation p, index_t value)
{
permutation used = 0;
for (index_t i = 0; i < n; ++i) {
index_t left = n - i;
index_t digit = p % left;
p /= left;
for (index_t j = 0; j <= digit; ++j) {
if (used & (1 << j)) {
digit++;
}
}
used |= (1 << digit);
if (value == -1) {
return digit;
}
if (value == digit) {
value = -1;
}
}
/* value not found */
return -1;
}
static void dump_permutation(index_t n, permutation p)
{
index_t value = -1;
fputs("[", stdout);
value = permutation_next(n, p, value);
while (value != -1) {
printf("%d", value);
value = permutation_next(n, p, value);
if (value != -1) {
fputs(", ", stdout);
}
}
puts("]");
}
static int factorial(int n)
{
int prod = 1;
for (int i = 1; i <= n; ++i) {
prod *= i;
}
return prod;
}
int main(int argc, char **argv)
{
const index_t n = 4;
const permutation max = factorial(n);
for (permutation p = 0; p < max; ++p) {
dump_permutation(n, p);
}
}

Code that unpacks a permutation index into an array, with a certain mapping from index to permutation. There are loads of others, but this one is convenient.
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
typedef unsigned char index_t;
typedef unsigned int permutation;
static void permutation_to_array(index_t *indices, index_t n, permutation p)
{
index_t used = 0;
for (index_t i = 0; i < n; ++i) {
index_t left = n - i;
index_t digit = p % left;
for (index_t j = 0; j <= digit; ++j) {
if (used & (1 << j)) {
digit++;
}
}
used |= (1 << digit);
indices[i] = digit;
p /= left;
}
}
static void dump_array(index_t *indices, index_t n)
{
fputs("[", stdout);
for (index_t i = 0; i < n; ++i) {
printf("%d", indices[i]);
if (i != n - 1) {
fputs(", ", stdout);
}
}
puts("]");
}
static int factorial(int n)
{
int prod = 1;
for (int i = 1; i <= n; ++i) {
prod *= i;
}
return prod;
}
int main(int argc, char **argv)
{
const index_t n = 4;
const permutation max = factorial(n);
index_t *indices = malloc(n * sizeof (*indices));
for (permutation p = 0; p < max; ++p) {
permutation_to_array(indices, n, p);
dump_array(indices, n);
}
free(indices);
}

Related

Complexity of backtracking algorithm

I tried to solve this problem using backtracking but I am not sure about the complexity of the algorithm (and if the algorithm is correct) and what would be an algorithm with a better complexity.
Given 2 positive integers n and m, we call legal a sequence of integers if:
the length of the sequence is n
the elements in the sequence are between 1 and m
the element in position i of the sequence, 1 < i <= n, is a divisor of the element in position i-1
Count the number of legal sequences. Expected complexity of the algorithm is O(m² + nm)
This is my algorithm in c:
// n length of the sequence
// m maximum valid number
// l number of remaining positions in the sequence
// p previous number in the sequence
int legal(int n, int m, int l, int p) {
if (l == 0)
return 1;
int q=0;
for (int i=1; i <= m;i++) {
if (p%i == 0 || l == n)
q += legal(n,m,l-1,i);
}
return q;
}
int main() {
int n, m;
scanf("%d", &n);
scanf("%d", &m);
printf("%d\n", legal(n,m,n,0));
}
I think the complexity of my algorithm is O(nmS(n)) with S(n) = the number of legal sequences

You are correct that your program runs in the solution space of problem. For this type of problem, your solution is sub-optimal for large input (say n = m = 100). That is because the solution space grows exponentially in relation to m and n. Here is a solution that uses memoization to avoid re-computations:
#include <cstdio>
#define LIMIT 101
#define DIRTY -1
long long cache[LIMIT][LIMIT];
void clear_cache() {
for (int i = 0; i < LIMIT; i++) {
for (int j = 0; j < LIMIT; j++) {
// marked all entries in cache as dirty
cache[i][j] = DIRTY;
}
}
}
long long legal_seqs(int curr_len, int prev_num, int seq_len, int max_num) {
// base case
if (curr_len == seq_len) return 1;
// if we haven't seen this sub-problem, compute it!
// this is called memoization
if (cache[curr_len][prev_num] == DIRTY) {
long long ways = 0;
// get all multiples of prev_num
for (int next_num = 1; next_num <= max_num; next_num++) {
if (prev_num % next_num == 0) {
ways += legal_seqs(curr_len + 1, next_num, seq_len, max_num);
}
}
cache[curr_len][prev_num] = ways;
}
return cache[curr_len][prev_num];
}
int main() {
int n, m;
scanf("%d%d", &n, &m);
clear_cache();
printf("%lld\n", legal_seqs(0, 0, n, m));
}
The code above runs in the time complexity you mentioned.

How to print values in memoization method-Dynamic pragraming

I know for a problem that can be solved using DP, can be solved by either tabulation(bottom-up) approach or memoization(top-down) approach. personally i find memoization is easy and even efficient approach(analysis required just to get recursive formula,once recursive formula is obtained, a brute-force recursive method can easily be converted to store sub-problem's result and reuse it.) The only problem that i am facing in this approach is, i am not able to construct actual result from the table which i filled on demand.
For example, in Matrix Product Parenthesization problem ( to decide in which order to perform the multiplications on Matrices so that cost of multiplication is minimum) i am able to calculate minimum cost not not able to generate order in algo.
For example, suppose A is a 10 × 30 matrix, B is a 30 × 5 matrix, and C is a 5 × 60 matrix. Then,
(AB)C = (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 operations
A(BC) = (30×5×60) + (10×30×60) = 9000 + 18000 = 27000 operations.
here i am able to get min-cost as 27000 but unable to get order which is A(BC).
I used this. Suppose F[i, j] represents least number of multiplication needed to multiply Ai.....Aj and an array p[] is given which represents the chain of matrices such that the ith matrix Ai is of dimension p[i-1] x p[i]. So
0 if i=j
F[i,j]=
min(F[i,k] + F[k+1,j] +P_i-1 * P_k * P_j where k∈[i,j)
Below is the implementation that i have created.
#include<stdio.h>
#include<limits.h>
#include<string.h>
#define MAX 4
int lookup[MAX][MAX];
int MatrixChainOrder(int p[], int i, int j)
{
if(i==j) return 0;
int min = INT_MAX;
int k, count;
if(lookup[i][j]==0){
// recursively calculate count of multiplcations and return the minimum count
for (k = i; k<j; k++) {
int gmin=0;
if(lookup[i][k]==0)
lookup[i][k]=MatrixChainOrder(p, i, k);
if(lookup[k+1][j]==0)
lookup[k+1][j]=MatrixChainOrder(p, k+1, j);
count = lookup[i][k] + lookup[k+1][j] + p[i-1]*p[k]*p[j];
if (count < min){
min = count;
printf("\n****%d ",k); // i think something has be done here to represent the correct answer ((AB)C)D where first mat is represented by A second by B and so on.
}
}
lookup[i][j] = min;
}
return lookup[i][j];
}
// Driver program to test above function
int main()
{
int arr[] = {2,3,6,4,5};
int n = sizeof(arr)/sizeof(arr[0]);
memset(lookup, 0, sizeof(lookup));
int width =10;
printf("Minimum number of multiplications is %d ", MatrixChainOrder(arr, 1, n-1));
printf("\n ---->");
for(int l=0;l<MAX;++l)
printf(" %*d ",width,l);
printf("\n");
for(int z=0;z<MAX;z++){
printf("\n %d--->",z);
for(int x=0;x<MAX;x++)
printf(" %*d ",width,lookup[z][x]);
}
return 0;
}
I know using tabulation approach printing the solution is much easy but i want to do it in memoization technique.
Thanks.

Your code correctly computes the minimum number of multiplications, but you're struggling to display the optimal chain of matrix multiplications.
There's two possibilities:
When you compute the table, you can store the best index found in another memoization array.
You can recompute the optimal splitting points from the results in the memoization array.
The first would involve creating the split points in a separate array:
int lookup_splits[MAX][MAX];
And then updating it inside your MatrixChainOrder function:
...
if (count < min) {
min = count;
lookup_splits[i][j] = k;
}
You can then generate the multiplication chain recursively like this:
void print_mult_chain(int i, int j) {
if (i == j) {
putchar('A' + i - 1);
return;
}
putchar('(');
print_mult_chain(i, lookup_splits[i][j]);
print_mult_chain(lookup_splits[i][j] + 1, j);
putchar(')');
}
You can call the function with print_mult_chain(1, n - 1) from main.
The second possibility is that you don't cache lookup_splits and recompute it as necessary.
int get_lookup_splits(int p[], int i, int j) {
int best = INT_MAX;
int k_best;
for (int k = i; k < j; k++) {
int count = lookup[i][k] + lookup[k+1][j] + p[i-1]*p[k]*p[j];
if (count < best) {
best = count;
k_best = k;
}
}
return k;
}
This is essentially the same computation you did inside MatrixChainOrder, so if you go with this solution you should factor the code appropriately to avoid having two copies.
With this function, you can adapt print_mult_chain above to use it rather than the lookup_splits array. (You'll need to pass the p array in).
[None of this code is tested, so you may need to edit the answer to fix bugs].

Speed of two algorithms rotating a sequence. (from the book Programming Pearls)

In Column 2 of the book Programming Pearls there is a problem asking you to design an algorithm to rotate a string k positions to the left. For example, the string is "12345" and k=2, then the result is "34512".
The first algorithm is to simulate the exchanging process, i.e. put x[(i + k) % n] into x[i], and repeat until finishing.
The second algorithm uses the observation that we only need to exchange the a="12" and b="345", i.e. first k characters and last n - k characters. We could reverse a to a'="21", and b to b'="543' at first, then reverse (a'b')' to ba, which is desired.
Following is my code:
Algorithm 1:
#define NEXT(j) ((j + k) % n)
#define PREV(j) ((j + n - k) % n)
#include "stdio.h"
#include "stdlib.h"
int gcd(int a, int b) {
return (a % b == 0 ? b : gcd(b, a % b));
}
void solve(int *a, int n, int k) {
int len = gcd(n, k);
for (int i = 0; i < len; i++) {
int x = a[i];
int j = i;
do {
a[j] = a[NEXT(j)];
j = NEXT(j);
} while (j != i);
a[PREV(j)] = x;
}
}
int main(int argc, char const *argv[])
{
int n, k;
scanf("%d %d", &n, &k);
int *a = malloc(sizeof(int) * n);
for (int i = 0; i < n; i++) a[i] = i;
solve(a, n, k);
free(a);
return 0;
}
Algorithm 2:
#include "stdio.h"
#include "stdlib.h"
void swap(int *a, int *b) {
int t = *a;
*a = *b;
*b = t;
}
void reverse(int *a, int n) {
int m = n / 2;
for (int i = 0; i < m; i++) {
swap(a + i, a + (n - 1 - i));
}
}
void solve(int *a, int n, int k) {
reverse(a, k);
reverse(a + k, n - k);
reverse(a, n);
}
int main(int argc, char const *argv[])
{
int n, k;
scanf("%d %d", &n, &k);
int *a = malloc(sizeof(int) * n);
for (int i = 0; i < n; i++) a[i] = i;
solve(a, n, k);
free(a);
return 0;
}
where n is the length of the string, and k is the length to rotate.
I use n=232830359 and k=80829 to test the two algorithms. The result is, algorithm 1 takes 6.199s while algorithm 2 takes 1.970s.
However, I think the two algorithms both need to compute n exchanges. (Algorithm 1 is obvious, algorithm 2 takes k/2 + (n-k)/2 + n/2 = n exchanges).
My question is, why their speeds differ so much?

Both of this algorithms are more memory bound than CPU bound. That's why it the case when analyzing the number of basic operations(like swaps or loop iterations) gives results that are quite different from the real running time. So we will use external memory model instead of RAM model. That is, we will analyze the number of cache misses. Let's assume that N is an array size, M is the number of blocks in cache and B is one block size. As long as N is big in your test, it safe to assume that N >M(that is, all the array cannot be in cache).
1)The first algorithm: It accesses array elements in the the following manner i, (i + k) mod N, (i + 2 * k) mod N and so on. If k is large, then two consecutively accessed elements are not in the same block. So in the worst case two accesses yield two cache misses.
These two blocks will be loaded into cache, but they might not be used for a long time after that! So when they are accessed again, they might be already replaced by other blocks(because the cache is smaller then the array). And it will be a miss again. It can be shown that this algorithm can have O(N) cache misses in the worst case.
2)The second algorithm has very different array access pattern: l, r, l + 1, r - 1, ....
If accessing the l-th element causes a miss, the entire block with it is loaded into the cache, so accesses to l + 1, l + 2, ... till the end of the block will not cause any misses. The same is true for r, r - 1 and so on(it is actually true only if l and r blocks can be held in cache at the same time, but this is a safe assumption because caches are usually not direct mapped). So this algorithm has O(N / B) cache misses in the worst case.
Taking into account that a block size of real cache is larger than one integer size, it becomes clear why the second algorithm is significantly faster.
P.S It is just a model of what's really going on, but in this particular case external memory model works better than RAM model(and RAM model is just a model too, anyway).

Perfect minimal hash for mathematical combinations

First, define two integers N and K, where N >= K, both known at compile time. For example: N = 8 and K = 3.
Next, define a set of integers [0, N) (or [1, N] if that makes the answer simpler) and call it S. For example: {0, 1, 2, 3, 4, 5, 6, 7}
The number of subsets of S with K elements is given by the formula C(N, K). Example
My problem is this: Create a perfect minimal hash for those subsets. The size of the example hash table will be C(8, 3) or 56.
I don't care about ordering, only that there be 56 entries in the hash table, and that I can determine the hash quickly from a set of K integers. I also don't care about reversibility.
Example hash: hash({5, 2, 3}) = 42. (The number 42 isn't important, at least not here)
Is there a generic algorithm for this that will work with any values of N and K? I wasn't able to find one by searching Google, or my own naive efforts.

There is an algorithm to code and decode a combination into its number in the lexicographical order of all combinations with a given fixed K. The algorithm is linear to N for both code and decode of the combination. What language are you interested in?
EDIT: here is example code in c++(it founds the lexicographical number of a combination in the sequence of all combinations of n elements as opposed to the ones with k elements but is really good starting point):
typedef long long ll;
// Returns the number in the lexicographical order of all combinations of n numbers
// of the provided combination.
ll code(vector<int> a,int n)
{
sort(a.begin(),a.end());
int cur = 0;
int m = a.size();
ll res =0;
for(int i=0;i<a.size();i++)
{
if(a[i] == cur+1)
{
res++;
cur = a[i];
continue;
}
else
{
res++;
int number_of_greater_nums = n - a[i];
for(int j = a[i]-1,increment=1;j>cur;j--,increment++)
res += 1LL << (number_of_greater_nums+increment);
cur = a[i];
}
}
return res;
}
// Takes the lexicographical code of a combination of n numbers and returns the
// combination
vector<int> decode(ll kod, int n)
{
vector<int> res;
int cur = 0;
int left = n; // Out of how many numbers are we left to choose.
while(kod)
{
ll all = 1LL << left;// how many are the total combinations
for(int i=n;i>=0;i--)
{
if(all - (1LL << (n-i+1)) +1 <= kod)
{
res.push_back(i);
left = n-i;
kod -= all - (1LL << (n-i+1)) +1;
break;
}
}
}
return res;
}
I am sorry I have an algorithm for the problem you are asking for right now, but I believe it will be a good exercise to try to understand what I do above. Truth is this is one of the algorithms I teach in the course "Design and analysis of algorithms" and that is why I had it pre-written.

This is what you (and I) need:
hash() maps k-tuples from [1..n] onto the set 1..C(n,k)\subset N.
The effort is k subtractions (and O(k) is a lower bound anyway, see Strandjev's remark above):
// bino[n][k] is (n "over" k) = C(n,k) = {n \choose k}
// these are assumed to be precomputed globals
int hash(V a,int n, int k) {// V is assumed to be ordered, a_k<...<a_1
// hash(a_k,..,a_2,a_1) = (n k) - sum_(i=1)^k (n-a_i i)
// ii is "inverse i", runs from left to right
int res = bino[n][k];
int i;
for(unsigned int ii = 0; ii < a.size(); ++ii) {
i = a.size() - ii;
res = res - bino[n-a[ii]][i];
}
return res;
}

Permutation with repetition without allocate memory

I'm looking for an algorithm to generate all permutations with repetition of 4 elements in list(length 2-1000).
Java implementation
The problem is that the algorithm from the link above alocates too much memory for calculation. It creates an array with length of all possible combination. E.g 4^1000 for my example. So i got heap space exception.
Thank you

Generalized algorithm for lazily-evaluated generation of all permutations (with repetition) of length X for a set of choices Y:
for I = 0 to (Y^X - 1):
list_of_digits = calculate the digits of I in base Y
a_set_of_choices = possible_choices[D] for each digit D in list_of_digits
yield a_set_of_choices

If there is not length limit for repetition of your 4 symbols there is a very simple algorithm that will give you what you want. Just encode your string as a binary number where all 2 bits pattern encode one of the four symbol. To get all possible permutations with repetitions you just have to enumerate "count" all possible numbers. That can be quite long (more than the age of the universe) as a 1000 symbols will be 2000 bits long. Is it really what you want to do ? The heap overflow may not be the only limit...
Below is a trivial C implementation that enumerates all repetitions of length exactly n (n limited to 16000 with 32 bits unsigned) without allocating memory. I leave to the reader the exercice of enumerating all repetitions of at most length n.
#include <stdio.h>
typedef unsigned char cell;
cell a[1000];
int npack = sizeof(cell)*4;
void decode(cell * a, int nbsym)
{
unsigned i;
for (i=0; i < nbsym; i++){
printf("%c", "GATC"[a[i/npack]>>((i%npack)*2)&3]);
}
printf("\n");
}
void enumerate(cell * a, int nbsym)
{
unsigned i, j;
for (i = 0; i < 1000; i++){
a[i] = 0;
}
while (j <= (nbsym / npack)){
j = 0;
decode(a, nbsym);
while (!++a[j]){
j++;
}
if ((j == (nbsym / npack))
&& ((a[j] >> ((nbsym-1)%npack)*2)&4)){
break;
}
}
}
int main(){
enumerate(a, 5);
}

You know how to count: add 1 to the ones spot, if you go over 9 jump back to 0 and add 1 to the tens, etc..
So, if you have a list of length N with K items in each spot:
int[] permutations = new int[N];
boolean addOne() { // Returns true when it advances, false _once_ when finished
int i = 0;
permutations[i]++;
while (permutations[i] >= K) {
permutations[i] = 0;
i += 1;
if (i>=N) return false;
permutations[i]++;
}
return true;
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio