Number of all longest increasing subsequences - algorithm

I'm practicing algorithms and one of my tasks is to count the number of all longest increasing sub-sequences for given 0 < n <= 10^6 numbers. Solution O(n^2) is not an option.
I have already implemented finding a LIS and its length (LIS Algorithm), but this algorithm switches numbers to the lowest possible. Therefore, it's impossible to determine if sub-sequences with a previous number (the bigger one) would be able to achieve the longest length, otherwise I could just count those switches, I guess.
Any ideas how to get this in about O(nlogn)? I know that it should be solved using dynamic-programming.
I implemented one solution and it works well, but it requires two nested loops (i in 1..n) x (j in 1..i-1).
So it's O(n^2) I think, nevertheless it's too slow.
I tried even to move those numbers from array to a binary tree (because in each i iteration I look for all smaller numbers then number[i] - going through elements i-1..1), but it was even slower.
Example tests:
1 3 2 2 4
result: 3 (1,3,4 | 1,2,4 | 1,2,4)
3 2 1
result: 3 (1 | 2 | 3)
16 5 8 6 1 10 5 2 15 3 2 4 1
result: 3 (5,8,10,15 | 5,6,10,15 | 1,2,3,4)

Finding the number of all longest increasing subsequences
Full Java code of improved LIS algorithm, which discovers not only the length of longest increasing subsequence, but number of subsequences of such length, is below. I prefer to use generics to allow not only integers, but any comparable types.
#Test
public void testLisNumberAndLength() {
List<Integer> input = Arrays.asList(16, 5, 8, 6, 1, 10, 5, 2, 15, 3, 2, 4, 1);
int[] result = lisNumberAndlength(input);
System.out.println(String.format(
"This sequence has %s longest increasing subsequenses of length %s",
result[0], result[1]
));
}
/**
* Body of improved LIS algorithm
*/
public <T extends Comparable<T>> int[] lisNumberAndLength(List<T> input) {
if (input.size() == 0)
return new int[] {0, 0};
List<List<Sub<T>>> subs = new ArrayList<>();
List<Sub<T>> tails = new ArrayList<>();
for (T e : input) {
int pos = search(tails, new Sub<>(e, 0), false); // row for a new sub to be placed
int sum = 1;
if (pos > 0) {
List<Sub<T>> pRow = subs.get(pos - 1); // previous row
int index = search(pRow, new Sub<T>(e, 0), true); // index of most left element that <= e
if (pRow.get(index).value.compareTo(e) < 0) {
index--;
}
sum = pRow.get(pRow.size() - 1).sum; // sum of tail element in previous row
if (index >= 0) {
sum -= pRow.get(index).sum;
}
}
if (pos >= subs.size()) { // add a new row
List<Sub<T>> row = new ArrayList<>();
row.add(new Sub<>(e, sum));
subs.add(row);
tails.add(new Sub<>(e, 0));
} else { // add sub to existing row
List<Sub<T>> row = subs.get(pos);
Sub<T> tail = row.get(row.size() - 1);
if (tail.value.equals(e)) {
tail.sum += sum;
} else {
row.add(new Sub<>(e, tail.sum + sum));
tails.set(pos, new Sub<>(e, 0));
}
}
}
List<Sub<T>> lastRow = subs.get(subs.size() - 1);
Sub<T> last = lastRow.get(lastRow.size() - 1);
return new int[]{last.sum, subs.size()};
}
/**
* Implementation of binary search in a sorted list
*/
public <T> int search(List<? extends Comparable<T>> a, T v, boolean reversed) {
if (a.size() == 0)
return 0;
int sign = reversed ? -1 : 1;
int right = a.size() - 1;
Comparable<T> vRight = a.get(right);
if (vRight.compareTo(v) * sign < 0)
return right + 1;
int left = 0;
int pos = 0;
Comparable<T> vPos;
Comparable<T> vLeft = a.get(left);
for(;;) {
if (right - left <= 1) {
if (vRight.compareTo(v) * sign >= 0 && vLeft.compareTo(v) * sign < 0)
return right;
else
return left;
}
pos = (left + right) >>> 1;
vPos = a.get(pos);
if (vPos.equals(v)) {
return pos;
} else if (vPos.compareTo(v) * sign > 0) {
right = pos;
vRight = vPos;
} else {
left = pos;
vLeft = vPos;
}
}
}
/**
* Class for 'sub' pairs
*/
public static class Sub<T extends Comparable<T>> implements Comparable<Sub<T>> {
T value;
int sum;
public Sub(T value, int sum) {
this.value = value;
this.sum = sum;
}
#Override public String toString() {
return String.format("(%s, %s)", value, sum);
}
#Override public int compareTo(Sub<T> another) {
return this.value.compareTo(another.value);
}
}
Explanation
As my explanation seems to be long, I will call initial sequence "seq" and any its subsequence "sub". So the task is to calculate count of longest increasing subs that can be obtained from the seq.
As I mentioned before, idea is to keep counts of all possible longest subs obtained on previous steps. So let's create a numbered list of rows, where number of each line equals the length of subs stored in this row. And let's store subs as pairs of numbers (v, c), where "v" is "value" of ending element, "c" is "count" of subs of given length that end by "v". For example:
1: (16, 1) // that means that so far we have 1 sub of length 1 which ends by 16.
We will build such list step by step, taking elements from initial sequence by their order. On every step we will try to add this element to the longest sub that it can be added to and record changes.
Building a list
Let's build the list using sequence from your example, since it has all possible options:
16 5 8 6 1 10 5 2 15 3 2 4 1
First, take element 16. Our list is empty so far, so we just put one pair in it:
1: (16, 1) <= one sub that ends by 16
Next is 5. It cannot be added to a sub that ends by 16, so it will create new sub with length of 1. We create a pair (5, 1) and put it into line 1:
1: (16, 1)(5, 1)
Element 8 is coming next. It cannot create the sub [16, 8] of length 2, but can create the sub [5, 8]. So, this is where algorithm is coming. First, we iterate the list rows upside down, looking at the "values" of last pair. If our element is greater than values of all last elements in all rows, then we can add it to existing sub(s), increasing its length by one. So value 8 will create new row of the list, because it is greater than values all last elements existing in the list so far (i. e. > 5):
1: (16, 1)(5, 1)
2: (8, ?) <=== need to resolve how many longest subs ending by 8 can be obtained
Element 8 can continue 5, but cannot continue 16. So we need to search through previous row, starting from its end, calculating the sum of "counts" in pairs which "value" is less than 8:
(16, 1)(5, 1)^ // sum = 0
(16, 1)^(5, 1) // sum = 1
^(16, 1)(5, 1) // value 16 >= 8: stop. count = sum = 1, so write 1 in pair next to 8
1: (16, 1)(5, 1)
2: (8, 1) <=== so far we have 1 sub of length 2 which ends by 8.
Why don't we store value 8 into subs of length 1 (first line)? Because we need subs of maximum possible length, and 8 can continue some previous subs. So every next number greater than 8 will also continue such sub and there is no need to keep 8 as sub of length less that it can be.
Next. 6. Searching upside down by last "values" in rows:
1: (16, 1)(5, 1) <=== 5 < 6, go next
2: (8, 1)
1: (16, 1)(5, 1)
2: (8, 1 ) <=== 8 >= 6, so 6 should be put here
Found the room for 6, need to calculate a count:
take previous line
(16, 1)(5, 1)^ // sum = 0
(16, 1)^(5, 1) // 5 < 6: sum = 1
^(16, 1)(5, 1) // 16 >= 6: stop, write count = sum = 1
1: (16, 1)(5, 1)
2: (8, 1)(6, 1)
After processing 1:
1: (16, 1)(5, 1)(1, 1) <===
2: (8, 1)(6, 1)
After processing 10:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)
3: (10, 2) <=== count is 2 because both "values" 8 and 6 from previous row are less than 10, so we summarized their "counts": 1 + 1
After processing 5:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)(5, 1) <===
3: (10, 2)
After processing 2:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)(5, 1)(2, 1) <===
3: (10, 2)
After processing 15:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)(5, 1)(2, 1)
3: (10, 2)
4: (15, 2) <===
After processing 3:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)(5, 1)(2, 1)
3: (10, 2)(3, 1) <===
4: (15, 2)
After processing 2:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)(5, 1)(2, 2) <===
3: (10, 2)(3, 1)
4: (15, 2)
If when searching rows by last element we find equal element, we calculate its "count" again based on previous row, and add to existing "count".
After processing 4:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)(5, 1)(2, 2)
3: (10, 2)(3, 1)
4: (15, 2)(4, 1) <===
After processing 1:
1: (16, 1)(5, 1)(1, 2) <===
2: (8, 1)(6, 1)(5, 1)(2, 2)
3: (10, 2)(3, 1)
4: (15, 2)(4, 1)
So what do we have after processing all initial sequence? Looking at the last row, we see that we have 3 longest subs, each consist of 4 elements: 2 end by 15 and 1 ends by 4.
What about complexity?
On every iteration, when taking next element from initial sequence, we make 2 loops: first when iterating rows to find room for next element, and second when summarizing counts in previous row. So for every element we make maximum to n iterations (worst cases: if initial seq consists of elements in increasing order, we will get a list of n rows with 1 pair in every row; if seq is sorted in descending order, we will obtain list of 1 row with n elements). By the way, O(n2) complexity is not what we want.
First, this is obvious, that in every intermediate state rows are sorted by increasing order of their last "value". So instead of brute loop, binary searching can be performed, which complexity is O(log n).
Second, we don't need to summarize "counts" of subs by looping through row elements every time. We can summarize them in process, when new pair is added to the row, like:
1: (16, 1)(5, 2) <=== instead of 1, put 1 + "count" of previous element in the row
So second number will show not count of longest subs that can be obtained with given value at the end, but summary count of all longest subs that end by any element that is greater or equal to "value" from the pair.
Thus, "counts" will be replaced by "sums". And instead of iterating elements in previous row, we just perform binary search (it is possible because pairs in any row are always ordered by their "values") and take "sum" for new pair as "sum" of last element in previous row minus "sum" from element left to found position in previous row plus "sum" of previous element in the current row.
So when processing 4:
1: (16, 1)(5, 2)(1, 3)
2: (8, 1)(6, 2)(5, 3)(2, 5)
3: (10, 2)(3, 3)
4: (15, 2) <=== room for (4, ?)
search in row 3 by "values" < 4:
3: (10, 2)^(3, 3)
4 will be paired with (3-2+2): ("sum" from the last pair of previous row) - ("sum" from pair left to found position in previous row) + ("sum" from previous pair in current row):
4: (15, 2)(4, 3)
In this case, final count of all longest subs is "sum" from the last pair of the last row of the list, i. e. 3, not 3 + 2.
So, performing binary search to both row search and sum search, we will come with O(n*log n) complexity.
What about memory consumed, after processing all array we obtain maximum n pairs, so memory consumption in case of dynamic arrays will be O(n). Besides, when using dynamic arrays or collections, some additional time is needed to allocate and resize them, but most operations are made in O(1) time because we don't make any kind of sorting and rearrangement during process. So complexity estimation seems to be final.

Sasha Salauyou's answer is great but I am not clear why
sum -= pRow.get(index).sum;
here is my code based on the same idea
import java.math.BigDecimal;
import java.util.*;
class lisCount {
static BigDecimal lisCount(int[] a) {
class Container {
Integer v;
BigDecimal count;
Container(Integer v) {
this.v = v;
}
}
List<List<Container>> lisIdxSeq = new ArrayList<List<Container>>();
int lisLen, lastIdx;
List<Container> lisSeqL;
Container lisEle;
BigDecimal count;
int pre;
for (int i = 0; i < a.length; i++){
pre = -1;
count = new BigDecimal(1);
lisLen = lisIdxSeq.size();
lastIdx = lisLen - 1;
lisEle = new Container(i);
if(lisLen == 0 || a[i] > a[lisIdxSeq.get(lastIdx).get(0).v]){
// lis len increased
lisSeqL = new ArrayList<Container>();
lisSeqL.add(lisEle);
lisIdxSeq.add(lisSeqL);
pre = lastIdx;
}else{
int h = lastIdx;
int l = 0;
while(l < h){
int m = (l + h) / 2;
if(a[lisIdxSeq.get(m).get(0).v] < a[i]) l = m + 1;
else h = m;
}
List<Container> lisSeqC = lisIdxSeq.get(l);
if(a[i] <= a[lisSeqC.get(0).v]){
int hi = lisSeqC.size() - 1;
int lo = 0;
while(hi < lo){
int mi = (hi + lo) / 2;
if(a[lisSeqC.get(mi).v] < a[i]) lo = mi + 1;
else hi = mi;
}
lisSeqC.add(lo, lisEle);
pre = l - 1;
}
}
if(pre >= 0){
Iterator<Container> it = lisIdxSeq.get(pre).iterator();
count = new BigDecimal(0);
while(it.hasNext()){
Container nt = it.next();
if(a[nt.v] < a[i]){
count = count.add(nt.count);
}else break;
}
}
lisEle.count = count;
}
BigDecimal rst = new BigDecimal(0);
Iterator<Container> i = lisIdxSeq.get(lisIdxSeq.size() - 1).iterator();
while(i.hasNext()){
rst = rst.add(i.next().count);
}
return rst;
}
public static void main(String[] args) {
System.out.println(lisCount(new int[] { 1, 3, 2, 2, 4 }));
System.out.println(lisCount(new int[] { 3, 2, 1 }));
System.out.println(lisCount(new int[] { 16, 5, 8, 6, 1, 10, 5, 2, 15, 3, 2, 4, 1 }));
}
}

Patience sorting is also O(N*logN), but way shorter and simpler than the methods based on binary search:
static int[] input = {4, 5, 2, 8, 9, 3, 6, 2, 7, 8, 6, 6, 7, 7, 3, 6};
/**
* Every time a value is tested it either adds to the length of LIS (by calling decs.add() with it), or reduces the remaining smaller cards that must be found before LIS consists of smaller cards. This way all inputs/cards contribute in one way or another (except if they're equal to the biggest number in the sequence; if want't to include in sequence, replace 'card <= decs.get(decIndex)' with 'card < decs.get(decIndex)'. If they're bigger than all decs, they add to the length of LIS (which is something we want), while if they're smaller than a dec, they replace it. We want this, because the smaller the biggest dec is, the smaller input we need before we can add onto LIS.
*
* If we run into a decreasing sequence the input from this sequence will replace each other (because they'll always replace the leftmost dec). Thus this algorithm won't wrongfully register e.g. {2, 1, 3} as {2, 3}, but rather {2} -> {1} -> {1, 3}.
*
* WARNING: This can only be used to find length, not actual sequence, seeing how parts of the sequence will be replaced by smaller numbers trying to make their sequence dominate
*
* Due to bigger decs being added to the end/right of 'decs' and the leftmost decs always being the first to be replaced with smaller decs, the further a dec is to the right (the bigger it's index), the bigger it must be. Thus, by always replacing the leftmost decs, we don't run the risk of replacing the biggest number in a sequence (the number which determines if more cards can be added to that sequence) before a sequence with the same length but smaller numbers (thus currently equally good, due to length, and potentially better, due to less needed to increase length) has been found.
*/
static void patienceFindLISLength() {
ArrayList<Integer> decs = new ArrayList<>();
inputLoop: for (Integer card : input) {
for (int decIndex = 0; decIndex < decs.size(); decIndex++) {
if (card <= decs.get(decIndex)) {
decs.set(decIndex, card);
continue inputLoop;
}
}
decs.add(card);
}
System.out.println(decs.size());
}

Cpp implementation of above logic:
#include<bits/stdc++.h>
using namespace std;
#define pb push_back
#define pob pop_back
#define pll pair<ll, ll>
#define pii pair<int, int>
#define ll long long
#define ull unsigned long long
#define fori(a,b) for(i=a;i<b;i++)
#define forj(a,b) for(j=a;j<b;j++)
#define fork(a,b) for(k=a;k<b;k++)
#define forl(a,b) for(l=a;l<b;l++)
#define forir(a,b) for(i=a;i>=b;i--)
#define forjr(a,b) for(j=a;j>=b;j--)
#define mod 1000000007
#define boost std::ios::sync_with_stdio(false)
struct comp_pair_int_rev
{
bool operator()(const pair<int,int> &a, const int & b)
{
return (a.first > b);
}
bool operator()(const int & a,const pair<int,int> &b)
{
return (a > b.first);
}
};
struct comp_pair_int
{
bool operator()(const pair<int,int> &a, const int & b)
{
return (a.first < b);
}
bool operator()(const int & a,const pair<int,int> &b)
{
return (a < b.first);
}
};
int main()
{
int n,i,mx=0,p,q,r,t;
cin>>n;
int a[n];
vector<vector<pii > > v(100005);
vector<pii > v1(100005);
fori(0,n)
cin>>a[i];
v[1].pb({a[0], 1} );
v1[1]= {a[0], 1};
mx=1;
fori(1,n)
{
if(a[i]<=v1[1].first)
{
r=v1[1].second;
if(v1[1].first==a[i])
v[1].pob();
v1[1]= {a[i], r+1};
v[1].pb({a[i], r+1});
}
else if(a[i]>v1[mx].first)
{
q=upper_bound(v[mx].begin(), v[mx].end(), a[i], comp_pair_int_rev() )-v[mx].begin();
if(q==0)
{
r=v1[mx].second;
}
else
{
r=v1[mx].second-v[mx][q-1].second;
}
v1[++mx]= {a[i], r};
v[mx].pb({a[i], r});
}
else if(a[i]==v1[mx].first)
{
q=upper_bound(v[mx-1].begin(), v[mx-1].end(), a[i], comp_pair_int_rev() )-v[mx-1].begin();
if(q==0)
{
r=v1[mx-1].second;
}
else
{
r=v1[mx-1].second-v[mx-1][q-1].second;
}
p=v1[mx].second;
v1[mx]= {a[i], p+r};
v[mx].pob();
v[mx].pb({a[i], p+r});
}
else
{
p=lower_bound(v1.begin()+1, v1.begin()+mx+1, a[i], comp_pair_int() )-v1.begin();
t=v1[p].second;
if(v1[p].first==a[i])
{
v[p].pob();
}
q=upper_bound(v[p-1].begin(), v[p-1].end(), a[i], comp_pair_int_rev() )-v[p-1].begin();
if(q==0)
{
r=v1[p-1].second;
}
else
{
r=v1[p-1].second-v[p-1][q-1].second;
}
v1[p]= {a[i], t+r};
v[p].pb({a[i], t+r});
}
}
cout<<v1[mx].second;
return 0;
}

Although I completely agree with Alex this can be done very easily using Segment tree.
Here is the logic to find the length of LIS using segment tree in NlogN.
https://www.quora.com/What-is-the-approach-to-find-the-length-of-the-strictly-increasing-longest-subsequence
Here is an approach that finds no of LIS but takes N^2 complexity.
https://codeforces.com/blog/entry/48677
We use segment tree(as used here) to optimize approach given in this.
Here is the logic:
first sort the array in ascending order(also keep the original order), initialise segment tree with zeroes, segment tree should query two things(use pair for this) for a given range:
a. max of first.
b. sum of second corresponding to max-first.
iterate through sorted array.
let j be the original index of current element, then we query (0 - j-1) and update the j-th element(if result of query is 0,0 then we update it with (1,1)).
Here is my code in c++:
#include<bits/stdc++.h>
#define tr(container, it) for(typeof(container.begin()) it = container.begin(); it != container.end(); it++)
#define ll long long
#define pb push_back
#define endl '\n'
#define pii pair<ll int,ll int>
#define vi vector<ll int>
#define all(a) (a).begin(),(a).end()
#define F first
#define S second
#define sz(x) (ll int)x.size()
#define hell 1000000007
#define rep(i,a,b) for(ll int i=a;i<b;i++)
#define lbnd lower_bound
#define ubnd upper_bound
#define bs binary_search
#define mp make_pair
using namespace std;
#define N 100005
ll max(ll a , ll b)
{
if( a > b) return a ;
else return
b;
}
ll n,l,r;
vector< pii > seg(4*N);
pii query(ll cur,ll st,ll end,ll l,ll r)
{
if(l<=st&&r>=end)
return seg[cur];
if(r<st||l>end)
return mp(0,0); /* 2-change here */
ll mid=(st+end)>>1;
pii ans1=query(2*cur,st,mid,l,r);
pii ans2=query(2*cur+1,mid+1,end,l,r);
if(ans1.F>ans2.F)
return ans1;
if(ans2.F>ans1.F)
return ans2;
return make_pair(ans1.F,ans2.S+ans1.S); /* 3-change here */
}
void update(ll cur,ll st,ll end,ll pos,ll upd1, ll upd2)
{
if(st==end)
{
// a[pos]=upd; /* 4-change here */
seg[cur].F=upd1;
seg[cur].S=upd2; /* 5-change here */
return;
}
ll mid=(st+end)>>1;
if(st<=pos&&pos<=mid)
update(2*cur,st,mid,pos,upd1,upd2);
else
update(2*cur+1,mid+1,end,pos,upd1,upd2);
seg[cur].F=max(seg[2*cur].F,seg[2*cur+1].F);
if(seg[2*cur].F==seg[2*cur+1].F)
seg[cur].S = seg[2*cur].S+seg[2*cur+1].S;
else
{
if(seg[2*cur].F>seg[2*cur+1].F)
seg[cur].S = seg[2*cur].S;
else
seg[cur].S = seg[2*cur+1].S;
/* 6-change here */
}
}
int main()
{
ios_base::sync_with_stdio(false);
cin.tie(0);
cout.tie(0);
int TESTS=1;
// cin>>TESTS;
while(TESTS--)
{
int n ;
cin >> n;
vector< pii > arr(n);
rep(i,0,n)
{
cin >> arr[i].F;
arr[i].S = -i;
}
sort(all(arr));
update(1,0,n-1,-arr[0].S,1,1);
rep(i,1,n)
{
pii x = query(1,0,n-1,-1,-arr[i].S - 1 );
update(1,0,n-1,-arr[i].S,x.F+1,max(x.S,1));
}
cout<<seg[1].S;//answer
}
return 0;
}

Related

using binary search to count occurrences

assume I have sorted array A in length n so 1
I need to write pseuodocode of a program that give output of all occurrences of each element.
the algorithm runtime has to be maximum k(c1+c2*log(n)).
example - A=[1,1,2,2,2,5,5,5,5] ----> (1,2)(2,3)(5,4)
I thought about using binary search when the first element I want to count is A[1] and I need to find his last occurrence.
then the next element is A[last occurrence index + 1] and so on.
I have a bit difficult with the idea and writig it down as pseuodocode.
tnx
Recursive algorithm, it gets left and right position and calculates middle position. Going deeper if there is number change, en edge. Up to here it is simple binary search. But once it detects (on distance=1) an edge, change of numbers, it will return it in 4 values: 'what number sequence ended', 'on what position', 'what started', 'on what position'. Parent node then merges these 4 values from left and right side and if it detects complete sequence 'in the middle', it immediately prints it and pass just ending edge (from left side) and starting edge (from right).
It is not possible to achieve that asymptotic complexity.
The reason is no matter what algorithm it is, When all the n elements are distinct, It has to return all the elements.That implies it has to read all of them.Of course, this operation takes O(n).
You can count number of occurrences for one entry in O(log(n))
static int count(int[] array, int target, int start, int end, Func<int, int, bool> compare)
{
if (end < start) { return start; }
int m = (start + end) / 2;
if (compare(target, array[m])) { return count(array, target, start, m - 1, compare); }
else { return count(array, target, m + 1, end, compare); }
}
static void Main(string[] args)
{
int[] a = { 1, 3, 8, 12, 12, 12, 25, 88 };
int r1 = count(a, 12, 0, a.Length - 1, (x1, x2) =>
{
return x1 < x2;
});
int r2 = count(a, 12, 0, a.Length - 1, (x1, x2) =>
{
return x1 <= x2;
});
Console.Out.WriteLine("count=" + (r1 - r2).ToString());
}

Efficient method for generating these sequence

Problem:
I need to generate the following sequence. I have the order of the matrix as input.
Example:
I need to generate the sequence of position of its elements.
(0,0),(0,1),(1,0),(1,1) ->for order 2
(0,0),(0,1),(0,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2) -> for order 3.
I need to have function which does this for me. When I call this function it should calculate for me on the fly. I don't want to store the sequence in the memory.
Eg:
first_call - > return value (0,0)
second_call to function - > return value ( 0,1)
...and so on...
You can store these values in some global variables.
PS:
The function has to be thread-safe as the application is multi-threaded. I know this condition doesn't make difference. Just wanted to convey the entire problem.
Precision:
I have tried my solution but I think its in efficient. I am looking for an efficient method to do that. You can just mention the steps. I don't need implementation in any particular language. Please let me know if more info is required for the question.
Use a global variable to store the number of times you have called the function. Call it t. If order is order, then
f = (t div order, t mod order)
Where div is the integer division (e.g. 5 div 3 = 1) and mod is the modulus (i.e. remainder of the division). (e.g. 5 mod 3 = 2).
So in Java for example:
public class MyCounter {
private static int t = 0;
public static int[] myFunction(int order) {
return new int[] { t / order , t++ % order };
}
public static void main(String[] args) {
int order = 3;
for(int i=0; i<order*order; i++) {
int[] k = myFunction(order);
System.out.println("("+k[0]+", "+k[1]+")");
}
}
}
#define MATRIX_ORDER 3
void NextPosition(int aIndex, int& aRow, int& aColumn)
{
if (aColumn == MATRIX_ORDER - 1)
{
aRow++;
aColumn = 0;
} else {
aColumn++;
}
}
void SomeFunction()
{
int index = 0;
int row = 0;
int column = -1;
while (index < (MATRIX_ORDER * MATRIX_ORDER))
{
NextPosition(index, row, column);
printf("index: %d, row: %d, column: %d\n", index, row, column);
index++;
}
}
Output:
index: 0, row: 0, column: 0
index: 1, row: 0, column: 1
index: 2, row: 0, column: 2
index: 3, row: 1, column: 0
index: 4, row: 1, column: 1
index: 5, row: 1, column: 2
index: 6, row: 2, column: 0
index: 7, row: 2, column: 1
index: 8, row: 2, column: 2
Here's a Python solution using a generator:
def sequence_gen(order=2):
for i in range(order*order):
yield divmod(i, order)
for val in sequence_gen(2):
print(val)
#(0, 0)
#(0, 1)
#(1, 0)
#(1, 1)
for val in sequence_gen(3):
print(val)
#(0, 0)
#(0, 1)
#(0, 2)
#(1, 0)
#(1, 1)
#(1, 2)
#(2, 0)
#(2, 1)
#(2, 2)

Find Second largest number in array at most n+log₂(n)−2 comparisons [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
The community reviewed whether to reopen this question 12 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
You are given as input an unsorted array of n distinct numbers, where n is a power of 2. Give an algorithm that identifies the second-largest number in the array, and that uses at most n+log₂(n)−2 comparisons.
Start with comparing elements of the n element array in odd and even positions and determining largest element of each pair. This step requires n/2 comparisons. Now you've got only n/2 elements. Continue pairwise comparisons to get n/4, n/8, ... elements. Stop when the largest element is found. This step requires a total of n/2 + n/4 + n/8 + ... + 1 = n-1 comparisons.
During previous step, the largest element was immediately compared with log₂(n) other elements. You can determine the largest of these elements in log₂(n)-1 comparisons. That would be the second-largest number in the array.
Example: array of 8 numbers [10,9,5,4,11,100,120,110].
Comparisons on level 1: [10,9] ->10 [5,4]-> 5, [11,100]->100 , [120,110]-->120.
Comparisons on level 2: [10,5] ->10 [100,120]->120.
Comparisons on level 3: [10,120]->120.
Maximum is 120. It was immediately compared with: 10 (on level 3), 100 (on level 2), 110 (on level 1).
Step 2 should find the maximum of 10, 100, and 110. Which is 110. That's the second largest element.
sly s's answer is derived from this paper, but he didn't explain the algorithm, which means someone stumbling across this question has to read the whole paper, and his code isn't very sleek as well. I'll give the crux of the algorithm from the aforementioned paper, complete with complexity analysis, and also provide a Scala implementation, just because that's the language I chose while working on these problems.
Basically, we do two passes:
Find the max, and keep track of which elements the max was compared to.
Find the max among the elements the max was compared to; the result is the second largest element.
In the picture above, 12 is the largest number in the array, and was compared to 3, 1, 11, and 10 in the first pass. In the second pass, we find the largest among {3, 1, 11, 10}, which is 11, which is the second largest number in the original array.
Time Complexity:
All elements must be looked at, therefore, n - 1 comparisons for pass 1.
Since we divide the problem into two halves each time, there are at most log₂n recursive calls, for each of which, the comparisons sequence grows by at most one; the size of the comparisons sequence is thus at most log₂n, therefore, log₂n - 1 comparisons for pass 2.
Total number of comparisons <= (n - 1) + (log₂n - 1) = n + log₂n - 2
def second_largest(nums: Sequence[int]) -> int:
def _max(lo: int, hi: int, seq: Sequence[int]) -> Tuple[int, MutableSequence[int]]:
if lo >= hi:
return seq[lo], []
mid = lo + (hi - lo) // 2
x, a = _max(lo, mid, seq)
y, b = _max(mid + 1, hi, seq)
if x > y:
a.append(y)
return x, a
b.append(x)
return y, b
comparisons = _max(0, len(nums) - 1, nums)[1]
return _max(0, len(comparisons) - 1, comparisons)[0]
The first run for the given example is as follows:
lo=0, hi=1, mid=0, x=10, a=[], y=4, b=[]
lo=0, hi=2, mid=1, x=10, a=[4], y=5, b=[]
lo=3, hi=4, mid=3, x=8, a=[], y=7, b=[]
lo=3, hi=5, mid=4, x=8, a=[7], y=2, b=[]
lo=0, hi=5, mid=2, x=10, a=[4, 5], y=8, b=[7, 2]
lo=6, hi=7, mid=6, x=12, a=[], y=3, b=[]
lo=6, hi=8, mid=7, x=12, a=[3], y=1, b=[]
lo=9, hi=10, mid=9, x=6, a=[], y=9, b=[]
lo=9, hi=11, mid=10, x=9, a=[6], y=11, b=[]
lo=6, hi=11, mid=8, x=12, a=[3, 1], y=11, b=[9]
lo=0, hi=11, mid=5, x=10, a=[4, 5, 8], y=12, b=[3, 1, 11]
Things to note:
There are exactly n - 1=11 comparisons for n=12.
From the last line, y=12 wins over x=10, and the next pass starts with the sequence [3, 1, 11, 10], which has log₂(12)=3.58 ~ 4 elements, and will require 3 comparisons to find the maximum.
I have implemented this algorithm in Java answered by #Evgeny Kluev. The total comparisons are n+log2(n)−2. There is also a good reference:
Alexander Dekhtyar: CSC 349: Design and Analyis of Algorithms. This is similar to the top voted algorithm.
public class op1 {
private static int findSecondRecursive(int n, int[] A){
int[] firstCompared = findMaxTournament(0, n-1, A); //n-1 comparisons;
int[] secondCompared = findMaxTournament(2, firstCompared[0]-1, firstCompared); //log2(n)-1 comparisons.
//Total comparisons: n+log2(n)-2;
return secondCompared[1];
}
private static int[] findMaxTournament(int low, int high, int[] A){
if(low == high){
int[] compared = new int[2];
compared[0] = 2;
compared[1] = A[low];
return compared;
}
int[] compared1 = findMaxTournament(low, (low+high)/2, A);
int[] compared2 = findMaxTournament((low+high)/2+1, high, A);
if(compared1[1] > compared2[1]){
int k = compared1[0] + 1;
int[] newcompared1 = new int[k];
System.arraycopy(compared1, 0, newcompared1, 0, compared1[0]);
newcompared1[0] = k;
newcompared1[k-1] = compared2[1];
return newcompared1;
}
int k = compared2[0] + 1;
int[] newcompared2 = new int[k];
System.arraycopy(compared2, 0, newcompared2, 0, compared2[0]);
newcompared2[0] = k;
newcompared2[k-1] = compared1[1];
return newcompared2;
}
private static void printarray(int[] a){
for(int i:a){
System.out.print(i + " ");
}
System.out.println();
}
public static void main(String[] args) {
//Demo.
System.out.println("Origial array: ");
int[] A = {10,4,5,8,7,2,12,3,1,6,9,11};
printarray(A);
int secondMax = findSecondRecursive(A.length,A);
Arrays.sort(A);
System.out.println("Sorted array(for check use): ");
printarray(A);
System.out.println("Second largest number in A: " + secondMax);
}
}
the problem is:
let's say, in comparison level 1, the algorithm need to be remember all the array element because largest is not yet known, then, second, finally, third. by keep tracking these element via assignment will invoke additional value assignment and later when the largest is known, you need also consider the tracking back. As the result, it will not be significantly faster than simple 2N-2 Comparison algorithm. Moreover, because the code is more complicated, you need also think about potential debugging time.
eg: in PHP, RUNNING time for comparison vs value assignment roughly is :Comparison: (11-19) to value assignment: 16.
I shall give some examples for better understanding. :
example 1 :
>12 56 98 12 76 34 97 23
>>(12 56) (98 12) (76 34) (97 23)
>>> 56 98 76 97
>>>> (56 98) (76 97)
>>>>> 98 97
>>>>>> 98
The largest element is 98
Now compare with lost ones of the largest element 98. 97 will be the second largest.
nlogn implementation
public class Test {
public static void main(String...args){
int arr[] = new int[]{1,2,2,3,3,4,9,5, 100 , 101, 1, 2, 1000, 102, 2,2,2};
System.out.println(getMax(arr, 0, 16));
}
public static Holder getMax(int[] arr, int start, int end){
if (start == end)
return new Holder(arr[start], Integer.MIN_VALUE);
else {
int mid = ( start + end ) / 2;
Holder l = getMax(arr, start, mid);
Holder r = getMax(arr, mid + 1, end);
if (l.compareTo(r) > 0 )
return new Holder(l.high(), r.high() > l.low() ? r.high() : l.low());
else
return new Holder(r.high(), l.high() > r.low() ? l.high(): r.low());
}
}
static class Holder implements Comparable<Holder> {
private int low, high;
public Holder(int r, int l){low = l; high = r;}
public String toString(){
return String.format("Max: %d, SecMax: %d", high, low);
}
public int compareTo(Holder data){
if (high == data.high)
return 0;
if (high > data.high)
return 1;
else
return -1;
}
public int high(){
return high;
}
public int low(){
return low;
}
}
}
Why not to use this hashing algorithm for given array[n]? It runs c*n, where c is constant time for check and hash. And it does n comparisons.
int first = 0;
int second = 0;
for(int i = 0; i < n; i++) {
if(array[i] > first) {
second = first;
first = array[i];
}
}
Or am I just do not understand the question...
In Python2.7: The following code works at O(nlog log n) for the extra sort. Any optimizations?
def secondLargest(testList):
secondList = []
# Iterate through the list
while(len(testList) > 1):
left = testList[0::2]
right = testList[1::2]
if (len(testList) % 2 == 1):
right.append(0)
myzip = zip(left,right)
mymax = [ max(list(val)) for val in myzip ]
myzip.sort()
secondMax = [x for x in myzip[-1] if x != max(mymax)][0]
if (secondMax != 0 ):
secondList.append(secondMax)
testList = mymax
return max(secondList)
public static int FindSecondLargest(int[] input)
{
Dictionary<int, List<int>> dictWinnerLoser = new Dictionary<int, List<int>>();//Keeps track of loosers with winners
List<int> lstWinners = null;
List<int> lstLoosers = null;
int winner = 0;
int looser = 0;
while (input.Count() > 1)//Runs till we get max in the array
{
lstWinners = new List<int>();//Keeps track of winners of each run, as we have to run with winners of each run till we get one winner
for (int i = 0; i < input.Count() - 1; i += 2)
{
if (input[i] > input[i + 1])
{
winner = input[i];
looser = input[i + 1];
}
else
{
winner = input[i + 1];
looser = input[i];
}
lstWinners.Add(winner);
if (!dictWinnerLoser.ContainsKey(winner))
{
lstLoosers = new List<int>();
lstLoosers.Add(looser);
dictWinnerLoser.Add(winner, lstLoosers);
}
else
{
lstLoosers = dictWinnerLoser[winner];
lstLoosers.Add(looser);
dictWinnerLoser[winner] = lstLoosers;
}
}
input = lstWinners.ToArray();//run the loop again with winners
}
List<int> loosersOfWinner = dictWinnerLoser[input[0]];//Gives all the elemetns who lost to max element of array, input array now has only one element which is actually the max of the array
winner = 0;
for (int i = 0; i < loosersOfWinner.Count(); i++)//Now max in the lossers of winner will give second largest
{
if (winner < loosersOfWinner[i])
{
winner = loosersOfWinner[i];
}
}
return winner;
}

How to search for closest value in a lookup table?

I have a simple one dimmensional array of integer values that represent a physical set of part values I have to work with. I then calculate and ideal value mathematically.
How could I write an efficient search algorithm that will find the smallest abosulte difference from my ideal value in the array?
The array is predetermined and constant, so it can be sorted however I need.
Example
Lookup array:
100, 152, 256, 282, 300
Searching for an ideal value of 125 would find 100 in the array, whereas 127 would find 152.
The actual lookup array will be about 250 items long and never change.
Once array is sorted, use binary search
This is very similar to a binary search except if it does not find the exact key, it would return a key would be very close to the provided key.
Logic is to search till exact key is found or till there exactly one key left between high key and the low while performing binary search.
Consider an array n[] = {1, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20}
if you search for the key: 2, then using below algorithm
Step 1: high=10, low=0, med=5
Step 2: high=5, low=0, med=2
Step 3: high=2, low=0, med=1 In this step the exact key is found. So it returns 1.
if you search for the key:3 (which is not present in the array), then using below algorithm
Step 1: high=10, low=0, med=5
Step 2: high=5, low=0, med=2
Step 3: high=2, low=0, med=1
Step 4: high=1, low=0, At this step high=low+1 i.e. no more element to search. So it returns med=1.
Hope this helps...
public static <T> int binarySearch(List<T> list, T key, Comparator<T> compare) {
int low, high, med, c;
T temp;
high = list.size();
low = 0;
med = (high + low) / 2;
while (high != low+1) {
temp = list.get(med);
c = compare.compare(temp, key);
if (c == 0) {
return med;
} else if (c < 0){
low = med;
}else{
high = med;
}
med = (high + low) / 2;
}
return med;
}
/** ------------------------ Example -------------------- **/
public static void main(String[] args) {
List<Integer> nos = new ArrayList<Integer>();
nos.addAll(Arrays.asList(new Integer[]{1, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20}));
search(nos, 2); // Output Search:2 Key:1 Value:2
search(nos, 3); // Output Search:3 Key:1 Value:2
search(nos, 10); // Output Search:10 Key:5 Value:10
search(nos, 11); // Output Search:11 Key:5 Value:10
}
public static void search(List<Integer> nos, int search){
int key = binarySearch(nos, search, new IntComparator());
System.out.println("Search:"+search+"\tKey:"+key+"\tValue:"+nos.get(key));
}
public static class IntComparator implements Comparator<Integer>{
#Override
public int compare(Integer o1, Integer o2) {
return o1.compareTo(o2);
}
}
The binary search algorithm from Wikipedia is as below:
int binary_search(int A[], int key, int imin, int imax)
{
// continue searching while [imin,imax] is not empty
while (imax >= imin)
{
// calculate the midpoint for roughly equal partition
int imid = midpoint(imin, imax);
if(A[imid] == key)
// key found at index imid
return imid;
// determine which subarray to search
else if (A[imid] < key)
// change min index to search upper subarray
imin = imid + 1;
else
// change max index to search lower subarray
imax = imid - 1;
}
// key was not found
return KEY_NOT_FOUND;
}
The end condition in case a key is not found is that imax < imin.
In fact, this condition can locate the nearest match. The nearest match will lie between imax and imin (taking into account either might be outside the array bounds). Note again that imax < imin in the end case. Some solutions use abs to find the difference, but we know that A[imax] < key < A[imin] so:
if imax <= 0 return 0
if imin >= A.count - 1 return A.count - 1
if (key - A[imax]) < (A[imin] - key) return imax
return imin
Python, brute force on unsorted list (cause it's fun writing Python) O(n):
table = (100, 152, 256, 282, 300)
value = 125
lookup_dict = dict([(abs(value-x),x) for x in table])
closest_val = ldict[min(ldict.keys())]
And a proper implementation that uses binary search to find the value O(log_n):
import bisect
'''Returns the closest entry in the sorted list 'sorted' to 'value'
'''
def find_closest(sorted, value):
if (value <= sorted[0]):
return sorted[0]
if (value >= sorted[-1]):
return sorted[-1]
insertpos = bisect.bisect(sorted, value)
if (abs(sorted[insertpos-1] - value) <= abs(sorted[insertpos] - value)):
return sorted[insertpos-1]
else:
return sorted[insertpos]
Java has a Arrays.binarySearch function.
Given an array of [10, 20, 30] you would get these results:
Search for
Result
10
0
20
1
30
2
7
-1
9
-1
11
-2
19
-2
21
-3
29
-3
43
-4
Sample code:
import java.util.Arrays;
public class Solution {
public static void main(String[] args) {
int[] array = new int[]{10, 20, 30};
int[] keys = new int[]{10, 20, 30, 7, 9, 11, 19, 21, 29, 43};
for (int key: keys) {
System.out.println(Arrays.binarySearch(array, key));
}
}
}
Sample output:
1
2
-1
-1
-2
-2
-3
-3
-4
Basically the negative numbers provide 2 crucial information. Negative denotes that the exact match was not found but we can get a "close enough" match. The negative value indicates where the match is, -2 means: array[0] < key < array[1] and -3 means array[1] < key < array[2].
-1 means it is smaller than the minimum value in the array.
Example based on sample data on the initial question:
public class Solution {
public static void main(String[] args) {
int[] array = new int[]{100, 152, 256, 282, 300};
int[] keys = new int[]{125, 127, 282, 4, 900, 126};
for (int key : keys) {
int index = Arrays.binarySearch(array, key);
if (index >= 0) {
System.out.println("Found " + key);
} else {
if (index == -1) {
//smaller than smallest in the array
System.out.println("Closest to " + key + " is " + array[0]);
} else if (-index > array.length) {
//larger than the largest in the array
System.out.println("Closest to " + key + " is " + array[array.length - 1]);
} else {
//in between
int before = array[0 - index - 2];
int after = array[0 - index - 1];
if (key - before < after - key) {
System.out.println("Closest to " + key + " is " + before);
} else if (key - before > after - key) {
System.out.println("Closest to " + key + " is " + after);
} else {
System.out.println("Same distance from " + key + " to " + before + " and " + after);
}
}
}
}
}
}
And the output:
Closest to 125 is 100
Closest to 127 is 152
Found 282
Closest to 4 is 100
Closest to 900 is 300
Same distance from 126 to 100 and 152
Just going through the array and computing abs(reference-array_value[i]) would take O(N).
carry the index with the smallest difference.

Generate all combinations of arbitrary alphabet up to arbitrary length

Say I have an array of arbitrary size holding single characters. I want to compute all possible combinations of those characters up to an arbitrary length.
So lets say my array is [1, 2, 3]. The user-specified length is 2. Then the possible combinations are [11, 22, 33, 12, 13, 23, 21, 31, 32].
I'm having real trouble finding a suitable algorithm that allows arbitrary lengths and not just permutates the array. Oh and while speed is not absolutely critical, it should be reasonably fast too.
Just do an add with carry.
Say your array contained 4 symbols and you want ones of length 3.
Start with 000 (i.e. each symbol on your word = alphabet[0])
Then add up:
000
001
002
003
010
011
...
The algorithm (given these indices) is just to increase the lowest number. If it reaches the number of symbols in your alphabet, increase the previous number (following the same rule) and set the current to 0.
C++ code:
int N_LETTERS = 4;
char alphabet[] = {'a', 'b', 'c', 'd'};
std::vector<std::string> get_all_words(int length)
{
std::vector<int> index(length, 0);
std::vector<std::string> words;
while(true)
{
std::string word(length);
for (int i = 0; i < length; ++i)
word[i] = alphabet[index[i]];
words.push_back(word);
for (int i = length-1; ; --i)
{
if (i < 0) return words;
index[i]++;
if (index[i] == N_LETTERS)
index[i] = 0;
else
break;
}
}
}
Code is untested, but should do the trick.
Knuth covers combinations and permutations in some depth in The Art of Computer Programming, vol 1. Here is an implementation of one of his algorithms I wrote some years ago (don't hate on the style, its ancient code):
#include <algorithm>
#include <vector>
#include <functional>
#include <iostream>
using namespace std;
template<class BidirectionalIterator, class Function, class Size>
Function _permute(BidirectionalIterator first, BidirectionalIterator last, Size k, Function f, Size n, Size level)
{
// This algorithm is adapted from Donald Knuth,
// "The Art of Computer Programming, vol. 1, p. 45, Method 1"
// Thanks, Donald.
for( Size x = 0; x < (n-level); ++x ) // rotate every possible value in to this level's slot
{
if( (level+1) < k )
// if not at max level, recurse down to twirl higher levels first
f = _permute(first,last,k,f,n,level+1);
else
{
// we are at highest level, this is a unique permutation
BidirectionalIterator permEnd = first;
advance(permEnd, k);
f(first,permEnd);
}
// rotate next element in to this level's position & continue
BidirectionalIterator rotbegin(first);
advance(rotbegin,level);
BidirectionalIterator rotmid(rotbegin);
rotmid++;
rotate(rotbegin,rotmid,last);
}
return f;
}
template<class BidirectionalIterator, class Function, class Size>
Function for_each_permutation(BidirectionalIterator first, BidirectionalIterator last, Size k, Function fn)
{
return _permute<BidirectionalIterator,Function,Size>(first, last, k, fn, distance(first,last), 0);
}
template<class Elem>
struct DumpPermutation : public std::binary_function<bool, Elem* , Elem*>
{
bool operator()(Elem* begin, Elem* end) const
{
cout << "[";
copy(begin, end, ostream_iterator<Elem>(cout, " "));
cout << "]" << endl;
return true;
}
};
int main()
{
int ary[] = {1, 2, 3};
const size_t arySize = sizeof(ary)/sizeof(ary[0]);
for_each_permutation(&ary[0], &ary[arySize], 2, DumpPermutation<int>());
return 0;
}
Output of this program is:
[1 2 ]
[1 3 ]
[2 3 ]
[2 1 ]
[3 1 ]
[3 2 ]
If you want your combinations to include repeated elements like [11] [22] and [33], you can generate your list of combinations using the algorithm above, and then append to the generated list new elements, by doing something like this:
for( size_t i = 0; i < arySize; ++i )
{
cout << "[";
for( int j = 0; j < k; ++j )
cout << ary[i] << " ";
cout << "]" << endl;
}
...and the program output now becomes:
[1 2 ]
[1 3 ]
[2 3 ]
[2 1 ]
[3 1 ]
[3 2 ]
[1 1 ]
[2 2 ]
[3 3 ]
One way to do it would be with a simple counter that you internally interpret as base N, where N is the number of items in the array. You then extract each digit from the base N counter and use it as an index into your array. So if your array is [1,2] and the user specified length is 2, you have
Counter = 0, indexes are 0, 0
Counter = 1, indexes are 0, 1
Counter = 2, indexes are 1, 0
Counter = 3, indexes are 1, 1
The trick here will be your base-10 to base-N conversion code, which isn't terribly difficult.
If you know the length before hand, all you need is some for loops. Say, for length = 3:
for ( i = 0; i < N; i++ )
for ( j = 0; j < N; j++ )
for ( k = 0; k < N; k++ )
you now have ( i, j, k ), or a_i, a_j, a_k
Now to generalize it, just do it recursively, each step of the recursion with one of the for loops:
recurse( int[] a, int[] result, int index)
if ( index == N ) base case, process result
else
for ( i = 0; i < N; i++ ) {
result[index] = a[i]
recurse( a, result, index + 1 )
}
Of course, if you simply want all combinations, you can just think of each step as an N-based number, from 1 to k^N - 1, where k is the length.
Basically you would get, in base N (for k = 4):
0000 // take the first element four times
0001 // take the first element three times, then the second element
0002
...
000(N-1) // take the first element three times, then take the N-th element
1000 // take the second element, then the first element three times
1001
..
(N-1)(N-1)(N-1)(N-1) // take the last element four times
Using Peter's algorithm works great; however, if your letter set is too large or your string size too long, attempting to put all of the permutations in an array and returning the array won't work. The size of the array will be the size of the alphabet raised to the length of the string.
I created this in perl to take care of the problem:
package Combiner;
#package used to grab all possible combinations of a set of letters. Gets one every call, allowing reduced memory usage and faster processing.
use strict;
use warnings;
#initiate to use nextWord
#arguments are an array reference for the list of letters and the number of characters to be in the generated strings.
sub new {
my ($class, $phoneList,$length) = #_;
my $self = bless {
phoneList => $phoneList,
length => $length,
N_LETTERS => scalar #$phoneList,
}, $class;
$self->init;
$self;
}
sub init {
my ($self) = shift;
$self->{lindex} = [(0) x $self->{length}];
$self->{end} = 0;
$self;
}
#returns all possible combinations of N phonemes, one at a time.
sub nextWord {
my $self = shift;
return 0 if $self->{end} == 1;
my $word = [('-') x $self->{length}];
$$word[$_] = ${$self->{phoneList}}[${$self->{lindex}}[$_]]
for(0..$self->{length}-1);
#treat the string like addition; loop through 000, 001, 002, 010, 020, etc.
for(my $i = $self->{length}-1;;$i--){
if($i < 0){
$self->{end} = 1;
return $word;
}
${$self->{lindex}}[$i]++;
if (${$self->{lindex}}[$i] == $self->{N_LETTERS}){
${$self->{lindex}}[$i] = 0;
}
else{
return $word;
}
}
}
Call it like this: my $c = Combiner->new(['a','b','c','d'],20);. Then call nextWord to grab the next word; if nextWord returns 0, it means it's done.
Here's my implementation in Haskell:
g :: [a] -> [[a]] -> [[a]]
g alphabet = concat . map (\xs -> [ xs ++ [s] | s <- alphabet])
allwords :: [a] -> [[a]]
allwords alphabet = concat $ iterate (g alphabet) [[]]
Load this script into GHCi. Suppose that we want to find all strings of length less than or equal to 2 over the alphabet {'a','b','c'}. The following GHCi session does that:
*Main> take 13 $ allwords ['a','b','c']
["","a","b","c","aa","ab","ac","ba","bb","bc","ca","cb","cc"]
Or, if you want just the strings of length equal to 2:
*Main> filter (\xs -> length xs == 2) $ take 13 $ allwords ['a','b','c']
["aa","ab","ac","ba","bb","bc","ca","cb","cc"]
Be careful with allwords ['a','b','c'] for it is an infinite list!
This is written by me. may be helpful for u...
#include<stdio.h>
#include <unistd.h>
void main()
{
FILE *file;
int i=0,f,l1,l2,l3=0;
char set[]="abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890!##$%&*.!##$%^&*()";
int size=sizeof(set)-1;
char per[]="000";
//check urs all entered details here//
printf("Setlength=%d Comination are genrating\n",size);
// writing permutation here for length of 3//
for(l1=0;l1<size;l1++)
//first for loop which control left most char printed in file//
{
per[0]=set[l1];
// second for loop which control all intermediate char printed in file//
for(l2=0;l2<size;l2++)
{
per[1]=set[l2];
//third for loop which control right most char printed in file//
for(l3=0;l3<size;l3++)
{
per[2]=set[l3];
//apend file (add text to a file or create a file if it does not exist.//
file = fopen("file.txt","a+");
//writes array per to file named file.txt//
fprintf(file,"%s\n",per);
///Writing to file is completed//
fclose(file);
i++;
printf("Genrating Combination %d\r",i);
fflush(stdout);``
usleep(1);
}
}
}
printf("\n%d combination has been genrate out of entered data of length %d \n",i,size);
puts("No combination is left :) ");
puts("Press any butoon to exit");
getchar();
}

Resources