Algorithm to find a sequence of sub-sequences - algorithm

Sequence [1,2,3] consider. This sequence has the following 6 different sequence: [1]and [2]and [3] and [1,2] and [2,3] and [1,2,3]
Note! Length the initial sequence may be up to 100 digits.
Please help me. How can I make the following sequences?
I love researching more about this kind of algorithms. Please tell me the name of this type of algorithms.

Here is a c code to print all sub sequences. Algorithm uses nested loops.
#include<stdio.h>
void seq_print(int A[],int n)
{
int k;
for(int i =0;i<=n-1;i++)
{
for(int j=0;j<=i;j++)
{
k=j;
while(k<=i)
{
printf("%d",A[k]);
k++;
}
printf("\n");
}
}
}
void main()
{
int A[]={1,2,3,4,5,6,7,8,9,0};
int n=10;
seq_print(A,n);
}

Your problem can be reduced to the Combination problem. There are already many solutions existed in stackoverflow. You can check this, it may be useful for you.

It is called a power set (in your case the empty set is excluded).
To build a power set, start with a set with an empty set in it; then
for each item in the input set extend the power set with all its subsets accumulated so far
with the current item included (in Python):
def powerset(lst):
S = [[]]
for item in lst:
S += [subset + [item] for subset in S]
return S
Example:
print(powerset([1, 2, 3]))
# -> [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]
To avoid producing all subsets at once, a recursive definition could be used:
a power set of an empty set is a set with an empty set in it
a power set of a set with n items contains all subsets from a power set
of a set with n - 1 items plus all these subsets with the n-th item included.
def ipowerset(lst):
if not lst: # empty list
yield []
else:
item, *rest = lst
for subset in ipowerset(rest):
yield subset
yield [item] + subset
Example:
print(list(ipowerset([1, 2, 3])))
# -> [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]
Yet another way to generate a power set is to generate r-length subsequences (combinations) for all r from zero to the size of the input set (itertools recipe):
from itertools import chain, combinations
def powerset_comb(iterable):
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Example:
print(list(powerset_comb([1, 2, 3])))
# -> [(), (1,), (2,), (3,), (1,2), (1,3), (2,3), (1,2,3)]
See also what's a good way to combinate through a set?.

Related

What is the time complexity of a function that generates combinations of length r from n elements without repetitions or permutations?

The following function takes in a list of elements src as well as a combination length r. It prints out all possible combinations of length r, without repetition of an element inside a combination or repetition of a combination in a different order (permutation).
void fn(List<dynamic> src, int r, List<dynamic> tmp) {
for (var i = 0; i < src.length; i++) {
tmp.add(src.removeAt(i));
if (tmp.length == r) print(tmp.toString());
else if (i < src.length) fn(src.sublist(i), r, tmp);
src.insert(i, tmp.removeLast());
}
}
So, given n = [1,2,3,4,5] and r = 3, it would print out
[1, 2, 3]
[1, 2, 4]
[1, 2, 5]
[1, 3, 4]
[1, 3, 5]
[1, 4, 5]
[2, 3, 4]
[2, 3, 5]
[2, 4, 5]
[3, 4, 5]
How would you describe the time complexity of this function in Big O notation? Clearly both the length of src as well as r have to be taken into account. If I am not mistaken, the time complexity of a similar function printing out all combinations with repetitions and permutations would be O(n^r). But what is it in this case?
According to Stef's comment, the time complexity seems to be O(r(n choose r)).

Algorithm: Find out which objects hold the subset of input array

We have some objets (about 100,000 objects), each object has a property with some integers (range from 1 to 20,000, at most 20 elements, no duplicated elements.):
For example:
object_1: [1, 4]
object_2: [1, 3]
object_3: [100]
And the problem is, we input a array of integer (called A), find out which objects hold the subset of A.
For example:
when A = [1], the output should be []
when A = [1, 4], the output should be [object_1]
when A = [1, 3, 4], the output should be [object_1, object_2]
The problem can be described in python:
from typing import List
# problem description
class Object(object):
def __init__(self, integers):
self.integers = integers
def size(self):
return len(self.integers)
object_1 = Object([1, 4])
object_2 = Object([1, 3])
object_3 = Object([100])
def _find_subset_objects(integers): # type: (List[int]) -> List[Object]
raise NotImplementedError()
def test(find_subset_objects=_find_subset_objects):
assert find_subset_objects([1]) == []
assert find_subset_objects([1, 4]) == [object_1]
assert find_subset_objects([1, 3, 4]) == [object_1, object_2]
Is there some algorithm or some data struct is aim to solve this kind of problem?
Store the objects in an array. The indices will be 0 ... ~100K. Then create two helper arrays.
First one with the element counts for every object. I will call this array obj_total(This could be ommited by calling the object.size or something similar if you wish.)
Second one initialized with zeroes. I will call it current_object_count.
For every integer property p where 0 < p <= 20000, create a list of indices where index i in the list means that the element is contained in the i-th object.
It is getting messy and I'm getting lost in the names. Time for the example with the objects that you used in the question:
objects = [[1, 4], [1, 3], [100]]
obj_total = [2, 2, 1]
current_object_count = [0, 0, 0]
object_1_ref = [0, 1]
object_2_ref = [ ]
object_3_ref = [1]
object_4_ref = [0]
object_100_ref = [100]
object_refs = [object_1_ref ,object_2_ref , ... , object_100_ref]
#Note that you will want to do this references dynamically.
#I'm writing them out only for the sake of clarity.
Now we are given the input array, for example [1, 3, 4]. For every element i in the array, we look we look at the object_i_ref. We then use the indices in the reference array to increase the values in the current_object_count array.
Whenever you increase a value in the current_object_count[x], you also check against the obj_total[x] array. If the values match, the object in objects[x] is a subset of the input array and we can note it down.
When you finish with the input array you have all the results.

Finding all subsets of a given set of distinct integers recursively

Given a set of distinct integers, I want to find all the possible subsets (for [1,2,3], the code should print [1], [1,2], [1,3], [1,2,3], [2], [2,3], [3] (not necessarily in that order).
There are a few solutions (like this one) out there but what I want to do is to re-implement the bellow solution with a new recursion and no for loop by passing around the indexes: (start = 0)
public void forSolution(List<List<Integer>> res, int[] nums, List<Integer> list, int start) {
for (int i = start; i < nums.length; i++) {
List<Integer> tmp = new ArrayList<>(list);
tmp.add(nums[i]);
res.add(new ArrayList<>(tmp));
forSolution(res, nums, tmp, i + 1);
}
}
I thought I need to pass two integers to the method, one for keeping the record of index and the other one for keeping the start point, but I am having problem on when I need to do the index increment (vs start increment).
Any help would be appreciated.
I think the algorithm gets easier if you don't bother with indices.
The basic idea is that for any given sublist, each element of the original list is either included or not included. The list of all possible sublists is simply all possible combinations of including / not including each element.
For a recursive implementation, we can consider two cases:
The input list is empty. The empty list only has a single sublist, which is the empty list itself.
The input list consists of a first element x and a list of remaining elements rest. Here we can call our function recursively to get a list of all sublists of rest. To implement the idea of both including and not including x in our results, we return a list consisting of
each element of sublists(rest) with x added at the front, representing all sublists of our original list that contain x, and
each element of sublists(rest) as is (without x), representing all sublists of our original list that don't contain x.
For example, if the list is [1, 2, 3], we have x = 1 and rest = [2, 3]. The recursive call sublists(rest) produces [2, 3], [2], [3], []. For each of those sublists we
prepend x (which is 1), giving [1, 2, 3], [1, 2], [1, 3], [1], and
don't prepend x, giving [2, 3], [2], [3], [].
Concatenating those parts gives our total result as [1, 2, 3], [1, 2], [1, 3], [1], [2, 3], [2], [3], [].
Sample implementation:
use strict;
use warnings;
sub sublists {
if (!#_) {
return [];
}
my $x = shift #_;
my #r = sublists(#_);
return (map [$x, #$_], #r), #r;
}
for my $sublist (sublists 1, 2, 3) {
print "[" . join(", ", #$sublist) . "]\n";
}
Output:
[1, 2, 3]
[1, 2]
[1, 3]
[1]
[2, 3]
[2]
[3]
[]

Intersect each array in a array - Ruby

I want to find the intersect of each array elements in a array and take the intersection.
The inputs are array of arrays e.g., "'list_arrays' as mentioned in this script below"
The 'filter' is a limit needed to be applied on the total length of intersections observed
The out put is expected as array like this "[[2,4]]"
list_arrays = [[1, 2, 3, 4], [2, 5, 6], [1, 5, 8], [8, 2, 4]]
filter = 2
first_element_array = Array.new
list_arrays.each_with_index do |each_array1, index1|
list_arrays.each_with_index do |each_array2, index2|
unless index1 < index2
intersection = each_array1 & each_array2
if intersection.length == filter.to_i
first_element_array.push(intersection)
end
end
end
end
puts first_element_array
This above procedure takes long execution time as I have too long array of array (In million lines). I need a simple rubistic way to handle this problem. Anyone have any simple idea for it?
Deciphering your code it seems what you are asking for is "Return the intersections between pair combinations of a collection if that intersection has a certain size (2 in the example)". I'd write (functional approach):
list_arrays = [[1, 2, 3, 4], [2, 5, 6], [1, 5, 8], [8, 2, 4]]
list_arrays.combination(2).map do |xs, ys|
zs = xs & ys
zs.size == 2 ? zs : nil
end.compact
#=> [[2, 4]]
Proposed optimizations: 1) Use sets, 2) Use a custom abstraction Enumerable#map_compact (equivalent to map+compact but it would discard nils on the fly, write it yourself). 3) Filter out subarrays which won't satisfy the predicate:
require 'set'
xss = list_arrays.select { |xs| xs.size >= 2 }.map(&:to_set)
xss.combination(2).map_compact do |xs, ys|
zs = xs & ys
zs.size == 2 ? zs : nil
end
#=> [#<Set: {2, 4}>]

Partitioning a superset and getting the list of original sets for each partition

Introduction
While trying to do some cathegorization on nodes in a graph (which will be rendered differenty), I find myself confronted with the following problem:
The Problem
Given a superset of elements S = {0, 1, ... M} and a number n of non-disjoint subsets T_i thereof, with 0 <= i < n, what is the best algorithm to find out the partition of the set S called P?
P = S is the union of all disjoint partitions P_j of the original superset S, with 0 <= j < M, such that for all elements x in P_j, every x has the same list of "parents" among the "original" sets T_i.
Example
S = [1, 2, 3, 4, 5, 6, 8, 9]
T_1 = [1, 4]
T_2 = [2, 3]
T_3 = [1, 3, 4]
So all P_js would be:
P_1 = [1, 4] # all elements x have the same list of "parents": T_1, T_3
P_2 = [2] # all elements x have the same list of "parents": T_2
P_3 = [3] # all elements x have the same list of "parents": T_2, T_3
P_4 = [5, 6, 8, 9] # all elements x have the same list of "parents": S (so they're not in any of the P_j
Questions
What are good functions/classes in the python packages to compute all P_js and the list of their "parents", ideally restricted to numpy and scipy? Perhaps there's already a function which does just that
What is the best algorithm to find those partitions P_js and for each one, the list of "parents"? Let's note T_0 = S
I think the brute force approach would be to generate all 2-combinations of T sets and split them in at most 3 disjoint sets, which would be added back to the pool of T sets and then repeat the process until all resulting Ts are disjoint, and thus we've arrived at our answer - the set of P sets. A little problematic could be caching all the "parents" on the way there.
I suspect a dynamic programming approach could be used to optimize the algorithm.
Note: I would have loved to write the math parts in latex (via MathJax), but unfortunately this is not activated :-(
The following should be linear time (in the number of the elements in the Ts).
from collections import defaultdict
S = [1, 2, 3, 4, 5, 6, 8, 9]
T_1 = [1, 4]
T_2 = [2, 3]
T_3 = [1, 3, 4]
Ts = [S, T_1, T_2, T_3]
parents = defaultdict(int)
for i, T in enumerate(Ts):
for elem in T:
parents[elem] += 2 ** i
children = defaultdict(list)
for elem, p in parents.items():
children[p].append(elem)
print(list(children.values()))
Result:
[[5, 6, 8, 9], [1, 4], [2], [3]]
The way I'd do this is to construct an M × n boolean array In where In(i, j) &equals; Si &in; Tj. You can construct that in O(Σj|Tj|), provided you can map an element of S onto its integer index in O(1), by scanning all of the sets T and marking the corresponding bit in In.
You can then read the "signature" of each element i directly from In by concatenating row i into a binary number of n bits. The signature is precisely the equivalence relationship of the partition you are seeking.
By the way, I'm in total agreement with you about Math markup. Perhaps it's time to mount a new campaign.

Resources