Algorithm for nesting nesting sets - algorithm

I have a large collection of sets, some of which are subsets of each others, like:
[{1, 2, 3, 4}, {1, 2}, {1, 5}, {1, 2, 3, 4, 5}, {2, 6}]
I'd like to take this collection and output a DAG of partial order of the subset relations
{1, 2, 3, 4, 5} >= {1, 2, 3, 4} >= {1, 2}
{1, 2, 3, 4, 5} >= {1, 5}
{2, 6}
Is there a way to do this other than comparing all combinations of sets (which is prohibitive when there is a large number of sets). This seems close to a number of set cover problems but, I can't find a problem that this reduces to.
One optimization is to create an inverted index which would help avoid comparing sets that had no common element like {2, 6} and {1, 5}.
This problem seems related to Topological sorting and Linear Extensions of a partial order.
This is nearly a duplicate of Generate a DAG from a poset using stricly functional programming, but I'm open to a solution that is not purely functional.

Related

How to use stack and a queue to generate all possible subsets on n-element set nonrecursively?

This is question from Data Structures and Algorithms in Java by Michael T Goodrich and Robert Tamassia. How to do this? Any help appreciated.
This is what I thought, correct me if I am wrong:
Store elements in Stack. Pop first element and store it in queue and remaining elements in Stack form a subset. Restore the Stack, now pop the second element (pop first in queue, pop second in queue, and push from queue) and the remaining elements in stack from another subset. Similarly pop third element and then fourth. Now, its turn to do the same with two elements and then three elements? Did I misunderstood the question and stretching it too far?
I have defined the ArrayStack() and ArrayQueue() classes
n=[1,2,3,4,5]
st=ArrayStack()
q=ArrayQueue()
q.enqueue(set())
for i in range(len(n)):
st.push(n[i])
while st.is_empty()==False:
cur_el=st.pop()
print('cur',cur_el)
for i in range(len(q)):
a=q.dequeue()
print('a',a)
q.enqueue(a)
b=a|{cur_el}
q.enqueue(b)
print('b',b)
while q.isempty()==False:
x=q.dequeue()
print(x)
OUTPUT
cur 5
a set()
b {5}
cur 4
a set()
b {4}
a {5}
b {4, 5}
cur 3
a set()
b {3}
a {4}
b {3, 4}
a {5}
b {3, 5}
a {4, 5}
b {3, 4, 5}
cur 2
a set()
b {2}
a {3}
b {2, 3}
a {4}
b {2, 4}
a {3, 4}
b {2, 3, 4}
a {5}
b {2, 5}
a {3, 5}
b {2, 3, 5}
a {4, 5}
b {2, 4, 5}
a {3, 4, 5}
b {2, 3, 4, 5}
cur 1
a set()
b {1}
a {2}
b {1, 2}
a {3}
b {1, 3}
a {2, 3}
b {1, 2, 3}
a {4}
b {1, 4}
a {2, 4}
b {1, 2, 4}
a {3, 4}
b {1, 3, 4}
a {2, 3, 4}
b {1, 2, 3, 4}
a {5}
b {1, 5}
a {2, 5}
b {1, 2, 5}
a {3, 5}
b {1, 3, 5}
a {2, 3, 5}
b {1, 2, 3, 5}
a {4, 5}
b {1, 4, 5}
a {2, 4, 5}
b {1, 2, 4, 5}
a {3, 4, 5}
b {1, 3, 4, 5}
a {2, 3, 4, 5}
b {1, 2, 3, 4, 5}
set()
{1}
{2}
{1, 2}
{3}
{1, 3}
{2, 3}
{1, 2, 3}
{4}
{1, 4}
{2, 4}
{1, 2, 4}
{3, 4}
{1, 3, 4}
{2, 3, 4}
{1, 2, 3, 4}
{5}
{1, 5}
{2, 5}
{1, 2, 5}
{3, 5}
{1, 3, 5}
{2, 3, 5}
{1, 2, 3, 5}
{4, 5}
{1, 4, 5}
{2, 4, 5}
{1, 2, 4, 5}
{3, 4, 5}
{1, 3, 4, 5}
{2, 3, 4, 5}
{1, 2, 3, 4, 5}
Only half-facetiously:
Create the stack and ignore it;
Create the queue and ignore it;
Output (iteratively) the numbers from 0 to (2^N)-1 in binary form of length N, with leading zeros.
Any required alternate representation of the subsets is easily generated from the binary representation of the generated integers, by interpreting each 1-bit as inclusion, and each 0-bit as exclusion.
Technically this meets the criteria as it (a) creates a stack; (b) creates a queue; and (c) non-recursively generates all possible sub-sets of an N element set. The stack and queue are strictly redundant for an iterative solution, so why they are asked for without any further guidance/constraints eludes me.
Update:
Random access to the input set can be replaced by use of the queue, by treating it as a circular buffer. (Of course, then it cannot be used for anything else.) Either a marker element can be added to the queue to indicate when each cycle is complete, but more natural for the algorithm as presented above is to process N elements at a time, N being known in advance. As each element is dequeued it is processed (added or ignored for the current subset) and enqueued again.
I think I have a reasonable solution which was stolen from here: http://arstechnica.com/civis/viewtopic.php?f=20&t=96354&sid=e74a29103e9297050680afbba6b72f32&start=40
So the idea is that the Queue will hold the subsets, and your stack will hold your original set. The empty set is a subset of every set and so we initialize the Queue with it. Then, for each element on the stack, we pop it off. Now for each subset on the Queue, we dequeue that subset and enqueue two copies: 1) One without the new element (i.e. the same as the original), 2) One with the new element. The tricky part is keeping track of when you need to pop the next element of the stack (i.e. when you're done with the current element). One way to do this would be checking the entire Queue for a matching set (meaning you've already added this subset you constructed... so stop). But a nice/cleaner way is to use the empty set as a marker.
Basically you have your typical recursive solution:
GenerateSubsets(Set set)
{
if (set == Set.EmptySet)
return new List<Set>(set);
var elem = set.Remove();
var subsets = GenerateSubsets(set);
// Add all of thew subsets that contain elem (i.e. partition all subsets
// by whether they contain elem or do not contain elem)
subsets.AddRange(subsets.Map(subset => subset.Add(elem));
return subsets;
}
And we use the exact same idea where the Queue iteratively constructs subsets and the original set is stored in Stack.
GenerateSubsets(Stack originalSetAsStack)
{
var queue = new Queue { new Set() };
while (!originalSetAsStack.IsEmpty)
{
var elem = originalSetAsStack.Pop();
while (true)
{
var currSubset = queue.Dequeue();
// This is key. This is how we know when to start
// the next iteration!
// This also assumes that any two empty sets are equal...
if (currSubset == new Set())
{
break;
}
var subsetWithElem = currSubset.Clone();
subsetWithElem.Add(elem);
// Add back in current subset. This has to be first!
// for our break to work above
queue.Queue(currSubset);
queue.Queue(subsetWithElem);
}
}
return queue;
}
To show why this solution stems from the recursive solution. Observe:
We construct the subsets iteratively:
-Start with the empty set => ({})
-Take some element from the stack
-Now for each element in the Queue, enqueue two: one with the current
element, and one without => ({}, {elem})
-Now take the next element and do the same => ({}, {elem}, {nextElem},
{elem, NextElem})
-Now take the third element and do the same => ({}, {elem}, {nextElem},
{elem, nextElem}, {thirdElem}, {thirdElem, elem}, {thirdElem, nextElem},
{elem, nextElem, thirdElem})
-...

Fast extraction of elements from nested lists

This is a basic question on list manipulation in Mathematica.
I have a large list where each element has the following schematic form: {List1, List2,Number}. For e.g.,
a = {{{1,2,3},{1,3,2},5},{{1,4,5},{1,0,2},10},{{4,5,3},{8,3,4},15}}}.
I want to make a new lists which only has some parts from each sublist. Eg., pick out the third element from each sublist to give {5,10,15} from the above. Or drop the third element to return {{{1,2,3},{1,3,2}},{{1,4,5},{1,0,2}},{{4,5,3},{8,3,4}}}.
I can do this by using the table command to construct new lists, e.g.,
Table[a[[i]][[3]],{i,1,Length[a]}
but I was wondering if there was a must faster way which would work on large lists.
In Mathematica version 5 and higher, you can use the keyword All in multiple ways to specify a list traversal.
For instance, instead of your Table, you can write
a[[All,3]]
Here Mathematica converts All into all acceptable indices for the first dimension then takes the 3rd one of the next dimension.
It is usually more efficient to do this than to make a loop with the Mathematica programming language. It is really fine for homogenous lists where the things you want to pick or scan through always exist.
Another efficient notation and shortcut is the ;; syntax:
a[[ All, 1 ;; 2]]
will scan the first level of a and take everything from the 1st to the 2st element of each sublist, exactly like your second case.
In fact All and ;; can be combined to any number of levels. ;; can even be used in a way similar to any iterator in Mathematica:
a[[ start;;end;;step ]]
will do the same things as
Table[ a[[i]], {i,start,end,step}]
and you can omit one of start, end or step, it is filled with its default of 1, Length[(of the implicit list)], and 1.
Another thing you might want to lookup in Mathematica's Help are ReplacePart and MapAt that allow programmatic replacement of structured expressions. The key thing to use this efficiently is that in ReplacePart you can use patterns to specify the coordinates of the things to be replaced, and you can define functions to apply to them.
Example with your data
ReplacePart[a, {_, 3} -> 0]
will replace every 3rd part of every sublist with 0.
ReplacePart[a, {i : _, 3} :> 2*a[[i, 3]]]
will double every 3rd part of every sublist.
As the authors suggest, the approaches based on Part need well-formed data, but Cases is built for robust separation of Lists:
Using your a,
a = {{{1, 2, 3}, {1, 3, 2}, 5}, {{1, 4, 5}, {1, 0, 2},
10}, {{4, 5, 3}, {8, 3, 4}, 15}};
Cases[a,{_List,_List,n_}:>n,Infinity]
{5, 10, 15}
The other pieces of a record can be extracted by similar forms.
Part-based approaches will gag on ill-formed data like:
badA = {{{1, 2, 3}, {1, 3, 2}, 5}, {{1, 4, 5}, {1, 0, 2},
10}, {{4, 5, 3}, {8, 3, 4}, 15}, {baddata}, {{1, 2, 3}, 4}};
badA[[All,3]]
{{{1, 2, 3}, {1, 3, 2}, 5}, {{1, 4, 5}, {1, 0, 2},
10}, {{4, 5, 3}, {8, 3, 4}, 15}, {baddata}, {{1, 2, 3},
4}}[[All, 3]]
,but Cases will skip over garbage, operating only on conforming data
Cases[badA, {_List, _List, s_} :> s, Infinity]
{5, 10, 15}
hth,
Fred Klingener
You can use Part (shorthand [[...]]) for this :
a[[All, 3]]
a[[All, {1, 2}]]

How to create a Mathematica list of fewer columns from larger list of many columns

I have a list "data1":
{{1, 6, 4.5, 1, 141.793, 2.31634, 27.907}, {2, 7, 4.5, 1, 133.702,
2.28725, 26.7442}, {3, 5, 5, 1, 136.546, 2.33522, 25.5814}, {4, 8,
5, 1, 104.694, 2.27871, 24.4186}}
What I would like to do is to create a new table with only the first two columns of each element. So my new table would be:
{{1,6},{2,7},{3,5},{4,8}}
I tried
data1[[All, 1][All, 2]]
and other variations but I am not understanding how to capture the desired fields. Thank you for your help.
Just have a range or list of the indices you want as the second argument, like so:
In[71]:= data[[All, {1, 2}]]
Out[71]= {{1, 6}, {2, 7}, {3, 5}, {4, 8}}

Find all partitions from a list of subsets

Given a list of specific subsets like
S = [ {1, 2}, {3, 4}, {1}, {2, 3}, {4}, {3} ]
and a "universe" set like
U = {1, 2, 3, 4}
what elegant and simple algorithm can be used to find all the possible partitions of U made of sets from S? With this example, such partitions include
{1, 2} {3, 4}
{1, 2} {3} {4}
etc.
Use recursion.
Split the problem into two smaller problems based on whether the first element is used or not:
Partition using {1,2} and any of the the remaining sets.
Partition without using {1,2} but using any of the the remaining sets.
These two options cover all possibilities.
The first is solved by partitioning {3,4} using only [ {3, 4}, {1}, {2, 3}, {4}, {3} ].
The second is solved by partitioning {1,2,3,4} using only [ {3, 4}, {1}, {2, 3}, {4}, {3} ].
To see how to solve these smaller problems refer to this similar question.

How to do Tally-like operation on list based on elements' total in Mathematica

For example, I have a list like:
{{1, 2, 3}, {6}, {4, 5}, {1, 6}, {2, 2, 3, 2}, {9}, {7}, {2, 5}}
And I want to get a tallied list based on the total of the lists' elements.
In this case, I want the output to be:
{{6, {{1, 2, 3}, {6}}, {7, {{2, 5}, {1, 6}, {7}}}, {9, {{4, 5}, {2, 2, 3, 2}, {9}}}}}
How to do this conveniently in Mathematica?
Thanks a lot.
Here's my attempt - a little simpler than Yoda's
lst = {{1, 2, 3}, {6}, {4, 5}, {1, 6}, {2, 2, 3, 2}, {9}, {7}, {2, 5}};
{Total#First##, #} & /# GatherBy[lst, Total]
If you don't want repeated elements, then you could use
{Total#First##, Union[#]} & /# GatherBy[lst, Total]
Or if you really wanted a tally-like operation
{Total#First##, Tally[#]} & /# GatherBy[lst, Total]
While I would probably do this just as #Simon did, let us not forget that Reap and Sow can be used as well:
Reap[Sow[#, Total[#]] & /# lst, _, List][[2]]
where lst is the original list. This will be somewhat less efficient than the GatherBy- based code, but also pretty fast. One can speed up the above code about 1.5 times by rewriting it as
Reap[Sow ### Transpose[{lst, Total[lst, {2}]}], _, List][[2]]
in which case it becomes about 1.5 times slower than the code based on GatherBy. Note that the speed difference between the two methods is not very dramatic here, because the list is ragged and therefore not packed, and GatherBy does not have here the speed advantage it normally enjoys for packed arrays.
Don't overlook Tr. This is shorter and faster:
{Tr##, {##}} & ### GatherBy[lst, Tr]

Resources