Fill a nested structure with values from a linear supply stream - algorithm

I got stuck in the resolution of the next problem:
Imagine we have an array structure, any structure, but for this example let's use:
[
[ [1, 2], [3, 4], [5, 6] ],
[ 7, 8, 9, 10 ]
]
For convenience, I transform this structure into a flat array like:
[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ]
Imagine that after certain operations our array looks like this:
[ 1, 2, 3, 4, 12515, 25125, 12512, 8, 9, 10]
NOTE: those values are a result of some operation, I just want to point out that is independent from the structure or their positions.
What I would like to know is... given the first array structure, how can I transform the last flat array into the same structure as the first? So it will look like:
[
[ [1, 2], [3, 4] , [12515, 25125] ],
[ 12512, 8, 9, 10]
]
Any suggestions? I was just hardcoding the positions in to the given structure. But that's not dynamic.

Just recurse through the structure, and use an iterator to generate the values in order:
function fillWithStream(structure, iterator) {
for (var i=0; i<structure.length; i++)
if (Array.isArray(structure[i]))
fillWithStream(structure[i], iterator);
else
structure[i] = getNext(iterator);
}
function getNext(iterator) {
const res = iterator.next();
if (res.done) throw new Error("not enough elements in the iterator");
return res.value;
}
var structure = [
[ [1, 2], [3, 4], [5, 6] ],
[ 7, 8, 9, 10 ]
];
var seq = [1, 2, 3, 4, 12515, 25125, 12512, 8, 9, 10];
fillWithStream(structure, seq[Symbol.iterator]())
console.log(JSON.stringify(structure));

Here is a sketch in Scala. Whatever your language is, you first have to represent the tree-like data structure somehow:
sealed trait NestedArray
case class Leaf(arr: Array[Int]) extends NestedArray {
override def toString = arr.mkString("[", ",", "]")
}
case class Node(children: Array[NestedArray]) extends NestedArray {
override def toString =
children
.flatMap(_.toString.split("\n"))
.map(" " + _)
.mkString("[\n", "\n", "\n]")
}
object NestedArray {
def apply(ints: Int*) = Leaf(ints.toArray)
def apply(cs: NestedArray*) = Node(cs.toArray)
}
The only important part is the differentiation between the leaf nodes that hold arrays of integers, and the inner nodes that hold their child-nodes in arrays. The toString methods and extra constructors are not that important, it's mostly just for the little demo below.
Now you essentially want to build an encoder-decoder, where the encode part simply flattens everything, and decode part takes another nested array as argument, and reshapes a flat array into the shape of the nested array. The flattening is very simple:
def encode(a: NestedArray): Array[Int] = a match {
case Leaf(arr) => arr
case Node(cs) => cs flatMap encode
}
The restoring of the structure isn't all that difficult either. I've decided to keep the track of the position in the array by passing around an explicit int-index:
def decode(
shape: NestedArray,
flatArr: Array[Int]
): NestedArray = {
def recHelper(
startIdx: Int,
subshape: NestedArray
): (Int, NestedArray) = subshape match {
case Leaf(a) => {
val n = a.size
val subArray = Array.ofDim[Int](n)
System.arraycopy(flatArr, startIdx, subArray, 0, n)
(startIdx + n, Leaf(subArray))
}
case Node(cs) => {
var idx = startIdx
val childNodes = for (c <- cs) yield {
val (i, a) = recHelper(idx, c)
idx = i
a
}
(idx, Node(childNodes))
}
}
recHelper(0, shape)._2
}
Your example:
val original = NestedArray(
NestedArray(NestedArray(1, 2), NestedArray(3, 4), NestedArray(5, 6)),
NestedArray(NestedArray(7, 8, 9, 10))
)
println(original)
Here is what it looks like as ASCII-tree:
[
[
[1,2]
[3,4]
[5,6]
]
[
[7,8,9,10]
]
]
Now reconstruct a tree of same shape from a different array:
val flatArr = Array(1, 2, 3, 4, 12515, 25125, 12512, 8, 9, 10)
val reconstructed = decode(original, flatArr)
println(reconstructed)
this gives you:
[
[
[1,2]
[3,4]
[12515,25125]
]
[
[12512,8,9,10]
]
]
I hope that should be more or less comprehensible for anyone who does some functional programming in a not-too-remote descendant of ML.

Turns out I've already answered your question a few months back, a very similar one to it anyway.
The code there needs to be tweaked a little bit, to make it fit here. In Scheme:
(define (merge-tree-fringe vals tree k)
(cond
[(null? tree)
(k vals '())]
[(not (pair? tree)) ; for each leaf:
(k (cdr vals) (car vals))] ; USE the first of vals
[else
(merge-tree-fringe vals (car tree) (lambda (Avals r) ; collect 'r' from car,
(merge-tree-fringe Avals (cdr tree) (lambda (Dvals q) ; collect 'q' from cdr,
(k Dvals (cons r q))))))])) ; return the last vals and the combined results
The first argument is a linear list of values, the second is the nested list whose structure is to be re-created. Making sure there's enough elements in the linear list of values is on you.
We call it as
> (merge-tree-fringe '(1 2 3 4 5 6 7 8) '(a ((b) c) d) (lambda (vs r) (list r vs)))
'((1 ((2) 3) 4) (5 6 7 8))
> (merge-tree-fringe '(1 2 3 4 5 6 7 8) '(a ((b) c) d) (lambda (vs r) r))
'(1 ((2) 3) 4)
There's some verbiage at the linked answer with the explanations of what's going on. Short story short, it's written in CPS – continuation-passing style:
We process a part of the nested structure while substituting the leaves with the values from the linear supply; then we're processing the rest of the structure with the remaining supply; then we combine back the two results we got from processing the two sub-parts. For LISP-like nested lists, it's usually the "car" and the "cdr" of the "cons" cell, i.e. the tree's top node.
This is doing what Bergi's code is doing, essentially, but in a functional style.
In an imaginary pattern-matching pseudocode, which might be easier to read/follow, it is
merge-tree-fringe vals tree = g vals tree (vs r => r)
where
g vals [a, ...d] k = g vals a (avals r => -- avals: vals remaining after 'a'
g avals d (dvals q => -- dvals: remaining after 'd'
k dvals [r, ...q] )) -- combine the results
g vals [] k = k vals [] -- empty
g [v, ...vs] _ k = k vs v -- leaf: replace it
This computational pattern of threading a changing state through the computations is exactly what the State monad is about; with Haskell's do notation the above would be written as
merge_tree_fringe vals tree = evalState (g tree) vals
where
g [a, ...d] = do { r <- g a ; q <- g d ; return [r, ...q] }
g [] = do { return [] }
g _ = do { [v, ...vs] <- get ; put vs ; return v } -- leaf: replace
put and get work with the state being manipulated, updated and passed around implicitly; vals being the initial state; the final state being silently discarded by evalState, like our (vs r => r) above also does, but explicitly so.

Related

Failing to print a list in a sorted fashion without actually sorting the list

I am struggling to sort a list in a sorted fashion without actually "sorting" the list.
arr = [3, 2, 1, 4, 5]
count = 0
current = arr[0]
prev = -1
while count < len(arr):
for item in arr:
if current < item > prev:
current = item
prev = current
count = count + 1
print(current)
Output:
5
5
5
5
5
I don't want to sort the list. I am wondering is there a way to not sort the list and not change the original list and print the items in a sorted fashion?
It's pretty unclear what you're trying to do. If you want a sorted copy, you could make a list containing the indices of the the original objects ([0, 1, 2, ..., n]) and then sort these by comparing the original values at those indices, then map this sorted list back to the values from the first list.
But much simpler still is just to sort a shallow clone of the list.
If you read Javascript, here's a demonstration of that idea, using a simple range helper function to create the list of indices:
const arr = [8, 6, 7, 5, 3, 0, 9]
const range = (lo, hi) =>
[...Array (hi - lo)] .map((_, i) => lo + i)
const indexSort = (ns) =>
range (0, ns .length)
.sort ((i, j) => ns [i] - ns [j])
.map (x => ns [x])
console .log ('indexSort:', indexSort (arr))
console .log ('shallow clone:', [...arr] .sort ((a, b) => a - b))
console .log ('no mutation of original array:', arr)
.as-console-wrapper {max-height: 100% !important; top: 0}

Multiply collection and randomly merge with other - Apache Spark

I am given two collections(RDDs). Let's say and a number of samples
val v = sc.parallelize(List("a", "b", "c"))
val a = sc.parallelize(List(1, 2, 3, 4, 5))
val samplesCount = 2
I want to create two collections(samples) consisting of pairs where one value is from the 'v' and second one from 'a'. Each collection must consist all values from v and random values from 'a'.
Example result would be:
(
(("a", 3), ("b", 5), ("c", 1)),
(("a", 4), ("b", 2), ("c", 5))
)
One more to add is that the values from v or a can't repeat within a sample.
I can't think of any good way to achieve this.
You randomly shuffle the RDD to be sampled and then join the two RDDs by line index:
def shuffle[A: reflect.ClassTag](a: RDD[A]): RDD[A] = {
val randomized = a.map(util.Random.nextInt -> _)
randomized.sortByKey().values
}
def joinLines[A: reflect.ClassTag, B](a: RDD[A], b: RDD[B]): RDD[(A, B)] = {
val aNumbered = a.zipWithIndex.map { case (x, i) => (i, x) }
val bNumbered = b.zipWithIndex.map { case (x, i) => (i, x) }
aNumbered.join(bNumbered).values
}
val v = sc.parallelize(List("a", "b", "c"))
val a = sc.parallelize(List(1, 2, 3, 4, 5))
val sampled = joinLines(v, shuffle(a))
RDDs are immutable, so you don't need to "multiply" anything. If you want multiple samples just do:
val sampledRDDs: Seq[RDD[(String, Int)]] =
(1 to samplesCount).map(_ => joinLines(v, shuffle(a)))

Grouping adjacent elements in a list

Let's say I want to write a function that does this:
input: [1,1,3,3,4,2,2,5,6,6]
output: [[1,1],[3,3],[4],[2,2],[5],[6,6]]
It's grouping adjacent elements that are same.
What should the name of this method be? Is there a standard name for this operation?
In [1,1,3,3,4,2,2,5,6,6], a thing like [1,1] is very often referred to as run (as in run-length encoding, see RLE in Scala). I'd therefore call the method groupRuns.
#tailrec
def groupRuns[A](c: Seq[A], acc: Seq[Seq[A]] = Seq.empty): Seq[Seq[A]] = {
c match {
case Seq() => acc
case xs =>
val (same, rest) = xs.span { _ == xs.head }
groupRuns(rest, acc :+ same)
}
}
scala> groupRuns(Vector(1, 1, 3, 3, 4, 2, 2, 5, 6, 6))
res7: Seq[Seq[Int]] = List(Vector(1, 1), Vector(3, 3), Vector(4), Vector(2, 2), Vector(5), Vector(6, 6))

Looking for the best solution

Consider this list composed of objects which are instances of case classes:
A, B, Opt(A),C, Opt(D), F, Opt(C), G, Opt(H)
I wan to normalize this list to get this result:
A, B, C, Opt(D), F, G, Opt(H)
As you see, if there are elements A and Opt(A) I replace them with just A or said other way, I have to remove OPT(A) element.
I would like:
most optimal solution in the mean of performance
shortest solution
This might be a little more concise, as filtering is what you want ;-):
scala> List(1,2,3,Some(4),5,Some(5))
res0: List[Any] = List(1, 2, 3, Some(4), 5, Some(5))
scala> res0.filter {
| case Some(x) => !res0.contains(x)
| case _ => true
| }
res1: List[Any] = List(1, 2, 3, Some(4), 5)
edit: For large collections it might be good to use a toSet or directly use a Set.
Not the most efficient solution, but certainly a simple one.
scala> case class Opt[A](a: A)
defined class Opt
scala> val xs = List(1, 2, Opt(1), 3, Opt(4), 6, Opt(3), 7, Opt(8))
xs: List[Any] = List(1, 2, Opt(1), 3, Opt(4), 6, Opt(3), 7, Opt(8))
scala> xs flatMap {
| case o # Opt(x) => if(xs contains x) None else Some(o)
| case x => Some(x)
| }
res5: List[Any] = List(1, 2, 3, Opt(4), 6, 7, Opt(8))
If you don't care about order then efficiency leads you to use a Set:
xs.foldLeft(Set.empty[Any])({ case (set, x) => x match {
case Some(y) => if (set contains y) set else set + x
case y => if (set contains Some(y)) set - Some(y) + y else set + y
}}).toList
Alternatively:
val (opts, ints) = xs.toSet.partition(_.isInstanceOf[Option[_]])
opts -- (ints map (Option(_))) ++ ints toList

What algorithm can calculate the power set of a given set?

I would like to efficiently generate a unique list of combinations of numbers based on a starting list of numbers.
example start list = [1,2,3,4,5] but the algorithm should work for [1,2,3...n]
result =
[1],[2],[3],[4],[5]
[1,2],[1,3],[1,4],[1,5]
[1,2,3],[1,2,4],[1,2,5]
[1,3,4],[1,3,5],[1,4,5]
[2,3],[2,4],[2,5]
[2,3,4],[2,3,5]
[3,4],[3,5]
[3,4,5]
[4,5]
Note. I don't want duplicate combinations, although I could live with them, eg in the above example I don't really need the combination [1,3,2] because it already present as [1,2,3]
Just count 0 to 2^n - 1 and print the numbers according to the binary representation of your count. a 1 means you print that number and a 0 means you don't. Example:
set is {1, 2, 3, 4, 5}
count from 0 to 31:
count = 00000 => print {}
count = 00001 => print {1} (or 5, the order in which you do it really shouldn't matter)
count = 00010 => print {2}
00011 => print {1, 2}
00100 => print {3}
00101 => print {1, 3}
00110 => print {2, 3}
00111 => print {1, 2, 3}
...
11111 => print {1, 2, 3, 4, 5}
There is a name for what you're asking. It's called the power set.
Googling for "power set algorithm" led me to this recursive solution.
Ruby Algorithm
def powerset!(set)
return [set] if set.empty?
p = set.pop
subset = powerset!(set)
subset | subset.map { |x| x | [p] }
end
Power Set Intuition
If S = (a, b, c) then the powerset(S) is the set of all subsets
powerset(S) = {(), (a), (b), (c), (a,b), (a,c), (b,c), (a,b,c)}
The first "trick" is to try to define recursively.
What would be a stop state?
S = () has what powerset(S)?
How get to it?
Reduce set by one element
Consider taking an element out - in the above example, take out {c}
S = (a,b) then powerset(S) = {(), (a), (b), (a,b)}
What is missing?
powerset(S) = {(c), (a,c), (b,c), (a,b,c)}
hmmm
Notice any similarities? Look again...
powerset(S) = {(), (a), (b), (c), (a,b), (a,c), (b,c), (a,b,c)}
take any element out
powerset(S) = {(), (a), (b), (c), (a,b), (a,c), (b,c), (a,b,c)} is
powerset(S - {c}) = {(), (a), (b), (a,b)} unioned with
{c} U powerset(S - {c}) = { (c), (a,c), (b,c), (a,b,c)}
powerset(S) = powerset(S - {ei}) U ({ei} U powerset(S - {ei}))
where ei is an element of S (a singleton)
Pseudo-algorithm
Is the set passed empty? Done (Note that power set of {} is {{}})
If not, take an element out
recursively call method on the remainder of the set
return the set composed of the Union of
the powerset of the set without the element (from the recursive call)
this same set (i.e., 2.1) but with each element therein unioned with the element initially taken out
def power(a)
(0..a.size).map {|x| a.combination(x).to_a}.flatten(1)
end
From a comment by OP (copy edited):
The example is simplified form of what I am actually doing. The numbers are objects which have a property "Qty", I want to sum the quantities for every possible combination then chose the combination that uses the most objects where the sum of the quantities N is within some other boundaries, e.g. x < N < y.
What you have is an optimization problem. What you have assumed is that the right way to approach this optimization problem is to decompose it into an enumeration problem (which is what you asked) and then a filtration problem (which presumably you know how to do).
What you don't yet realize is that this kind of solution only works either (a) for theoretical analysis, or (b) for very small values of n. The enumeration you're asking for is exponential in n, which means you'd end up with something that would take far too long to run in practice.
Therefore, figure out how to pose your optimization problem as such, write a new question, and edit this one to point to it.
Same as hobodave's answer, but iterative and faster (in Ruby). It also works with both Array and Set.
def Powerset(set)
ret = set.class[set.class[]]
set.each do |s|
deepcopy = ret.map { |x| set.class.new(x) }
deepcopy.map { |r| r << s }
ret = ret + deepcopy
end
return ret
end
In my tests, IVlad's method doesn't work so well in Ruby.
Recursive and iterative solutions to calculate power set in scheme. Not fully tested though
(define (power_set set)
(cond
((empty? set) (list '()))
(else
(let ((part_res (power_set (cdr set))))
(append (map (lambda (s) (cons (car set) s)) part_res) part_res)))))
(define (power_set_iter set)
(let loop ((res '(())) (s set))
(if (empty? s)
res
(loop (append (map (lambda (i) (cons (car s) i)) res) res) (cdr s)))))
Hereafter is a recursive solution, which is similar to already posted ones. A few assertions are providing as kind of unit tests.
I didn't managed to use "set" Python type for representing set of sets. Python said that "set objects are unhashable" when trying expression like "s.add(set())".
See also solutions in many programming languages at http://rosettacode.org/wiki/Power_set
def generatePowerSet(s, niceSorting = True):
"""Generate power set of a given set.
The given set, as well as, return set of sets, are implemented
as lists.
"niceSorting" optionnaly sorts the powerset by increasing subset size.
"""
import copy
def setCmp(a,b):
"""Compare two sets (implemented as lists) for nice sorting"""
if len(a) < len(b):
return -1
elif len(a) > len(b):
return 1
else:
if len(a) == 0:
return 0
else:
if a < b:
return -1
elif a > b:
return 1
else:
return 0
# Initialize the power set "ps" of set "s" as an empty set
ps = list()
if len(s) == 0:
ps.append(list())
else:
# Generate "psx": the power set of "sx",
# which is "s" without a chosen element "x"
sx = copy.copy(s)
x = sx.pop()
psx = generatePowerSet(sx, False)
# Include "psx" to "ps"
ps.extend(psx)
# Include to "ps" any set, which contains "x"
# Such included sets are obtained by adding "x" to any subset of "sx"
for y in psx:
yx = copy.copy(y)
yx.append(x)
ps.append(yx)
if niceSorting:
ps.sort(cmp=setCmp)
return ps
assert generatePowerSet([]) == [[]]
assert generatePowerSet(['a']) == [[], ['a']]
assert generatePowerSet(['a', 'b']) == [[], ['a'], ['b'], ['a', 'b']]
assert generatePowerSet(['a', 'b','c']) == [[],
['a'], ['b'], ['c'],
['a', 'b'], ['a', 'c'], ['b', 'c'],
['a', 'b', 'c'] ]
assert generatePowerSet(['a', 'b','c'], False) == [ [],
['a'],
['b'],
['a', 'b'],
['c'],
['a', 'c'],
['b', 'c'],
['a', 'b', 'c'] ]
print generatePowerSet(range(4), True)
My colleague created an elegant way to do it in ruby. It uses IVlad's concept on the index set.
class Array
def select_by_index(&block)
# selects array element by index property
n = []
each_with_index do |e, i|
if block.call(i)
n << e
end
end
n
end
end
def pow(a)
# power set of a
max = (1 << a.length)
(0...max).map { |i| a.select_by_index { |k| (1 << k) & i != 0 }}
end

Resources