Dynamic list sorting in Tcl - sorting

I want to do some sorting based on a dynamic list. Let me explain below
I am using tcl version 8.4 which i cannot change, have to use that
list1 = {{a b c} {c b c a} {b b a}} ..... 1st input data
List 1 is a tcl list that has 3 members which forms different types of sub lists in any order and this will even change everytime. For example next time, list 1 will be :
list1 = {{c} {b c a} {b c} {a c a c}} ..... 2nd input data (for next time consideration)
Now I want to sort them in such a way that if I use a loop around them or lsort or string compare or any other tcl command, the new tcl list should contain individual members based on a priority. just as we have ascending/descending.
Notice in that both cases the individual sub_lists length is increasing and decreasing and at the same time from a,b,c also keep on rotating.
In my case I want "a" to have highest priority, then "b" and then "c" (a->b->c)
So output after processing done for 1st iteration should be :
$> puts $new_list1
$> {a a a} # as out of 3 sublists a is present in them and it gets highest priority.
Similarly, output after processing done on 2nd iteration should be :
$> puts $new_list1
$> {c a b a} # as you can see that list1 1st element is c so it gets output as is, second sublist has b c and a so `a` gets outputted, 3rd sublist is b and c so `b` gets outputted
Let me know what your thoughts are.
Thanks in advance !

First, I'd look into is constructing that data structure in a way such that you wouldn't have to sort all the sublists—for example, use an algorithm as simple as binary search to linsert each element into a sorted index per sublist.
Second, I'd think about whether you need as much "optimization" as you might think you do. Often, the best solution (due to maintainability) is the most obvious thing: sort the sublists, then use a loop, like so:
# construct a new list of sorted sublists
foreach sublist $list {
lappend presorted_list [lsort $sublist]
}
# given a reference to a list of sorted lists, simultaneously build (1) a list of
# each sublist's first element and (2) a list of the each sublist's remaining
# elements, so that the former can be returned, and the latter can be set by
# reference for the next iteration (and will have omitted any empty sublists)
proc process_once {presorted_list_ref} {
upvar $presorted_list_ref presorted_list
foreach sublist $presorted_list {
if {[llength $sublist] > 0} {
lappend returning_list [lindex $sublist 0]
lappend remaining_list [lrange $sublist 1 end]
}
}
set presorted_list $remaining_list
return $returning_list
}
set iter_1 [process_once presorted_list]
set iter_2 [process_once presorted_list]
I don't think there is any better way to do this, if you cannot pre-process or construct your original list in a way to begin with sorted sublists. Unless beginning with sorted sublists, you cannot make a decision about which item in each sublist must be output, without examining all items—so you might as well sort once so you'll know to always take the first item per sublist, as I've coded above.
In loop form, if you don't need to retrieve one iteration at a time specifically,
foreach sublist $list {
lappend presorted_list [lsort $sublist]
}
while {[llength $presorted_list] > 0} {
foreach sublist $presorted_list {
if {[llength $sublist] > 0} {
lappend returning_list [lindex $sublist 0]
lappend remaining_list [lrange $sublist 1 end]
}
}
#
# do stuff with $returning_list
#
set presorted_list $remaining_list
}

Related

How to sort faster when adding few elements to sorted list?

I have a sorted list of ~10'000 elements in which I insert a few elements (1-10) at a time between popping the first. Measurements show the sort procedure take several milliseconds (~5), presumably because lsort makes a sort from scratch every time. It now takes up the majority of frame time so I need to do something about it.
Is there some trick to make merging a large sorted list with a small sorted list with enhanced efficiency?
Code for explaining the context:
while {true} {
set work [lindex $frontier 0]
set frontier [lreplace $frontier 0 0]
if {[done $work]} break;
set more_work [do work]; # about 1-10 elements, distribution is generally hard to predict
lappend frontier {*}$more_work
set frontier [lsort $frontier]; # when frontier is 10'000 elements time to sort is ~5ms
}
Trying my best to implement a Tcl proc doing merge-like sort, will post findings. :-)
This proc reduces time elapsed from ~5ms to ~1.2ms:
proc merge_insert {sorted1 sorted2} {
set res {}
set prevloc 0
foreach insert $sorted2 {
# find location of next element to insert
set nextloc [lsearch -bisect -integer -index 1 $sorted1 [lindex $insert 1]]
# append up to next loc
lappend res {*}[lrange $sorted1 $prevloc $nextloc] $insert
# put read location just beyond the inserted element
set prevloc [+ 1 $nextloc]
}
# append whatever tail is left
lappend res {*}[lrange $sorted1 $prevloc end]
return $res
}
The attribute sorted on is an integer in the second element in each sorted element, hence the -integer index 1 and lindex $insert 1.

Blocks as various data structures in Rebol

I gather that in Rebol one is expected to use a block for representing arbitrary structured data. Are there built-in or standard ways of treating blocks as data structures other than lists?
I am thinking of:
stacks
queues (possibly double-ended)
sets
maps aka. associative arrays
Rebol have three holders of arbitrary data that all can be treated the same way.
block! implemented as an array, for fast index (integer) referencing
list! implemented as a linked list, for fast inserting and removing data
hash! implemented as a hash referenced list, for fast lookup of both data and key
You operate on them in the same way with
insert append index? find poke select ...
but they differ a little in result and particularly in response time.
In your case use
block! for a stack
list! for queues (I think)
hash! for associative arrays
As mentioned all operate similarly (even the hash! can be referenced by index). Hence you can treat any of them as an associative array.
>> x: [a one b two c 33]
== [a one b two c 33]
>> x/a
== one
>> x/c
== 33
>> select x 'b
== two
>> pick x 4
== two
which would result in exactly the same for a hash! defined as x: hash! [a 1 b 2 33]. So to add a new key value pair:
>> x: make hash! [ a 1 b 2 c 33]
== make hash! [a 1 b 2 c 33]
>> append x [ key value ]
== make hash! [a 1 b 2 c 33 key value]
>> x/key
== value
>> select x 'key
== value
>> pick x 8
== value
Note that rebol does not have a sense of key value pairs, the hash! is just a list of ordered values that internally will build hash! values for referencing. You can therefore just as well ask what follows the value 33 above
>> select x 33
== key
To really use it for key value pairs, use the skip refinement
>> select/skip x 33 2
== none
For the associative arrays you can also use object! in case it does not need to have dynamic fields.

Algorithm to return all combinations of k out of n as well as corresponding complement list

This is related to this question, but I also want the corresponding complement ordered list that contains the element not chosen in an efficient way using python or c++. For example, with an ordered list list=(0,1,1,2,3,3) when we choose 3 elements, one possible return pair should be (0,1,3) and (1,2,3). Additionally, I want the function to return a total of C(n,k) terms, so that the pair (0,1,3) and (1,2,3) should repeat 4 times in the returned result.
Full example of a short input: define a function foo(list,k), then foo([0,1,1],1) should return a list r of length C(3,1)=3,
r(0)=[[0],[1,1]] (choose 0, complement list is [1,1])
r(1)=[[1],[0,1]] (choose first 1, complement list is [0,1])
r(2)=[[1],[0,1]] (choose second 1, complement list is [0,1])
You can use a basic combinations-algorithm, with the exception that it returns a tuple: the elements that are "in" and those that are "out". Then just recursively generate combinations for the rest of the list and add the first element to either the "in" or the "out" list, respectively.
Here's some Python code:
def comb_and_comp(lst, n):
# no combinations
if len(lst) < n:
return
# trivial 'empty' combination
if n == 0 or lst == []:
yield [], lst
else:
first, rest = lst[0], lst[1:]
# combinations that contain the first element
for in_, out in comb_and_comp(rest, n - 1):
yield [first] + in_, out
# combinations that do not contain the first element
for in_, out in comb_and_comp(rest, n):
yield in_, [first] + out
This will create both the "in" and "out" lists in one go, instead of creating the complement in a second pass.
Ok then use the algorithm here to generate all combinations of k over n and then use this algorithm to get the complementary list
(assume that the combination is given in indices and not actual values)
for your example this would be:
foo([0,1,1],1) // i.e indices are [0,1,2] with values [0,1,1]
r(0)=[[0],[1,2]] (choose 0, complement list is [1,2])
r(1)=[[1],[0,2]] (choose first 1, complement list is [0,2])
r(2)=[[1],[0,1]] (choose second 1, complement list is [0,1])
The algorithm is (adapted from this post):
def complement(n, k, combination):
# assume the combination values/indices are given in ascending order (i.e lexicographic)
complement = []
i=0
j=0
while i < n:
if j >= k or i<combination[j]: complement.append(i)
else: j+=1
i+=1
return complement

Two indexes in Ruby for loop

can you have a ruby for loop that has two indexes?
ie:
for i,j in 0..100
do something
end
Can't find anything in google
EDIT: Adding in more details
I need to compare two different arrays like such
Index: Array1: Array2:
0 a a
1 a b
2 a b
3 a b
4 b b
5 c b
6 d b
7 d b
8 e c
9 e d
10 e d
11 e
12 e
But knowing that they both have the same items (abcde)
This is my logic in pseudo, lets assume this whole thing is inside a loop
#tese two if states are for handling end-of-array cases
If Array1[index_a1] == nil
Errors += Array1[index_a1-1]
break
If Array2[index_a1] == nil
Errors += Array2[index_a2-1]
break
#this is for handling mismach
If Array1[index_a1] != Array2[index_a2]
Errors += Array1[index_a1-1] #of course, first entry of array will always be same
if Array1[index_a1] != Array1[index_a1 - 1]
index_a2++ until Array1[index_a1] == Array2[index_a2]
index_a2 -=1 (these two lines are for the loop's sake in next iteration)
index_a1 -=1
if Array2[index_a2] != Array2[index_a2 - 1]
index_a1++ until Array1[index_a1] == Array2[index_a2]
index_a2 -=1 (these two lines are for the loop's sake in next iteration)
index_a1 -=1
In a nutshell, in the example above,
Errors looks like this
a,b,e
As c and d are good.
You could iterate over two arrays using Enumerators instead of numerical indices. This example iterates over a1 and a2 simultaneously, echoing the first word in a2 that starts with the corresponding letter in a1, skipping duplicates in a2:
a1 = ["a", "b", "c", "d"]
a2 = ["apple", "angst", "banana", "clipper", "crazy", "dizzy"]
e2 = a2.each
a1.each do |letter|
puts e2.next
e2.next while e2.peek.start_with?(letter) rescue nil
end
(It assumes all letters in a1 have at least one word in a2 and that both are sorted -- but you get the idea.)
The for loop is not the best way to approach iterating over an array in Ruby. With the clarification of your question, I think you have a few possibly strategies.
You have two arrays, a and b.
If both arrays are the same length:
a.each_index do |index|
if a[index] == b[index]
do something
else
do something else
end
end
This also works if A is shorter than B.
If you don't know which one is shorter, you could write something like:
controlArray = a.length < b.length ? a : b to assign the controlArray, the use controlArray.each_index. Or you could use (0..[a.length, b.length].min).each{|index| ...} to accomplish the same thing.
Looking over your edit to your question, I think I can rephrase it like this: given an array with duplicates, how can I obtain a count of each item in each array and compare the counts? In your case, I think the easiest way to do that would be like this:
a = [:a,:a,:a,:b,:b,:c,:c,:d,:e,:e,:e]
b = [:a,:a,:b,:b,:b,:c,:c,:c,:d,:e,:e,:e]
not_alike = []
a.uniq.each{|value| not_alike << value if a.count(value) != b.count(value)}
not_alike
Running that code gives me [:a,:b,:c].
If it is possible that a does not contain every symbol, then you will need to have an array which just contains the symbols and use that instead of a.uniq, and another and statement in the conditional could deal with nil or 0 counts.
the two arrays are praticatly the same except for a few elements that i have to skip in either/or every once in a while
Instead of skipping during iterating, could you pre-select the non-skippable ones?
a.select{ ... }.zip( b.select{ ... } ).each do |a1,b1|
# a1 is an entry from a's subset
# b1 is the paired entry bfrom b's subset
end

How do I efficiently (mem/time) modify all elelements of a list in Tcl?

To operate on each element of a list, returning a modified list various languages have explicit constructs.
In Perl there's map:
perl -e 'my #a = (1..4); print join(q( ), map { $_ * $_ } #a)'
1 4 9 16
In Python there're list comprehensions:
>>> a = (1,2,3,4)
>>> [el*el for el in a]
[1, 4, 9, 16]
What's the most efficient way to do this in Tcl?
I can come up with the usual foreach loop.
set l {}
foreach i {1 2 3 4} {
lappend l [expr $i * $i]
}
puts $l
1 4 9 16
Is this the fastest way?
Regarding mem efficiency this builds up a second list, one by one. If I don't need the list permanently is there a more efficient way?
And, finally, is there something that's shorter?
I couldn't find infos here or in the http://wiki.tcl.tk
Answer:
As Donal Fellows has answered, most importantly for speed tests, things should be wrapped in a proc {} since Tcl then can optimize. For Tcl, a "map" function is discussed as a future enhancement. With this hint and further searching I found http://wiki.tcl.tk/12848
The most efficient method is this:
set idx 0
foreach item $theList {
lset theList $idx [expr {$item * $item}]
incr idx
}
If the list is short (e.g., a few hundred elements) the cost of allocating a new list is minimal though, so you can use this (simpler) version instead:
foreach item $theList {
lappend newList [expr {$item * $item}]
}
Note that the foreach command is only fast if placed in a procedure (or lambda expression or method) and expressions are only fast if placed in {braces}. Also, don't speculate, measure: take care to use the time command to find out how fast your code really is.
Well, there is something shorter (using the tcllib struct::list package), but not necessarily faster.
package require struct::list
puts [struct::list mapfor x $data { expr {$x * $x} }]

Resources