Sort an array of string by length in ColdFusion? - sorting

How would you sort an array of string by length in ColdFusion?
In PHP, one can use usort as demonstrated here: PHP: Sort an array by the length of its values?
Does ArraySort() in CF10 support passing in a comparator function like usort?

The above answer has an error, here is the correct way to use arraysort to sort by string length:
<cfscript>
data = [ "bb", "a", "dddd", "ccc" ];
arraySort( data, function( a, b ) {
return len(a) - len(b);
});
</cfscript>
The comparator for this function should return a number either < 0 (less than), 0 (equal) or > 0 (greater than), not a boolean. Also see the arraySort docs.

I guess this is not going to be most flexible or even effective solution, but I was interested in the shortest version which uses built-in CFML sorting... Without comments it's just 13 lines of code :)
source = ["bb", "a", "ffff", "ccc", "dd", 22, 0];
lengths = {};
result = [];
// cache lengths of the values with index as key
for (i=1; i LTE ArrayLen(source); i++) {
lengths[i] = Len(source[i]);
}
// sort the values using 'numeric' type
sorted = StructSort(lengths, "numeric", "asc");
// populate results using sorted cache indexes
for (v in sorted) {
ArrayAppend(result, source[v]);
}
Result is ["a",0,"bb",22,"dd","ccc","ffff"]

You can use a quick sort algorithm along with your own custom comparator, similar to how Java's comparators work.
You can find a quickSort UDF here: http://cflib.org/udf/quickSort.
You'll need to define your own comparator to tell the function how it should do the sorting. Below is a working example. Note that you'll need in include the UDF in your page so that the quickSort function is available.
strings = ["bb", "a", "ccc"];
WriteOutput(ArrayToList(quickSort(strings, descStringLenCompare)));
//outputs a,bb,ccc
WriteOutput(ArrayToList(quickSort(strings, ascStringLenCompare)));
//outputs ccc,bb,a
//Ascending comparator
Numeric function ascStringLenCompare(required String s1, required String s2)
{
if (Len(s1) < Len(s2)){
return -1;
}else if (Len(s1) > Len(s2)) {
return 1;
}else{
return 0;
}
}
//Descending comparator
Numeric function descStringLenCompare(required String s1, required String s2)
{
if (Len(s1) < Len(s2)){
return 1;
}else if (Len(s1) > Len(s2)) {
return -1;
} else {
return 0;
}
}

In Coldfusion 10 or Railo 4, you can use the Underscore.cfc library to write this in an elegant and simple way:
_ = new Underscore(); // instantiate the library
// define an array of strings
arrayOfStrings = ['ccc', 'a', 'dddd', 'bb'];
// perform sort
sortedArray = _.sortBy(arrayOfStrings, function (string) {
return len(string);
});
// sortedArray: ['a','bb','ccc','dddd']
The iterator function is called for each value in the array, and that value is passed in as the first argument. The function should return the value that you wish to sort on. In this case, we return len(string). _.sortBy always sorts in ascending order.
(Disclaimer: I wrote Underscore.cfc)

In CF10 you can indeed use a closure with ArraySort().
eg1. sort by length alone.
<cfscript>
data = [ "bb", "a", "dddd", "ccc" ];
arraySort( data, function( a, b ) {
return len(a) < len(b);
});
</cfscript>
data == [ "a", "bb", "ccc", "dddd" ]
eg2. sort by length and alphabetically when same length.
<cfscript>
data = [ "b", "a", "dddd", "ccc" ];
arraySort( data, function( a, b ) {
return len(a) == len(b) ? compare( a, b ) : ( len(a) > len(b) );
});
</cfscript>
data == [ "a", "b", "ccc", "dddd" ]
eg3. same, only reverse the order.
<cfscript>
data = [ "b", "a", "dddd", "ccc" ];
arraySort( data, function( a, b ) {
return len(a) == len(b) ? compare( b, a ) : ( len(a) < len(b) );
});
</cfscript>
data == [ "dddd", "ccc", "b", "a" ]

Related

Bencode parser using stack

I am trying to use a stack-based approach to parse a bencoded string.
This link describes bencoding: https://www.bittorrent.org/beps/bep_0003.html
My psuedocode fails to handle the case where there are nested lists, for example, [1, [2]], and [[1, 2]] will both return [[1 ,2]], even when clearly the encoding is different, "li1eli2eee" versus "lli1ei2eee".
Here is my psuedocode thus far
input: string
output: map/list/integer/string in a bencoded data structure
first, tokenize the string into valid tokens
Valid tokens "d, l, [words], [numbers], e, s (virtual token)"
Strings are tokenized as 4:spam becomes "s spam e" with s being a virtual token
Eg. li1el4:spamee becomes [l i 1 e i 22 e l s spam e i 2 e e e]
Parsing:
make two stacks:
stack1
stack2
for token in tokens:
if stack is empty
return error
if the token isn’t an “e”
push token onto stack1
while the stack isn’t empty:
elem = pop off the stack
if elem is “i”
elem2 = pop elem off stack2 and check if it can be converted to an int
if not
return error
push elem2 onto stack2 again
elif elem is “d”
make a new dict
while stack2 isn’t empty:
key = pop off stack2
if stack2 is empty:
return error (because then we have an odd key value encoding)
value = pop off stack2
dict[key] = value
push dict onto stack2
elif elem is “l”
make a new list
while stack2 isn’t empty:
append pop off stack2 to l
push l onto stack2
elif elem is “s”
dont need to do anything :P
else
push elem onto stack2
if stack2 isn’t empty:
ret = pop the lone element off stack2
if stack2 isn’t empty:
return error
return ret
I don't quite follow the spec or the pseudocode, but it seems pretty straightforward to implement a subset of "Bencoding" to handle the two strings you've shown (lists and integers). Everything else is relatively trivial (dicts are the same as lists, more or less, and strings and other non-recursively defined data types are basically the same as ints) as far as I can tell.
My algorithm is as follows:
Make a stack and put an empty array into it.
For each index in the bencoded string:
If the current character is i, parse the integer and fast-forward the index to the e that closes the integer. Append the integer to the array at the top of the stack.
If the current character is l, push a new arr onto the stack.
If the current character is e, pop the stack and push the popped array onto the array below it (i.e. the new top).
Return the only element in the stack.
Here it is in JS:
const tinyBencodeDecode = s => {
const stack = [[]];
for (let i = 0; i < s.length; i++) {
if (s[i] === "i") {
for (var j = ++i; s[j] !== "e"; j++);
stack[stack.length-1].push(+s.slice(i, j));
i = j;
}
else if (s[i] === "l") {
stack.push([]);
}
else if (s[i] === "e") {
stack[stack.length-2].push(stack.pop());
}
}
return stack[0];
};
[
"i1ei2e", // => [1, 2]
"lli1ei2eee", // => [[1, 2]]
"li1eli2eee", // => [[1, [2]]]
// [44, [1, [23, 561, [], 1, [78]]], 4]
"i44eli1eli23ei561elei1eli78eeeei4e",
].forEach(e => console.log(JSON.stringify(tinyBencodeDecode(e))));
No error handling is performed and everything is assumed to be well-formed, but error handling doesn't impact the fundamental algorithm; it's just a matter of adding a bunch of conditionals to check the index, stack and string as you work.
Here's an (admittedly lazy) example of how you could support the 4 datatypes. Again, error handling is omitted. The idea is basically the same as above except more fussing is needed to determine whether we're building a dictionary or a list. Since null doesn't appear to be a valid key per the spec, I'm using it a placeholder to pair up value tokens with their corresponding key.
In both cases, minor adjustments will need to bee made if it turns out that bencoding only has a single root element (list or dictionary). In that case, s = "i42ei43e" would be invalid on the top level and we'd start with an empty stack.
const back = (a, n=1) => a[a.length-n];
const append = (stack, data) => {
if (Array.isArray(back(stack))) {
back(stack).push(data);
}
else {
const emptyKey = Object.entries(back(stack))
.find(([k, v]) => v === null);
if (emptyKey) {
back(stack)[emptyKey[0]] = data;
}
else {
back(stack)[data] = null;
}
}
};
const bencodeDecode = s => {
const stack = [[]];
for (let i = 0; i < s.length; i++) {
if (s[i] === "i") {
for (var j = ++i; s[j] !== "e"; j++);
append(stack, +s.slice(i, j));
i = j;
}
else if (/\d/.test(s[i])) {
for (var j = i; s[j] !== ":"; j++);
const num = +s.slice(i, j++);
append(stack, s.slice(j, j + num));
i = j + num - 1;
}
else if (s[i] === "l") {
stack.push([]);
}
else if (s[i] === "d") {
stack.push({});
}
else if (s[i] === "e") {
append(stack, stack.pop());
}
}
return stack[0];
};
[
"i1ei2e", // => [1, 2]
"lli1ei2eee", // => [[1, 2]]
"li1eli2eee", // => [[1, [2]]]
"li1e4:spamli2eee", // => [[1, "spam", [2]]]
// [[1, "spam", {"cow": "moo", "spam": {"eggs": [6, "rice"]}}, [2]]]
"li1e4:spamd3:cow3:moo4:spamd4:eggsli6e4:riceeeeli2eee",
// [44, [1, [23, 561, [], 1, [78]]], 4]
"i44eli1eli23ei561elei1eli78eeeei4e",
].forEach(e => console.log(JSON.stringify(bencodeDecode(e))));

Tree Level-Order Traversal of Elements in a Vector

I am looking for an algorithm to take a list of x values and loop through them starting in the middle then the middle of the left then the middle of the right, then the middle of the middle of the left...like a tree.
I don't think recursion will work because it will traverse down one side completely before getting to the other side. I need to parse through evenly.
Pretend this is a list of 50 numbers:
.................................................. (50)
Need to find the 25th element first
........................1......................... (lvl1)
Then the 12th, then 38th
...........2.........................3............ (lvl2)
Then the 6,18 31,44
.....4...........5.............6...........7...... (lvl3)
Then the 3,9,15,21 28,34,41,48
..8.....9.....a......b.....c.......d.....e.....f.. (lvl4)
etc... until all the values have been traversed. So by the time lvl4 is hit, i've seen 1,2,3,4,5,6,7,8,9,a,b,c,d,e,f in that order.
All my attempts have flopped to do this iteratively.
Efficiency is not critical as it won't be run often.
Hopefully my question is clear. Thank-you
You can solve this via a queue data structure and some math.
Start by pushing in the tuple (0, 25, 49). This indicates that this is a node at position 25, splitting the range 0-49. So the queue should look like this:
[(0, 25, 49)]
Now at each point, remove the front of the queue, print the element at the index, and push in the descendants. So, for example, when you pop (0, 25, 49), how to track the descendants? The left descendant is the middle of the range 0-24, so you would push in (0, 12, 24). The right descendant is the middle of the range 26-49, so you would push in (26, 38, 49). So the queue should look like this:
[(0, 13, 23), (26, 38, 49)].
Et cetera.
(The solution that follows is written in Swift, but I hope you can follow it and translate to your favourite language of choice, in case you wish to make use of it)
We can quite easily come up with a solution that works in the special case where your number of array values describe a full(/proper) binary tree, i.e., if numElements = 2^(lvl-1)+1, where lvl is the level of your tree. See function printFullBinaryTree(...) below.
Now, we can also somewhat with ease expand any array into one that describes a full binary tree, see expandToFullBinary. '
By combining these two methods, we have a general method for input arrays of any size.
Expand any array into one that describes a full binary tree:
/* given 'arr', returns array expanded to full binary tree (if necessary) */
func expandToFullBinary(arr: [String], expandByCharacter: String = "*") -> [String] {
let binLength = Int(pow(2.0,Double(Int(log2(Double(arr.count)))+1)))-1
if arr.count == binLength {
return arr
}
else {
let diffLength = binLength - arr.count
var arrExpanded = [String](count: binLength, repeatedValue: expandByCharacter)
var j = 0
for i in 0 ..< arr.count {
if i < (arr.count - diffLength) {
arrExpanded[i] = arr[i]
}
else {
arrExpanded[i+j] = arr[i]
j = j+1
}
}
return arrExpanded
}
}
Print array (that describes a full binary tree) as a binary tree according to your question specifications:
/* assumes 'arr' describes a full binary tree */
func printFullBinaryTree(arr: [String]) {
var posVectorA : [Int] = [arr.count/2]
var posVectorB : [Int]
var splitSize : Int = arr.count/2
var elemCount = 0
if arr.count < 2 {
print("\(arr.first ?? "")")
}
else {
while elemCount < arr.count {
posVectorB = []
splitSize = splitSize/2
for i in posVectorA {
if elemCount == arr.count {
print("noo")
break
}
print(arr[i], terminator: " ")
elemCount = elemCount + 1
posVectorB.append(i-splitSize-1)
posVectorB.append(i+splitSize+1)
}
print("")
posVectorA = posVectorB
}
}
}
Example for a vector describing a full binary tree as well as one describing a non-full binary tree:
/* Example */
var arrFullBinary : [String] = ["8", "4", "9", "2", "a", "5", "b", "1", "c", "6", "d", "3", "e", "7", "f"]
var arrNonFullBinary : [String] = ["g", "8", "h", "4", "i", "9", "j", "2", "a", "5", "b", "1", "c", "6", "d", "3", "e", "7", "f"]
printFullBinaryTree(expandToFullBinary(arrFullBinary, expandByCharacter: ""))
/* 1
2 3
4 5 6 7
8 9 a b c d e f */
printFullBinaryTree(expandToFullBinary(arrNonFullBinary, expandByCharacter: ""))
/* 1
2 3
4 5 6 7
8 9 a b c d e f
g h i j */

Lua - Sort table and randomize ties

I have a table with two values, one is a name (string and unique) and the other is a number value (in this case hearts). What I want is this: sort the table by hearts but scramble randomly the items when there is a tie (e.g. hearts is equal). By a standard sorting function, in case of ties the order is always the same and I need it to be different every time the sorting function works.
This is anexample:
tbl = {{name = "a", hearts = 5}, {name = "b", hearts = 2}, {name = "c", hearts = 6}, {name = "d", hearts = 2}, {name = "e", hearts = 2}, {name = "f", hearts = 7}}
sort1 = function (a, b) return a.hearts > b.hearts end
sort2 = function (a, b)
if a.hearts ~= b.hearts then return a.hearts > b.hearts
else return a.name > b.name end
end
table.sort(tbl, sort2)
local s = ""
for i = 1, #tbl do
s = s .. tbl[i].name .. "(" .. tbl[i].hearts .. ") "
end
print(s)
Now, with the function sort2 I think I quite got the problem. The problem is, what happens when a.hearts == b.hearts? In my code it just orders the ties by their name, not what I want. I have two ideas:
First scramble randomly all the items in the table, then apply sort1.
Add a value to every element of the table, called rnd, that is a random number. Then in sort2, when a.hearts == b.hearts order the items by a.rnd > b.rnd.
In sort2, when a.hearts == b.hearts generate randomly true or false and return it. It doesn't work, and I understand that this happens because the random true/false makes the order function crash since there could be inconsistencies.
I don't like 1 (because I would like to do everything inside the sorting function) and 2 (since it requires to add a value), I would like to do something like 3 but working. The question is: is there a way do to this in a simple manner, and what is an optimal way of doing this? (maybe, method 1 or 2 are optimal and I don't get it).
Bonus question. Moreover, I need to fix an item and sort the others. For example, suppose we want "c" to be first. Is it good to make a separate table with only the items to sort, sort the table and then add the fixed items?
-- example table
local tbl = {
{ name = "a", hearts = 5 },
{ name = "b", hearts = 2 },
{ name = "c", hearts = 6 },
{ name = "d", hearts = 2 },
{ name = "e", hearts = 2 },
{ name = "f", hearts = 7 },
}
-- avoid same results on subsequent requests
math.randomseed( os.time() )
---
-- Randomly sort a table
--
-- #param tbl Table to be sorted
-- #param corrections Table with your corrections
--
function rnd_sort( tbl, corrections )
local rnd = corrections or {}
table.sort( tbl,
function ( a, b)
rnd[a.name] = rnd[a.name] or math.random()
rnd[b.name] = rnd[b.name] or math.random()
return a.hearts + rnd[a.name] > b.hearts + rnd[b.name]
end )
end
---
-- Show the values of our table for debug purposes
--
function show( tbl )
local s = ""
for i = 1, #tbl do
s = s .. tbl[i].name .. "(" .. tbl[i].hearts .. ") "
end
print(s)
end
for i = 1, 10 do
rnd_sort(tbl)
show(tbl)
end
rnd_sort( tbl, {c=1000000} ) -- now "c" will be the first
show(tbl)
Here's a quick function for shuffling (scrambling) numerically indexed tables:
function shuffle(tbl) -- suffles numeric indices
local len, random = #tbl, math.random ;
for i = len, 2, -1 do
local j = random( 1, i );
tbl[i], tbl[j] = tbl[j], tbl[i];
end
return tbl;
end
If you are free to introduce a new dependency, you can use lazylualinq to do the job for you (or check out how it sorts sequences, if you do not need the rest):
local from = require("linq")
math.randomseed(os.time())
tbl = {{name = "a", hearts = 5}, {name = "b", hearts = 2}, {name = "c", hearts = 6}, {name = "d", hearts = 2}, {name = "e", hearts = 2}, {name = "f", hearts = 7}}
from(tbl)
:orderBy("x => x.hearts")
:thenBy("x => math.random(-1, 1)")
:foreach(function(_, x) print(x.name, x.hearts) end)

Fastest way to get values from 2d array

I have 2d aray similar to this:
string[,] arr = {
{ "A", "A", "A", "A", "A", "A", "A", "D", "D", "D", "D", "D", "D", "D", "D" },
{ "1", "1", "1", "1", "1", "1", "1", "0", "0", "0", "0", "0", "0", "0", "0" },
{ "2", "2", "2", "2", "2", "2", "2", "00", "00", "00", "00", "00", "00", "00", "00" }
};
I am trying to get following result from above array:
A 1 2
A 1 2
A 1 2
A 1 2
A 1 2
A 1 2
Get all "A" from the array at length 0. Than get corrospoding values of it from other columns.
This is big 2d array with over 6k values. But design is exactly same as described above. I have tried 2 ways so far:
1st method: using for loop to go through all the values:
var myList = new List<string>();
var arrLength = arr.GetLength(1)-1;
for (var i = 0; i < arrLength; i++)
{
if (arr[0,i].Equals("A"))
myList.Add(arr[0, i]);
else
continue;
}
}
2nd method: creating list and than going through all values:
var dataList = new List<string>();
var list = Enumerable.Range(0, arr.GetLength(1))
.Select(i => arr[0, i])
.ToList();
var index = Enumerable.Range(0, arr.GetLength(1))
.Where(index => arr[0, index].Contains("A"))
.ToArray();
var sI = index[0];
var eI = index[index.Length - 1];
myList.AddRange(list.GetRange(sI, eI - sI));
They both seem to be slow, not efficient enough. Is there any better way of doing this?
I like to approach these kinds of algorithms in a way that my code ends up being self-documenting. Usually, describing the algorithm with your code, and not bloating it with code features, tends to produce pretty good results.
var matchingValues =
from index in Enumerable.Range(0, arr.GetLength(1))
where arr[0, index] == "A"
select Tuple.Create(arr[1, index], arr[2, index]);
Which corresponds to:
// find the tuples produced by
// mapping along one length of an array with an index
// filtering those items whose 0th item on the indexed dimension is A"
// reducing index into the non-0th elements on the indexed dimension
This should parallelize extremely well, as long as you keep to the simple "map, filter, reduce" paradigm and refrain from introducing side-effects.
Edit:
In order to return an arbitrary collection of the columns associated with an "A", you can:
var targetValues = new int[] { 1, 2, 4, 10 };
var matchingValues =
from index in Enumerable.Range(0, arr.GetLength(1))
where arr[0, index] == "A"
select targetValues.Select(x => arr[x, index]).ToArray();
To make it a complete collection, simply use:
var targetValues = Enumerable.Range(1, arr.GetLength(0) - 1).ToArray();
As "usr" said: back to the basics if you want raw performance. Also taking into account that the "A" values can start at an index > 0:
var startRow = -1; // "row" in the new array.
var endRow = -1;
var match = "D";
for (int i = 0; i < arr.GetLength(1); i++)
{
if (startRow == -1 && arr[0,i] == match) startRow = i;
if (startRow > -1 && arr[0,i] == match) endRow = i + 1;
}
var columns = arr.GetLength(0);
var transp = new String[endRow - startRow,columns]; // transposed array
for (int i = startRow; i < endRow; i++)
{
for (int j = 0; j < columns; j++)
{
transp[i - startRow,j] = arr[j,i];
}
}
Initializing the new array first (and then setting the "cell values) is the main performance boost.

Linq intersect with sum

I have two collections that I want to intersect, and perform a sum operation on matching elements.
For example the collections are (in pseudo code):
col1 = { {"A", 5}, {"B", 3}, {"C", 2} }
col2 = { {"B", 1}, {"C", 8}, {"D", 6} }
and the desired result is:
intersection = { {"B", 4}, {"C", 10} }
I know how to use an IEqualityComparer to match the elements on their name, but how to sum the values while doing the intersection?
EDIT:
The starting collections haven't two items with the same name.
Let's say your input data looks like this:
IEnumerable<Tuple<string, int>> firstSequence = ..., secondSequence = ...;
If the strings are unique in each sequence (i.e there can be no more than a single {"A", XXX} in either sequence) you can join like this:
var query = from tuple1 in firstSequence
join tuple2 in secondSequence on tuple1.Item1 equals tuple2.Item1
select Tuple.Create(tuple1.Item1, tuple1.Item2 + tuple2.Item2);
You might also want to consider using a group by, which would be more appropriate if this uniqueness doesn't hold:
var query = from tuple in firstSequence.Concat(secondSequence)
group tuple.Item2 by tuple.Item1 into g
select Tuple.Create(g.Key, g.Sum());
If neither is what you want, please clarify your requirements more precisely.
EDIT: After your clarification that these are dictionaries - your existing solution is perfectly fine. Here's another alternative with join:
var joined = from kvp1 in dict1
join kvp2 in dict2 on kvp1.Key equals kvp2.Key
select new { kvp1.Key, Value = kvp1.Value + kvp2.Value };
var result = joined.ToDictionary(t => t.Key, t => t.Value);
or in fluent syntax:
var result = dict1.Join(dict2,
kvp => kvp.Key,
kvp => kvp.Key,
(kvp1, kvp2) => new { kvp1.Key, Value = kvp1.Value + kvp2.Value })
.ToDictionary(a => a.Key, a => a.Value);
This will give the result, but there are some caveats. It does an union of the two collections and then it groups them by letter. So if, for example, col1 contained two A elements, it would sum them together and, because now they are 2 A, it would return them.
var col1 = new[] { new { L = "A", N = 5 }, new { L = "B", N = 3 }, new { L = "C", N = 2 } };
var col2 = new[] { new { L = "B", N = 1 }, new { L = "C", N = 8 }, new { L = "D", N = 6 } };
var res = col1.Concat(col2)
.GroupBy(p => p.L)
.Where(p => p.Count() > 1)
.Select(p => new { L = p.Key, N = p.Sum(q => q.N) })
.ToArray();
The best I came up with until now is (my collections are actually Dictionary<string, int> instances):
var intersectingKeys = col1.Keys.Intersect(col2.Keys);
var intersection = intersectingKeys
.ToDictionary(key => key, key => col1[key] + col2[key]);
I'm not sure if it will perform well, at least is it readable.
If your intersection algorithm will result in anonymous type, i.e. ...Select(new { Key = key, Value = value}) then you can easily sum it
result.Sum(e => e.Value);
If you want to sum the "while" doing the intersection, add the value to the accumulator value when adding to the result set.

Resources