What's the smart way to implement OrderBy / ThenBy?

What's the smart way to implement OrderBy / ThenBy? - linq

I'm implementing a LINQ clone in Lua, but that's not too relevant here, and I've got most features done (enumerable/queryable, not the precompiler yet), but can't think of a smart way to implement OrderBy's ThenBy.
Currently I sort once, then place in new lists and then sort those sub lists and finally merge the results again, but that seems very wasteful and inelegant, I'm sure someone has figured out a smart way to do this (better algorithm), but I have no idea what it is. Any clues as to how to implement OrderBy / Thenby in an efficient way?
Note: Language and Language constructs hopefully are not relevant here, I'm looking for the generalized algorithm, just as say a Binary Sort can be done in any language.
Edit: Currently I'm working on LINQ to Object, so any ideas how that would be done in particular would be great. I'm guessing OrberBy/ThenBy are 2 function calls, not one but I might be wrong.

Typically you would implement a multi-key sort by using a suitable compare method. For example, to sort a list of names by last name and then first name, you might use a compare function like this:
int compareNames(Name n1, Name n2)
{
if (n1.LastName < n2.LastName) {
return -1;
} else if (n1.LastName > n2.LastName) {
return 1;
} else if (n1.FirstName < n2.FirstName) {
return -1;
} else if (n1.FirstName > n2.FirstName) {
return 1;
} else {
return 0;
}
}
The key point here is that we don't look at the FirstName member unless we already know that the two LastName members are equal.

I think this also works:
function(lh,rh)
if lh.first < rh.first then
return true
elseif lh.second < rh.second then
return true
end
return false
end
which, if true, means this should work:
tests={}
tests[1]=function(lh,rh)
return lh.first < rh.first
end
tests[2]=function(lh,rh)
return lh.second < rh.second
end
function(lh,rh)
local res=true
local k,v
for k,v in ipairs(tests) do
res = v(lh,rh)
if res then break end
end
return res
end

Related

Can an array of strings of size 1 be used like a string?

I'm translating a program from asp to asp.net. The creator has a few function that I'm scratching my head over. It seems to be passing back arrays but the results of the function are used as if they're strings in some situations and arrays in others.
Mostly it operates as if it returns a string but sometimes it'll do for each on the result which indicates that it's actually an array of strings. I've searched the web to see if there's some weird corner case logic but nothing specific to this comes up.
function textvalue(myPar)
{
eval("var anotherArray=" + myPar);
anotherArray.sort();
if (eval("datatype" + myPar)=="datetime")
{
//if (eval(myPar)==null || eval(myPar)=="null" || eval(myPar)=="")
if (anotherArray.toString()==null || anotherArray.toString()=="null" || anotherArray.toString()=="")
{
return anotherArray;
}
else
{
return new Array(convertFromAODdatetime(anotherArray.toString()));
}
}
else
{
return anotherArray;
}
}
USE 1
Response.Write(...existing status: " & theStructure.textvalue("structureItemStatus") & "....")
USE 2
For Each datum In fileData.textValue("fileNomenclature")
Response.Write(...
I'm ultimately wondering if I need to do something unique w/ these functions or maybe the resulting datatype to replicate the logic of the function properly.

I don't believe that an array works that way. You might be able to replace the array with a list and it work more in the way you are expecting.
Another option is to just run an if statement to see if the array length is greater than 1 then run a foreach loop otherwise just grab arry[0].

Using the Haxe While Loop to Remove All of a Value from an Array

I'm wanting to remove all of a possibly duplicated value in an array. At the moment I'm using the remove(x:T):Bool function in a while loop, but I'm wondering about the expression part.
I've started by using:
function removeAll(array:Array<String>, element:String):Void
while (array.remove(element)) {}
but I'm wondering if any of these lines would be more efficient:
while (array.remove(element)) continue;
while (array.remove(element)) true;
while (array.remove(element)) 0;
or if it makes any kind of difference.
I'm guessing that using continue is less efficient because it actually has to do something, true and 0 are slightly more efficient, but still do something, and {} would probably be most efficient.
Does anyone have any background information on this?

While other suggested filter, it will create a new instance of list/array which may cause your other code to lose reference.
If you loop array.remove, it is going to loop through all the elements in the front of the array every time, which is not so performant.
IMO a better approach is to use a reverse while loop:
var i = array.length;
while(--i >= 0)
if(array[i] == element) array.splice(i, 1);

It doesn't make any difference. In fact, there's not even any difference in the generated code for the {}, 0 and false cases: they all end up generating {}, at least on the JS target.
However, you could run into issues if you have a large array with many duplicates: in that case, remove() would be called many times, and it has to iterate over the array each time (until it finds a match, that is). In that case, it's probably more efficient to use filter():
function removeAll(array:Array<String>, element:String):Array<String>
return array.filter(function(e) return e != element);
Personally, I also find this to be a bit more elegant than your while-loop with an empty body. But again, it depends on the use case: this does create a new array, and thus causes an allocation. Usually, that's not worth worrying about, but if you for instance do it in the update loop of a game, you might want to avoid it.

In terms of the expression part of the while loop, it seems that it's just set to empty brases ({}) when compiled so it doesn't really matter what you do.
In terms of performance, a much better solution is the Method 2 from the following:
class Test
{
static function main()
{
var thing:Array<String> = new Array<String>();
for (index in 0...1000)
{
thing.push("0");
thing.push("1");
}
var copy1 = thing.copy();
var copy2 = thing.copy();
trace("epoch");
while (copy1.remove("0")) {}
trace("check");
// Method 2.
copy2 = [
for (item in Lambda.filter(copy2, function(v)
{return v != "0";}))
item
];
trace("check");
}
}
which can be seen [here](https://try.haxe.org/#D0468"Try Haxe example."). For 200,000 one-character elements in an Array<String>, Method 2 takes 0.017s while Method 1 takes 44.544s.

For large arrays it will be faster to use a temporary array and then assign that back after populating ( method3 in try )?
OR
If you don't want to use a temp you can assign back and splice ( method4 in try )?
https://try.haxe.org/#5f80c
Both are more verbose codewise as I setup vars, but on mac seems faster at runtime, summary of my method3 approach:
while( i < l ) { if( ( s = copy[ i++ ] ) != '0' ) arr[ j++ ] = s;
copy = arr;
am I missing something obvious against these approaches?

Does Ruby Array#assoc use linear search?

Hmm..can't yet read this yet..but does Ruby Array#assoc use linear search?
rb_ary_assoc(VALUE ary, VALUE key)
{
long i;
VALUE v;
for (i = 0; i < RARRAY_LEN(ary); ++i) {
v = rb_check_array_type(RARRAY_PTR(ary)[i]);
if (!NIL_P(v) && RARRAY_LEN(v) > 0 &&
rb_equal(RARRAY_PTR(v)[0], key))
return v;
}
return Qnil;
}

Personally, I find the Rubinius source code much easier to read than the YARV source code. (Actually, I find all other Ruby implementations' source code easier to read than YARV or MRI.)
This is the implementation of Array#assoc from Rubinius:
def assoc(obj)
each do |x|
if x.kind_of? Array and x.first == obj
return x
end
end
nil
end
So, yes it is easy to see that it indeed does use a linear search.
But you don't really need to look at the source code to figure that out. What else could it be? There is no structure or order that could be exploited to speed it up, unlike with a search tree or a sorted array.

Yes; it's iterating over the array: RARRAY_PTR(ary)[i]
Which is the only thing that makes sense, given an array may or may not be sorted.
(Noting that Ruby 2 will introduce a bsearch, and there are at least 2-3 gems for binary searches, if you are concerned with speed. See https://stackoverflow.com/a/8672512/438992 for details.)

LINQ and CASE Sensitivity

I have this LINQ Query:
TempRecordList = new ArrayList(TempRecordList.Cast<string>().OrderBy(s => s.Substring(9, 30)).ToArray());
It works great and performs sorting in a way that's accurate but a little different from what I want. Among the the result of the query I see something like this:
Palm-Bouter, Peter
Palmer-Johnson, Sean
Whereas what I really need is to have names sorted like this:
Palmer-Johnson, Sean
Palm-Bouter, Peter
Basically I want the '-' character to be treated as being lower than the character so that names that contain it show up later in an ascending search.
Here is another example. I get:
Dias, Reginald
DiBlackley, Anton
Instead of:
DiBlackley, Anton
Dias, Reginald
As you can see, again, the order is switched due to how the uppercase letter 'B' is treated.
So my question is, what do I need to change in my LINQ query to make it return results in the order I specified. Any feedback would be greatly appreaciated.
By the way, I tried using s.Substring(9, 30).ToLower() but that didn't help.
Thank you!

To customize the sorting order you will need to create a comparer class that implements IComparer<string> interface. The OrderBy() method takes comparer as second parameter.
internal sealed class NameComparer : IComparer<string> {
private static readonly NameComparer DefaultInstance = new NameComparer();
static NameComparer() { }
private NameComparer() { }
public static NameComparer Default {
get { return DefaultInstance; }
}
public int Compare(string x, string y) {
int length = Math.Min(x.Length, y.Length);
for (int i = 0; i < length; ++i) {
if (x[i] == y[i]) continue;
if (x[i] == '-') return 1;
if (y[i] == '-') return -1;
return x[i].CompareTo(y[i]);
}
return x.Length - y.Length;
}
}
This works at least with the following test cases:
var names = new[] {
"Palmer-Johnson, Sean",
"Palm-Bouter, Peter",
"Dias, Reginald",
"DiBlackley, Anton",
};
var sorted = names.OrderBy(name => name, NameComparer.Default).ToList();
// sorted:
// [0]: "DiBlackley, Anton"
// [1]: "Dias, Reginald"
// [2]: "Palmer-Johnson, Sean"
// [3]: "Palm-Bouter, Peter"

As already mentioned, the OrderBy() method takes a comparer as a second parameter.
For strings, you don't necessarily have to implement an IComparer<string>. You might be fine with System.StringComparer.CurrentCulture (or one of the others in System.StringComparer).
In your exact case, however, there is no built-in comparer which will handle also the - after letter sort order.

OrderBy() returns results in ascending order.
e comes before h, thus the first result (remember you're comparing on a substring that starts with the character in the 9th position...not the beginning of the string) and i comes before y, thus the second. Case sensitivity has nothing to do with it.
If you want results in descending order, you should use OrderByDescending():
TempRecordList.Cast<string>
.OrderByDescending(s => s.Substring(9, 30)).ToArray());

You might want to just implement a custom IComparer object that will give a custom priority to special, upper-case and lower-case characters.
http://msdn.microsoft.com/en-us/library/system.collections.icomparer.aspx

nice way of doing 'if ALL items in collection pass test'

I mean this:
bool passed = true;
for(int i = 0; i < collection.Length; i++)
{
if(!PassesTest(collection[i]))
{
passed = false;
break;
}
}
if(passed){/*passed code*/}
requires extra variable, extra test
for(int i = 0; i < collection.Length; i++)
{
if(!PassesTest(collection[i]))
{
return;
}
}
{/*passed code*/}
neat, but requires this to be it's own function, if this it's self is inside a loop or something, not the most performant way of doing things. also, writing a whole extra function is a pain
if(passed){/*passed code*/}
for(int i = 0; i < collection.Length; i++)
{
if(!PassesTest(collection[i]))
{
goto failed;
}
}
{/*passed code*/}
failed: {}
great, but you have to screw around with label names and ugly label syntax
for(int i = 0; ; i++)
{
if(!(i < collection.Length))
{
{/*passed code*/}
break;
}
if(!PassesTest(collection[i]))
{
break;
}
}
probably the nicest, but still a bit manual, kinda wasting the functionality of the for loop construct, for instance, you can't do this with a foreach
what is the nicest way to handle this problem?
it seems to me something like this would be nice:
foreach(...)
{
...
}
finally{...} // only executed if loop ends conventionally (without break)
am I missing something? because this is a very common problem for me, and I don't really like any of the solutions I've come up with.
I use c++ and C#, so solutions in either would be great.
but would also be interested in solutions in other languages. (though a design principle that avoids this in any language would be ideal)

If your language doesn't have this feature, write a function "forall," which takes two arguments: a list and a boolean-valued function which is to be true for all elements of the list. Then you only have to write it once, and it matters very little how idiomatic it is.
The "forall" function looks exactly like your second code sample, except that now "collection" and "passesTest" are the arguments to that function.
Calling forall looks roughly like:
if (forall(myList,isGood)) {
which is readable.
As an added bonus, you could implement "exists" by calling "forall" on the negated boolean function, and negating its answer. That is, "exists x P(x)" is implemented as "not forall x not P(x)".

You can use Linq in .NET.
Here's an example in C#:
if(collection.All(item => PassesTest(item)))
{
// do my code
}

C++ and STL
if (std::all_of(collection.begin(), collection.end(), PassesTest))
{
/* passed code */
}

Ruby:
if collection.all? {|n| n.passes_test?}
# do something
end
Clojure:
(if (every? passes-test? collection)
; do something
)

Groovy:
if (!collection.find { !PassesTest(it) }) {
// do something
}

Scala:
def passesTest(i: Int) = i < 5 // example, would likely be idiomatically in-line
var seq = List(1,2,3,4);
seq.forall(passesTest) // => True
Most, if not all of the answers presented here are saying: higher-order constructs -- such as "passing functions" -- are really, really nice. If you are designing a language, don't forget about the last 60+ years of programming languages/designs.

Python (since v2.5):
if all(PassesTest(c) for c in collection):
do something
Notes:
we can iterate directly on collection, no need for an index and lookup
all() and any() builtin functions were added in Python 2.5
the argument to any(), i.e. PassesTest(c) for c in collection, is a generator expression. You could also view it as a list comprehension by adding brackets [(PassesTest(c) for c in collection]
... for c in collection causes iteration through a collection (or sequence/tuple/list). For a dict, use dict.keys(). For other datatypes, use the correct iterator method.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

What's the smart way to implement OrderBy / ThenBy? - linq

Related

Can an array of strings of size 1 be used like a string?

Using the Haxe While Loop to Remove All of a Value from an Array

Does Ruby Array#assoc use linear search?

LINQ and CASE Sensitivity

nice way of doing 'if ALL items in collection pass test'

Categories

Resources