nice way of doing 'if ALL items in collection pass test' - algorithm

I mean this:
bool passed = true;
for(int i = 0; i < collection.Length; i++)
{
if(!PassesTest(collection[i]))
{
passed = false;
break;
}
}
if(passed){/*passed code*/}
requires extra variable, extra test
for(int i = 0; i < collection.Length; i++)
{
if(!PassesTest(collection[i]))
{
return;
}
}
{/*passed code*/}
neat, but requires this to be it's own function, if this it's self is inside a loop or something, not the most performant way of doing things. also, writing a whole extra function is a pain
if(passed){/*passed code*/}
for(int i = 0; i < collection.Length; i++)
{
if(!PassesTest(collection[i]))
{
goto failed;
}
}
{/*passed code*/}
failed: {}
great, but you have to screw around with label names and ugly label syntax
for(int i = 0; ; i++)
{
if(!(i < collection.Length))
{
{/*passed code*/}
break;
}
if(!PassesTest(collection[i]))
{
break;
}
}
probably the nicest, but still a bit manual, kinda wasting the functionality of the for loop construct, for instance, you can't do this with a foreach
what is the nicest way to handle this problem?
it seems to me something like this would be nice:
foreach(...)
{
...
}
finally{...} // only executed if loop ends conventionally (without break)
am I missing something? because this is a very common problem for me, and I don't really like any of the solutions I've come up with.
I use c++ and C#, so solutions in either would be great.
but would also be interested in solutions in other languages. (though a design principle that avoids this in any language would be ideal)

If your language doesn't have this feature, write a function "forall," which takes two arguments: a list and a boolean-valued function which is to be true for all elements of the list. Then you only have to write it once, and it matters very little how idiomatic it is.
The "forall" function looks exactly like your second code sample, except that now "collection" and "passesTest" are the arguments to that function.
Calling forall looks roughly like:
if (forall(myList,isGood)) {
which is readable.
As an added bonus, you could implement "exists" by calling "forall" on the negated boolean function, and negating its answer. That is, "exists x P(x)" is implemented as "not forall x not P(x)".

You can use Linq in .NET.
Here's an example in C#:
if(collection.All(item => PassesTest(item)))
{
// do my code
}

C++ and STL
if (std::all_of(collection.begin(), collection.end(), PassesTest))
{
/* passed code */
}

Ruby:
if collection.all? {|n| n.passes_test?}
# do something
end
Clojure:
(if (every? passes-test? collection)
; do something
)

Groovy:
if (!collection.find { !PassesTest(it) }) {
// do something
}

Scala:
def passesTest(i: Int) = i < 5 // example, would likely be idiomatically in-line
var seq = List(1,2,3,4);
seq.forall(passesTest) // => True
Most, if not all of the answers presented here are saying: higher-order constructs -- such as "passing functions" -- are really, really nice. If you are designing a language, don't forget about the last 60+ years of programming languages/designs.

Python (since v2.5):
if all(PassesTest(c) for c in collection):
do something
Notes:
we can iterate directly on collection, no need for an index and lookup
all() and any() builtin functions were added in Python 2.5
the argument to any(), i.e. PassesTest(c) for c in collection, is a generator expression. You could also view it as a list comprehension by adding brackets [(PassesTest(c) for c in collection]
... for c in collection causes iteration through a collection (or sequence/tuple/list). For a dict, use dict.keys(). For other datatypes, use the correct iterator method.

Related

Using the Haxe While Loop to Remove All of a Value from an Array

I'm wanting to remove all of a possibly duplicated value in an array. At the moment I'm using the remove(x:T):Bool function in a while loop, but I'm wondering about the expression part.
I've started by using:
function removeAll(array:Array<String>, element:String):Void
while (array.remove(element)) {}
but I'm wondering if any of these lines would be more efficient:
while (array.remove(element)) continue;
while (array.remove(element)) true;
while (array.remove(element)) 0;
or if it makes any kind of difference.
I'm guessing that using continue is less efficient because it actually has to do something, true and 0 are slightly more efficient, but still do something, and {} would probably be most efficient.
Does anyone have any background information on this?
While other suggested filter, it will create a new instance of list/array which may cause your other code to lose reference.
If you loop array.remove, it is going to loop through all the elements in the front of the array every time, which is not so performant.
IMO a better approach is to use a reverse while loop:
var i = array.length;
while(--i >= 0)
if(array[i] == element) array.splice(i, 1);
It doesn't make any difference. In fact, there's not even any difference in the generated code for the {}, 0 and false cases: they all end up generating {}, at least on the JS target.
However, you could run into issues if you have a large array with many duplicates: in that case, remove() would be called many times, and it has to iterate over the array each time (until it finds a match, that is). In that case, it's probably more efficient to use filter():
function removeAll(array:Array<String>, element:String):Array<String>
return array.filter(function(e) return e != element);
Personally, I also find this to be a bit more elegant than your while-loop with an empty body. But again, it depends on the use case: this does create a new array, and thus causes an allocation. Usually, that's not worth worrying about, but if you for instance do it in the update loop of a game, you might want to avoid it.
In terms of the expression part of the while loop, it seems that it's just set to empty brases ({}) when compiled so it doesn't really matter what you do.
In terms of performance, a much better solution is the Method 2 from the following:
class Test
{
static function main()
{
var thing:Array<String> = new Array<String>();
for (index in 0...1000)
{
thing.push("0");
thing.push("1");
}
var copy1 = thing.copy();
var copy2 = thing.copy();
trace("epoch");
while (copy1.remove("0")) {}
trace("check");
// Method 2.
copy2 = [
for (item in Lambda.filter(copy2, function(v)
{return v != "0";}))
item
];
trace("check");
}
}
which can be seen [here](https://try.haxe.org/#D0468"Try Haxe example."). For 200,000 one-character elements in an Array<String>, Method 2 takes 0.017s while Method 1 takes 44.544s.
For large arrays it will be faster to use a temporary array and then assign that back after populating ( method3 in try )?
OR
If you don't want to use a temp you can assign back and splice ( method4 in try )?
https://try.haxe.org/#5f80c
Both are more verbose codewise as I setup vars, but on mac seems faster at runtime, summary of my method3 approach:
while( i < l ) { if( ( s = copy[ i++ ] ) != '0' ) arr[ j++ ] = s;
copy = arr;
am I missing something obvious against these approaches?

Pass parameters between method name

I was wondering if you know of any programming language in which we can pass parameters inside method name. I'm guessing this could improve the code readability. I.e.
Lets say I want to multiply to integers in a method. Normally my method declaration would be something like:
function multiply(int a, int b){
return a*b;
}
However, it may be nice to be able to define it this way also:
function multiply (int a) times (int b){
return a*b;
}
This way, we could make a more explicit call in the code by calling:
var c = multiply(4)times(2);
This could have a greater impact on more complicated code and programming syntax.
Do you know if something like this exists?
Of course, there is Smalltalk which is really expressive with its keyword messages...
n := collection size // 2.
head := collection copyFrom: 1 to: n.
Other than that, you will find that in ADA or Python, and probably some others you can prefix each argument with a key (which should match the function parameter names)...

Parsing script loops in C#

I'm writing an application that will parse a script in a custom language (based slightly on C syntax and Allman style formatting) and am looking for a better (read: faster) way of parsing blocks of the script code into string arrays than the way I'm currently doing it (the current method will do, but it was more for debug than anything else).
The script contents are currently read from a file into a string array and passed to the method.
Here's a script block template:
loop [/* some conditional */ ]
{
/* a whole bunch of commands that are to be read into
* a List<string>, then converted to a string[] and
* passed to the next step for execution */
/* some command that has a bracket delimited set of
* properties or attributes */
{
/* some more commands to be acted on */
}
}
Basically, the curly bracket blocks can be nested (just like in any other C-based language), and I'm looking for the best way to find individual blocks like this.
The curly bracket delimited blocks will ALWAYS be formatted like this - the contents of the brackets will start on the line after the open bracket and will be followed by a bracket on the line after the final attribute/command/comment/whatever.
An example might be:
loop [ someVar <= 10 ]
{
informUser "Get ready to do something"
readValue
{
valueToLookFor = 0x54
timeout = 10 /* in seconds */
}
}
This would tell the app to loop whilst someVar is less than 10 (sorry for the sucking eggs comment). Each time round, we pass a message to the user and look for a specific value from somewhere (with a timeout of 10 seconds).
Here's how I'm doing it at the minute (note: the method that calls this passes the entire string[] containing the current script into it with an index to read from):
private string[] findEntireBlock(string[] scriptContents, int indexToReadFrom,
out int newIndex)
{
newIndex = 0;
int openBraceCount = 0; // for '{' char count
int closeBraceCount = 0; // for '}' char count
int openSquareCount = 0; // for '[' char count
int closeSquareCount = 0; // for ']' char count
List<string> fullblock = new List<string>();
for (int i = indexToReadFrom; i < scriptContents.Length; i++)
{
if (scriptContents[i].Contains('}'))
{
if (scriptContents[i].Contains("[") && fullblock.Count > 0)
{
//throw new exception, as we shouldn't expect to
//to find a line which starts with [ when we've already
}
else
{
if (scriptContents[i].Contains('{')) openBraceCount++;
if (scriptContents[i].Contains('}')) closeBraceCount++;
if (scriptContents[i].Contains('[')) openSquareCount++;
if (scriptContents[i].Contains(']')) closeBraceCount++;
newIndex = i;
fullblock.Add(scriptContents[i]);
break;
}
}
else
{
if (scriptContents[i].Contains("[") && fullblock.Count > 0)
{
//throw new exception, as we shouldn't expect to
//to find a line which starts with [ when we've already
}
else
{
if (scriptContents[i].Contains('{')) openBraceCount++;
if (scriptContents[i].Contains('}')) closeBraceCount++;
if (scriptContents[i].Contains('[')) openSquareCount++;
if (scriptContents[i].Contains(']')) closeBraceCount++;
fullblock.Add(scriptContents[i]);
}
}
}
if (openBraceCount == closeBraceCount &&
openSquareCount == closeSquareCount)
return fullblock.ToArray();
else
//throw new exception, the number of open brackets doesn't match
//the number of close brackets
}
I agree that this might be a slightly obtuse and slow method to follow, that's why I'm asking for any ideas on how to re-implement this for speed and clarity (if a balance can be met, that is).
I'm looking to stay away from RegEx, because I can't use it to maintain a bracket count and I'm not sure on whether you can write a RegEx statement (is that the correct term?) that can act recursively. I was thinking of working from the inside outward, but I'm convinced that would be quite slow.
I'm not looking for someone to re-write it for me, but a general idea on algorithms or techniques/libraries that I could use that would improve my method.
As a side question, how do compilers deal with multiple nested brackets in source code?
Let's Build a Compiler, by Jack Crenshaw, is a fantastic, easy-to-read introduction to building a basic compiler. The techniques discussed should help with what you're trying to do here.

What is the correct way to convert from a for loop to a while loop?

I have a for loop of the form:
for (int i = from; i < to; i++) {
// do some code (I don't know exactly what, it is subject to change)
}
And I want to convert it to a while loop (mostly because I want to play with the value of i inside the loop to go backwards and forwards and my co-worker thinks that doing this in a for loop is prone to problems. I tend to agree with him). So I wrote something like this:
int i = from;
while (i < to) {
try {
// do some code (I don't know exactly what, it is subject to change)
} finally {
i++;
}
}
Which prompted some some loud comments. My reasoning is that you don't know what the code inside the loop does - it may (and does) have multiple continue commands.
As a response he wrote this:
int i = from - 1;
while (++i < to) {
// do some code (I don't know exactly what, it is subject to change)
}
Granted its less lines, but I still think my code is more elegant - what do you think?
Playing with the value of your index while in a looping structure is prone to problems, no matter what the looping structure is.
It's not going to matter if it's a for loop or a while loop, the point is will the indexer eventually lead you to make a decision of loop termination?
If you're confident that you're indexer will eventually cause your exit condition to be achieved, then that is all you should be concerned with, not whether to use a for or a while.
And I want to convert it to a while loop (mostly because I want to play with the value of i inside the loop to go backwards and forwards and my co-worker thinks that doing this in a for loop is prone to problems. I tend to agree with him).
This is perfectly acceptable in most languages. There is no reason to avoid a for loop.
It seems to me that it may be easier and more readable to convert it to:
while (condition == true) {
// do stuff
// set the condition flag appropriately
}
and thus separate the termination of the loop from the variable incrementation.
If I see a loop with a limit check (e.g. i < limit) I would tend to assume that there's a variable that is being modified in a (reasonably) consistent fashion. There's no reason why you can't do what you want, but I would lean towards the more readable and more expected behaviour.
Why bother with silly loops when you can do the same (and much more!) with the uber-powerful goto?
i = fro;
my_loop:
//all your stuff here
i++;
if (i < to) goto my_loop;
If you are one of those faint hearted programmers that diminish the goto, then you can try with this:
i = fro;
while(1) {
//your stuff here
if (++i < to) break;
}
The easiest way to do this would be to not convert into a while loop, such as below.
for (int i = from; i < to; ) {
// do some code (I don't know exactly what, it is subject to change)
i += rand()*10;
}
To answer the question about which code I would select; I choose your longer code. Its MUCH easier to read the first(longer) loop. And yes I can read the second but even if you have lots of experience you have to look twice to know what that loop does. Plus the compiler will optimize the code well enough.

What's the smart way to implement OrderBy / ThenBy?

I'm implementing a LINQ clone in Lua, but that's not too relevant here, and I've got most features done (enumerable/queryable, not the precompiler yet), but can't think of a smart way to implement OrderBy's ThenBy.
Currently I sort once, then place in new lists and then sort those sub lists and finally merge the results again, but that seems very wasteful and inelegant, I'm sure someone has figured out a smart way to do this (better algorithm), but I have no idea what it is. Any clues as to how to implement OrderBy / Thenby in an efficient way?
Note: Language and Language constructs hopefully are not relevant here, I'm looking for the generalized algorithm, just as say a Binary Sort can be done in any language.
Edit: Currently I'm working on LINQ to Object, so any ideas how that would be done in particular would be great. I'm guessing OrberBy/ThenBy are 2 function calls, not one but I might be wrong.
Typically you would implement a multi-key sort by using a suitable compare method. For example, to sort a list of names by last name and then first name, you might use a compare function like this:
int compareNames(Name n1, Name n2)
{
if (n1.LastName < n2.LastName) {
return -1;
} else if (n1.LastName > n2.LastName) {
return 1;
} else if (n1.FirstName < n2.FirstName) {
return -1;
} else if (n1.FirstName > n2.FirstName) {
return 1;
} else {
return 0;
}
}
The key point here is that we don't look at the FirstName member unless we already know that the two LastName members are equal.
I think this also works:
function(lh,rh)
if lh.first < rh.first then
return true
elseif lh.second < rh.second then
return true
end
return false
end
which, if true, means this should work:
tests={}
tests[1]=function(lh,rh)
return lh.first < rh.first
end
tests[2]=function(lh,rh)
return lh.second < rh.second
end
function(lh,rh)
local res=true
local k,v
for k,v in ipairs(tests) do
res = v(lh,rh)
if res then break end
end
return res
end

Resources