Which closure implementation is faster between two examples - performance

I'm writing some training material for the Groovy language and I'm preparing an example which would explain Closures.
The example is a simple caching closure for "expensive" methods, withCache
def expensiveMethod( Long a ) {
withCache (a) {
sleep(rnd())
a*5
}
}
So, now my question is: which of the two following implementations would be the fastest and more idiomatic in Groovy?
def withCache = {key, Closure operation ->
if (!cacheMap.containsKey(key)) {
cacheMap.put(key, operation())
}
cacheMap.get(key)
}
or
def withCache = {key, Closure operation ->
def cached = cacheMap.get(key)
if (cached) return cached
def res = operation()
cacheMap.put(key, res)
res
}
I prefer the first example, as it doesn't use any variable but I wonder if accessing the get method of the Map is slower than returning the variable containing the computed result.
Obviously the answer is "it depends on the size of the Map" but, out of curiosity, I would like to have the opinion of the community.
Thanks!

Firstly I agree with OverZealous, that worrying about two get operations is a premature optimization. The second exmaple is also not equal to the first. The first allows null for example, while the second on uses Groovy-Truth in the if, which means that null evals to false, as does for example an empty list/array/map. So if you want to show calling Closure I would go with the first one. If you want something more idiomatic I would do this instead for your case:
def expensiveMethod( Long a ) {
sleep(rnd())
a*5
}
def cache = [:].withDefault this.&expensiveMethod

Related

Implicitly lazy gather/take not considered a "lazy" object

The documentation for gather/take mentions
Binding to a scalar or sigilless container will also force laziness.
However,
my \result = gather { for 1..3 { take $_² } };
say result.is-lazy # OUTPUT: «False␤»
Same happens if you use a scalar, and binding using := Is there some way to create implicitly lazy gather/take statements?
Update: It's actually lazy, only it does not respond to the is-lazy method in the expected way:
my $result := gather { for 1..3 { say "Hey"; take $_² } };
say $result[0] # OUTPUT: «Hey␤1␤»
So the question is "What are the conditions for is-lazy to consider things actually lazy?"
I think the problem is really that you cannot actually tell what's going on inside a gather block. So that's why that Seq object tells you it is not lazy.
Perhaps it's more a matter of documentation: if is-lazy returns True, then you can be sure that the Seq (well, in fact its underlying Iterator) is not going to end by itself. If is-lazy returns False, it basically means that we cannot be sure.
One could argue that in that case is-lazy should return the Bool type object, which will also be interpreted as being false (as all type objects are considered to be False in boolean context). But that would at least give some indication that it is really undecided / undecidable.

Equivalent of enumerators in C++11?

In C#, you can define a custom enumeration very trivially, eg:
public IEnumerable<Foo> GetNestedFoos()
{
foreach (var child in _SomeCollection)
{
foreach (var foo in child.FooCollection)
{
yield return foo;
}
foreach (var bar in child.BarCollection)
{
foreach (var foo in bar.MoreFoos)
{
yield return foo;
}
}
}
foreach (var baz in _SomeOtherCollection)
{
foreach (var foo in baz.GetNestedFoos())
{
yield return foo;
}
}
}
(This can be simplified using LINQ and better encapsulation but that's not the point of the question.)
In C++11, you can do similar enumerations but AFAIK it requires a visitor pattern instead:
template<typename Action>
void VisitAllFoos(const Action& action)
{
for (auto& child : m_SomeCollection)
{
for (auto& foo : child.FooCollection)
{
action(foo);
}
for (auto& bar : child.BarCollection)
{
for (auto& foo : bar.MoreFoos)
{
action(foo);
}
}
}
for (auto& baz : m_SomeOtherCollection)
{
baz.VisitAllFoos(action);
}
}
Is there a way to do something more like the first, where the function returns a range that can be iterated externally rather than calling a visitor internally?
(And I don't mean by constructing a std::vector<Foo> and returning it -- it should be an in-place enumeration.)
I am aware of the Boost.Range library, which I suspect would be involved in the solution, but I'm not particularly familiar with it.
I'm also aware that it's possible to define custom iterators to do this sort of thing (which I also suspect might be involved in the answer) but I'm looking for something that's easy to write, ideally no more complicated than the examples shown here, and composable (like with _SomeOtherCollection).
I would prefer something that does not require the caller to use lambdas or other functors (since that just makes it a visitor again), although I don't mind using lambdas internally if needed (but would still prefer to avoid them there too).
If I'm understanding your question correctly, you want to perform some action over all elements of a collection.
C++ has an extensive set of iterator operations, defined in the iterator header. Most collection structures, including the std::vector that you reference, have .begin and .end methods which take no arguments and return iterators to the beginning and the end of the structure. These iterators have some operations that can be performed on them manually, but their primary use comes in the form of the algorithm header, which defines several very useful iteration functions.
In your specific case, I believe you want the for_each function, which takes a range (as a beginning to end iterator) and a function to apply. So if you had a function (or function object) called action and you wanted to apply it to a vector called data, the following code would be correct (assuming all necessary headers are included appropriately):
std::for_each(data.begin(), data.end(), action);
Note that for_each is just one of many functions provided by the algorithm header. It also provides functions to search a collection, copy a set of data, sort a list, find a minimum/maximum, and much more, all generalized to work over any structure that has an iterator. And if even these aren't enough, you can write your own by reading up on the operations supported on iterators. Simply define a template function that takes iterators of varying types and document what kind of iterator you want.
template <typename BidirectionalIterator>
void function(BidirectionalIterator begin, BidirectionalIterator end) {
// Do something
}
One final note is that all of the operations mentioned so far also operate correctly on arrays, provided you know the size. Instead of writing .begin and .end, you write + 0 and + n, where n is the size of the array. The trivial zero addition is often necessary in order to decay the type of the array into a pointer to make it a valid iterator, but array pointers are indeed random access iterators just like any other container iterator.
What you can do is writing your own adapter function and call it with different ranges of elements of the same type.
This is a non tested solution, that will probably needs some tweaking to make it compile,but it will give you an idea. It uses variadic templates to move from a collection to the next one.
template<typename Iterator, Args...>
visitAllFoos(std::pair<Iterator, Iterator> collection, Args&&... args)
{
std::for_each(collection.first, collection.second, {}(){ // apply action });
return visitAllFoos(std::forward<Args>(args)...);
}
//you can call it with a sequence of begin/end iterators
visitAllFoos(std::make_pair(c1.begin(), c1,end()), std::make_pair(c2.begin(), c2,end()))
I believe, what you're trying to do can be done with Boost.Range, in particular with join and any_range (the latter would be needed if you want to hide the types of the containers and remove joined_range from the interface).
However, the resulting solution would not be very practical both in complexity and performance - mostly because of the nested joined_ranges and type erasure overhead incurred by any_range. Personally, I would just construct std::vector<Foo*> or use visitation.
You can do this with the help of boost::asio::coroutine; see examples at https://pubby8.wordpress.com/2014/03/16/multi-step-iterators-using-coroutines/ and http://www.boost.org/doc/libs/1_55_0/doc/html/boost_asio/overview/core/coroutine.html.

Faster implementation of Option.isEmpty?

The implementation of isEmpty in Option is straightforward - here's a sketch:
abstract class Option[+A] { def isEmpty:Boolean }
object None extends Option[Nothing] { def isEmpty=true }
final class Some extends Option[+A] { def isEmpty=false }
isEmpty is used extremely heavily, including inside Option itself, so its performance is significant, even though it is so trivial.
I suspect it would be faster to implement it as:
abstract class Option[+A] { final def isEmpty = this eq None }
This implementation shouldn't require dereferencing the option or calling any methods on it, AFAIK - just a straightforward reference comparison.
I'd performance-test this, but JVM microbenchmarks are so tricky that I really have no confidence in my ability to create a meaningful result.
Are there any factors I'm overlooking?
Actually, you might be right. Using the following code:
sealed abstract class Opshun[+A] {
final def isEmpty = this eq Nun
def get: A
}
object Nun extends Opshun[Nothing] { def get = ??? }
case class Summ[+A](get: A) extends Opshun[A] {}
on the simplest possible test case (array of Option or Opshun), if all you do is test isEmpty, the pattern you suggested is 5x (!) faster, and you can verify this if you manually replace .isEmpty with eq None or a pattern match picking out None.
If you move to a more complex case where you test not-isEmpty and then get a stored value, the difference is less impressive (a third faster).
So the suggestion has merit; it's worth testing in a more official setting.
Note added in edit: this is with arrays large enough so that not everything fits in L2 cache. Either way is equally fast when it fits in L2.
(Based on Eugenes Zhulenev's comment)
It seems HotSpot compiler will do this optimization automatically, as there are only two subclasses of Option:
Polymorphism Performance Mysteries Explained:
The Server HotSpot compiler, according to Cliff Click, deals with bi-morphism as a special case of poly-morphism: "Where the server compiler can prove only two classes reach a call site, it will insert a type-check and then statically call both targets (which may then further inline, etc)."

Algorithm for visiting nodes only once

Let's say I have an array of N elements. I call a recursive function somehow like this: (no specific language here, just pseudocode)
recursive(myArray){
// do something awesome and provide base case etc
// also get mySecondArray based on myArray
for(i=0;i<mySecondArray.length;i++){
recursive(mySecondArray[i];
}
}
As you can see I need to call this function on every element of another array created inside based on some conditions and other functions called on myArray.
The problem I am having is that mySecondArray always has some of the elements that were already in myArray. I do not want to call recursion again on those elements.
Q: What would be the best algorithm approach to solve this?
If you need more info just let me know (I didn't get into details since it gets more complicated)
Thanks
You can have a hashmap/set/dictionary/whatever-you-call-it to look up the elements.
Python solution:
def recursive(myArray, mySet = None):
if mySet is None:
mySet = { myArray }
else:
mySet.add(myArray)
for mySecondArray in myArray:
if mySecondArray not in mySet:
recursive(myArray, mySet)
By the way writing recursive functions like that is a very bad idea in general. You should use a single function and a stack of the arguments if possible.
P.S.: Your code was incomplete by the way but the idea is the same.

What's the gain I can have with blocks over regular methods?

I am a Java programmer, and I am learning Ruby...
But I don't get where those blocks of codes can give me gain... like what's the purpose of passing a block as an argument ? why not have 2 specialized methods instead that can be reused ?
Why have some code in a block that cannot be reused ?
I would love some code examples...
Thanks for the help !
Consider some of the things you would use anonymous classes for in Java. e.g. often they are used for pluggable behaviour such as event listeners or to parametrize a method that has a general layout.
Imagine we want to write a method that takes a list and returns a new list containing the items from the given list for which a specified condition is true. In Java we would write an interface:
interface Condition {
boolean f(Object o);
}
and then we could write:
public List select(List list, Condition c) {
List result = new ArrayList();
for (Object item : list) {
if (c.f(item)) {
result.add(item);
}
}
return result;
}
and then if we wanted to select the even numbers from a list we could write:
List even = select(mylist, new Condition() {
public boolean f(Object o) {
return ((Integer) o) % 2 == 0;
}
});
To write the equivalent in Ruby it could be:
def select(list)
new_list = []
# note: I'm avoid using 'each' so as to not illustrate blocks
# using a method that needs a block
for item in list
# yield calls the block with the given parameters
new_list << item if yield(item)
end
return new_list
end
and then we could select the even numbers with simply
even = select(list) { |i| i % 2 == 0 }
Of course, this functionality is already built into Ruby so in practice you would just do
even = list.select { |i| i % 2 == 0 }
As another example, consider code to open a file. You could do:
f = open(somefile)
# work with the file
f.close
but you then need to think about putting your close in an ensure block in case an exception occurs whilst working with the file. Instead, you can do
open(somefile) do |f|
# work with the file here
# ruby will close it for us when the block terminates
end
The idea behind blocks is that it is a highly localized code where it is useful to have the definition at the call site. You can use an existing function as a block argument. Just pass it as an additional argument, and prefix it with an &

Resources