What are nested functions? What are they for? - information-hiding

I've never used nested functions, but have seen references to them in several languages (as well as nested classes, which I assume are related).
What is a nested function?
Why?!?
What can you do with a nested function that you cannot do any other way?
What can you do with a nested function this is difficult or inelegant without nested functions?
I assume nested functions are simply an artifact of treating everything as an object, and if objects can contain other objects then it follows.
Do nested functions have scope (in general, I suppose languages differ on this) just as variables inside a function have scope?
Please add the language you are referencing if you're not certain that your answer is language agnostic.
-Adam

One popular use of nested functions is closures. In a lexically scoped language with first-class functions it's possible to use functions to store data. A simple example in Scheme is a counter:
(define (make-counter)
(let ((count 0)) ; used to store the count
(define (counter) ; this is the counter we're creating
(set! count (+ count 1)) ; increment the count
count) ; return the new count
counter)) ; return the new counter function
(define mycounter (make-counter)) ; create a counter called mycounter
(mycounter) ; returns 1
(mycounter) ; returns 2
In this example, we nest the function counter inside the function make-counter, and by returning this internal function we are able to access the data available to counter when it was defined. This information is private to this instance of mycounter - if we were to create another counter, it would use a different spot to store the internal count. Continuing from the previous example:
(define mycounter2 (make-counter))
(mycounter2) ; returns 1
(mycounter) ; returns 3

It's useful for recursion when there is only 1 method that will ever call it
string[] GetFiles(string path)
{
void NestedGetFiles(string path, List<string> result)
{
result.AddRange( files in the current path);
foreach(string subPath in FoldersInTheCurrentPath)
NestedGetFiles(subPath, result);
}
List<string> result = new List<string>();
NestedGetFiles(path, result);
return result.ToArray();
}
The above code is completely made up but is based on C# to give the idea of what I mean. The only method that can call NestedGetFiles is the GetFiles method.

Nested functions allow you to encapsulate code that is only relevant to the inner workings of one function within that function, while still allowing you to separate that code out for readability or generalization. In some implementations, they also allow access to outer scope. In D:
int doStuff() {
int result;
void cleanUpReturn() {
myResource1.release();
myResource2.release();
return result * 2 + 1;
}
auto myResource1 = getSomeResource();
auto myResource2 = getSomeOtherResource();
if(someCondition) {
return cleanUpReturn();
} else {
doSomeOtherStuff();
return cleanUpReturn();
}
}
Of course, in this case this could also be handled with RAII, but it's just a simple example.

A nested function is simply a function defined within the body of another function. Why? About the only reason I could think of off the top of my head is a helper or utility function.
This is a contrived example but bear with me. Let's say you had a function that had to act on the results two queries and fill an object with values from one of the queries. You could do something like the following.
function process(qryResult q1, qryResult q2) {
object o;
if (q1.someprop == "useme") {
o.prop1 = q1.prop1;
o.prop2 = q1.prop2;
o.prop3 = q1.prop3;
} else if (q2.someprop == "useme") {
o.prop1 = q2.prop1;
o.prop2 = q2.prop2;
o.prop3 = q2.prop3;
}
return o;
}
If you had 20 properties, you're duplicating the code to set the object over and over leading to a huge function. You could add a simple nested function to do the copy of the properties from the query to the object. Like this:
function process(qryResult q1, qryResult q2) {
object o;
if (q1.someprop == "useme") {
fillObject(o,q1);
} else if (q2.someprop == "useme") {
fillObject(o,q2);
}
return o;
function fillObject(object o, qryResult q) {
o.prop1 = q.prop1;
o.prop2 = q.prop2;
o.prop3 = q.prop3;
}
}
It keeps things a little cleaner. Does it have to be a nested function? No, but you may want to do it this way if the process function is the only one that would have to do this copy.

(C#) :
I use that to simplify the Object Browser view, and to structure my classes better.
As class Wheel nested in Truck class.
Don't forget this detail :
"Nested types can access private and protected members of the containing type, including any inherited private or protected members."

They can also be useful if you need to pass a function to another function as an argument. They can also be useful for making factory functions for factory functions (in Python):
>>> def GetIntMaker(x):
... def GetInt():
... return x
... return GetInt
...
>>> GetInt = GetIntMaker(1)
>>> GetInt()
1

A nested function is just a function inside another function.
Yes, it is a result of everything being an object. Since you can have variables only visible in the function's scope and variables can point to functions you can have a function that is referenced by a local variable.
I don't think there is anything that you can do with a nested function that you absolutely couldn't do without. A lot of the times it makes sense, though. Namely, whenever a function is a "sub-function" of some other function.
A common use-case for me is when a function performs a lot of complicated logic but what the function computes/returns is easy to abstract for all the cases dictated by the logic.

Related

Removing a std::function<()> from a vector c++

I'm building a publish-subscribe class (called SystermInterface), which is responsible to receive updates from its instances, and publish them to subscribers.
Adding a subscriber callback function is trivial and has no issues, but removing it yields an error, because std::function<()> is not comparable in C++.
std::vector<std::function<void()> subs;
void subscribe(std::function<void()> f)
{
subs.push_back(f);
}
void unsubscribe(std::function<void()> f)
{
std::remove(subs.begin(), subs.end(), f); // Error
}
I've came down to five solutions to this error:
Registering the function using a weak_ptr, where the subscriber must keep the returned shared_ptr alive.
Solution example at this link.
Instead of registering at a vector, map the callback function by a custom key, unique per callback function.
Solution example at this link
Using vector of function pointers. Example
Make the callback function comparable by utilizing the address.
Use an interface class (parent class) to call a virtual function.
In my design, all intended classes inherits a parent class called
ServiceCore, So instead of registering a callback function, just
register ServiceCore reference in the vector.
Given that the SystemInterface class has a field attribute per instance (ID) (Which is managed by ServiceCore, and supplied to SystemInterface by constructing a ServiceCore child instance).
To my perspective, the first solution is neat and would work, but it requires handling at subscribers, which is something I don't really prefer.
The second solution would make my implementation more complex, where my implementation looks as:
using namespace std;
enum INFO_SUB_IMPORTANCE : uint8_t
{
INFO_SUB_PRIMARY, // Only gets the important updates.
INFO_SUB_COMPLEMENTARY, // Gets more.
INFO_SUB_ALL // Gets all updates
};
using CBF = function<void(string,string)>;
using INFO_SUBTREE = map<INFO_SUB_IMPORTANCE, vector<CBF>>;
using REQINF_SUBS = map<string, INFO_SUBTREE>; // It's keyed by an iterator, explaining it goes out of the question scope.
using INFSRC_SUBS = map<string, INFO_SUBTREE>;
using WILD_SUBS = INFO_SUBTREE;
REQINF_SUBS infoSubrs;
INFSRC_SUBS sourceSubrs;
WILD_SUBS wildSubrs;
void subscribeInfo(string info, INFO_SUB_IMPORTANCE imp, CBF f) {
infoSubrs[info][imp].push_back(f);
}
void subscribeSource(string source, INFO_SUB_IMPORTANCE imp, CBF f) {
sourceSubrs[source][imp].push_back(f);
}
void subscribeWild(INFO_SUB_IMPORTANCE imp, CBF f) {
wildSubrs[imp].push_back(f);
}
The second solution would require INFO_SUBTREE to be an extended map, but can be keyed by an ID:
using KEY_T = uint32_t; // or string...
using INFO_SUBTREE = map<INFO_SUB_IMPORTANCE, map<KEY_T,CBF>>;
For the third solution, I'm not aware of the limitations given by using function pointers, and the consequences of the fourth solution.
The Fifth solution would eliminate the purpose of dealing with CBFs, but it'll be more complex at subscriber-side, where a subscriber is required to override the virtual function and so receives all updates at one place, in which further requires filteration of the message id and so direct the payload to the intended routines using multiple if/else blocks, which will increase by increasing subscriptions.
What I'm looking for is an advice for the best available option.
Regarding your proposed solutions:
That would work. It can be made easy for the caller: have subscribe() create the shared_ptr and corresponding weak_ptr objects, and let it return the shared_ptr.
Then the caller must not lose the key. In a way this is similar to the above.
This of course is less generic, and then you can no longer have (the equivalent of) captures.
You can't: there is no way to get the address of the function stored inside a std::function. You can do &f inside subscribe() but that will only give you the address of the local variable f, which will go out of scope as soon as you return.
That works, and is in a way similar to 1 and 2, although now the "key" is provided by the caller.
Options 1, 2 and 5 are similar in that there is some other data stored in subs that refers to the actual std::function: either a std::shared_ptr, a key or a pointer to a base class. I'll present option 6 here, which is kind of similar in spirit but avoids storing any extra data:
Store a std::function<void()> directly, and return the index in the vector where it was stored. When removing an item, don't std::remove() it, but just set it to std::nullptr. Next time subscribe() is called, it checks if there is an empty element in the vector and reuses it:
std::vector<std::function<void()> subs;
std::size_t subscribe(std::function<void()> f) {
if (auto it = std::find(subs.begin(), subs.end(), std::nullptr); it != subs.end()) {
*it = f;
return std::distance(subs.begin(), it);
} else {
subs.push_back(f);
return subs.size() - 1;
}
}
void unsubscribe(std::size_t index) {
subs[index] = std::nullptr;
}
The code that actually calls the functions stored in subs must now of course first check against std::nullptrs. The above works because std::nullptr is treated as the "empty" function, and there is an operator==() overload that can check a std::function against std::nullptr, thus making std::find() work.
One drawback of option 6 as shown above is that a std::size_t is a rather generic type. To make it safer, you might wrap it in a class SubscriptionHandle or something like that.
As for the best solution: option 1 is quite heavy-weight. Options 2 and 5 are very reasonable, but 6 is, I think, the most efficient.

Adding value to generic collection in class not allowed because of scope

I'm having trouble adding elements to an object that keeps a collection of generic-typed values. I tried a Minimal Working Example that causes the error:
class OneElementQueue {
type eltType;
var elements : [0..0] eltType;
//initializer
proc init(type eltType) {
this.eltType = eltType;
}
proc add(element : eltType) {
this.elements[0] = element;
}
proc remove() : eltType {
return this.elements[0];
}
} //end of OneElementQueue
class Monkey {
var name: string;
var age: int;
proc init(name : string, age : int) {
this.name = name;
this.age = age;
}
} //end of class Monkey
var q = new owned OneElementQueue(Monkey);
var m = new owned Monkey("Kyle", 6);
q.add(m);
When I try to compile all of this, I get an error:
$ chpl BadQueue.chpl
BadQueue.chpl:12: In function 'add':
BadQueue.chpl:13: error: Scoped variable would outlive the value it is set to
BadQueue.chpl:12: note: consider scope of element
$
What is the correct way to go about adding something to a generic data structure like this? How am I going about this the wrong way?
There are two possible approaches you can take here, depending on what behavior you want:
"I want to have my collection take ownership of the Monkey objects"
In this case, you'll want to instantiate your OneElementQueue collection to store owned Monkey objects rather than simply [borrowed] Monkey objects, which is the default for class types. You can do this with the one line change (Try it Online):
var q = new owned OneElementQueue(owned Monkey);
In this approach, passing an owned Monkey to your add() method will pass the ownership to the argument and eventually to the collection, making the original object reference invalid (nil).
"I want to have my collection borrow the existing Monkey objects without taking ownership of them"
In this case, you'll need to tell the add() method that the argument passed into it will outlive the argument itself (and then be sure not to lie about it). In Chapel version 1.19, this can be done via lifetime annotations:
proc add(element : eltType) lifetime element > this {
where the annotation lifetime element > this asserts that the actual argument passed through element will outlive the this collection itself, so the compiler should not fear that the borrow will cease to exist once the formal argument has.
Lifetime annotations were not available in Chapel 1.18, so if you're using that version you need to use a slightly bigger hammer and apply pragma "unsafe" to the method. Note that pragmas are not an officially supported feature and may change in the future, so for this case, served as a stopgap until lifetime annotations had been implemented (Try it Online):
pragma "unsafe"
proc add(element : eltType) {

Equivalent of enumerators in C++11?

In C#, you can define a custom enumeration very trivially, eg:
public IEnumerable<Foo> GetNestedFoos()
{
foreach (var child in _SomeCollection)
{
foreach (var foo in child.FooCollection)
{
yield return foo;
}
foreach (var bar in child.BarCollection)
{
foreach (var foo in bar.MoreFoos)
{
yield return foo;
}
}
}
foreach (var baz in _SomeOtherCollection)
{
foreach (var foo in baz.GetNestedFoos())
{
yield return foo;
}
}
}
(This can be simplified using LINQ and better encapsulation but that's not the point of the question.)
In C++11, you can do similar enumerations but AFAIK it requires a visitor pattern instead:
template<typename Action>
void VisitAllFoos(const Action& action)
{
for (auto& child : m_SomeCollection)
{
for (auto& foo : child.FooCollection)
{
action(foo);
}
for (auto& bar : child.BarCollection)
{
for (auto& foo : bar.MoreFoos)
{
action(foo);
}
}
}
for (auto& baz : m_SomeOtherCollection)
{
baz.VisitAllFoos(action);
}
}
Is there a way to do something more like the first, where the function returns a range that can be iterated externally rather than calling a visitor internally?
(And I don't mean by constructing a std::vector<Foo> and returning it -- it should be an in-place enumeration.)
I am aware of the Boost.Range library, which I suspect would be involved in the solution, but I'm not particularly familiar with it.
I'm also aware that it's possible to define custom iterators to do this sort of thing (which I also suspect might be involved in the answer) but I'm looking for something that's easy to write, ideally no more complicated than the examples shown here, and composable (like with _SomeOtherCollection).
I would prefer something that does not require the caller to use lambdas or other functors (since that just makes it a visitor again), although I don't mind using lambdas internally if needed (but would still prefer to avoid them there too).
If I'm understanding your question correctly, you want to perform some action over all elements of a collection.
C++ has an extensive set of iterator operations, defined in the iterator header. Most collection structures, including the std::vector that you reference, have .begin and .end methods which take no arguments and return iterators to the beginning and the end of the structure. These iterators have some operations that can be performed on them manually, but their primary use comes in the form of the algorithm header, which defines several very useful iteration functions.
In your specific case, I believe you want the for_each function, which takes a range (as a beginning to end iterator) and a function to apply. So if you had a function (or function object) called action and you wanted to apply it to a vector called data, the following code would be correct (assuming all necessary headers are included appropriately):
std::for_each(data.begin(), data.end(), action);
Note that for_each is just one of many functions provided by the algorithm header. It also provides functions to search a collection, copy a set of data, sort a list, find a minimum/maximum, and much more, all generalized to work over any structure that has an iterator. And if even these aren't enough, you can write your own by reading up on the operations supported on iterators. Simply define a template function that takes iterators of varying types and document what kind of iterator you want.
template <typename BidirectionalIterator>
void function(BidirectionalIterator begin, BidirectionalIterator end) {
// Do something
}
One final note is that all of the operations mentioned so far also operate correctly on arrays, provided you know the size. Instead of writing .begin and .end, you write + 0 and + n, where n is the size of the array. The trivial zero addition is often necessary in order to decay the type of the array into a pointer to make it a valid iterator, but array pointers are indeed random access iterators just like any other container iterator.
What you can do is writing your own adapter function and call it with different ranges of elements of the same type.
This is a non tested solution, that will probably needs some tweaking to make it compile,but it will give you an idea. It uses variadic templates to move from a collection to the next one.
template<typename Iterator, Args...>
visitAllFoos(std::pair<Iterator, Iterator> collection, Args&&... args)
{
std::for_each(collection.first, collection.second, {}(){ // apply action });
return visitAllFoos(std::forward<Args>(args)...);
}
//you can call it with a sequence of begin/end iterators
visitAllFoos(std::make_pair(c1.begin(), c1,end()), std::make_pair(c2.begin(), c2,end()))
I believe, what you're trying to do can be done with Boost.Range, in particular with join and any_range (the latter would be needed if you want to hide the types of the containers and remove joined_range from the interface).
However, the resulting solution would not be very practical both in complexity and performance - mostly because of the nested joined_ranges and type erasure overhead incurred by any_range. Personally, I would just construct std::vector<Foo*> or use visitation.
You can do this with the help of boost::asio::coroutine; see examples at https://pubby8.wordpress.com/2014/03/16/multi-step-iterators-using-coroutines/ and http://www.boost.org/doc/libs/1_55_0/doc/html/boost_asio/overview/core/coroutine.html.

Cast Exception when performing LINQ on IEnumerable

Let's say I have two classes,
class A
{
}
class B : A
{
}
I have a method which accepts a parameter foo of type IEnumerable<A>;
void AMethod(IEnumerable<A> foo)
{
}
but instead pass in a value of type IEnumerable<B>.
AMethod(new[] { new B() });
This compiles and executes, though at execution foo has been implicitly cast to IEnumerable<B>. Now let's say my IEnumerable<B> contains objects of type A (I don't believe it matters whether they're mixed or all the same). When I call foo.Any() I get an exception:
Unable to cast object of type 'A' to type 'B'
I understand that I can't convert a base class to a subclass, but that's not what I'm trying to do (in fact at this point in execution, I don't even care what type it is). I guess LINQ is inherently trying to make this conversion, probably based on the fact that foo is type IEnumerable<B>. So, it seems as though I need to write two separate methods, one which handles IEnumerable<A> and one which handles IEnumerable<B>. I don't understand why I would need to do this. Any thoughts?
EDIT:
There's some dynamic casting and transformation going on that manages to spit out a IEnumerable<B> populated with 'A's, which until now I thought was impossible too. I'll do my best to translate what's happening leading up to this method call:
protected void SetDynamicData(dynamic data)
{
_data = data;
IsB = typeof(IBInterface).IsAssignableFrom(_data.DataType);
}
...
var foo = IsB ? _data.Cast<B>() : data.Cast<A>();
return BuildJqGridData<A, B, int>(gridModel, foo);
An IEnumerable<B> cannot contain objects of type A because and A is not a B.
I can write this code,
IEnumerable<B> dodgy = (new[] { new A() }).Cast<B>();
It will compile, despite being obviously wrong. The compiler assumes I know what I'm doing. Remember that no item in the IEnumerable sequence has yet been evaluated.
When I write code that evaluates a member of dodgy I get exception because an A is not a B and does not support casting to B.
I can write
IEnumerable<A> fine = new[] { new B() };
no cast is required and everything works fine because a B is an A.
If I do,
var hungrilyDodgy = (new[] { new A() }).Cast<B>().ToList();
the Tolist() will force enumeration and evaluation of the IEnumerable, therefore the exception will be thrown much sooner.

What's the gain I can have with blocks over regular methods?

I am a Java programmer, and I am learning Ruby...
But I don't get where those blocks of codes can give me gain... like what's the purpose of passing a block as an argument ? why not have 2 specialized methods instead that can be reused ?
Why have some code in a block that cannot be reused ?
I would love some code examples...
Thanks for the help !
Consider some of the things you would use anonymous classes for in Java. e.g. often they are used for pluggable behaviour such as event listeners or to parametrize a method that has a general layout.
Imagine we want to write a method that takes a list and returns a new list containing the items from the given list for which a specified condition is true. In Java we would write an interface:
interface Condition {
boolean f(Object o);
}
and then we could write:
public List select(List list, Condition c) {
List result = new ArrayList();
for (Object item : list) {
if (c.f(item)) {
result.add(item);
}
}
return result;
}
and then if we wanted to select the even numbers from a list we could write:
List even = select(mylist, new Condition() {
public boolean f(Object o) {
return ((Integer) o) % 2 == 0;
}
});
To write the equivalent in Ruby it could be:
def select(list)
new_list = []
# note: I'm avoid using 'each' so as to not illustrate blocks
# using a method that needs a block
for item in list
# yield calls the block with the given parameters
new_list << item if yield(item)
end
return new_list
end
and then we could select the even numbers with simply
even = select(list) { |i| i % 2 == 0 }
Of course, this functionality is already built into Ruby so in practice you would just do
even = list.select { |i| i % 2 == 0 }
As another example, consider code to open a file. You could do:
f = open(somefile)
# work with the file
f.close
but you then need to think about putting your close in an ensure block in case an exception occurs whilst working with the file. Instead, you can do
open(somefile) do |f|
# work with the file here
# ruby will close it for us when the block terminates
end
The idea behind blocks is that it is a highly localized code where it is useful to have the definition at the call site. You can use an existing function as a block argument. Just pass it as an additional argument, and prefix it with an &

Resources