Why do we need boost::thread_specific_ptr? - boost

Why do we need boost::thread_specific_ptr, or in other words what can we not easily do without it?
I can see why pthread provides pthread_getspecific() etc. These functions are useful for cleaning up after dead threads, and handy to call from C-style functions (the obvious alternative being to pass a pointer everywhere that points to some memory allocated before the thread was created).
In contrast, the constructor of boost:thread takes a callable class by value, and everything non-static in that class becomes thread local once it is copied. I cannot see why I would want to use boost::thread_specific_ptr in preference to a class member any more than I would want to use a global variable in OOP code.
Do I horribly misunderstand anything? A very brief example would help, please. Many thanks.

thread_specific_ptr simply provides portable thread local data access. You don't have to be managing your threads with Boost.Thread to get value from this. The canonical example is the one cited in the Boost docs for this class:
One example is the C errno variable,
used for storing the error code
related to functions from the Standard
C library. It is common practice (and
required by POSIX) for compilers that
support multi-threaded applications to
provide a separate instance of errno
for each thread, in order to avoid
different threads competing to read or
update the value.

Related

Is it bad to have many global functions?

I'm relatively new to software development, and I'm on my way to completing my first app for the iPhone.
While learning Swift, I learned that I could add functions outside the class definition, and have it accessible across all views. After a while, I found myself making many global functions for setting app preferences (registering defaults, UIAppearance, etc).
Is this bad practice? The only alternate way I could think of was creating a custom class to encapsulate them, but then the class itself wouldn't serve any purpose and I'd have to think of ways to passing it around views.
Global functions: good (IMHO anyway, though some disagree)
Global state: bad (fairly universally agreed upon)
By which I mean, it’s probably a good practice to break up your code to create lots of small utility functions, to make them general, and to re-use them. So long as they are “pure functions”
For example, suppose you find yourself checking if all the entries in an array have a certain property. You might write a for loop over the array checking them. You might even re-use the standard reduce to do it. Or you could write a re-useable function, all, that takes a closure that checks an element, and runs it against every element in the array. It’s nice and clear when you’re reading code that goes let allAboveGround = all(sprites) { $0.position.y > 0 } rather than a for…in loop that does the same thing. You can also write a separate unit test specifically for your all function, and be confident it works correctly, rather than a much more involved test for a function that includes embedded in it a version of all amongst other business logic.
Breaking up your code into smaller functions can also help avoid needing to use var so much. For example, in the above example you would probably need a var to track the result of your looping but the result of the all function can be assigned using let. Favoring immutable variables declared with let can help make your program easier to reason about and debug.
What you shouldn’t do, as #drewag points out in his answer, is write functions that change global variables (or access singletons which amount to the same thing). Any global function you write should operate only on their inputs and produce the exact same results every time regardless of when they are called. Global functions that mutate global state (i.e. make changes to global variables (or change values of variables passed to them as arguments by reference) can be incredibly confusing to debug due to unexpected side-effects they might cause.
There is one downside to writing pure global functions,* which is that you end up “polluting the namespace” – that is, you have all these functions lying around that might have specific relevance to a particular part of your program, but accessible everywhere. To be honest, for a medium-sized application, with well-written generic functions named sensibly, this is probably not an issue. If a function is purely of use to a specific struct or class, maybe make it a static method. If your project really is getting too big, you could perhaps factor out your most general functions into a separate framework, though this is quite a big overhead/learning exercise (and Swift frameworks aren’t entirely fully-baked yet), so if you are just starting out so I’d suggest leaving this for now until you get more confident.
* edit: ok two downsides – member functions are more discoverable (via autocomplete when you hit .)
Updated after discussion with #AirspeedVelocity
Global functions can be ok and they really aren't much different than having type methods or even instance methods on a custom type that is not actually intended to contain state.
The entire thing comes down mostly to personal preference. Here are some pros and cons.
Cons:
They sometimes can cause unintended side effects. That is they can change some global state that you or the caller forgets about causing hard to track down bugs. As long as you are careful about not using global variables and ensure that your function always returns the same result with the same input regardless of the state of the rest of the system, you can mostly ignore this con.
They make code that uses them difficult to test which is important once you start unit testing (which is a definite good policy in most circumstances). It is hard to test because you can't mock out the implementation of a global function easily. For example, to change the value of a global setting. Instead your test will start to depend on your other class that sets this global setting. Being able to inject a setting into your class instead of having to fake out a global function is generally preferable.
They sometimes hint at poor code organization. All of your code should be separable into small, single purpose, logical units. This ensures your code will remain understandable as your code base grows in size and age. The exception to this is truly universal functions that have very high level and reusable concepts. For example, a function that lets you test all of the elements in a sequence. You can also still separate global functions into logical units by separating them into well named files.
Pros:
High level global functions can be very easy to test. However, you cannot ignore the need to still test their logic where they are used because your unit test should not be written with knowledge of how your code is actually implemented.
Easily accessible. It can often be a pain to inject many types into another class (pass objects into an initializer and probably store it as a property). Global functions can often remove this boiler plate code (even if it has the trade off of being less flexible and less testable).
In the end, every code architecture decision is a balance of trade offs each time you go to use it.
I have a Framework.swift that contains a set of common global functions like local(str:String) to get rid of the 2nd parameter from NSLocalize. Also there are a number of alert functions internally using local and with varying number of parameters which makes use of NSAlert as modal dialogs more easy.
So for that purpose global functions are good. They are bad habit when it comes to information hiding where you would expose internal class knowledge to some global functionality.

Go lang global variables without goroutines overwriting

I'm writing a CMS in Go and have a session type (user id, page contents to render, etc). Ideally I'd like that type to be a global variable so I'm not having to propagate it through all the nested functions, however having a global variable like that would obviously mean that each new session would overwrite it's predecessor, which, needlessly to say, would be an epic fail.
Some languages to offer a way of having globals within threads that are preserved within that thread (ie the value of that global is sandboxed within that thread). While I'm aware that Goroutines are not threading, I just wondered if there was a similar method at my disposal or if I'd have to pass a local pointer of my session type down through the varies nested routines.
I'm guessing channels wouldn't do this? From what I can gather (and please correct me if I'm wrong here), but they're basically just a safe way of sharing global variables?
edit: I'd forgotten about this question! Anyhow, an update for anyone who is curious. This question was written back when I was new to Go and the CMS was basically my first project. I was coming from a C background with familiarity with POSIX thread but I quickly realised a better approach was to write the code in a mode functional design with session objects passed down as pointers in function parameters. This gave me both the context-sensitive local scope I was after while also minimizing the amount to data I was copying about. However being a 7 year old project and one that was at the start of my transition to Go, it's fair to say the project could do with a major rewrite anyway as there are a lot of mistakes made. That's a concern for another day though - currently it works and I have enough other projects on the go at.
You'll want to use something like a Context:
http://blog.golang.org/context
Basically, the pattern is to create a Context for each unique thing you want to do. (A web request in your case.) Use context.WithValue to embed multiple variables in the context. Then always pass it as the first parameter to other methods that are doing further work in other goroutines.
Getting the variable you need out of the context is a matter of calling context.Value from within any goroutine. From the above link:
A Context is safe for simultaneous use by multiple goroutines. Code can pass a single Context to any number of goroutines and cancel that Context to signal all of them.
I had an implementation where I was explicitly sending variables as method parameters, and I discovered that embedding these variables using contexts significantly cleaned up my code.
Using a Context also helps because it provides ways to end long-running tasks by using channels, select, and a concept called a "done channel." See this article for a great basic review and implementation:
http://blog.golang.org/pipelines
I'd recommend reading the pipelines article first for a good flavor of how to manage communication among goroutines, then the context article for a better idea of how to level-up and start embedding variables to pass around.
Good luck!
Don't use global variables. Use Go goroutine-local variables.
go-routine Id..
There are already goroutine-local variables: they are called function
arguments, function return values, and local variables.
Russ
If you have more than one user, then wouldn't you need that info for each connection? So I would think that you'd have a struct per connected user. It would be idiomatic Go to pass a pointer to that struct when setting up the worker goroutine, or passing the pointer over a channel.

Is "net/http"'s use of global variables considered a good practice in golang?

The golang package "net/http" uses the global variable DefaultServeMux to register handlers. Is this considered a good practice or even an golang idiom? Is it a global variable after all?
The two main reasons not to use global variables are AFAIK 1) that they add to complexity and 2) are problematic in concurrent programs.
Maybe 1) is not considered important in this case because the developer can choose not to use DefaultServerMux? What about 2)? Are global variables always thread/goroutine safe in Go? Still, I'm surprised that it's used in Go's standard library. I've never seen such practice in other languages / standard libraries.
Is it a global variable after all?
Yes. The variable is defined on root level, which makes it global throughout the package.
However, this is not a global variable which stores all the sensible information
of the net/http package. It is merely a convenience setup which uses the content of
the net/http package to provide an quickstart opportunity to the user.
This also means, that is does not add much complexity.
Is this considered a good practice or even an golang idiom?
IMO, it is good practice to aid the user with the usage of a package.
If you're finding that you could save the user some time by providing a
good default configuration, do so.
However, you should be careful when you're about to export variables.
They should be made ready for concurrent access.
The DefaultServeMux (or better, the underlying ServeMux), for example, is using a mutex to be thread safe.
Are global variables always thread/goroutine safe in Go?
No. Without proper synchronization (mutex, channel, ...), everything that is accessed concurrently is problematic and will most certainly blow everything to bits and pieces.
I've never seen such practice in other languages / standard libraries.
Python's logging module, for example, provides a function to retrieve the root logging object, which one can call methods on to customize the logging behaviour. This could be seen as a global object, as it is mutable and defined in the module.
The globvar is, in this case, as safe and as good choice as the analogue seen in e.g package "log" is.
IOW, claim 1 is as vague as it can get and claim 2 is constrained: sometime/somewhere true, otherwise false == doesn't hold in general even though used just like that.

Reentrancy in Boost

When working with multithreading, I need to make sure that the boost classes I use are reentrant (even when each thread uses its own object instance).
I'm having hard time finding in the documentation of Boost's classes a statement about the reentrancy of the class. Am I missing something here? Are all the classes of Boost reentrant unless explicitly stated otherwise in the documentation? Or is Boost's documentation on reentrancy a gray area?
For example, I couldn't find anywhere in the documentation a statement on the reentrancy of the boost::numeric::ublas∷matrix class. So can I assume it's reentrant or not?
Thanks!
Ofer
Most of Boost is similar to most of the STL and the C++ standard library in that:
Creating two instances of a type in two threads at the same time is OK.
Using two instances of a type in two threads at the same time is OK.
Using a single object in two threads at the same time is often not OK.
But doing read-only operations on one object in two threads is often OK.
There is usually no particular effort taken to "enhance" thread safety, except where there is a particular need to do so, like shared_ptr, Asio, Signals2 (but not Signals), and so on. Parts of Boost that look like value types (such as your matrix example) probably do not have any special thread safety support at all, leaving it up to the application.

Object must be locked to be used?

I was pondering language features and I was wondering if the following feature had been implemented in any languages.
A way of declaring that an object may only be accessed within a Mutex. SO for example in java you would only be able to access an object if it was in a synchrnoised block and in C# a Lock.
A compiler error would ensue if the object was used outside of a Mutex block.
Any thoughts?
UPDATE
I think some people have misunderstood the question, I'm not asking if you can lock objects, I'm asking if there is a mechanism to state at declaration of an object that it may only be accessed from within a lock/synchronised statement.
There are two ways to do that.
Your program either refuses to run a method unless the protecting mutex is locked by the calling thread (that's a runtime check); or it refuses to compile (that's a compile time check).
First way is what C# lock does.
Second method requires a compiler able to evaluate every execution path possible. It's hardly feasible.
In Java you can add the synchronized keyword to a method, but that is only syntactic sugar to wrapping the entire method body in a synchronized(this)-block (for non-static methods).
So for Java there is no language construct that enforces that behavior. You can try to .wait() on this with a zero timeout to ensure that the calling code has acquired the monitor, but that's just checking after-the-fact
In Objective-C, you can use the #property and #synthesize directives to let the compiler generate the code for accessors. By default they are protected by mutex.
Demanding locks on everything as you describe would create the potential for deadlocks, as one might be forced to take a lock sooner than one would otherwise.
That said, there are approaches similar to what you describe - Software Transactional Memory, in particular, avoids the deadlock issue by allowing rollbacks and retries.

Resources