Can goroutine like erlang spawn process across multiple hosts transparently? - go

It is said that if configured erlang with cookie setting, the erlang's process could be run across different machines, and this is transparent to the caller.
Is that possible for goroutine run like this?

This is not a feature of the language, no. However, since there's no way in the language to ask about goroutines (for example, to get a thread ID or control them from a different goroutine like in some other languages), as long as you could set up transparent communication mechanisms (for example, channels that work over the network), you could create a similar effect. In fact, Rob Pike, one of the creators of Go, has toyed around in the past with a package he called "netchan" to do exactly this, but couldn't get the semantics right, and so he hasn't published a finalized version yet. It's definitely something he's still interested in, though, and would certainly be consistent with the Go approach to abstraction.

Related

where is fyne's thread safety defined?

I was attracted to Fyne (and hence Go) by a promise of thread safety. But now that I'm getting better at reading Go I'm seeing things that make be believe that the API as a whole is not thread safe and perhaps was never intended to be. So I'm trying to determine what "thread safe" means in Fyne.
I'm looking specifically at
func (l *Label) SetText(text string) {
l.Text = text
l.textProvider.SetText(text) // calls refresh
}
and noting that l.Text is also a string. Assignments in Go are not thread safe, so it seems obvious to me that if two threads fight over the text of a label and both call label.SetText at the same time, I can expect memory corruption.
"But you wouldn't do that", one might say. No, but I am worried about the case of someone editing the content of an Entry while an app thread decides it needs to replace all the Entry's text - this is entirely possible in my app because it supports simultaneous editing by multiple users over a network, so updates to all sorts of widgets come in asynchronously. (Note I don't care what happens if two people edit the same Entry at the same time; someone's changes will be lost and I don't care who's. But it must not result in memory corruption.) Note that one approach I could take would be to have the background thread create an entirely new Entry widget, which would then replace the one in the current Box. But is that thread safe?
It's not that I don't know how to serialize things with channels. But I was hoping that Fyne would eliminate the need for it (a blog post claims it does); and even using channels I can't convince myself that a user meddling with a widget in various ways while some background thread is altering it, hiding it, etc, isn't going to result in crashes. Maybe all that is serialized under the covers and is perfectly safe, but I don't want to find out the hard way that it isn't, because I'll have no way to fix it.
Fyne is clearly pretty new and seems to have tons of promise, but documentation seems light on details. Is more information available somewhere? Have people tried this successfully?
You have found some race conditions here. There are plans to improve, but the 1.2 release was required to get a new "BaseWidget" first - and that was only released a few weeks ago.
Setting fields directly is primarily for setup purposes and so not expected to be used in the way you illustrate. That said, we do want to support it. The base widget will soon introduce something akin to SetFieldsAndRefresh(func()) which will ensure the safety of the code passed and refresh the widget afterward.
There is indeed a race currently within Refresh(). The use of channels internally were designed to remove this - but there are some corners such as multiple goroutines calling it. This is the area that our new BaseWidget code can help with - as they can internally lock automatically. Using this approach will be thread safe with no changes to the developer in a future release.
The API so far has made it possible for developers to not worry about threading and work from any goroutines - we do need to work internally to make it safer - you are quite right. https://github.com/fyne-io/fyne/issues/506

Is there parallelism in Elm?

It's possible to write parallel code in Elm? Elm is pure functional, so no locking is needed. Of course, I can use Javascript FFI, spawn workers here and do it on my own. But, I want more user friendly "way" of doing this.
Short answer
No, not currently. But the next release (0.15) release will have new ways to handle effects inside Elm so you will need to use ports + JavaScript code less. So there may well be a way to spawn workers inside Elm in the next version.
More background
If you're feeling adventurous, try reading the published paper on Elm (or the longer original thesis), which shows that the original flavour of FRP that Elm uses is well suited for fine-grained concurrency. There is also an async construct which can potentially make part of the program run separately in a more coarse-grained manner. That might be support with OS-level threads (like JS Webworkers) and parallelism.
There have been earlier experiments with Webworkers. There is certainly an interest in concurrency within the community, but JavaScript doesn't offer (m)any great options for concurrency.
For reading tips on the paper, here's post of mine from the elm-discuss mailing list:
If you want to know more about signals and opt-in async, I suggest you try Evan's PLDI paper on Elm. Read from the introduction (1) up to building GUIs (4). You can skip the type system (3.2) and functional evaluation (3.3.1), that may save you some time. Most in and after building GUIs (4) is probably stuff you know already. Figure 8 is probably the best overview of what the async keyword does (note that the async keyword is not implemented in the current Elm compiler).

What's the best erlang approach to being able to identify a processes identity from its process id?

When I'm debugging, I'm usually looking at about 5000 processes, each of which could be one of about 100 gen_servers, fsms, etc. If I want to know WHAT an erlang process is, I can do:
process_info(pid(0,1,0), initial_call).
And get a result like:
{initial_call,{proc_lib,init_p,5}}
...which is all but useless.
More recently, I hit upon the idea (brace yourselves) of registering each process with a name that told me WHO that process represented. For example, player_1150 is the player process that represents player 1150. Yes, I end up making a couple million atoms over the course of a week-long run. (And I would love to hear comments on the drawbacks of boosting the limit to 10,000,000 atoms when my system runs with about 8GB of real memory unused, if there are any.) Doing this meant that I could, at the console of a live system, query all processes for how long their message queue was, find the top offenders, then check to see if those processes were registered and print out the atom they were registered with.
I've hit a snag with this: I'm moving processes from one node to another. Now a player process can have 3 different names; player_1158, player_1158_deprecating, player_1158_replacement. And I have to make absolutely sure I register and unregister these names with precision timing to make sure that a process is always named and that the appropriate names always exist, AND that I don't try to register a name that some dying process already holds. There is some slop room, since this is only used for console debugging of a live system Nonetheless, the moment I started feeling like this mechanism was affecting how I develop the system (the one that moves processes around) I felt like it was time to do something else.
There are two ideas on the table for me right now. An ets tables that associates process ids with their description:
ets:insert(self(), {player, 1158}).
I don't really like that one because I have to manually keep the tables clean. When a player exits (or crashes) someone is responsible for making sure that his data are removed from the ets table.
The second alternative was to use the process dictionary, storing similar information. When my exploration of a live system led me to wonder who a process is, I could just look at his process dictionary using process_info.
I realize that none of these solutions is functionally clean, but given that the system itself is never, EVER the consumer of these data, I'm not too worried about it. I need certain debugging tools to work quickly and easily, so the behavior described is not open for debate. Are there any convincing arguments to go one way or another (other than the academic "don't use the _, it's evil" canned garbage?) I'd be happy to hear other suggestions and their justifications.
You should try out gproc, it's a very convenient application for keeping process metadata.
A process can be registered with several names and you can associate arbitrary properties to a process (where the key and value can be any erlang term). Also gproc monitors the registered processes and unregisters them automatically if they crash.
If you're debugging gen_servers and gen_fsms while they're still running, I would implement the handle_info functions for these behaviors. When you send each process a {get_info, ReplyPid} tuple, the process in question can send back a term describing its own state, what it is, etc. That way you don't have to keep track of this information outside of the process itself.
Isac mentions there is already a built in way to do this

Best form of IPC for a decentralized roguelike?

I've got a project to create a roguelike that in some way abstracts the UI from the engine and the engine from map creation, line-of-site, etc. To narrow the focus, i first want to just get the UI (player's client) and engine working.
My current idea is to make the client basically a program that decides what one character (player, monsters) will do for its turn and waits until it can move again. So each monster has a client, and so does the player. The player's client prints the map, waits for input, sends it to the engine, and tells the player what happened. The monster's client does the same except without printing the map and using AI instead of keyboard input.
Before i go any futher, if this seems somehow an obfuscated way of doing things, my goal is to learn, not write a roguelike. It's the journy, not the destination.
And so i need to choose what form of ipc fits this model best.
My first attempt used pipes because they're simplest and i wrote a
UI for the player and a program to pipe in instructions such as
where to put the map and player. While this works, it only allows
one client--communicating through stdin and out.
I've thought about making the engine a daemon that looks in a spool
where clients, when started, create unique-per-client temp files to
give instructions to the engine and recieve feedback.
Lastly, i've done a little introductory programing with sockets.
They seem like they might be the way to go, and would allow the game
to perhaps someday be run over a net. I'd like to, if possible, use
a simpler solution, and since i'm unfamiliar with them, it's more
error prone.
I'm always open to suggestions.
I've been playing around with using these combinations for a similar problem (multiple clients talking via a single daemon on the local box, with much of the intelligence shoved off into the clients).
mmap for sharing large data blobs, with unix domain sockets, messages queues, or named pipes for notification
same, but using individual files per blob instead of munging them all together in an mmap
same, but without the files or mmap (in other words, more like conventional messaging)
In general I like the idea of breaking things up into separate executables this way -- it certainly makes testing easier, for instance. I think the choice of method comes down to usage patterns -- how large are messages, how persistent does the data in them need to be, can you afford the cost of multiple trips through the network stack for a socket-based message, that sort of thing. The fact that you're sticking to Linux makes things easy in terms of what's available -- you don't need to worry about portability of message queues, for instance.
This one's also applicable: https://stackoverflow.com/a/1428542/1264797

NSThread or pythons' threading module in pyobjc?

I need to do some network bound calls (e.g., fetch a website) and I don't want it to block the UI. Should I be using NSThread's or python's threading module if I am working in pyobjc? I can't find any information on how to choose one over the other. Note, I don't really care about Python's GIL since my tasks are not CPU bound at all.
It will make no difference, you will gain the same behavior with slightly different interfaces. Use whichever fits best into your system.
Learn to love the run loop. Use Cocoa's URL-loading system (or, if you need plain sockets, NSFileHandle) and let it call you when the response (or failure) comes back. Then you don't have to deal with threads at all (the URL-loading system will use a thread for you).
Pretty much the only time to create your own threads in Cocoa is when you have a large task (>0.1 sec) that you can't break up.
(Someone might say NSOperation, but NSOperationQueue is broken and RAOperationQueue doesn't support concurrent operations. Fine if you already have a bunch of NSOperationQueue code or really want to prepare for working NSOperationQueue, but if you need concurrency now, run loop or threads.)
I'm more fond of the native python threading solution since I could join and reference threads around. AFAIK, NSThreads don't support thread joining and cancelling, and you could get a variety of things done with python threads.
Also, it's a bummer that NSThreads can't have multiple arguments, and though there are workarounds for this (like using NSDictionarys and NSArrays), it's still not as elegant and as simple as invoking a thread with arguments laid out in order / corresponding parameters.
But yeah, if the situation demands you to use NSThreads, there shouldn't be any problem at all. Otherwise, it's cool to stick with native python threads.
I have a different suggestion, mainly because python threading is just plain awful because of the GIL (Global Interpreter Lock), especially when you have more than one cpu core. There is a video presentation that goes into this in excruciating detail, but I cannot find the video right now - it was done by a Google employee.
Anyway, you may want to think about using the subprocess module instead of threading (have a helper program that you can execute, or use another binary on the system. Or use NSThread, it should give you more performance than what you can get with CPython threads.

Resources