Related
Look at this statement taken from The examples from Tony Hoare's seminal 1978 paper:
Go's design was strongly influenced by Hoare's paper. Although Go differs significantly from the example language used in the paper, the examples still translate rather easily. The biggest difference apart from syntax is that Go models the conduits of concurrent communication explicitly as channels, while the processes of Hoare's language send messages directly to each other, similar to Erlang. Hoare hints at this possibility in section 7.3, but with the limitation that "each port is connected to exactly one other port in another process", in which case it would be a mostly syntactic difference.
I'm confused.
Processes in Hoare's language communicate directly to each other. Go routines communicate also directly to each other but using channels.
So what impact has the limitation in golang. What is the real difference?
The answer requires a fuller understanding of Hoare's work on CSP. The progression of his work can be summarised in three stages:
based on Dijkstra's semaphore's, Hoare developed monitors. These are as used in Java, except Java's implementation contains a mistake (see Welch's article Wot No Chickens). It's unfortunate that Java ignored Hoare's later work.
CSP grew out of this. Initially, CSP required direct exchange from process A to process B. This rendezvous approach is used by Ada and Erlang.
CSP was completed by 1985, when his Book was first published. This final version of CSP includes channels as used in Go. Along with Hoare's team at Oxford, David May concurrently developed Occam, a language deliberately intended to blend CSP into a practical programming language. CSP and Occam influenced each other (for example in The Laws of Occam Programming). For years, Occam was only available on the Transputer processor, which had its architecture tailored to suit CSP. More recently, Occam has developed to target other processors and has also absorbed Pi calculus, along with other general synchronisation primitives.
So, to answer the original question, it is probably helpful to compare Go with both CSP and Occam.
Channels: CSP, Go and Occam all have the same semantics for channels. In addition, Go makes it easy to add buffering into channels (Occam does not).
Choices: CSP defines both the internal and external choice. However, both Go and Occam have a single kind of selection: select in Go and ALT in Occam. The fact that there are two kinds of CSP choice proved to be less important in practical languages.
Occam's ALT allows condition guards, but Go's select does not (there is a workaround: channel aliases can be set to nil to imitate the same behaviour).
Mobility: Go allows channel ends to be sent (along with other data) via channels. This creates a dynamically-changing topology and goes beyond what is possible in CSP, but Milner's Pi calculus was developed (out of his CCS) to describe such networks.
Processes: A goroutine is a forked process; it terminates when it wants to and it doesn't have a parent. This is less like CSP / Occam, in which processes are compositional.
An example will help here: firstly Occam (n.b. indentation matters)
SEQ
PAR
processA()
processB()
processC()
and secondly Go
go processA()
go processB()
processC()
In the Occam case, processC doesn't start until both processA and processB have terminated. In Go, processA and processB fork very quickly, then processC runs straightaway.
Shared data: CSP is not really concerned with data directly. But it is interesting to note there is an important difference between Go and Occam concerning shared data. When multiple goroutines share a common set of data variables, race conditions are possible; Go's excellent race detector helps to eliminate problems. But Occam takes a different stance: shared mutable data is prevented at compilation time.
Aliases: related to the above, Go allows many pointers to refer to each data item. Such aliases are disallowed in Occam, so reducing the effort needed to detect race conditions.
The latter two points are less about Hoare's CSP and more about May's Occam. But they are relevant because they directly concern safe concurrent coding.
That's exactly the point: in the example language used in Hoare's initial paper (and also in Erlang), process A talks directly to process B, while in Go, goroutine A talks to channel C and goroutine B listens to channel C. I.e. in Go the channels are explicit while in Hoare's language and Erlang, they are implicit.
See this article for more info.
Recently, I've been working quite intensively with Go's channels, and have been working with concurrency and parallelism for many years, although I could never profess to know everything about this.
I think what you're asking is what's the subtle difference between sending a message to a channel and sending directly to each other? If I understand you, the quick answer is simple.
Sending to a Channel give the opportunity for parallelism / concurrency on both sides of the channel. Beautiful, and scalable.
We live in a concurrent world. Sending a long continuous stream of messages from A to B (asynchronously) means that B will need to process the messages at pretty much the same pace as A sends them, unless more than one instance of B has the opportunity to process a message taken from the channel, hence sharing the workload.
The good thing about channels is that that you can have a number of producer/receiver go-routines which are able to push messages to the queue, or consume from the queue and process it accordingly.
If you think linearly, like a single-core CPU, concurrency is basically like having a million jobs to do. Knowing a single-core CPU can only do one thing at a time, and yet also see that it gives the illusion that lots of things are happening at the same time. When executing some code, the time the OS needs to wait a while for something to come back from the network, disk, keyboard, mouse, etc, or even some process which sleeps for a while, give the OS the opportunity to do something else in the meantime. This all happens extremely quickly, creating the illusion of parallelism.
Parallelism on the other hand is different in that the job can be run on a completely different CPU independent of what's going with other CPUs, and therefore doesn't run under the same constraints as the other CPU (although most OS's do a pretty good job at ensuring workloads are evenly distributed to run across all of it's CPUs - with perhaps the exception of CPU-hungry, uncooperative non-os-yielding-code, but even then the OS tames them.
The point is, having multi-core CPUs means more parallelism and more concurrency can occur.
Imagine a single queue at a bank which fans-out to a number of tellers who can help you. If no customers are being served by any teller, one teller elects to handle the next customer and becomes busy, until they all become busy. Whenever a customer walks away from a teller, that teller is able to handle the next customer in the queue.
Does order in which the Goroutines block on a channel determine the order they will unblock? I'm not concerned with the order of the messages that are sent (they're guaranteed to be ordered), but the order of the Goroutines that'll unblock.
Imagine a empty Channel ch shared between multiple Goroutines (1, 2, and 3), with each Goroutine trying to receive a message on ch. Since ch is empty, each Goroutine will block. When I send a message to ch, will Goroutine 1 unblock first? Or could 2 or 3 possibly receive the first message? (Or vice-versa, with the Goroutines trying to send)
I have a playground that seems to suggest that the order in which Goroutines block is the order in which they are unblocked, but I'm not sure if this is an undefined behavior because of the implementation.
This is a good question - it touches on some important issues when doing concurrent design. As has already been stated, the answer to your specific question is, according to the current implementation, FIFO based. It's unlikely ever to be different, except perhaps if the implementers decided, say, a LIFO was better for some reason.
There is no guarantee, though. So you should avoid creating code that relies on a particular implementation.
The broader question concerns non-determinism, fairness and starvation.
Perhaps surprisingly, non-determinism in a CSP-based system does not come from things happening in parallel. It is possible because of concurrency, but not because of concurrency. Instead, non-determinism arises when a choice is made. In the formal algebra of CSP, this is modelled mathematically. Fortunately, you don't need to know the maths to be able to use Go. But formally, two goroutines code execute in parallel and the outcome could still be deterministic, provided all the choices are eliminated.
Go allows choices that introduce non-determinism explicitly via select and implicitly via ends of channels being shared between goroutines. If you have point-to-point (one reader, one writer) channels, the second kind does not arise. So if it's important in a particular situation, you have a design choice you can make.
Fairness and starvation are typically opposite sides of the same coin. Starvation is one of those dynamic problems (along with deadlock, livelock and race conditions) that result perhaps in poor performance, more likely in wrong behaviour. These dynamic problems are un-testable (more on this) and need some level analysis to solve. Clearly, if part of a system is unresponsive because it is starved of access to certain resources, then there is a need for greater fairness in governing those resources.
Shared access to channel ends may well provide a degree of fairness because of the current FIFO behaviour and this may appear sufficient. But if you want it guaranteed (regardless of implementation uncertainties), it is possible instead to use a select and a bundle of point-to-point channels in an array. Fair indexing is easy to achieve by always preferring them in an order that puts the last-selected at the bottom of the pile. This solution can guarantee fairness, but probably with a small performance penalty.
(aside: see "Wot No Chickens" for a somewhat-amusing discovery made by researchers in Canterbury, UK concerning a fairness flaw in the Java Virtual Machine - which has never been rectified!)
I believe it's unspecified because the memory model document only says "A send on a channel happens before the corresponding receive from that channel completes." The spec sections on send statements and the receive operator don't say anything about what unblocks first. Right now the gc toolchain uses an orderly FIFO queue to control which goroutine unblocks, but I don't see any promises in the spec that it must always be so.
(Just for general background note that Playground code runs with GOMAXPROCS=1, i.e., on one core, so some types of concurrency-related unpredictability just won't come up.)
The order is not specified, but current implementations use a FIFO queue for waiting goroutines.
The authoritative document is the Go Memory Model. The memory model does not define a happens-before relationship for two goroutines sending to the same channel, therefore the order is not specified. Ditto for receive.
I have a number of processes (of the order of 100 to 1000) and each of them has to send some data to some (say about 10) of the other processes. (Typically, but not necessary always, if A sends to B, B also sends to A.) Every process knows how much data it has to receive from which process.
So I could just use MPI_Alltoallv, with many or most of the message lengths zero.
However, I heard that for performance reasons it would be better to use several MPI_send and MPI_recv communications rather than the global MPI_Alltoallv.
What I do not understand: if a series of send and receive calls are more efficient than one Alltoallv call, why is Alltoallv not just implemented as a series of sends and receives?
It would be much more convenient for me (and others?) to use just one global call. Also I might have to be concerned about not running into a deadlock situation with several Send and Recv (fixable by some odd-even strategy or more complex? or by using buffered send/recv?).
Would you agree that MPI_Alltoallv is necessary slower than the, say, 10 MPI_Send and MPI_Recv; and if yes, why and how much?
Usually the default advice with collectives is the opposite: use a collective operation when possible instead of coding your own. The more information the MPI library has about the communication pattern, the more opportunities it has to optimize internally.
Unless special hardware support is available, collective calls are in fact implemented internally in terms of sends and receives. But the actual communication pattern will probably not be just a series of sends and receives. For example, using a tree to broadcast a piece of data can be faster than having the same rank send it to a bunch of receivers. A lot of work goes into optimizing collective communications, and it is difficult to do better.
Having said that, MPI_Alltoallv is somewhat different. It can be difficult to optimize for all irregular communication scenarios at the MPI level, so it is conceivable that some custom communication code can do better. For example, an implementation of MPI_Alltoallv might be synchronizing: it could require that all processes "check in", even if they have to send a 0-length message. I though that such an implementation is unlikely, but here is one in the wild.
So the real answer is "it depends". If the library implementation of MPI_Alltoallv is a bad match for the task, custom communication code will win. But before going down that path, check if the MPI-3 neighbor collectives are a good fit for your problem.
I am creating a simple intrusion detection system for an Information Security course using jpcap.
One of the features will be remote OS detection, in which I must implement an algorithm that detects when a host sends 5 packets within 20 seconds that have different ACK, SYN, and FIN combinations.
What would be a good method of detecting these different "combinations"? A brute-force algorithm would be time-consuming to implement, but I can't think of a better method.
Notes: jpcap's API allows one to know if the packet is ACK, SYN, and/or FIN. Also note that one doesn't need to know what ACK, SYN, and FIN are in order to understand the problem.
Thanks!
I built my own data structure based on vectors that hold "records" about the type of packet.
You need to keep state on each session. - using hashtables. Keep each syn,ack and fin/fin-ack. I wrote and opensource IDS sniffer a few years ago that does this; feel free to look at the code. It should be very easy to write an algorithm to do passive os-detection (google it). My opensource code is here dnasystem
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What kind of programming problems are state machines most suited for?
I have read about parsers being implemented using state machines, but would like to find out about problems that scream out to be implemented as a state machine.
The easiest answer is probably that they are suited for practically any problem. Don't forget that a computer itself is also a state machine.
Regardless of that, state machines are typically used for problems where there is some stream of input and the activity that needs to be done at a given moment depends the last elements seen in that stream at that point.
Examples of this stream of input: some text file in the case of parsing, a string for regular expressions, events such as player entered room for game AI, etc.
Examples of activities: be ready to read a number (after another number followed by a + have appear in the input in a parser for a calculator), turn around (after player approached and then sneezed), perform jumping kick (after player pressed left, left, right, up, up).
A good resource is this free State Machine EBook. My own quick answer is below.
When your logic must contain information about what happened the last time it was run, it must contain state.
So a state machine is simply any code that remembers (or acts on) information that can only be gained by understanding what happened before.
For instance, I have a cellular modem that my program must use. It has to perform the following steps in order:
reset the modem
initiate communications with the modem
wait for the signal strength to indicate a good connection with a tower
...
Now I could block the main program and simply go through all these steps in order, waiting for each to run, but I want to give my user feedback and perform other operations at the same time. So I implement this as a state machine inside a function, and run this function 100 times a second.
enum states{reset,initsend, initresponse, waitonsignal,dial,ppp,...}
modemfunction()
{
static currentstate
switch(currentstate)
{
case reset:
Do reset
if reset was successful, nextstate=init else nextstate = reset
break
case initsend
send "ATD"
nextstate = initresponse
break
...
}
currentstate=nextstate
}
More complex state machines implement protocols. For instance a ECU diagnostics protocol I used can only send 8 byte packets, but sometimes I need to send bigger packets. The ECU is slow, so I need to wait for a response. Ideally when I send a message I use one function and then I don't care what happens, but somewhere my program must monitor the line and send and respond to these messages, breaking them up into smaller pieces and reassembling the pieces of received messages into the final message.
Stateful protocols such as TCP are often represented as state machines. However it's rare that you should want to implement anything as a state machine proper. Usually you will use a corruption of one, i.e. have it carrying out a repeated action while sitting in one state, logging data while it transitions, or exchanging data while remaining in one state.
Objects in games are often represented as state machines. An AI character might be:
Guarding
Aggressive
Patroling
Asleep
So you can see these might model some simple but effective states. Of course you could probably make a more complex continuous system.
Another example would be a process such as making a purchase on Google Checkout. Google gives a number of states for Financial and Order, and then informs you of transistions such as the credit card clearing or getting rejected, and allows you to inform it that the order has been shipped.
Regular expression matching, Parsing, Flow control in a complex system.
Regular expressions are a simple form of state machine, specifically finite automata. They have a natural represenation as such, although it is possible to implement them using mutually recursive functions.
State machines when implemented well, will be very efficient.
There is an excellent state machine compiler for a number of target languages, if you want to make a readable state machine.
http://research.cs.queensu.ca/~thurston/ragel/
It also allows you to avoid the dreaded 'goto'.
AI in games is very often implemented using State Machines.
Helps create discrete logic that is much easier to build and test.
Workflow (see WF in .net 3.0)
They have many uses, parsers being a notable one. I have personally used simplified state machines to implement complex multi-step task dialogs in applications.
A parser example. I recently wrote a parser that takes a binary stream from another program. The meaning of the current element parsed indicates the size/meaning of the next elements. There are a (small) finite number of elements possible. Hence a state machine.
They're great for modelling things that change status, and have logic that triggers on each transition.
I'd use finite state machines for tracking packages by mail, or to keep track of the different stata of a user during the registration process, for example.
As the number of possible status values goes up, the number of transitions explodes. State machines help a lot in that case.
Just as a side note, you can implement state machines with proper tail calls like I explained in the tail recursion question.
In that exemple each room in the game is considered one state.
Also, Hardware design with VHDL (and other logic synthesis languages) uses state machines everywhere to describe hardware.
Any workflow application, especially with asynchronous activities. You have an item in the workflow in a certain state, and the state machine knows how to react to external events by placing the item in a different state, at which point some other activity occurs.
The concept of state is very useful for applications to "remember" the current context of your system and react properly when a new piece of information arrives. Any non trivial application has that notion embedded in the code thru variables and conditionals.
So if your application has to react differently every time it receives a new piece of information because of the context you are in, you could model your system with with a state machines. An example would be how to interpret the keys on a calculator, which depends on what your are processing at that point in time.
On the contrary, if your computation does not depend of the context but solely on the input (like a function adding two numbers), you will not need an state machine (or better said, you will have a state machine with zero states)
Some people design the whole application in terms of state machines since they capture the essential things to keep in mind in your project and then use some procedure or autocoders to make them executable. It takes some paradigm chance to program in this way, but I found it very effective.
Things that comes to mind are:
Robot/Machine manipulation... those robot arms in factories
Simulation Games, (SimCity, Racing Game etc..)
Generalizing: When you have a string of inputs that when interacting with anyone of them, requires the knowledge of the previous inputs or in other words, when processing of any single input requires the knowledge of previous inputs. (that is, it needs to have "states")
Not much that I know of that isn't reducible to a parsing problem though.
If you need a simple stochastic process, you might use a Markov chain, which can be represented as a state machine (given the current state, at the next step the chain will be in state X with a certain probability).