dead-lock free vs. starvation free - algorithm

Can it happen that mutual exclusion algorithm doesn`t maintain dead-lock free property,but it maintains starvation freedom ?
Thank you

Starvation-freedom can be defined as: Whatever the process p, each invocation of acquire_mutex() issused by p eventually terminates.
OR Any process trying to enter critical section, will eventually enter critical section.
Deadlock-freedom: Whatever the time T , if before T one or several processes
have invoked the operation acquire_mutex() and none of them has terminated its
invocation at time T , then there is a time T' > T at which a process that has invoked acquire_mutex() terminates its invocation.[Raynal, Concurrent Programming: Algorithms, Principles, and Foundations]
OR If process is trying to enter critical section, then some process, not necessary same one, eventually will enter critical section. OR At least one, always wins.
Notice, that deadlock-freedom is saying that there are some processes will make progresses, but others might be stuck(starving), trying to get into critical section. It sound weird at first, but it is so: not all threads are stuck, so there is no deadlock, i.e. deadlock-freedom.
On other hand, starvation-freedom is saying that every process trying to get into critical section, will eventually do so. There will be no processes that will ever starve.
This makes starvation-freedom much stronger property than deadlock-freedom.
Answer to your question is NO.

No—every reasonable definition of starvation includes deadlock.

Related

How to synchronize multiple goroutines on one critical section?

I need to synchronize a number of goroutines 1..n that run code in an interpreted language (written in Go) on a critical section of interpreted code running in a goroutine k, which may be any of 1..n at runtime. I.e., if the code in k is running in the critical section, all other goroutines i (i≠k; 1≤k,i≤n) need to be suspended until k is finished. The solution also needs to deal with erroneous nesting of critical sections, either by processing at the outermost level or by signalling an error.
The problem is that each of these goroutines may run code in a tight loop and I have no direct control over this code - it's interpreted, supplied by the user. Things I've considered:
Mess with the interpreter to include some kind of yield or checks. This seems like a very inefficient way of dealing with it. The language allows for recursion, so I'd have to check every function call.
Suspend goroutines i while they are running, from k in a non-cooperative way? Is it possible to suspend a goroutine in Go "from the outside (=k)"?
Supply a way in the target language to co-operatively check whether k is running and halt if needed. I'm reluctant to do this, since it basically is a leaky abstraction for channels. It would also make programming very complicated and I want my interpreted language to remain simple.
What would you recommend. Is Option 2 even possible? Is there another way to do this? Any help is greatly appreciated!

Grafcet - synchronous machine behaviour

My goal is to implement a control algorithm written in Grafcet on a PLC. I am struggling with the difference of Grafcet as multi-process synchronous language and the single-core sequential PLC. Below is an example. What is the outcome of the Grafcet in the first cycle after the upper transition has fired? (a=1,x=1) or (a=1, x=0)?
I know that in SFC, it depends on the implementation of the engineering tool (e.g. Codesys, Multiprog) how actions are evaluated, typically from left to right. So for an SFC, (a=1,x=1) would be the answer. But since everything happens at the same time in Grafcet, I do not know how to handle this case.
Bonus points if someone can point out how I can learn more about the challenges of implementing languages like Grafcet on sequential machines.
Conditional actions are considered in not all Grafcet variants, but when they are, the behavior goes like this: as long as the step is active, turn on x while a is on.
If that's what you meant, though we may never find a conditional action formatted the way you did, x will be turned on within an infinitely short time after the two simultaneous steps are activated (at least that's my understanding based on the Grafcet evolution rules). So, the fact that the initial value of x is unpredictable - assuming that the two concurrent steps are activated at the very same time - should be actually no problem.
Moreover, as soon as the Grafcet is "implemented" in the real world (i.e. your single-core PLC), whether it's directly compiled by the engineering tool or converted into ladder diagram, an order of evaluation is necessarily chosen, as you said, and everything becomes deterministic, so your question is not a real problem when it comes to "implementing languages like Grafcet on sequential machines". You may find which are the real "challanges" by studying the canonical procedure for converting SFCs to ladder logic (detailed documentation is easily found on the web).
PLC's are single core, as you said, so... never 2 steps in the same moment of time.
There you have simultaneous branch, so both steps WILL execute. But clearly there will be one after another. By default, always the one from left. Please note that some PLC's allow you to change the order (never tried for simultaneous, but for divergent surely allow... such as RSLogix5000).
Simultaneous it's like having an AND. So you are telling processor execute first step AND second step. If you are familiar with Ladder Logic, I am sure this will be clear to you.
In the end, it should be a=1;x=1.
Also note that for other steps that are not simultaneous, there is one scan delay before evaluate next transition, which is a great thing. This is the most omitted thing when implementing a SFC in Ladder (and can lead to problems impossible to troubleshoot if you are not aware of it). I've seen this "bug" in about 50% of projects with ladder implementation and hundreds of projects so far. Example: If you have 10 consecutive transitions true, you are going from step 1 to step 10 in a single scan. Troubleshoot why motor didn't start :)
Tip: You can always use dummy steps in simultaneous branches to delay with 1 scan. So, if you want the other outcome (a=1,x=0), you can put a dummy step before left step.

Order of Goroutine Unblocking on Single Channel

Does order in which the Goroutines block on a channel determine the order they will unblock? I'm not concerned with the order of the messages that are sent (they're guaranteed to be ordered), but the order of the Goroutines that'll unblock.
Imagine a empty Channel ch shared between multiple Goroutines (1, 2, and 3), with each Goroutine trying to receive a message on ch. Since ch is empty, each Goroutine will block. When I send a message to ch, will Goroutine 1 unblock first? Or could 2 or 3 possibly receive the first message? (Or vice-versa, with the Goroutines trying to send)
I have a playground that seems to suggest that the order in which Goroutines block is the order in which they are unblocked, but I'm not sure if this is an undefined behavior because of the implementation.
This is a good question - it touches on some important issues when doing concurrent design. As has already been stated, the answer to your specific question is, according to the current implementation, FIFO based. It's unlikely ever to be different, except perhaps if the implementers decided, say, a LIFO was better for some reason.
There is no guarantee, though. So you should avoid creating code that relies on a particular implementation.
The broader question concerns non-determinism, fairness and starvation.
Perhaps surprisingly, non-determinism in a CSP-based system does not come from things happening in parallel. It is possible because of concurrency, but not because of concurrency. Instead, non-determinism arises when a choice is made. In the formal algebra of CSP, this is modelled mathematically. Fortunately, you don't need to know the maths to be able to use Go. But formally, two goroutines code execute in parallel and the outcome could still be deterministic, provided all the choices are eliminated.
Go allows choices that introduce non-determinism explicitly via select and implicitly via ends of channels being shared between goroutines. If you have point-to-point (one reader, one writer) channels, the second kind does not arise. So if it's important in a particular situation, you have a design choice you can make.
Fairness and starvation are typically opposite sides of the same coin. Starvation is one of those dynamic problems (along with deadlock, livelock and race conditions) that result perhaps in poor performance, more likely in wrong behaviour. These dynamic problems are un-testable (more on this) and need some level analysis to solve. Clearly, if part of a system is unresponsive because it is starved of access to certain resources, then there is a need for greater fairness in governing those resources.
Shared access to channel ends may well provide a degree of fairness because of the current FIFO behaviour and this may appear sufficient. But if you want it guaranteed (regardless of implementation uncertainties), it is possible instead to use a select and a bundle of point-to-point channels in an array. Fair indexing is easy to achieve by always preferring them in an order that puts the last-selected at the bottom of the pile. This solution can guarantee fairness, but probably with a small performance penalty.
(aside: see "Wot No Chickens" for a somewhat-amusing discovery made by researchers in Canterbury, UK concerning a fairness flaw in the Java Virtual Machine - which has never been rectified!)
I believe it's unspecified because the memory model document only says "A send on a channel happens before the corresponding receive from that channel completes." The spec sections on send statements and the receive operator don't say anything about what unblocks first. Right now the gc toolchain uses an orderly FIFO queue to control which goroutine unblocks, but I don't see any promises in the spec that it must always be so.
(Just for general background note that Playground code runs with GOMAXPROCS=1, i.e., on one core, so some types of concurrency-related unpredictability just won't come up.)
The order is not specified, but current implementations use a FIFO queue for waiting goroutines.
The authoritative document is the Go Memory Model. The memory model does not define a happens-before relationship for two goroutines sending to the same channel, therefore the order is not specified. Ditto for receive.

Ant colony behavior using genetic programming

I'm looking at evolving ants capable of food foraging behaviour using genetic programming, as described by Koza here. Each time step, I loop through each ant, executing its computer program (the same program is used by all ants in the colony). Currently, I have defined simple instructions like MOVE-ONE-STEP, TURN-LEFT, TURN-RIGHT, etc. But I also have a function PROGN that executes arguments in sequence. The problem I am having is that because PROGN can execute instructions in sequence, it means an ant can do multiple actions in a single time step. Unlike nature, I cannot run the ants in parallel, meaning one ant might go and perform several actions, manipulating the environment whilst all of the other ants are waiting to have their turn.
I'm just wondering, is this how it is normally done, or is there a better way? Koza does not seem to mention anything about it. Thing is, I want to expand the scenario to have other agents (e.g. enemies), which might rely on things occurring only once in a single time step.
I am not familiar with Koza's work, but I think a reasonable approach is to give each ant its own instruction queue that persists across time steps. By doing this, you can get the ants to execute PROGN functions one instruction per time step. For instance, the high-level logic for the time step of an ant can be:
Do-Time-Step(ant):
1. if ant.Q is empty: // then put the next instruction(s) into the queue
2. instructions <- ant.Get-Next-Instructions()
3. for instruction in instructions:
4. ant.Q.enqueue(instruction)
5. end for
6. end if
7. instruction <- ant.Q.dequeue() // get the next instruction in the queue
8. ant.execute(instruction) // have that ant do its job
Another similar approach to queuing instructions would be to preprocess the set of instructions an expand instances of PROGN to the set of component instructions. This would have to be done recursively if you allow PROGNs to invoke other PROGNs. The downside to this is that the candidate programs get a bit bloated, but this is only at runtime. On the other hand, it is easy, quick, and pretty easy to debug.
Example:
Say PROGN1 = {inst-p1 inst-p2}
Then the candidate program would start off as {inst1 PROGN1 inst2} and would be expanded to {inst1 inst-p1 inst-p2 inst2} when it was ready to be evaluated in simulation.
It all depends on your particular GP implementation.
In my GP kernel programs are either evaluated repeatedly or in parallel - as a whole, i.e. the 'atomic' operation in this scenario is a single program evaluation.
So all individuals in the population are repeated n times sequentially before evaluating the next program or all individuals are executed just once, then again for n times.
I've had pretty nice results with virtual agents using this level of concurrency.
It is definitely possible to break it down even more, however at that point you'll reduce the scalability of your algorithm:
While it is easy to distribute the evaluation of programs amongst several CPUs or cores it'll be next to worthless doing the same with per-node evaluation just due to the amount of synchronization required between all programs.
Given the rapidly increasing number of CPUs/cores in modern systems (even smartphones) and the 'CPU-hunger' of GP you might want to rethink your approach - do you really want to include move/turn instructions in your programs?
Why not redesign it to use primitives that store away direction and speed parameters in some registers/variables during program evaluation?
The simulation step then takes these parameters to actually move/turn your agents based on the instructions stored away by the programs.
evaluate programs (in parallel)
execute simulation
repeat for n times
evaluate fitness, selection, ...
Cheers,
Jay

chain of events analysis and reasoning

My boss said logs in current state are not acceptable for the customer. If there is a fault, a dozen of different modules of the device report their own errors and they all land in logs. The original reason of the fault may be buried somewhere in the middle of the list, may not appear on the list (given module being too damaged to report), or appear way late after everything else finished reporting problems that result from the original fault. Anyway, there are few people outside the system developers who can properly interprete the logs and come up with what actually happened.
My current task is writing a module that does a customer-friendly fault-reporting. That is, gather all the events that were reported over the last ~3 seconds (which is about the max interval between origin of the fault occurring and the last resulting after-effects), do some magic processing of this data, and come up with one clear, friendly line what is broken and needs to be fixed.
The problem is the magic part: how, given a number of fault reports, to come up with the original source of the fault. There is no simple list of cause-effect list. There are just commonly occurring chains of events displaying certain regularities.
Examples:
short circuit detected, resulting in limited operation mode, the limited operation does not remove the fault, so emergency state is escalated, total output power disconnected.
safety line got engaged. No module reported engaging it within 3s since it was engaged, so an "unknown-source or interference" is attributed as the reason of system halt.
most output modules report no output voltage. About 1s later the power supply monitoring module reports power is out, which is the original reason.
an output module reports no output voltage in all of its output lines. No report from power supply module. The reason is a power line disconnected from the module.
an output module reports no output voltage in one of its output lines. No other faults reported. The reason is a burnt fuse.
an output module did not report back with applying received state. Shortly after, control module reports illegal state or output lines, (resulting from the output module really not updating the state in a timely manner.) The cause is the output module (which introduced the fault), not the control module (which halted the system due to fault detected).
a fault of input module switches the device to backup-failsafe mode. An output module not used so far, which was faulty gets engaged in this mode and the fault mode gets escalated to critical. The original reason is not the input, which is allowed to report false-positives concerning faults, but the broken backup output which aborted the operation.
there is no activity of any kind from an output module, for the last 2 seconds. This means it's broken and a fault mode must be entered.
There is no comprehensive list of rules as to what causes what. The rules will be added as new kinds of faults occur "in the wild" and are diagnosed and fixed. Some of them are heuristics - if this error is accompanied with these errors, then the fault is most likely this. Some faults will not be solved - a bland list of module reports will have to suffice. Some answers will be ambigous, one set of symptoms may suggest two different faults. This is more of a "best effort" than a "guaranteed solution" one.
Now for the (overly general and vague) question: how to solve this? Are there specific algorithms, methods or generalized solutions to this kind of problem? How to write the generalized rulesets and match against them? How to do the soft-matching? (say, an input module broke right in the middle of an emergency halt, it's a completely unrelated event to be ignored.) Help please?
In all honesty, I would just write a series of simple rules and be done with it. It will be a pain maintenance wise, but getting this right may be time consuming and brittle.
If you insist, I would approach this by having each error drop some sort of symbol/token for each error code - you'll make this much harder if you try to do some bag of words/keyword matching. You would then input the outputted tokens in some sort of classifier.
At heart, you need some sort of rules engine - be it fuzzy or exact. The first thing that comes to mind is a hand-built Bayesian network. This would allow for fuzzy matching as you would calculate the most probable 'report' as a function of the tokens you receive. It also allows you to set a threshold for token groups that aren't really indicative of anything by specifying the minimum probability to return an answer.
You could also train a Bayes net or other type classifier, but you'll need quite a bit of data that you've manually labeled (token1,token2,token3->faultxyz) and it might be more accurate to do it yourself.

Resources