What is an approach for designing complex FSMs? - complexity-theory

At work, we use FSMs. Recently, I had to design an FSM for a problem that I deem "a little too complex for a simple FSM". Why? Because the problem has about 6 different data dimensions, and many permutations of this data impact the behaviour of the solution significantly. My brain thinks "6 data attributes means 2^6 +1 permutations of this data" if it were all boolean data. Furthermore, there are about 8 inputs that can happen at any given time.
This problem made me aware that my FSM creating skills stop at simple problems used in my hobby projects. At work, we are constrained to use FSMs. That means, I cannot just say "this problem is outside of the scope of FSMs. I'll use something else." Indeed, the FSM platform we have in place does provide a lot of power for our solutions.
Question: What is an approach for designing an FSM when the problem is sufficiently complex? I've researched a bit on this and found a few papers which, honestly, didn't help me much. I hope there are some best practices for this, and all I'm asking for is one. Please and thanks.

I suppose that you might be experiencing the usual "state-transition explosion", which is the known problem of traditional "flat" FSMs. The traditional FSMs "explode", because they inflict repetitions of the same reactions in many states. FSMs lack any mechanisms to capture commonalities of behavior among states. The long know solution is to use Hierarchical State Machines (a.k.a. Harel statecharts or UML state machines). HSM support the concept of state nesting, in which sub-states inherit behavior from the surrounding superstate(s). When used correctly, state nesting eliminates the repetitions and counteracts the "explosion" problem. Most non-trivial problems are not really tractable with FSMs, but are quite manageable with HSMs.

Related

What is a collaborative algorithm?

What is a collaborative algorithm? Is there a scientifically citable reference?
Details:
I found many articles about collaborative algorithms, but none (or other websites) with a definition.
I am actually looking for a term to describe distributed algorithms where each instance has all information at the beginning and can complete the whole task on its own, but the instances help each other whenever they have solved a sub-problem, so the other instances do not have to redo the work (hence "collaboration"). I picked up this terminology in A Collaborative Approach for Multi-Threaded SAT Solving. Do you think the term "collaborative algorithm" is suitable for this? If not, do you know of a better term?
No, there's no scientifically citable references.
All parallel/distributed programming is "collaborative" in a sense that several threads/nodes are collaborating on the same big task.
distributed algorithms where.. instances help each other whenever they have solved a sub-problem - even some web application clusters fit your description: individual cluster nodes "solve subproblems" and store the "solutions" in a distributed in-RAM storage (such as memcached or cassandra or many others) thus helping each other.
I think the term "collaborative algorithm" is not formal.
Actually the term "algorithm" itself is not really formal,
as far as I remember. I guess algorithm can be formalized as
"a program which runs on a Turing machine". I think I've seen
this definition somewhere.
So yes, I guess all in all the term you coined makes sense, but you
need to define it somehow yourself (either formally or informally).
Not sure what your background is but ... OK, in scientific papers
different authors sometimes use same terms/concepts for denoting
different things and sometimes they use different terms for
denoting the same thing.
Also, even though computer science papers are scientific not
all terms in them are formally defined. So I wouldn't draw
too many conclusions based on these papers unless I am familiar
to a decent extent with all of them or unless some of them
are considered really remarkable and widely accepted as
a de-facto standard in a particular sub-field or field.

How do programmers test their algorithm in TopCoder or other competitions?

Good programmers who write programs of moderate to higher difficulty in competitions of TopCoder or ACM ICPC, have to ensure the correctness of their algorithm before submission.
Although they are provided with some sample test cases to ensure the correct output, but how does it guarantees that program will behave correctly? They can write some test cases of their own but it won't be possible in all cases to know the correct answer through manual calculation. How do they do it?
Update: As it seems, it is not quite possible to analyze and guarantee the outcome of an algorithm given tight constraints of a competitive environment. However, if there are any manual, more common traits which are adopted while solving such problems - should be enough to answer the question. Something like best practices..
In competitions, the top programmers have enough experience to read the question, and think of some test cases that should catch most of the possibilities for input.
It catches most of the bugs usually - but it is NOT 100% safe.
However, in real life critical applications (critical systems on air planes or nuclear reactors for example) there are methods to PROVE some piece of code does what it is supposed to do.
This is the field of formal verification - which is way too complex and time consuming to be done during a contest, but for some systems it is used because mistakes could not be tolerated.
Some additional information:
Formal verification basically consists of 2 parts:
Manual verification - in here we use proving systems such as Hoare logic and manually prove the program does what we wants it to do.
Automatic model checking - modeling the problem as state machine, and use Model Checking tools to verify that the module does what it is supposed to do (or not doing something "bad").
Specifying "what it should do" is usually done with temporal logic.
This is often used to verify correctness of hardware models as well. For example Intel uses it to ensure they won't get the floating point bug again.
Picture this, imagine you are a top programmer.Meaning you know a bunch of algorithms and wouldn't think think twice while implementing them.You know how to modify an already known algorithm to suit the problem's needs.You are strong with estimating time and complexity and you expect that in the worst case your tailored algorithm would run within time and memory constraints.
At this level you simply think and use a scratchpad for about five to ten minutes and have a super clear algorithm before you start to code.Once you finish coding, you hit compile and there is usually no compilation error.Because the code is so intuitive to you.
Then based on the algorithm used and data structures used, you expect that there might be
one of the following issues.
a corner case
an overflow problem
A corner case is basically like you have coded for the general case, however when say N=1, the answer is different from others.So you generally write it as a special case.
An overflow is when intermediate values or results overflow a data type's limits.
You make note of any problems which arise at this point, and use this data during Challenge phase(as in TopCoder).
Once you have checked against these two, you hit Submit.
There's a time element to Top Coder, so it's not possible to test every combination within that constraint. They probably do the best they can and rely on experience for the rest, just as one does in real life. I don't know that it's ever possible to guarantee that a significant piece of code is error free forever.

Any math approaches to state management of complex objects?

I usually use ASP.net web forms for GUI, maybe one of most "stateful" technologies. But it applies to any technology which has states. Sometimes forms are tricky and complex, with >30 elements and > 3 states of each element. Intuitive way of designing such a form usually works for 90%. Other 10% usually find testers or end-users:).
The problem as i see it that we should imagine a lot of scenarios on the same object, which is much harder than a consequence of independent operations.
From functional programming courses I know that best way is not to use state management and use pure functions and variable passing by value and all these stuff, which is greatly formalized. Sometimes, we cannot avoid it.
Do you use any math formalisms and approaches to state management of complex objects? Not like monads in Haskell, but which can be used in more traditional business applications and languages - Java, C#, C++.
It may be not Turing-complete formalism, but 99% will be great also:).
Sorry if it is just another tumbleweed question:)
Use message-passing as an abstraction. Advantages:
The difficulty with complex state is complex interactions, which are especially hairy in concurrent systems like typical GUIs. Message-passing, by eliminating shared state, stops the complexity of state in one process from being infectious.
Message-passing concurrency has nice foundational models: e.g., the Actor model, CSP, both of which influenced Erlang.
It integrates well with functional programming: check out Erlang again. Peter van Roy's book *Concepts, Techniques, and Models of Computer Programming is an excellent text that shows the fundamental ingredients of programming languages, such as pure functions, and message-passing, and how they can be combined. The text is avilable as a free PDF.
It may be not Turing-complete formalism, but 99% will be great also:).
Sorry, but I'd rather provide NP-complete solution :)
Quick answer from me would be Test-Driven Approach. But read further for more.
The problem as i see it that we should
imagine a lot of scenarios on the same
object, which is much harder than a
consequence of independent operations.
In such cases the decomposition (not only in computer science sense, but in mathematical too) is very useful.
You decompose complex scenario in many simpler ones, which in turn can still be complex by themselves and can be decomposed further.
As a result of such a process you should end up with a number of simple functions (tasks) mostly independent of each ones.
This is very important because then you can UNIT TEST those simple scenarios.
Additionally, it is much easier and better to follow test-first approach which allows to see the decomposition in the very beginning of the development process.
Do you use any math formalisms and approaches to state management of complex objects?
To continue what I said, for me the most important thing is to make a good decomposition so that I can ensure the quality and being able to easily reproduce errors in an automated manner.
To give you an abstract example:
You have a complex scenario A. You always need to write at least 3 tests for each scenario: correct input, incorrect input and corner case(s).
Starting to write first test (correct input) I realize that the test becomes too complex.
As a result, I decompose scenario A into less complex A1, A2, A3. Then I start writing tests for each of them again (I should end up with at least 3*3=9 tests).
I realise that A1 is still too complex to test, so I decompose it again into A1-1, A1-2. Now I have 4 different scenarios (A1-2, A1-2, A2, A3) and 3*4=12 potential tests. I continue writing the tests.
After I am done. I start implementation, so all my tests pass. After that you have 12 proves that scenario A (more precisely its parts) works correctly. Additionally, you might write another 3 tests for the scenario A that combines all of its decomposed parts - this kind of testing is often (but not always!) can be seen as Integration testing.
Then let's assume a bug is found in scenario A. You are not sure which part it belongs to, but you suspect that it is related to A1-2 or A3. So you write 2 more tests for each of the scenario to reproduce the bug (write such a test that fails not meeting your expectations). After you have reproduced the bug you fix it and make ALL tests pass.
Now you have 2 more proves of correctly working system that ensures all the previous functionality is working the same way.
There are 2 main major problems with this approach IMO.
You need to write a lot of tests and support them. Many developers just do not want to do that.
Additionally, the process of decomposition is more art than science. Good decomposition will result in a good structure, tests and supportability while a bad one will result in a lot of pain and wasted time. And it is hard to tell if the decomposition is good or bad at first.
This process is called Test-Driven-Development. I find it to be the closest "formalization" of development process that plays nice between science and real world.
So I do not really talk about state here but rather behavior and proving it works correctly.
From personal experience, I should mention that ASP.NET WebForm is technically VERY hard to test.
To overcome that, I would suggest to apply MVP pattern for ASP.NET WebForms.
As opposed to WebForms, ASP.NET MVC is so much easier to test.
But still, you should have set of so called "services" (our scenarios) and (unit) test them separately, then test the UI integration in the environment close to Integration tests.

What are algorithms and data structures in layman’s terms?

I currently work with PHP and Ruby on Rails as a web developer. My question is why would I need to know algorithms and data structures? Do I need to learn C, C++ or Java first? What are the practical benefits of knowing algorithms and data structures? What are algorithms and data structures in layman’s terms? (As you can tell unfortunately I have not done a CS course.)
Please provide as much information as possible and thank you in advance ;-)
Data structures are ways of storing stuff, just like you can put stuff in stacks, queues, heaps and buckets - you can do the same thing with data.
Algorithms are recipes or instructions, the quick start manual for your coffee maker is an algorithm to make coffee.
Algorithms are, quite simply, the steps by which you do something. For instance the Coffee Maker Algorithm would run something like
Turn on Coffee Maker
Grind Coffee Beans
Put in filter and place coffee in filter
Add Water
Start brewing process
Drink coffee
A data structure is a means by which we store information in a organized fashion. For further info, check out the Wikipedia Article.
An algorithm is a list of instructions and data structures are ways to represent information. If you're writing computer programs then you're already using algorithms and data structures even if you don't know what the words mean.
I think the biggest advantages in knowing standard algorithms and data structures are:
You can communicate with other programmers using a common language.
Other people will be able to understand your code once you've left.
You will also learn better methods for solving common problems. You could probably solve these problems eventually anyway even without knowing the standard way to do it, but you will spend a lot of time reinventing the wheel and it's unlikely your solutions will be as good as those that thousands of experts have worked on and improved over the years.
An algorithm is a sequence of well defined steps leading to the solution of a type of problem.
A data structure is a way to store and organize data to facilitate access and modifications.
The benefit of knowing standard algorithms and data structures is they are mostly better than you yourself could develop. They are the result of months or even years of work by people who are far more intelligent than the majority of programmers. Knowing a range of data structures and algorithms allows you to fit a problem roughly to a data structure or/and algorithm and tweak as required.
In the classic "cooking/baking equivalent", algorithms are recipes and data structures are your measuring cups, your baking sheets, your cookie cutters, mixing bowls and essentially any other tool you would be using (your cooker is your compiler/interpreter, though).
(source: mit.edu)
This book is the bible on algorithms. In general, data structures relate to how to organize your data to access it in memory, and algorithms are methods / small programs to resolve problems (ex: sorting a list).
The reason you should care is first to understand what can go wrong in your code; poorly implemented algorithms can perform very badly compared to "proven" ones. Knowing classic algorithms and what performance to expect from them helps in knowing how good your code can be, and whether you can/should improve it.
Then there is no need to reinvent the wheel, and rewrite a buggy or sub-optimal implementation of a well-known structure or algorithm.
An algorithm is a representation of the process involved in a computation.
If you wanted to add two numbers then the algorithm might go:
Get first number;
Get second number;
Add first number to second number;
Return result.
At its simplest, an algorithm is just a structured list of things to do - its use in computing is that it allows people to see the intent behind the code and makes logical (as opposed to syntactical) errors easier to spot.
e.g. if step three above said multiply instead of add then someone would be able to point out the error in the logic without having to debug code.
A data structure is a representation of how a system's data should be referenced. It might match a table structure exactly or may be de-normalised to make data access easier. At its simplest it should show how the entities in a system are related.
It is too large a topic to go into in detail but there are plenty of resources on the web.
Data structures are critical the second your software has more than a handful of users. Algorithms is a broad topic, and you'll want to study it if a good knowledge of data structures doesn't fix your performance problems.
You probably don't need a new programming language to benefit from data structures knowledge, though PHP (and other high level languages) will make a lot of it invisible to you, unless you know where to look. Java is my personal favorite learning language for stuff like this, but that's pretty subjective.
My question is why would I need to know algorithms and data structures?
If you are doing any non-trivial programming, it is a good idea to understand the class data structures and algorithms and their uses in order to avoid reinventing the wheel. For example, if you need to put an array of things in order, you need to understand the various ways of sorting, so that you can choose the most appropriate one for the task in hand. If you choose the wrong approach, you can end up with a program that is grossly inefficient in some circumstances.
Do I need to learn C, C++ or Java first?
You need to know how to program in some language in order to understand what the algorithms and data structures do.
What are the practical benefits of knowing algorithms and data structures?
The main practical benefits are:
to avoid having to reinvent the wheel all of the time,
to avoid the problem of square wheels.

How to cultivate algorithm intuition?

When faced with a problem in software I usually see a solution right away. Of course, what I see is usually somewhat off, and I always need to sit down and design (admittedly, I usually don't design enough), but I get a certain intuition right away.
My problem is I don't get that same intuition when it comes to advanced algorithms. I feel much more up to the task of building another Facebook then building another Google search, or a Music Genom project. It's probably because I've been building software for quite some time, but I have little experience with composing algorithms.
I would like the community's advice on what to read and what projects to undertake to be better at composing algorithms.
(This question has nothing to do with Algorithmic composition. Well, almost nothing)
+1 To whoever said experience is the best teacher.
There are several online portals which have a lot of programming problems, that you can submit your own solutions to, and get an automated pass/fail indication.
http://www.spoj.pl/
http://uva.onlinejudge.org/
http://www.topcoder.com/tc
http://code.google.com/codejam/contests.html
http://projecteuler.net/
https://codeforces.com
https://leetcode.com
The USACO training site is the training program that all USA computing olympiad participants go through. It goes step by step, introducing more and more complex algorithms as you go.
You might find it helpful to perform algorithms physically. For example, when you're studying sorting algorithms, practice doing each one with a deck of cards. That will activate different parts of your brain than reading or programming alone will.
Steve Yegge referred to "The Algorithm Design Manual" in one of his rants. I haven't seen it myself, but it sounds like it's just the ticket from his description.
My absolute favorite for this kind of interview preparation is Steven Skiena's The Algorithm Design Manual. More than any other book it helped me understand just how astonishingly commonplace (and important) graph problems are – they should be part of every working programmer's toolkit. The book also covers basic data structures and sorting algorithms, which is a nice bonus. But the gold mine is the second half of the book, which is a sort of encyclopedia of 1-pagers on zillions of useful problems and various ways to solve them, without too much detail. Almost every 1-pager has a simple picture, making it easy to remember. This is a great way to learn how to identify hundreds of problem types.
problem domain
First you must understand the problem domain. An elegant solution to the wrong problem is no good, nor is an inefficient solution to the right problem in most cases. Solution quality, in other words, is often relative. A simple scheduling problem that has a deterministic solution that takes ten minutes to run may be fine if schedules are realculated once per week, but if schedules change several times a day then a genetic algorithm solution that converges in a few seconds may be required.
decomposition and mapping
Second, decompose the problem into sub-problems and known/unknown elements that correspond to elements of the solution. Sometimes this is obvious, e.g. to count widgets you need a way of identifying widgets, an incrementable counter, and a way of storing the count. Sometimes it is not so obvious. Sometimes you have to decompose the problem, the domain, and possible solutions at the same time and try several different mappings between them to find one that leads to the correct results [this is the general method].
model
Model the solution, in your head at least, and walk through it to see if it works correctly. Adjust as necessary (See decomposition and mapping, above).
composition/interfaces
Many times you can find elements of the problem and elements of the solution that map to each other and produce partial results that are useful. This composition and interface construction provides the kernal of the solution, and also serves to reduce the scope of the problem remaining. So then you just loop back to the top with a smaller initial problem, and go through it again.
experience
Experience is the best teacher, of course, but reading about different kinds of problems and solutions will also be helpful. Studying some of the well-known algorithms and their applications is likewise very helpful, e.g. Dijkstra, Bresenham, Unification, and of course, graph theory.
I am not sure intuition can be cultivated, but I think I know what you are asking. The more problems you solve, the more information and experience you have at your disposal for future problems. So, I say just practice. Practice programming real world applications and you run into plenty of problems. Sometimes, solving puzzles can be very educational as well.
I try to find physical analogues when I'm looking at a complex problem.

Resources