In Non-Deterministic Finite Automata (NFA), how is the next branch/transition selected when there are two or more transitions? - computation-theory

For NFA, when there are 2 or more transition states, how does the machine decide which transition to take?
I was only able to find the "Guess and Verify" methodology, where we consider the system to be clairvoyant and always pick the correct path using binary trees.
Is this the only method? Could we also consider it as taking and existing in both states simultaneously?

You can consider it as trying all possible options, NFA accepts a word if there is a path from an initial state to an accepting state using this word.

Related

Pushdown Automata for the Language {wwR | w∈{0,1}*}

I am currently enrolled in the undergraduate version of Theory of Computation at my university and we are discussing Pushdown Automata, Turing Machines, and the Church-Turing Thesis.
In our homework problem sets, my professor asked the following question:
Construct a PDA for the language {wwR | w∈{0,1}*}.
However, in our class lecture slides, he wrote the following
Not every nondeterministic PDA has an equivalent deterministic PDA. For example, there does exist a nondeterministic PDA recognizing the following language, but no deterministic PDA can recognize this language:
So, my question is whether or not it is possible to write this deterministic PDA or not? I have tried researching it online for the past two hours, but have found no problems which discuss this problem specifically.
There is no deterministic PDA for this language. Which is the same as to say that there is no deterministic context-free grammar for it. The non-deterministic one in ABNF meta-syntax is:
a = ["0" a "0" | "1" a "1"]
The inability to create a deterministic PDA comes from the fact that the decision to accept/reject the input is based on the length of the input (a palindrome in this case). The PDA have no machinery that enables it to make decisions based on the length (you cannot use that "half of the input was accepted by now").
Imagine that at every input character "0"/"1" you push into the PDA's stack "0"/"1" and remain in the same state. Then for the input character that is the same as the previous you are facing a decision, should you push it into the stack and remain in the same state, or you have to start to recognize the reverse of the previously found input characters. There is no way to know what decision to make, based on the input character itself, the last pushed character into the PDA's stack (that is the same as the input character at this moment of time) and the current state of the PDA. You need to know the length of the input, and if you are in the middle of it, you can start to accept the reverse, else you remain in the same state.
There is nothing you can re-arrange, so that a deterministic decision become possible. The only way that remains is to explore both of the decision paths, every time that is needed, and to experience non-linearity.

In Reinforcement learning using feature approximation, does one have a single set of weights or a set of weights for each action?

This question is an attempt to reframe this question to make it clearer.
This slide shows an equation for Q(state, action) in terms of a set of weights and feature functions.
These discussions (The Basic Update Rule and Linear Value Function Approximation) show a set of weights for each action.
The reason they are different is that the first slide assumes you can anticipate the result of performing an action and then find features for the resulting states. (Note that the feature functions are functions of both the current state and the anticipated action.) In that case, the same set of weights can be applied to all the resulting features.
But in some cases, one can't anticipate the effect of an action. Then what does one do? Even if one has perfect weights, one can't apply them to the results of applying the actions if one can't anticipate those results.
My guess is that the second pair of slides deals with that problem. Instead of performing an action and then applying weights to the features of the resulting states, compute features of the current state and apply possibly different weights for each action.
Those are two very different ways of doing feature-based approximation. Are they both valid? The first one makes sense in situations, e.g., like Taxi, in which one can effectively simulate what the environment will do at each action. But in some cases, e.g., cart-pole, that's not possible/feasible. Then it would seem you need a separate set of weights for each action.
Is this the right way to think about it, or am I missing something?
Thanks.

using genetic algorithm to generate test sequences based on extended finite state machine

I want to generate test sequences based on Extended finite state machine (EFSM ) using genetic algorithm. EFSM based testing face the problem of in feasible path by genetic algorithm. My coverage criteria is transition coverage. I have an EFSM model of a system which have input parameters and guards on transitions from one state to another. So by using this EFSM model, I want to generate test sequences. But I am confused about how do I start. I mean how to generate initial population.
Actually, my research is about EFSM based test case generation. I have a model of ATM machine.This model consist of states and transitions. Transitions have guards and actions for the input parameters. Now I want to generate test cases for this machine. I mean model based testing. For this task it is compulsory that there should not be in feasible path. I mean every transition should be covered in test case. So for this purpose, I need to generate test sequences. Genetic algorithm is good for path optimization. but I don't know how to use my model specification in genetic algorithm and generate test sequences.
Given the ramifications I would simplify the random creation of the population part by using a random walk in the graph of FSM (not taking into account the boolean constraints for now) - this is like generation of examples from a regex (or transforming your FSM into transducer producing the input on output and walking through it). Once you generated many random examples of sufficient length, you go through a process of validating them using the _E_FSM part. Given that probably many of them will not be valid you may consider some "fixing" strategy - fixing the individuals which do not validate but are not far from being correct (a heuristic you have to come up with on your own). Then your population is actually a set of individuals (so you evolve a population of sets) and your evaluation metric would be coverage on the set level. Additionally, I would either not use crossover operator or ensure only the valid points and individuals cross. Mutation would be choosing a point in the graph and randomly going a different path. That's about it for a sketch of a solution (I successfully solved a similar problem with GA).

How to handle multiple optimal edit paths implementing Needleman-Wunsche algorithm?

Trying to implement Needleman-Wunsche algorithm for biological sequences comparison. In some circumstances there exist multiple optimal edit paths.
What is the common practice in bio-seq-compare tools handling this? Any priority/preferences among substitute/insert/deletion?
If I want to keep multiple edit paths in memory, any data structure is recommended? Or generally, how to store paths with branches and merges?
Any comments appreciated.
If two paths are have identical scores, that means that the likelihood of them is the same no matter which kinds of operations they used. Priority for substitutions vs. insertions or deletions has already been handled in getting that score. So if two scores are the same, common practice is to break the tie arbitrarily.
You should be able to handle this by recording all potential cells that you could have arrived at the current one from in your traceback matrix. Then, during traceback, start a separate branch whenever you come to a branching point. In order to allow for merges too, store some additional data about each cell (how will depend on what language you're using) indicating how many different paths left from it. Then, during traceback, wait at a given cell until that number of paths have arrived back at it, and then merge them into one. You can either be following the different branches with true parallel processing, or by just alternating which one you are advancing.
Unless you have an a reason to prefer one input sequence over the other in advance it should not matter.
Otherwise you might consider seq_a as the vertical axis and seq_b as the horizontal axis then always choose to step in your preferred direction if there is a tie to break ... but I'm not convincing myself there is any difference to the to alignment assuming one favors one of the starting sequences over the other
As a lot of similar algorithms, Needleman-Wunsche one is just a task of finding the shortest way into a graph (square grid in this case). So I would use A* for defining a sequence & store the possible paths as a dictionary with nodes passes.

Intelligent purely functional sets

Set computations composed of unions, intersections and differences can often be expressed in many different ways. Are there any theories or concrete implementations that try to minimize the amount of computation required to reach a given answer?
For example, I first came across a practical application of this when trying to decompose atoms in a simulation of an amorphous material into neighbor shells where the first shell are the immediate neighbors of some given origin atom and the second shell are those atoms that are neighbors of the first shell not in either the first shell or the one before it:
nth 0 = singleton i
nth 1 = neighbors i
nth n = reduce union (map neighbors (nth(n-1))) - nth(n-1) - nth(n-2)
There are many different ways to solve this. You can incrementally test of membership in each set whilst composing the result or you can compute the union of three neighbor shells and use intersection to remove the previous two shells leaving the outermost one. In practice, solutions that require the construction of large intermediate sets are slower.
Presumably an intelligent set implementation could compose the expression that was to be evaluated and then optimize it (e.g. to reduce the size of intermediate sets) before evaluating it in order to improve performance. Do such set implementations exist?
Your question immediately reminded me of Haskell's stream fusion, described in this paper. The general principle can be summarized quite easily: Instead of storing a list, you store a way to build a list. Then the list transformation functions operate directly on the list generator, meaning that all the operations fuse into a single generation of the data without any intermediate structures. Then when you are done composing operations you run the generator and produce the data.
So I think the answer to your question is that if you wanted some similarly intelligent mechanism that fused computations and eliminated intermediate data structures, you'd need to find a way to transform a set into a "co-structure" (that's what the paper calls it) that generates a set and operate directly on that, then actually generate the set when you are done.
I think there's a very deep theory behind this concept that the paper hints at but never spells out, and if somebody else here knows what it is, please let me know, because this is very relevant to something else I am doing, too!

Resources