Marker positioning in Linear Bounded Automata - computation-theory

So, we recently discussed the topic of Linear Bounded Automata in class, but we didn't exactly define them formally. After looking up definitions online, I am still confused about the positioning of the markers. Are they always supposed to lie at either end of the input string? Can they be further to the left/right, just as long as their distance is at most some constant multiple of the length of the input? If so, do we get to choose their position for every input string separately (for example, making sure the left marker is always right before the input string)?

The linear bounded automaton is a restricted form of a nondeterministic Turing machine. The restriction is that the tape is finite. That is ensured by limiting the tape in its both ends with markers. That is all about it.
Now, how you choose these markers is not relevant for the automaton, as far as distance from the left and the right markers (i.e. the length of the tape) is a linear function of the input's length.
The Turing machine starts with a tape that has the input written in it "somewhere", and the tape's head points to its first symbol. The LBA does not add new restrictions on that, so it remains: the input is somewhere. That implies (because of the limitation markers) that the input is between the two markers, you can place your input's first symbol just after the left marker or place the input's last symbol just before the right marker. Both cases are not forbidden by the definition of LBA.
How the different position of the input on the tape will be used by the automaton with its states is another topic. With other words, you might need a specific placement of the input on the tape that depends from the automaton states and transition, else you will not get an acceptance.

Related

Generate or find a shortest text given list of words

Let's say I have a list of 1000+ words and I would like to generate a text that includes these words from the list. I would like to use as few extra words outside of the list as possible. How would one tackle such a problem? Or alternatively, is there a way to efficiently search for a smaller portion of text containing these words the most, given some larger text (millions of words)? Basically, the resulting text from the search should be optimized to be shortest but to contain all the words from the list.
I am not sure how you'd like the text to be generated, so I'll attempt to answer the second question:
Is there a way to efficiently search for a smaller portion of text containing these words the most, given some larger text (millions of words)? Basically, the resulting text from the search should be optimized to be shortest but to contain all the words from the list.
This is obviously a computationally demanding endeavour so I'll assume you are alright with spending like a gig of RAM on this and some time (but maybe not too long). Since you are looking for the shortest continuous text which satisfies some condition, one can conclude the following:
If the text satisfies the condition, you want to shorten it.
If it doesn't, you want to make it longer so that hopefully it will start satisfying the condition.
Now, when it comes to the condition, it is whatever predicate that will say whether the continuous section of the text is "good enough" or not, based on some relatively simple statistics. For instance, the predicate could check if some cumulative index based on what ratio of the words from your list are included in the section, modified by the number of words from outside the list, is greater than some expected value.
What my mind races to when I see something like this is the sliding window technique, described in this article. I do not know if this is a good article, I did not take the time to read it, but scanning through it seems to be decent. It's also known as the caterpillar method, which is a particularly common name for it in Poland.
Basically, you have two pointers, a left pointer and a right pointer. In the case of looking for the shortest continuous fragment of a larger text, such that the fragment satisfies a condition and given that if a condition is met for a fragment, then it is met for a larger fragment containing the previous fragment, you advance the right pointer forward as long as the condition is unmet, and then once it is met, you advance the left pointer, until the condition isn't met. This repeats until either or both pointers reach the end of the text.
This is a neat technique, which allows you to iterate over the whole text exactly once, linearly. It is clearly desirable in your case to have an algorithm linear with respect to the length of the text.
Now, we have to consider the statistics you will be collecting. You will probably want to know how many words from the list, and how many words from outside of the list are present in a continuous fragment. An extra condition for these statistics is that they will need to be relatively easily modifiable (preferably in constant time, but that will be hard to achieve) every time one of the pointers advances.
In order to keep track of the words, we will use a hashmap of ordered sets of indeces. In Java these data structures are called HashMap and TreeSet, in C++ they're unordered_map and set. The keys to the hashmap will be strings representing words. The values will be sets of indices of where the words appear in the text. Note that lookup in a hashmap is linear relative to the length of the key, so we can assume constant as most words are like <10 characters long, and checking how many values in a set there are between two given values is logarithmic relative to the size of the set. So getting the number of times a word appears in a fragment of the text is easy and fast. Keeping track of whether a word exists in the given list or not can also be achieved with a hashmap (or a hashset).
So let's get back to the statistics. Say you want to keep track of the number of words from inside and from outside your list in a given fragment. This can be achieved very simply:
Every time you add a word to the fragment by advancing its right end, you check if it appears in the list in constant time and if so, you add one to the "good words" number, and otherwise, you add one to the "bad words" number.
Every time you remove a word from the fragment by advancing the left end, you do the same but you decrement the counters instead.
Now if you want to track how many unique words from inside and from outside the list there are in the fragment, every time you will need to check the number of times a given word exists in the fragment. We established earlier that this can be done logarithmically relative to the length of the fragment, so now the trick is simple. You only modify the counters if the number of appearances of a word in the fragment either
rose from 0 to 1 when advancing the right pointer, or
fell from 1 to 0 when advancing the left pointer.
Otherwise, you ignore the word, not changing the counters.
Additional memory optimisations include removing indices from the sets of indices when they are out of scope of the fragment and removing hashmap entries from the hashmap if a set of indices becomes empty.
It is now up to you to perhaps find a better heuristic, some other statistical values which you can easily track whatever it is you intend to check in your predicate. Although it is important that whenever a fragment meets your condition, a bigger fragment must meet it too.
In the case described above you could keep track of all the fragments which had at least... I don't know... 90% of the words from your list and from those choose the shortest one or the one with the fewest foreign words.

Chess programming: minimax, detecting repeats, transposition tables

I'm building a database of chess evaluations (essentially a map from a chess position to an evaluation), and I want to use this to come up with a good move for given positions. The idea is to do a kind of "static" minimax, i.e.: for each position, use the stored evaluation if evaluations for child nodes (positions after next ply) are not available, otherwise use max (white to move)/min (black to move) evaluations of child nodes (which are determined in the same way).
The problem are, of course, loops in the graph, i.e. repeating positions. I can't fathom how to deal with this without making this infinitely less efficient.
The ideas I have explored so far are:
assume an evaluation of 0 for any position that can be reached in a game with less moves than are currently evaluated. This is an invalid assumption, because - for example - if White plays A, it might not be desirable for Black to follow up with x, but if White plays B, then y -> A -> x -> -B -> -y might be best line, resulting in the same position as A -> x, without any repetitions (-m denoting the inverse move to m here, lower case: Black moves, upper case: White moves).
having one instance for each possible way a position can be reached solves the loop problem, but this yields a bazillion of instances in some positions and is therefore not practical
the fact that there is a loop from a position back to that position doesn't mean that it's a draw by repetition, because playing the repeating line may not be best choice
I've tried iterating through the loops a few times to see if the overall evaluation would become stable. It doesn't, because in some cases, assuming the repeat is the best line means it isn't any longer - and then it goes back to the draw being the back line etc.
I know that chess engines use transposition tables to detect positions already reached before, but I believe this doesn't address my problem, and I actually wonder if there isn't an issue with them: a position may be reachable through two paths in the search tree - one of them going through the same position before, so it's a repeat, and the other path not doing that. Then the evaluation for path 1 would have to be 0, but the one for path 2 wouldn't necessarily be (path 1 may not be the best line), so whichever evaluation the transposition table holds may be wrong, right?
I feel sure this problem must have a "standard / best practice" solution, but google failed me. Any pointers / ideas would be very welcome!
I don't understand what the problem is. A minimax evaluation, unless we've added randomness to it, will have the exact same result for any given board position combined with who's turn it is and other key info. If we have the space available to store common board_position+who's_turn+castling+en passant+draw_related tuples (or hash thereof), go right ahead. When reaching that tuple in any other evaluation, just return the stored value or rely on its more detailed record for more complex evaluations (if the search yielding that record was not exhaustive, we can have different interpretations for it in any one evaluation). If the program also plays chess with time limits on the game, an additional time dimension (maybe a few broad blocks) would probably be needed in the memoisation as well.
(I assume you've read common public info about transposition tables.)

Can you construct a deterministic infinite RNG with just a few known parameters?

I'm currently using a cone-based random walk with reflections at the boundaries (denoted R[n]), with the following properties:
R[n] is always in a user defined range (called the boundaries) [a, b] (or [-a, a] if that's easier)
R[0] is defined by the user
|R[n]-R[n-1]|<d for some d <= b - a (this is the cone property)
If the generated R[n] falls outside the boundaries, reflect it across the nearest boundary so that no probability mass is accumulated at the edge
You can see a visualization of this process here (R[0] is "R" in the graphic):
As you can see, the red points are reflections, and the dashed lines represent the "cone"
This is very nice process for several reasons:
It uniformly walks the range
It has a well defined expected value, namely (b-a)/2
It's not quite as chaotic as Uni[a, b], which is nice for modelling real world drifts in e.g., sensor error.
However, the one flaw of this method is that to reconstruct the walk, you need to record every single point of the walk. I'd like to have a process that has these properties, but can also be regenerated using just a few initial parameters.
Is this possible?
You can do it with a "few" parameters provided at least one of those parameters has an infinite number of bits. For an infinite PRNG you need it to be capable of having an infinite number of possible states.
Given that your computer only has a finite memory then you will have to content yourself with a large, but finite, number of states. Once the PRNG has cycled through all possible states it will start to repeat because it is a deterministic machine.

Why isn't the prior state vector in the forward-backward algorithm the eigenvector of the transition matrix that has an eigenvalue of 1?

Wikipedia says you have no knowledge of what the first state is, so you have to assign each state equal probability in the prior state vector. But you do know what the transition probability matrix is, and the eigenvector that has an eigenvalue of 1 of that matrix is the frequency of each state in the HMM (i think), so why don't you go with that vector for the prior state vector instead?
This is really a modelling decision. Your suggestion is certainly possible, because it pretty much corresponds to prefixing the observations with a large stretch of observations where the hidden states are not observed at all or have no effect - this will give whatever the original states are time to settle down to the equilibrium distribution.
But if you have a stretch of observations with a delimited start, such as a segment of speech that starts when the speaker starts, or a segment of text that starts at the beginning of a sentence, there is no particular reason to believe that the distribution of the very first state is the same as the equilibrium distribution: I doubt very much if 'e' is the most common character at the start of a sentence, whereas it is well known to be the most common character in English text.
It may not matter very much what you choose, unless you have a lot of very short sequences of observations that you are processing together. Most of the time I would only worry if you wanted to set one of the state probabilities to zero, because the EM algorithm or Baum-Welch algorithm often used to optimise HMM parameters can be reluctant to re-estimate parameters away from zero.

On the bounding sets in RFC 5104

In section 3.5.4.2 of RFC5104, an algorithm to derive the bounding set of a set of lines is derived. Basically each line is of the form y=mx+b, and the objective is to find the points of intersections that identify the convex hull (equivalent to determine which receivers are relevant for bitrate adaptation in RTP media sessions). The following observation is taken from the RFC
These observations lead to the conclusion that when processing the
TMMBR tuples to select the initial bounding set, one should sort and
process the tuples by order of increasing overhead. Once a
particular tuple has been added to the bounding set, all tuples not
already selected and having lower overhead can be eliminated, because
the next side of the bounding polygon has to be steeper (i.e., the
corresponding TMMBR must have higher overhead) than the latest added
tuple.
I don't think this is correct. Assume you have a line, such as the one indicated in Figure 1 of the RFC with symbol 'a'. It is possible to draw a line with larger slope, such as the line indicated with symbol 'b' that starts on the Y-axis below line 'a'. In other words, if line 'b' has lower intercept on the Y-axis, you should consider line 'b' first. But if this is true, the rest of the algorithm does not work.
I believe the key here, and indeed the reason why the greedy approach works at all to solve the problem, is the sort ordering. While your intuitive claim is correct (that is, that a line segment exists that invalidates the proof of this solution), it's also guaranteed not to be present in a situation in which it would otherwise be problematic.
Wikipedia has a great set of references on solutions to this problem. For a more elegant, closed-form solution to the problem and its isoforms, consider asking about the finer points of this question more abstractly on http://cs.stackexchange.com or http://math.stackexchange.com.
Best of luck with retrofitting RTP to meet your needs.

Resources