how to formally described this algorithm for turing machine? - algorithm

Waring: This task was given by my professor who is 80 y/o and nobody understands what he sometimes wants, I do not expect more less standard approach to this problem, not just because the problem is difficult, but because my professor is old-school ex-ussr crazy guy ;) (he likes to make easy things complicated, just to explain why this is posted here)
This task is pure theory one, but I do not know how to formalize it with words
Problem:
9 bits binary code is given on input, we have to print "0" in output
if amount of bits with value "1" are two times less than amount of
bits with value "0", if this condition is false that we have to print
"1" in output.
What I proposed in my description is to introduce a counter, and then count bits that have value 1, then make an output based on this counter, but I was claimed to be an idiot and I was told that there's the way without the counter and I choose the hardest way. Does someone know the better way to determine what to output?
Thanks in advance, and sorry if description looks messy

As the TM reads the input bits, the state number must capture the number of bits seen, from 0 through to 9, so that we can recognize when we get to the end, and the number of 1 bits seen, with the relevant cases being 0, 1, 2, 3, and >=4.
There are less than 10*5=50 states required to encode all the relevant possibilities. When the machine enters one of the states indicating that 9 input bits have been seen, it writes a 0 if it indicates that 3 1s have been seen, or 1 otherwise, then stops.
Note that we didn't need to use the tape for storage -- the input language is regular so it can be decided with a finite state machine and unbounded storage is unnecessary.

While Matt is correct, you can generalize this problem to arbitrary input sizes using storage.
Go to the beginning of the tape
Move forward looking for an unmarked 1. Mark it.
If you can't find an unmarked 1, go to step 7.
Go back to the beginning of the tape
Look for an unmarked 0. Mark it.
If you can't find an unmarked 0, go to step 9.
Look for another unmarked 0. Mark it.
If you can't find an unmarked 0, go to step 9.
Go to step 1
Go to the beginning of your input.
Look for an unmarked 0.
If you don't find one, output 0. Halt.
Output 1. Halt.
This will work for any input size. Intuitively, we're looking for 2 0s for every 1 in the input, making sure there are twice as many 0 bits as 1 bits.

Related

Is there a better algorithm for finding the longest sequence of a same letter in a string?

I've been challenging myself to look at algorithms and try to change them in ordem to make them the fastest i can. Recently i tried an algorithm which searches for the longest sequence of any letter on a string. The naive answer looks at all letters and when the current sequence is bigger than the biggest sequence found, the new biggest become the current. Example:
With C for current sequence and M for maximum sequence, order of letters checked and variables updates goes like this:
AAAACCDDD-> A(C=1,M=1)->A(C=2,M=2)->A(C=3,M=3)->A(C=4,M=4)->C(C=1,M=4)->C(C=2,M=4)->D(C=1,M=4)->D(C=2,M=4)->D(C=3,M=4) Answer: 4 It can be faster by stopping when there is no way to get a new biggest sequence given M,the place you are in the string and the string size.
I've tried and came up with an algorithm which usually accesses less elements of the string, I think will be easier to explain like this:
Instead of jumping 1 by 1, you jump what would be necessary to have a new biggest sequence if all letters across the jump were the same. So for example after you read AAAB, you would jump 3 spots because you suppose all 3 next letters are B (AAABBBB). Of course they might not be, and that is why you now go backwards counting consecutive B's right behind your position. Your next "jump" will be lower depending on how many B's you've found. So for instance
AAABCBBBBD after the jump you are in the third B. You go backwards and find one B, backwards again and finding a C you stop. Now you already know you have a sequence of 2 so your next jump can't be of 3 -you might miss a sequence of 4 B's. So you jump 2 and get to a B. Go backwards one and find a B. The next backwards position is where you started so you know that you found a sequence of 4.
In that example it didnt have much of a difference but if you use instead a string like AAABBBBCDDECEE you can see that after you jumped from the first C to the last C you would only need to backtrack once because after seeing that the letter behind you is E you don't care anymore about what was across that jump.
I've coded both methods and that second one has been 2 to 3 times faster. Now I'm really curious to know, is there a faster way to find it?

Binary Number Having same number of 0s and 1s [duplicate]

This question already has answers here:
Get all 1-k tuples in a n-tuple
(3 answers)
Closed 8 years ago.
Given a binary string or a binary number(one is free to take it in any way), I need to find out the next smaller binary number but retaining the number of 0s and 1s in the original binary string or number.
For e.g.
If the given binary number or string was 11100000, the required output would be 11010000.
If the given binary number or string was 11010000, the required output would be 11001000.
Of course, I can do this with Brute Force approach. But I needed a better solution. What could be an optimal way of doing it? I was wondering if someone can help me reach a solution to this in O(1) using bit wise operations.
This is an elaboration on Setzer22's answer, which was close but which lacked one vital piece.
FindNextSmallestWithSameNumberOfBits(string[1...n])
1. for i = n - 1 to 1 do
2. if string[i+1] = 0 and string[i] = 1 then
3. string[i] := 0
4. string[i+1] := 1
5. sort(string[i+2...n], descending)
6. return string[1...n]
7. return "no solution"
This is an O(n) algorithm, which is a provably optimal asymptotic bound for this problem when the input size is unrestricted; while this is "bitwise" in the sense that it operates on bits, it clearly doesn't use what one would typically think of as "bitwise operations." Luckily, for inputs which can be of arbitrary length, there can be no asymptotic advantage to using traditional "bitwise operations" over this method. For inputs of fixed length, to which asymptotic analysis does not readily apply, one might do better using a technique such as those linked to by Asuka in the other answer to this question.
Note, based on comments, that sorting on line 5 can be replaced with simply reversing the string. The reason for this is that this substring is guaranteed to be of the form 0...01...1 (that is, any 0s to the left of any 1s) since, if it weren't, we'd have already found an occurrence of the string 10 and satisfied the condition on line 2.
The key that was missing in Setzer22's answer is that, once you move the rightmost 1 with a 0 to the right of it to the right, you then need to left-shift all the 1s that are even further right as far left as they will go. The reason for this is that the 1 bit shifted to the right is more significant than the bits to the right of it, so left-shifting any 1s which are less significant will give a larger number, but not large enough to undo the effect of reducing the more significant bit.
Clarification based on comments: notice that in line 7 of the pseudocode presented above, it's possible that the algorithm won't return a valid string. The reason for this is that, sometimes, there is no string with the same number of 1s which represents a smaller number. This occurs if and only if the string 01 does not appear as a substring in the input string (in which case the condition on line 2 is never satisfied).
This isn't the clearest explanation of all time, so please let me know if it needs more work. Here's an example:
10011 // input
01011 // right-shift the right-most 1 bit with a 0 to the right of it
01110 // left-shift all 1 bits to the right of the right-shifted as far as possible
1010100011 // input
1010010011 // right-shift the right-most 1 bit with a 0 to the right of it
1010011100 // left-shift all 1 bits to the right of the right-shifted bit as far as possible
One way to clarify this which just occurred to me: right-shifting the 1 bit guarantees that the result will be smaller than the original number; left-shifting the 1s to the right guarantees that the result will no smaller than is necessary.
May be this is what you are finding:
https://github.com/hcs0/Hackers-Delight/blob/master/snoob.c.txt
The functions snoob(), snoob1(), snoob2(), snoob3(), snoob4() and next_set_of_n_elements() are various implementations.
These functions are helper functions which are called by the above functions:
ntz() stands for "number of trailing zeros"
nlz() stands for "number of leading zeros"
pop() stands for "population count" (number of bit set (number of "1"s) in the string)
This is very efficient but only works on fix size integers (eg 32-bit, 64-bit).

"Charge changing" algorithm

First of all I'm not sure how to name this problem. If anyone have better idea feel free to change it or tell that I do so.
Let's say I have two strings s1, s2 containing '+' and '-', which means positive and negative charge.
s1 is our begin input, s2 is pattern we want to get from s1. Our only operation is that we can change charge into opposite. But when we do so not only chosen charge is being changed but also charges next to one that we choose (left and right, besides first and last character since one of them do not have left and other right).
When it's not possible to get from s1 to s2.
How to find minimum amount of charge changes to transform from s1 to s2.
I believe the only one is when we have string length of 2 and in total amount '+'(or '-') is odd. For instance
in:"+-"
pattern:"++"
otherwise it's possible, but proof would be appreciated. As point 2 I have no idea, any hints are welcome.
Your intuition for when the problem is solvable isn't quite right. Half of all instances are insoluble whenever n = 2 (mod 3). One way to see this is by doing a few steps of reducing the appropriate system of equations (mod 2). Another way to see that there's some redundancy is to see that flipping the first, fourth, seventh, ... (n-1)st affects exactly the same set of characters as flipping the second, fifth, eight, ... nth.
As for an algorithm for solving these problems: There are two possible choices for the first flip. Once you've decided whether to flip around the first character, the value of the first character tells you whether you need to flip around the second character. Then the value of the second character tells you whether to flip around the third character. And so forth. So just try both possibilities. If neither one works, the problem's insoluble; if one works, report it; if both work, report the one that required fewer flips.

Fewest toggles to create an alternating chain

I'm trying to solve this problem on SPOJ : http://www.spoj.pl/problems/EDIT/
I'm trying to get a decent recursive description of the algorithm, but I'm failing as my thoughts keep spinning in circles! Can you guys help me out with this one? I'll try to describe what approach I'm trying to solve this.
Basically I want to solve a problem of size j-i where i is the starting index and j is the ending index. Now, there should be two cases. If j-i is even then both the starting and the ending letters have to be the same case, and they have to be the opposite case when j-i is odd. I also want to reduce the problem of a lower size (j-i-1 or j-i-2), but I feel that if I know a solution to a smaller problem, then constructing a solution of a just bigger problem should also take into account the starting and ending letter cases of the smaller problem. This is exactly where I'm getting confused. Can you guys put my thoughts on the right track?
I think recursion is not the best way to go with this problem. It can be solved quite fast if we take a different approach!
Let us consider binary strings. Say an uppercase char is 1 and a lowercase one is 0. For example
AaAaB -> 10101
ABaa -> 1100
a -> 0
a "correct" alternating chain is either 10101010.. or 010101010..
We call the minimum number of substitutions required to change one string into the other the Hamming distance between the strings. What we have to find is the minimum Hamming distance between the input binary string and one of the two alternating chains of the same length.
It's not difficult: we XOR each string and then count the number of 1s. (link). For example, let's consider the following string: ABaa.
We convert it in binary:
ABaa -> 1100
We generate the only two alternating chains of length 4:
1010
0101
We XOR them with the input:
1100 XOR 1010 = 0101
1100 XOR 0101 = 1010
We count the 1s in the results and take the minimum. In this case, it's 2.
I coded this procedure in Java with some minor optimization (buffered I/O, no real need to generate the alternating chains) and it got accepted: (0.60 seconds one).
Given any string s of length n, there are only two possible "alternating chain".
This 2 variants can be defined sequentially by settings the first letter state (if first is upper then second is lower, third is upper...).
A simple linear algorithm would be to make 2 simple assumptions about the first letter:
First letter is UpperCase
First letter is LowerCase
For each assumption, run a simple edit distance algorithm and you are done.
You can do it recursively, but you'll need to pass and return a lot of state information between functions, which I think is not worthwhile when this problem can be solved by a simple loop.
As the others say, there are two possible "desired result" strings: one starts with an uppercase letter (let's call it result_U) and one starts with a lowercase letter (result_L). We want the smaller of EditDistance(input, result_U) and EditDistance(input, result_L).
Also observe that, to calculate EditDistance(input, result_U), we do not need to generate result_U, we just need to scan input 1 character at a time, and each character that is not the expected case will need 1 edit to make it the correct case, i.e. adds 1 to the edit distance. Ditto for EditDistance(input, result_L).
Also, we can combine the two loops so that we scan input only once. In fact, this can be done while reading each input string.
A naive approach would look like this:
Pseudocode:
EditDistance_U = 0
EditDistance_L = 0
Read a character
To arrive at result_U, does this character need editing?
Yes => EditDistance_U += 1
No => Do nothing
To arrive at result_L, does this character need editing?
Yes => EditDistance_L += 1
No => Do nothing
Loop until end of string
EditDistance = min(EditDistance_U, EditDistance_L)
There are obvious optimizations that can be done to the above also, but I'll leave it to you.
Hint 1: Do we really need 2 conditionals in the loop? How are they related to each other?
Hint 2: What is EditDistance_U + EditDistance_L?

How to compute palindrome from a stream of characters in sub-linear space/time?

I don't even know if a solution exists or not. Here is the problem in detail. You are a program that is accepting an infinitely long stream of characters (for simplicity you can assume characters are either 1 or 0). At any point, I can stop the stream (let's say after N characters were passed through) and ask you if the string received so far is a palindrome or not. How can you do this using less sub-linear space and/or time.
Yes. The answer is about two-thirds of the way down http://rjlipton.wordpress.com/2011/01/12/stringology-the-real-string-theory/
EDIT: Some people have asked me to summarize the result, in case the link dies. The link gives some details about a proof of the following theorem: There is a multi-tape Turing machine that can recognize initial non-trivial palindromes in real-time. (A summary, also provided by the article linked: Suppose the machine has read x1, x2, ..., xk of the input. Then it has only constant time to decide if x1, x2, ..., xk is a palindrome.)
A multitape Turing machine is just one with several side-by-side tapes that it can read and write to; in a very specific sense it is exactly equivalent to a standard Turing machine.
A real-time computation is one in which a Turing machine must read a character from input at least once every M steps (for some bounded constant M). It is readily seen that any real-time algorithm should be linear-time, then.
There is a paper on the proof which is around 10 pages which is available behind an institutional paywall here which I will not repost elsewhere. You can contact the author for a more detailed explanation if you'd like; I just had read this recently and realized it was more or less what you were looking for.
You could use a rolling hash, or more rolling hashes for accuracy. Incrementally compute the hash of the characters read so far, in the order they were read, and in reverse order of reading.
If your hash function is x*3^(k-1)+x*3^(k-2)+...+x*3^0 for example, where x is a character you read, this is how you'd do it:
hLeftRight = 0
hRightLeft = 0
k = 0
repeat until there are numbers in the stream
x = stream.Get()
hLeftRight = 3*hLeftRight + x.Value
hRightLeft = hRightLeft + 3^k*x.Value
if (x.QueryPalindrome = true)
yield hLeftRight == hRightLeft
k = k + 1
Obviously you'd have to calculate the hashes modulo something, probably a prime or a power of two. And of course, this could lead to false positives.
Round 2
As I see it, with each new character, there are three cases:
Character breaks potential symmetry, for example, aab -> aabc
Character extends the middle, for example aab -> aabb
Character continues symmetry, for example aab->aaba
Assume you have a pointer that tracks down the string and points to the last character that continued a potential palindrome.
(I am going to use parenthesis to indicate a pointed at character)
Lets say you are starting with aa(b) and get an:
'a' (case 3), you move the pointer to
the left and check if it's an 'a' (it
is). You now have a(a)b.
'c' (case 1), you are not expecting a 'c', in this case you start back at the beginning and you now have aab(c).
The really tricky case is 2, because somehow you have to know that the character you just got isn't affecting symmetry, it is just extending the middle. For this, you have to hold an additional pointer that tracks where the plateau's (middle's) edge lies. For example, you have (b)baabb and you just got another 'b', in this case you have to know to reset the pointer to the base of the middle plateau here: bbaa(b)bb. Since we are going for constant time, you have to hold a pointer here to begin with (you can't afford the time to search for the plateau's edge). Now if you get another 'b', you know that you are still on the edge of that plateau and you keep the pointer where it is, so bbaa(b)bb -> bbaa(b)bbb. Now, if you get an 'a', you know that the 'b's are not part of the extended middle and you reset both pointers (The tracking pointer and the edge pointer) so you now have bbaabbbb((a)).
With these three cases, I think all bases are covered. If you ever want to check if the current string is a palindrome, check if the first pointer (not the plateau's edge pointer) is at index 0.
This might help you:
http://arxiv.org/pdf/1308.3466v1.pdf
If you store the last $k$ many input symbols you can easily find palindromes up to a length of $k$.
If you use the algorithms of the paper you can find the midpoints of palindromes and an length estimate of its length.

Resources