I have this homework.
I want an explanation of the qustion, not a solution.
Write an algorithm to read a text and print whether or not is a palindrome, that is the text
Reads the same backward as forward.
I don't understand it!
Words like rotor are palindromes, meaning that reversing the letters of the word produces the same word.
In other words, the letters of the word are symmetric.
http://en.wikipedia.org/wiki/Palindrome would help
Related
I am trying to find out if there is an algorithm that exists that is capable of the following:
Given a list of strings:
{"56B99Z", "78K80F", "50B49J", "28F11F"}
And given an input string of:
"??B?9?"
Then the algorithm should output:
{"56B99Z", "50B49J"}
Where ? are uknown characters.
I think some sort of trie-tree with additional links between nodes could work, but I don't want to re-invent the wheel if this has been done before.
Your question is really vague and you need to be more specific, are the strings have the same size? If so you can just look on the position which aren't question mark in your string you search for each other string, anyway if you looking for matching strings algorithms I suggest you read about kmp algorithm which have linear complexity for the given input => https://en.m.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm
use a regular expression to match on the 1,2,4,6 positions as an \w
There is something I could not figure out about the two shifting rules (bad character and good suffix) in this algorithm. Are they working together and what exactly decide which one to deploy in each case or shift. This comprehensive explanation ended with an example of SSIMPLE EXAMPLE which confused me, my question here, if the algorithm moves backward, why would the algorithm will need good suffix shift to move to the right? I am sure I miss something here. Would you help me to explain the aforementioned example.
The missing point is the algorithm moves backward on the pattern not the string, so the comparison starts from the character of index n ( n is pattern length) not from the index 1. the following visual example is very helpful to clarify that.
What are the main differences between the Knuth-Morris-Pratt search algorithm and the Boyer-Moore search algorithm?
I know KMP searches for Y in X, trying to define a pattern in Y, and saves the pattern in a vector. I also know that BM works better for small words, like DNA (ACTG).
What are the main differences in how they work? Which one is faster? Which one is less computer-greedy? In which cases?
Moore's UTexas webpage walks through both algorithms in a step-by-step fashion (he also provides various technical sources):
Knuth-Morris-Pratt
Boyer-Moore
According to the man himself,
The classic Boyer-Moore algorithm suffers from the phenomenon that it
tends not to work so efficiently on small alphabets like DNA. The skip
distance tends to stop growing with the pattern length because
substrings re-occur frequently. By remembering more of what has
already been matched, one can get larger skips through the text. One
can even arrange ``perfect memory'' and thus look at each character at
most once, whereas the Boyer-Moore algorithm, while linear, may
inspect a character from the text multiple times. This idea of
remembering more has been explored in the literature by others. It
suffers from the need for very large tables or state machines.
However, there have been some modifications of BM that have made small-alphabet searching viable.
In an rough explanation
Boyer-Moore's approach is to try to match the last character of the pattern instead of the first one with the assumption that if there's not match at the end no need to try to match at the beginning. This allows for "big jumps" therefore BM works better when the pattern and the text you are searching resemble "natural text" (i.e. English)
Knuth-Morris-Pratt searches for occurrences of a "word" W within a main "text string" S by employing the observation that when a mismatch occurs, the word itself embodies sufficient information to determine where the next match could begin, thus bypassing re-examination of previously matched characters. (Source: Wiki)
This means KMP is better suited for small sets like DNA (ACTG)
Boyer-Moore technique match the characters from right to left, works well on long patterns.
knuth moris pratt match the characters from left to right, works fast on short patterns.
General idea is that do two for loop, carry out every character from string 1, compare to every character from string2, if all finded, that will indicate Include.
so we need to loop all the char from string1, and compare all look all the character from string2, that will O sqaure runing time.
Which interviewer says it is not good idea.
after it, i am thinking for it. i cannot generate one idea that did not do two loop.
perhaps i can first get all the character from string1, convert into asc2, the number built into a tree. so when do the compare to the string2, it will make search very fast.
Or any folk has better idea?
Like string1 is abc but string2 is cbattt that means every character is included in string2.
not substring,
as iccthedral says, boyer moore is probably what the interviewer was looking for.
searching a text for a given pattern (pattern matching) is a very known problem. known solutions:
KMP
witness table
boyer-moore
suffix tree
all solutions vary in some minor aspects, like if it can be generalized for 2D pattern matching, or more. if it needs pre-processing, if it can be generalized for unbound alphabet, running time, etc'...
EDIT:
if you just want to know if all the letters of some string appear in some other string, why not use a table the size of your alphabet, indicating if a given char can be found in the string. if the alphabet is unbounded or extremely large (more than O(1)), use hash table.
I just came across this interesting question online and am quite stumped as to how to even progress on it.
Write a function that finds all the different ways you can split up a word into a
concatenation of two other words.
Is this something that Suffix Trees are used for?
I'm not looking for code, just conceptual way to move forward with this.
some psuedocode:
foreach place you can split the word:
split the word.
check if both sides are valid words.
If you are looking for a nice answer then please let us know your definition of a valid word.
Assuming a word is a string defined over an alphabet and has length greater than zero. You can use suffix trees.
Below is a simplified algorithm which will take just O(n) time.
Convert the word into a character array.
Traverse through the length of the array and for each i just take two strings (0 to i) and
(i+1 to length of the array-1).
Do remember to cover the base conditions like length greater than zero.
Total number of different ways to do it can be greater than one if and only if this condition holds:
-> one of the two words must be a multiple of other. For eg: "abcd" and "abcdabcd".
Using these two words u can form the string "abcdabcdabcdabcd" in many different ways.
So first check this condition.
Then check whether the string can be written from the two words in any way. Then simple math should give you the answer