I have written this recursive function to find the palindrome.
def palindrome(string):
print("palindrome called with:"+string)
if(len(string)<=3):
return string[0]==string[-1]
else:
res=palindrome(string[1:-1])
print("palindrome returned:"+str(res))
return res
I have yo find the time complexity of this algorithm of it now.
My questions
is my base case right? which is len<=3?
I'm unable to relate this to classic examples of fibonacci and factorial algorithm which are everywhere on internet.
Yes, the fact is that only your base case is correct.
So what you should be doing here is, check if the first and last character are same, then check if the remaining string is also a palindrome.
But at no point you are checking that.
So, with minimal changes in your code, the following solution will work, this will fail if an empty string is a passed as an argument.
def palindrome(string):
print("palindrome called with:"+string)
if(len(string)<=3):
return string[0]==string[-1]
else:
if string[0] == string[-1]:
res=palindrome(string[1:-1])
print("palindrome returned:"+str(res))
return res
else:
return False
Definitely, there are better ways of writing this.
def palindrome(s):
return s == '' or (s[0]==s[-1] and palindrome(s[1:-1]))
All I have done is, further reduced your base case, by letting it make two more recursive calls.
Now, coming to the time complexity, which is same for both the codes.
In one function call, we are doing an O(1) operation of comparing the first and last character. And this recursive call is being done for at most n/2 times. n/2 because, in a string of length n, we are removing 2 characters in each call. Thus, the overall complexity will be O(n).(Just mind that, this ignores the string copying/slicing in every recursive call.)
Finally, you should avoid this recursively, as we are making a new string(at the time of slicing) before every recursive call.
def palindrome(s):
def _palindrome(string, i, j):
if i >= j:
return True
return string[i] == string[j] and _palindrome(string, i + 1, j - 1)
return _palindrome(s, 0, len(s) - 1)
This will not make copy at every call. Thus, is definitely an O(n) solution.
Related
I just post a mathematic question at math.stackexchange, but I'll ask people here for a programmatically recursive algorithm.
The problem: fill in the blank number from 1 to 9 (once and only once each blank) to finish the equation.
Additional conditions:
1. Mathematic priority DOES matter.
2. All numbers (include evaluation result) should be integers.
Which mean the divide should be divisible (E.g. 9 mod 3 = 0 is OK, 8 mod 3 != 0 is not OK).
3. For those who don't know (as one in the original question), the operations in the diagram are:
+ = plus; : = divide; X = multiple; - = minus.
There should be more than 1 answer. I'd like to have a recursive algorithm to find out all the solutions.
Original question
PS: I'd like to learn about the recursive algorithm, performance improval. I was trying to solve the problem using brute force. My PC freeze for quite a while.
You have to find the right permuations
9! = 362880
This is not a big number and you can do your calculations the following way:
isValid(elements)
//return true if and only if the permutation of elements yields the expected result
end isValid
isValid is the validator, which checks whether a given permutation is correct.
calculate(elements, depth)
//End sign
if (depth >= 9) then
//if valid, then store
if (isValid(elements)) then
store(elements)
end if
return
end if
//iterate elements
for element = 1 to 9
//exclude elements already in the set
if (not contains(elements, element)) then
calculate(union(elements, element), depth + 1)
end if
end for
end calculate
Call calculate as follows:
calculate(emptySet, 1)
Here's a solution using PARI/GP:
div(a,b)=if(b&&a%b==0,a/b,error())
f(v)=
{
iferr(
v[1]+div(13*v[2],v[3])+v[4]+12*v[5]-v[6]-11+div(v[7]*v[8],v[9])-10==66
, E, 0)
}
for(i=0,9!-1,if(f(t=numtoperm(9,i)),print(t)))
The function f defines the particular function here. I used a helper function div which throws an error if the division fails (producing a non-integer or dividing by 0).
The program could be made more efficient by splitting out the blocks which involve division and aborting early if they fail. But since this takes only milliseconds to run through all 9! permutations I didn't think it was worth it.
"1. The program will calculate the values using repetitive execution (loops)."
Recursion is repetitive execution but, i do not think it's a loop, do you think if I used recursion it would follow the guideline above?
No. In fact, it looks like the assignment is specifically asking for the "opposite" of recursion, iteration.
Loops are fundamentally about iteration, which is different to recursion. The main difference is that an iteration uses a constant amount of space, whereas recursion uses more space the deeper the recursion goes. For example, here are iterative and recursive procedures to compute the sum of the numbers from 1 to n
def sum_iter(n):
x = 0
for i in range(1,n+1):
x += i
return x
def sum_recursive(n):
if n == 0:
return 0
else:
return n + sum_recursive(n-1)
If you run these with a very large argument, you will run out of stack space (a "stack overflow") on the recursive version, but the iterative version will work fine.
There is a special kind of recursion called tail recursion, which is where the function doesn't have to do anything with the value from a recursive call except pass it to the caller. In this case you don't need to keep track of the stack - you can just jump straight to the top. This is called tail call optimization. A tail recursive function to calculate the sum of the integers 1 to n looks like
def sum_tailrec(n):
def helper(s,i):
if i == 0:
return s
else:
return helper(s+i, i-1)
return helper(0, n)
In this case people often refer to the function helper as an iterative recursion, because (with tail call optimization) it is only using a constant amount of space.
This is all a bit moot, because Python doesn't have tail call optimization, but some languages do.
I'm new to Ruby and just started to pick up the language a couple of days back. As an exercise, I tried to implement a simple quicksort
class Sort
def swap(i,j)
#data[i], #data[j] = #data[j], #data[i]
end
def quicksort(lower=0, upper = #data.length - 1)
return nil if lower >= upper
m = lower
i = 0
((lower+1)..upper).each do |i|
swap(++m, i) if #data[i] < #data[lower]
end
swap(m, lower)
quicksort1(lower, m -1)
quicksort1(m+1, upper)
end
end
Calling quicksort on say 10000 integers gives me a stack-level error. After googling, I figured out that tail-recursion isn't supported yet in Ruby (kind of). But then I found the following snippet (from here)
def qs(v)
return v if v.nil? or v.length <= 1
less, more = v[1..-1].partition { |i| i < v[0] }
qs(less) + [v[0]] + qs(more)
end
Running the second snippet works perfectly well even with a million integers. Yet, as far as I can tell, there's tail recursion at the end. So what am I not understanding here?
Neither of the methods you've shown are tail recursive (well technically the first one is half tail-recursive: the second recursive call is a tail call, but the first one is not - the second method is not tail recursive at all).
The reason that the first method overflows the stack, but the second one does not is that the first method recurses much deeper than the second (linearly instead of logarithmically) because of a bug (++m just applies the unary + operator to m twice - it does not actually do anything to m).
When given a large enough array both versions will overflow (and would do so even if ruby did perform TCO), but without the bug 10000 elements is not nearly large enough.
I have two strings. How can I determine whether the first string is composed only of letters given by the second string?
For example:
A = abcd
B = abbcc
should return false, since d is not in the second string.
A = aab
B = ab
should return true.
If the program most of the time returns false, how can I optimize this program? If it returns true most of the time, how can I optimize it then?
Sort both strings. Then go through A, and have a pointer going through B. If the character in A is the same as what the B pointer points to, keep looking through A. If the character in A is later in the alphabet than what the B pointer points to, advance the B pointer. If the character in A is earlier in the alphabet than what the B pointer points to, return false. If you run out of A, return true.
[How do I] determine [if] the first string [is composed of characters that appear in] the second string?
Here's a quick algorithm:
Treat the first and second strings as two sets of characters S and T.
Perform the set difference S - T. Call the result U.
If U is nonempty, return false. Otherwise, return true.
Here is one simple way.
def isComposedOf(A, B):
bset = set(B)
for c in A:
if c not in bset:
return False
return True
This algorithm walks each string once, so it runs in O(len(A) + len(B)) time.
When the answer is yes, you cannot do better than len(A) comparisons even in the best case, because no matter what you must check every letter. And in the worst case one of the characters in A is hidden very deep in B. So O(len(A) + len(B)) is optimal, as far as worst-case performance is concerned.
Similarly: when the answer is no, you cannot do better than len(B) comparisons even in the best case; and in the worst case the character that isn't in B is hidden very deep in A. So O(len(A) + len(B)) is again optimal.
You can reduce the constant factor by using a better data structure for bset.
You can avoid scanning all of B in some (non-worst) cases where the answer is yes by building it lazily, scanning more of B each time you find a character in A that you haven't seen before.
> If the program is always return false, how to optimize this program?
return false
> If it is always return true, how to optimize it?
return true
EDIT: Seriously, it's a good question what algorithm optimizes for the case of failure, and what algorithm optimizes for the case of success. I don't know what algorithm strstr uses, it may be a generally-good algorithm which isn't optimal for either of those assumptions.
Maybe you'll catch someone here who knows offhand. If not, this looks like a good place to start reading:Exact String Matching Algorithms
assuming strings has all lower case in it, then you can have a bit vector and set the bit based on position position = str1[i] - 'a'. to set it you would do
bitVector |= (1<<pos). And then for str2 you would check if a bit is set in bitVector for all bits, if so return true otherwise return false.
I got curious by Jon Limjap's interview mishap and started to look for efficient ways to do palindrome detection. I checked the palindrome golf answers and it seems to me that in the answers are two algorithms only, reversing the string and checking from tail and head.
def palindrome_short(s):
length = len(s)
for i in xrange(0,length/2):
if s[i] != s[(length-1)-i]: return False
return True
def palindrome_reverse(s):
return s == s[::-1]
I think neither of these methods are used in the detection of exact palindromes in huge DNA sequences. I looked around a bit and didn't find any free article about what an ultra efficient way for this might be.
A good way might be parallelizing the first version in a divide-and-conquer approach, assigning a pair of char arrays 1..n and length-1-n..length-1 to each thread or processor.
What would be a better way?
Do you know any?
Given only one palindrome, you will have to do it in O(N), yes. You can get more efficiency with multi-processors by splitting the string as you said.
Now say you want to do exact DNA matching. These strings are thousands of characters long, and they are very repetitive. This gives us the opportunity to optimize.
Say you split a 1000-char long string into 5 pairs of 100,100. The code will look like this:
isPal(w[0:100],w[-100:]) and isPal(w[101:200], w[-200:-100]) ...
etc... The first time you do these matches, you will have to process them. However, you can add all results you've done into a hashtable mapping pairs to booleans:
isPal = {("ATTAGC", "CGATTA"): True, ("ATTGCA", "CAGTAA"): False}
etc... this will take way too much memory, though. For pairs of 100,100, the hash map will have 2*4^100 elements. Say that you only store two 32-bit hashes of strings as the key, you will need something like 10^55 megabytes, which is ridiculous.
Maybe if you use smaller strings, the problem can be tractable. Then you'll have a huge hashmap, but at least palindrome for let's say 10x10 pairs will take O(1), so checking if a 1000 string is a palindrome will take 100 lookups instead of 500 compares. It's still O(N), though...
Another variant of your second function. We need no check equals of the right parts of normal and reverse strings.
def palindrome_reverse(s):
l = len(s) / 2
return s[:l] == s[l::-1]
Obviously, you're not going to be able to get better than O(n) asymptotic efficiency, since each character must be examined at least once. You can get better multiplicative constants, though.
For a single thread, you can get a speedup using assembly. You can also do better by examining data in chunks larger than a byte at a time, but this may be tricky due to alignment considerations. You'll do even better to use SIMD, if you can examine chunks as large as 16 bytes at a time.
If you wanted to parallelize it, you could divide the string into N pieces, and have processor i compare the segment [i*n/2, (i+1)*N/2) with the segment [L-(i+1)*N/2, L-i*N/2).
There isn't, unless you do a fuzzy match. Which is what they probably do in DNA (I've done EST searching in DNA with smith-waterman, but that is obviously much harder then matching for a palindrome or reverse-complement in a sequence).
They are both in O(N) so I don't think there is any particular efficiency problem with any of these solutions. Maybe I am not creative enough but I can't see how would it be possible to compare N elements in less than N steps, so something like O(log N) is definitely not possible IMHO.
Pararellism might help, but it still wouldn't change the big-Oh rank of the algorithm since it is equivalent to running it on a faster machine.
Comparing from the center is always much more efficient since you can bail out early on a miss but it alwo allows you to do faster max palindrome search, regardless of whether you are looking for the maximal radius or all non-overlapping palindromes.
The only real paralellization is if you have multiple independent strings to process. Splitting into chunks will waste a lot of work for every miss and there's always much more misses than hits.
On top of what others said, I'd also add a few pre-check criteria for really large inputs :
quick check whether tail-character matches
head character
if NOT, just early exit by returning Boolean-False
if (input-length < 4) {
# The quick check just now already confirmed it's palindrome
return Boolean-True
} else if (200 < input-length) {
# adjust this parameter to your preferences
#
# e.g. make it 20 for longer than 8000 etc
# or make it scale to input size,
# either logarithmically, or a fixed ratio like 2.5%
#
reverse last ( N = 4 ) characters/bytes of the input
if that **DOES NOT** match first N chars/bytes {
return boolean-false # early exit
# no point to reverse rest of it
# when head and tail don't even match
} else {
if N was substantial
trim out the head and tail of the input
you've confirmed; avoid duplicated work
remember to also update the variable(s)
you've elected to track the input size
}
[optional 1 : if that substring of N characters you've
just checked happened to all contain the
same character, let's call it C,
then locate the index position, P, for the first
character that isn't C
if P == input-size
then you've already proven
the entire string is a nonstop repeat
of one single character, which, by def,
must be a palindrome
then just return Boolean-True
but the P is more than the half-way point,
you've also proven it cannot possibly be a
palindrome, because second half contains a
component that doesn't exist in first half,
then just return Boolean-False ]
[optional 2 : for extremely long inputs,
like over 200,000 chars,
take the N chars right at the middle of it,
and see if the reversed one matches
if that fails, then do early exit and save time ]
}
if all pre-checks passed,
then simply do it BAU style :
reverse second-half of it,
and see if it's same as first half
With Python, short code can be faster since it puts the load into the faster internals of the VM (And there is the whole cache and other such things)
def ispalin(x):
return all(x[a]==x[-a-1] for a in xrange(len(x)>>1))
You can use a hashtable to put the character and have a counter variable whose value increases everytime you find an element not in table/map. If u check and find element thats already in table decrease the count.
For odd lettered string the counter should be back to 1 and for even it should hit 0.I hope this approach is right.
See below the snippet.
s->refers to string
eg: String s="abbcaddc";
Hashtable<Character,Integer> textMap= new Hashtable<Character,Integer>();
char charA[]= s.toCharArray();
for(int i=0;i<charA.length;i++)
{
if(!textMap.containsKey(charA[i]))
{
textMap.put(charA[i], ++count);
}
else
{
textMap.put(charA[i],--count);
}
if(length%2 !=0)
{
if(count == 1)
System.out.println("(odd case:PALINDROME)");
else
System.out.println("(odd case:not palindrome)");
}
else if(length%2==0)
{
if(count ==0)
System.out.println("(even case:palindrome)");
else
System.out.println("(even case :not palindrome)");
}
public class Palindrome{
private static boolean isPalindrome(String s){
if(s == null)
return false; //unitialized String ? return false
if(s.isEmpty()) //Empty Strings is a Palindrome
return true;
//we want check characters on opposite sides of the string
//and stop in the middle <divide and conquer>
int left = 0; //left-most char
int right = s.length() - 1; //right-most char
while(left < right){ //this elegantly handles odd characters string
if(s.charAt(left) != s.charAt(right)) //left char must equal
return false; //right else its not a palindrome
left++; //converge left to right
right--;//converge right to left
}
return true; // return true if the while doesn't exit
}
}
though we are doing n/2 calculations its still O(n)
this can done also using threads, but calculations get messy, best to avoid it. this doesn't test for special characters and is case sensitive. I have code that does it, but this code can be modified to handle that easily.