What is the Big O notation for this function? [closed] - algorithm

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I have written a function and I need to know the big O notation for it.
I have tried to slove this myself and I get O(N^2), however I have been told that this is not the correct answer.
Can someone please tell me what the correct notation is and also a step by step explanation of how they came to that answer?
The function is below.
Thanks in advance
public static string Palindrome(string input)
{
string current = string.Empty;
string longest = string.Empty;
int left;
int center;
int right;
if (input == null || input == string.Empty || input.Length == 1) { return input; }
for (center = 1; center < input.Length -1; center++)
{
left = center - 1;
right = center + 1;
if (input[left] == input[center])
{
left--;
}
while (0 <= left && right < input.Length)
{
if (input[left] != input[right])
{
break;
}
current = input.Substring(left, (right - left + 1));
longest = current.Length > longest.Length ? current : longest;
left--;
right++;
}
}
return longest;
}

This is O(n^3) algorithm:
This part takes O(n^2):
// O(n) times for while loop
while (0 <= left && right < input.Length)
{
if (input[left] != input[right])
{
break;
}
// taking substring is O(n)
current = input.Substring(left, (right - left + 1));
longest = current.Length > longest.Length ? current : longest;
left--;
right++;
}
Also there is an outer O(n), for loop, which causes to O(n*n^2).
You can improve your algorithm by changing this lines:
current = input.Substring(left, (right - left + 1));
longest = current.Length > longest.Length ? current : longest;
to:
currentLength = right - left + 1;
if(currentLength > longest)
{
longest = current.Length > longest.Length ? current : longest;
longestLeft = left;
longestRight = right;
}
and finally return a substring from longestLeft to longestRight. Actually avoid to use substring method too many times.

The if (input[left] != input[right]) statement is executed O(n^2) times, and so are the several assignments following it, in particular:
current = input.Substring(left, (right - left + 1));
In typical implementations of substring functions, a sequence of characters is copied from the string to a new string object. The copy is an O(n) operation, leading to O(n^3) time for the loops and substring operation.
One can fix the problem by moving the assignments to current and longest to after the closing bracket of the while construct. But note that left--; and right++; will then have executed one time more than in the existing code, so the assignment to current becomes
current = input.Substring(left+1, (right-1 - (left+1) + 1));
or
current = input.Substring(left+1, (right-left-1));
Thus, the O(n) substring operation is done at most O(n) times.

Related

Number flower pots in an arrangement

It's a Google interview question. There's a list of "T" and "F" only. All denotes a position such that T means position is occupied by a flower pot and F means pot is not there, so you can put another pot at this position. Find the number of pots that can be placed in a given arrangement such that no two pots are adjacent to each other(they can be adjacent in the given arrangement). If a position at the beginning is unoccupied then a pot can be placed if second position is also unoccupied and if the last position is unoccupied than a pot can be placed if second last position is also unoccupied. For ex.
TFFFTFFTFFFFT - returns 2
FFTTFFFFFTTFF - returns 4
I tried solving it by looking at adjacent values for every position with value F. Increased the counter if both adjacent positions were F and set this position as T. I need a better solution or any other solution(if any).
Let's analyse what has to be done.
So first we probably need to visit and examine each place. That suggests loop of some sort. E.g.:
for (int i = 0; i < myPlaces.Length; ++i)
When we are at a spot we have to check if it's occupied
if (place[i] == 'F')
but that's not enough to place the flower pot there. We have to check if the next and previous place is free
place[i-1]
place[i+1]
If all tree contain F you can put the flower pot there and move to next field
Now, we also have some exceptions from the rule. Beginning and end of the list. So you have to deal with them separately. E.g
if (i == 0)
{
// only check current position and next position
}
if (i == myPlaces.Length - 1) // minus 1 because indexing usually starts from 0
{
// only check current position and previous position
}
After that you can perform the checks mentioned previously.
Now let's think of the input data. Generally, it's a good habit not to modify the input data but make a copy and work on the copy. Also some data structures work better than the others for different tasks. Here you can use simple string to keep entry values. But I would say an array of chars would be a better option because then, when you find a place where you can put a flower pot you can actually replace the F with the T in an array. Then when you move to new spot your data structers knows that there is already a pot in the previous position so your algorithm won't put an adjacent one.
You would not be able to do that with string as strings are immutable and you would need to generate a new string each time.
Note that it's only a naive algorithm with a lot of scope for improvement and optimization. But my goal was rather to give some idea how to approach this kind of problems in general. I'll leave implementing of the details to you as an afternoon exercise before targeting a job at Google.
You may be able to do this with a modified Mergesort. Consider the flowerpots that can be placed in the singletons, then the flowerpots that can be placed in the doubleton merges of those singletons, up the tree to the full arrangement. It would complete in O(n lg n) for a list of n flowerpots.
There is certainly a way to do this with a modified Rod Cutting algorithm with complexity O(n^2). The subproblem is whether or not an open "false set" exists in the substring being considered. The "closed false sets" already have some maximum value computed for them. So, when a new character is added, it either increases the amount of flowerpots that can be inserted, or "locks in" the maximum quantity of available flowerpots for the substring.
Also, you know that the maximum flowerpots that can be placed in a set of n open positions bound by closed positions is n - 2 (else n-1 if only bracketed on one side, i.e. the string begins or ends with a "false set". The base condition (the first position is open, or the first position is closed) can calculated upon reaching the second flowerpot.
So, we can build up to the total number of flowerpots that can be inserted into the whole arrangement in terms of the maximum number of flowerpots that can be inserted into smaller subarrangements that have been previously calculated. By storing our previous calculations in an array, we reduce the amount of time necessary to calculate the maximum for the next subarrangement to a single array lookup and some constant-time calculations. This is the essence of dynamic programming!
EDIT: I updated the answer to provide a description of the Dynamic Programming approach. Please consider working through the interactive textbook I mentioned in the comments! http://interactivepython.org/runestone/static/pythonds/index.html
I would approach the problem like this. You need FFF to have one more pot, FFFFF for two pots, etc. To handle the end cases, add an F at each end.
Because this is very similar to a 16-bit integer, the algorithm should use tricks like binary arithmetic operations.
Here is an implementation in Python that uses bit masking (value & 1), bit shifting (value >>= 1) and math ((zeros - 1) / 2) to count empty slots and calculate how many flower pots could fit.
#value = 0b1000100100001
value = 0b0011000001100
width = 13
print bin(value)
pots = 0 # number of flower pots possible
zeros = 1 # number of zero bits in a row, start with one leading zero
for i in range(width):
if value & 1: # bit is one, count the number of zeros
if zeros > 0:
pots += (zeros - 1) / 2
zeros = 0
else: # bit is zero, increment the number found
zeros += 1
value >>= 1 # shift the bits to the right
zeros += 1 # add one trailing zero
pots += (zeros - 1) / 2
print pots, "flower pots"
The solution is really simple, check the previous and current value of the position and mark the position as plantable (or puttable) and increment the count. Read the next value, if it is already is planted, (backtrack and) change the previous value and decrement the count. The complexity is O(n). What we really want to check is the occurrence of 1001. Following is the implementation of the algorithm in Java.
public boolean canPlaceFlowers(List<Boolean> flowerbed, int numberToPlace) {
Boolean previous = false;
boolean puttable = false;
boolean prevChanged = false;
int planted = 0;
for (Boolean current : flowerbed) {
if (previous == false && current == false) {
puttable = true;
}
if (prevChanged == true && current == true) {
planted--;
}
if (puttable) {
previous = true;
prevChanged = true;
planted++;
puttable = false;
} else {
previous = current;
prevChanged = false;
}
}
if (planted >= numberToPlace) {
return true;
}
return false;
}
private static void canPlaceOneFlower(List<Boolean> flowerbed, FlowerBed fb) {
boolean result;
result = fb.canPlaceFlowers(flowerbed, 1);
System.out.println("Can place 1 flower");
if (result) {
System.out.println("-->Yes");
} else {
System.out.println("-->No");
}
}
private static void canPlaceTwoFlowers(List<Boolean> flowerbed, FlowerBed fb) {
boolean result;
result = fb.canPlaceFlowers(flowerbed, 2);
System.out.println("Can place 2 flowers");
if (result) {
System.out.println("-->Yes");
} else {
System.out.println("-->No");
}
}
private static void canPlaceThreeFlowers(List<Boolean> flowerbed, FlowerBed fb) {
boolean result;
result = fb.canPlaceFlowers(flowerbed, 3);
System.out.println("Can place 3 flowers");
if (result) {
System.out.println("-->Yes");
} else {
System.out.println("-->No");
}
}
private static void canPlaceFourFlowers(List<Boolean> flowerbed, FlowerBed fb) {
boolean result;
result = fb.canPlaceFlowers(flowerbed, 4);
System.out.println("Can place 4 flowers");
if (result) {
System.out.println("-->Yes");
} else {
System.out.println("-->No");
}
}
public static void main(String[] args) {
List<Boolean> flowerbed = makeBed(new int[] { 0, 0, 0, 0, 0, 0, 0 });
FlowerBed fb = new FlowerBed();
canPlaceFourFlowers(flowerbed, fb);
canPlaceThreeFlowers(flowerbed, fb);
flowerbed = makeBed(new int[] { 0, 0, 0, 1, 0, 0, 0 });
canPlaceFourFlowers(flowerbed, fb);
canPlaceThreeFlowers(flowerbed, fb);
canPlaceTwoFlowers(flowerbed, fb);
flowerbed = makeBed(new int[] { 1, 0, 0, 1, 0, 0, 0, 1 });
canPlaceFourFlowers(flowerbed, fb);
canPlaceThreeFlowers(flowerbed, fb);
canPlaceTwoFlowers(flowerbed, fb);
canPlaceOneFlower(flowerbed, fb);
}
My solution using dynamic programming.
ar is array in the form of ['F','T','F'].
import numpy as np
def pot(ar):
s = len(ar)
rt = np.zeros((s,s))
for k in range(0,s):
for i in range(s-k):
for j in range(i,i+k+1):
left = 0
right = 0
if ar[j] != 'F':
continue
if j-1 >= i and ar[j-1] == 'T':
continue
else:
left = 0
if j+1 <= i+k and ar[j+1] == 'T':
continue
else:
right = 0
if j-2 >= i:
left = rt[i][j-2]
if j+2 <= i+k:
right = rt[j+2][i+k]
rt[i][i+k] = max(rt[i][i+k], left+right+1)
return rt[0][len(ar)-1]
My solution written in C#
private static int CheckAvailableSlots(string str)
{
int counter = 0;
char[] chrs = str.ToCharArray();
if (chrs.FirstOrDefault().Equals('F'))
if (chrs.Length == 1)
counter++;
else if (chrs.Skip(1).FirstOrDefault().Equals('F'))
counter++;
if (chrs.LastOrDefault().Equals('F') && chrs.Reverse().Skip(1).FirstOrDefault().Equals('F'))
counter++;
for (int i = 1; i < chrs.Length - 2; i++)
{
if (chrs[i - 1].Equals('T'))
continue;
else if (chrs[i].Equals('F') && chrs[i + 1].Equals('F'))
{
chrs[i] = 'T';
counter++;
i++;
}
else
i++;
}
return counter;
}
// 1='T'
// 0='F'
int[] flowerbed = new int[] {1,0,0,0,0,1};
public boolean canPlaceFlowers(int[] flowerbed, int n) {
int tg = 0;
for (int i = 0, g = 1; i < flowerbed.length && tg < n; i++) {
g += flowerbed[i] == 0 ? flowerbed.length - 1 == i ? 2 : 1 : 0;
if (flowerbed[i] == 1 || i == flowerbed.length - 1) {
tg += g / 2 - (g % 2 == 0 ? 1 : 0);
g = 0;
}
}
return tg >= n;
}
Most of these answers (unless they alter the array or traverse and a copy) dont consider the situation where the first 3 (or last 3) pots are empty. These solutions will incorrectly determine that FFFT will contain 2 spaces, rather than just one. We therefore need to start at the third element (rather than then second) and end at index length - 3 (rather than length - 2). Also, while looping through the array, if an eligible index is found, the index just be incremented by 2, otherwise TTFFFFT would give 2 available plots instead of one. This is true unless you alter the array while looping or use a copy of the array and alter it.
Edit: this holds true unless the question is how many spaces are available for planting, rather than how many total plants can be added

Return the number of elements of an array that is the most "expensive"

I recently stumbled upon an interesting problem, an I am wondering if my solution is optimal.
You are given an array of zeros and ones. The goal is to return the
amount zeros and the amount of ones in the most expensive sub-array.
The cost of an array is the amount of 1s divided by amount of 0s. In
case there are no zeros in the sub-array, the cost is zero.
At first I tried brute-forcing, but for an array of 10,000 elements it was far too slow and I ran out of memory.
My second idea was instead of creating those sub-arrays, to remember the start and the end of the sub-array. That way I saved a lot of memory, but the complexity was still O(n2).
My final solution that I came up is I think O(n). It goes like this:
Start at the beginning of the array, for each element, calculate the cost of the sub-arrays starting from 1, ending at the current index. So we would start with a sub-array consisting of the first element, then first and second etc. Since the only thing that we need to calculate the cost, is the amount of 1s and 0s in the sub-array, I could find the optimal end of the sub-array.
The second step was to start from the end of the sub-array from step one, and repeat the same to find the optimal beginning. That way I am sure that there is no better combination in the whole array.
Is this solution correct? If not, is there a counter-example that will show that this solution is incorrect?
Edit
For clarity:
Let's say our input array is 0101.
There are 10 subarrays:
0,1,0,1,01,10,01,010,101 and 0101.
The cost of the most expensive subarray would be 2 since 101 is the most expensive subarray. So the algorithm should return 1,2
Edit 2
There is one more thing that I forgot, if 2 sub-arrays have the same cost, the longer one is "more expensive".
Let me sketch a proof for my assumption:
(a = whole array, *=zero or more, +=one or more, {n}=exactly n)
Cases a=0* and a=1+ : c=0
Cases a=01+ and a=1+0 : conforms to 1*0{1,2}1*, a is optimum
For the normal case, a contains one or more 0s and 1s.
This means there is some optimum sub-array of non-zero cost.
(S) Assume s is an optimum sub-array of a.
It contains one or more zeros. (Otherwise its cost would be zero).
(T) Let t be the longest `1*0{1,2}+1*` sequence within s
(and among the equally long the one with with most 1s).
(Note: There is always one such, e.g. `10` or `01`.)
Let N be the number of 1s in t.
Now, we prove that always t = s.
By showing it is not possible to add adjacent parts of s to t if (S).
(E) Assume t shorter than s.
We cannot add 1s at either side, otherwise not (T).
For each 0 we add from s, we have to add at least N more 1s
later to get at least the same cost as our `1*0+1*`.
This means: We have to add at least one run of N 1s.
If we add some run of N+1, N+2 ... somewhere than not (T).
If we add consecutive zeros, we need to compensate
with longer runs of 1s, thus not (T).
This leaves us with the only option of adding single zeors and runs of N 1s each.
This would give (symmetry) `1{n}*0{1,2}1{m}01{n+m}...`
If m>0 then `1{m}01{n+m}` is longer than `1{n}0{1,2}1{m}`, thus not (T).
If m=0 then we get `1{n}001{n}`, thus not (T).
So assumption (E) must be wrong.
Conclusion: The optimum sub-array must conform to 1*0{1,2}1*.
Here is my O(n) impl in Java according to the assumption in my last comment (1*01* or 1*001*):
public class Q19596345 {
public static void main(String[] args) {
try {
String array = "0101001110111100111111001111110";
System.out.println("array=" + array);
SubArray current = new SubArray();
current.array = array;
SubArray best = (SubArray) current.clone();
for (int i = 0; i < array.length(); i++) {
current.accept(array.charAt(i));
SubArray candidate = (SubArray) current.clone();
candidate.trim();
if (candidate.cost() > best.cost()) {
best = candidate;
System.out.println("better: " + candidate);
}
}
System.out.println("best: " + best);
} catch (Exception ex) { ex.printStackTrace(System.err); }
}
static class SubArray implements Cloneable {
String array;
int start, leftOnes, zeros, rightOnes;
// optimize 1*0*1* by cutting
void trim() {
if (zeros > 1) {
if (leftOnes < rightOnes) {
start += leftOnes + (zeros - 1);
leftOnes = 0;
zeros = 1;
} else if (leftOnes > rightOnes) {
zeros = 1;
rightOnes = 0;
}
}
}
double cost() {
if (zeros == 0) return 0;
else return (leftOnes + rightOnes) / (double) zeros +
(leftOnes + zeros + rightOnes) * 0.00001;
}
void accept(char c) {
if (c == '1') {
if (zeros == 0) leftOnes++;
else rightOnes++;
} else {
if (rightOnes > 0) {
start += leftOnes + zeros;
leftOnes = rightOnes;
zeros = 0;
rightOnes = 0;
}
zeros++;
}
}
public Object clone() throws CloneNotSupportedException { return super.clone(); }
public String toString() { return String.format("%s at %d with cost %.3f with zeros,ones=%d,%d",
array.substring(start, start + leftOnes + zeros + rightOnes), start, cost(), zeros, leftOnes + rightOnes);
}
}
}
If we can show the max array is always 1+0+1+, 1+0, or 01+ (Regular expression notation then we can calculate the number of runs
So for the array (010011), we have (always starting with a run of 1s)
0,1,1,2,2
so the ratios are (0, 1, 0.3, 1.5, 1), which leads to an array of 10011 as the final result, ignoring the one runs
Cost of the left edge is 0
Cost of the right edge is 2
So in this case, the right edge is the correct answer -- 011
I haven't yet been able to come up with a counterexample, but the proof isn't obvious either. Hopefully we can crowd source one :)
The degenerate cases are simpler
All 1's and 0's are obvious, as they all have the same cost.
A string of just 1+,0+ or vice versa is all the 1's and a single 0.
How about this? As a C# programmer, I am thinking we can use something like Dictionary of <int,int,int>.
The first int would be use as key, second as subarray number and the third would be for the elements of sub-array.
For your example
key|Sub-array number|elements
1|1|0
2|2|1
3|3|0
4|4|1
5|5|0
6|5|1
7|6|1
8|6|0
9|7|0
10|7|1
11|8|0
12|8|1
13|8|0
14|9|1
15|9|0
16|9|1
17|10|0
18|10|1
19|10|0
20|10|1
Then you can run through the dictionary and store the highest in a variable.
var maxcost=0
var arrnumber=1;
var zeros=0;
var ones=0;
var cost=0;
for (var i=1;i++;i<=20+1)
{
if ( dictionary.arraynumber[i]!=dictionary.arraynumber[i-1])
{
zeros=0;
ones=0;
cost=0;
if (cost>maxcost)
{
maxcost=cost;
}
}
else
{
if (dictionary.values[i]==0)
{
zeros++;
}
else
{
ones++;
}
cost=ones/zeros;
}
}
This will be log(n^2), i hope and u just need 3n size of memory of the array?
I think we can modify the maximal subarray problem to fit to this question. Here's my attempt at it:
void FindMaxRatio(int[] array, out maxNumOnes, out maxNumZeros)
{
maxNumOnes = 0;
maxNumZeros = 0;
int numOnes = 0;
int numZeros = 0;
double maxSoFar = 0;
double maxEndingHere = 0;
for(int i = 0; i < array.Size; i++){
if(array[i] == 0) numZeros++;
if(array[i] == 1) numOnes++;
if(numZeros == 0) maxEndingHere = 0;
else maxEndingHere = numOnes/(double)numZeros;
if(maxEndingHere < 1 && maxEndingHere > 0) {
numZeros = 0;
numOnes = 0;
}
if(maxSoFar < maxEndingHere){
maxSoFar = maxEndingHere;
maxNumOnes = numOnes;
maxNumZeros = numZeros;
}
}
}
I think the key is if the ratio is less then 1, we can disregard that subsequence because
there will always be a subsequence 01 or 10 whose ratio is 1. This seemed to work for 010011.

Convert string to palindrome string with minimum insertions

In order to find the minimal number of insertions required to convert a given string(s) to palindrome I find the longest common subsequence of the string(lcs_string) and its reverse. Therefore the number of insertions to be made is length(s) - length(lcs_string)
What method should be employed to find the equivalent palindrome string on knowing the number of insertions to be made?
For example :
1) azbzczdzez
Number of insertions required : 5
Palindrome string : azbzcezdzeczbza
Although multiple palindrome strings may exist for the same string but I want to find only one palindrome?
Let S[i, j] represents a sub-string of string S starting from index i and ending at index j (both inclusive) and c[i, j] be the optimal solution for S[i, j].
Obviously, c[i, j] = 0 if i >= j.
In general, we have the recurrence:
To elaborate on VenomFangs answer, there is a simple dynamic programming solution to this one. Note that I'm assuming the only operation allowed here is insertion of characters (no deletion, updates). Let S be a string of n characters. The simple recursion function P for this is:
= P [i+1 .. j-1], if S[i] = S[j]
P[i..j]
= min (P[i..j-1], P[i+1..j]) + 1,
If you'd like more explanation on why this is true, post a comment and i'd be happy to explain (though its pretty easy to see with a little thought). This, by the way, is the exact opposite of the LCS function you use, hence validating that your solution is in fact optimal. Of course its wholly possible I bungled, if so, someone do let me know!
Edit: To account for the palindrome itself, this can be easily done as follows:
As stated above, P[1..n] would give you the number of insertions required to make this string a palindrome. Once the above two-dimensional array is built up, here's how you find the palindrome:
Start with i=1, j=n. Now,
string output = "";
while(i < j)
{
if (P[i][j] == P[i+1][j-1]) //this happens if no insertions were made at this point
{
output = output + S[i];
i++;
j--;
}
else
if (P[i][j] == P[i+1][j]) //
{
output = output + S[i];
i++;
}
else
{
output = S[j] + output;
j--;
}
}
cout<<output<<reverse(output);
//You may have to be careful about odd sized palindromes here,
// I haven't accounted for that, it just needs one simple check
Does that make better reading?
The solution looks to be a dynamic programming solution.
You may be able to find your answer in the following post: How can I compute the number of characters required to turn a string into a palindrome?
PHP Solution of O(n)
function insertNode(&$arr, $idx, $val) {
$arr = array_merge(array_slice($arr, 0, $idx), array($val), array_slice($arr, $idx));
}
function createPalindrome($arr, $s, $e) {
$i = 0;
while(true) {
if($s >= $e) {
break;
} else if($arr[$s] == $arr[$e]) {
$s++; $e--; // shrink the queue from both sides
continue;
} else {
insertNode($arr, $s, $arr[$e]);
$s++;
}
}
echo implode("", $arr);
}
$arr = array('b', 'e', 'a', 'a', 'c', 'd', 'a', 'r', 'e');
echo createPalindrome ( $arr, 0, count ( $arr ) - 1 );
Simple. See below :)
String pattern = "abcdefghgf";
boolean isPalindrome = false;
int i=0,j=pattern.length()-1;
int mismatchCounter = 0;
while(i<=j)
{
//reverse matching
if(pattern.charAt(i)== pattern.charAt(j))
{
i++; j--;
isPalindrome = true;
continue;
}
else if(pattern.charAt(i)!= pattern.charAt(j))
{
i++;
mismatchCounter++;
}
}
System.out.println("The pattern string is :"+pattern);
System.out.println("Minimum number of characters required to make this string a palidnrome : "+mismatchCounter);

Control flow graph & cyclomatic complexity

I have to find the control flow graph and cyclomatic complexity for this code and then suggest some white box test cases and black box test cases. But I am having trouble making a CFG for the code.
Would appreciate some help on test cases as well.
private void downShift(int index)
{
// index of "child", which will be either index * 2 or index * 2 + 1
int childIndex;
// temp storage for item at index where shifting begins
Comparable temp = theItems[index];
// shift items, as needed
while (index * 2 <= theSize)
{
// set childIndex to "left" child
childIndex = index * 2;
// move to "right" child if "right" child < "left" child
if (childIndex != theSize && theItems[childIndex + 1].compareTo(theItems[childIndex]) < 0)
childIndex++;
if (theItems[childIndex].compareTo(temp) < 0)
{
// shift "child" down if child < temp
theItems[index] = theItems[childIndex];
}
else
{
// shifting complete
break;
}
// increment index
index = childIndex;
}
// position item that was originally at index where shifting began
theItems[index] = temp;
}
The basic cyclomatic complexity here is 4: while + if + if + 1. If you consider extended cyclomatic complexity as is done by Understand or CMTJava, you also need to add 1 for the conjuncts, so it will be 5. Unconditional control statements such as break do not affect the cyclomatic complexity value.

How can I compute the number of characters required to turn a string into a palindrome?

I recently found a contest problem that asks you to compute the minimum number of characters that must be inserted (anywhere) in a string to turn it into a palindrome.
For example, given the string: "abcbd" we can turn it into a palindrome by inserting just two characters: one after "a" and another after "d": "adbcbda".
This seems to be a generalization of a similar problem that asks for the same thing, except characters can only be added at the end - this has a pretty simple solution in O(N) using hash tables.
I have been trying to modify the Levenshtein distance algorithm to solve this problem, but haven't been successful. Any help on how to solve this (it doesn't necessarily have to be efficient, I'm just interested in any DP solution) would be appreciated.
Note: This is just a curiosity. Dav proposed an algorithm which can be modified to DP algorithm to run in O(n^2) time and O(n^2) space easily (and perhaps O(n) with better bookkeeping).
Of course, this 'naive' algorithm might actually come in handy if you decide to change the allowed operations.
Here is a 'naive'ish algorithm, which can probably be made faster with clever bookkeeping.
Given a string, we guess the middle of the resulting palindrome and then try to compute the number of inserts required to make the string a palindrome around that middle.
If the string is of length n, there are 2n+1 possible middles (Each character, between two characters, just before and just after the string).
Suppose we consider a middle which gives us two strings L and R (one to left and one to right).
If we are using inserts, I believe the Longest Common Subsequence algorithm (which is a DP algorithm) can now be used the create a 'super' string which contains both L and reverse of R, see Shortest common supersequence.
Pick the middle which gives you the smallest number inserts.
This is O(n^3) I believe. (Note: I haven't tried proving that it is true).
My C# solution looks for repeated characters in a string and uses them to reduce the number of insertions. In a word like program, I use the 'r' characters as a boundary. Inside of the 'r's, I make that a palindrome (recursively). Outside of the 'r's, I mirror the characters on the left and the right.
Some inputs have more than one shortest output: output can be toutptuot or outuputuo. My solution only selects one of the possibilities.
Some example runs:
radar -> radar, 0 insertions
esystem -> metsystem, 2 insertions
message -> megassagem, 3 insertions
stackexchange -> stegnahckexekchangets, 8 insertions
First I need to check if an input is already a palindrome:
public static bool IsPalindrome(string str)
{
for (int left = 0, right = str.Length - 1; left < right; left++, right--)
{
if (str[left] != str[right])
return false;
}
return true;
}
Then I need to find any repeated characters in the input. There may be more than one. The word message has two most-repeated characters ('e' and 's'):
private static bool TryFindMostRepeatedChar(string str, out List<char> chs)
{
chs = new List<char>();
int maxCount = 1;
var dict = new Dictionary<char, int>();
foreach (var item in str)
{
int temp;
if (dict.TryGetValue(item, out temp))
{
dict[item] = temp + 1;
maxCount = temp + 1;
}
else
dict.Add(item, 1);
}
foreach (var item in dict)
{
if (item.Value == maxCount)
chs.Add(item.Key);
}
return maxCount > 1;
}
My algorithm is here:
public static string MakePalindrome(string str)
{
List<char> repeatedList;
if (string.IsNullOrWhiteSpace(str) || IsPalindrome(str))
{
return str;
}
//If an input has repeated characters,
// use them to reduce the number of insertions
else if (TryFindMostRepeatedChar(str, out repeatedList))
{
string shortestResult = null;
foreach (var ch in repeatedList) //"program" -> { 'r' }
{
//find boundaries
int iLeft = str.IndexOf(ch); // "program" -> 1
int iRight = str.LastIndexOf(ch); // "program" -> 4
//make a palindrome of the inside chars
string inside = str.Substring(iLeft + 1, iRight - iLeft - 1); // "program" -> "og"
string insidePal = MakePalindrome(inside); // "og" -> "ogo"
string right = str.Substring(iRight + 1); // "program" -> "am"
string rightRev = Reverse(right); // "program" -> "ma"
string left = str.Substring(0, iLeft); // "program" -> "p"
string leftRev = Reverse(left); // "p" -> "p"
//Shave off extra chars in rightRev and leftRev
// When input = "message", this loop converts "meegassageem" to "megassagem",
// ("ee" to "e"), as long as the extra 'e' is an inserted char
while (left.Length > 0 && rightRev.Length > 0 &&
left[left.Length - 1] == rightRev[0])
{
rightRev = rightRev.Substring(1);
leftRev = leftRev.Substring(1);
}
//piece together the result
string result = left + rightRev + ch + insidePal + ch + right + leftRev;
//find the shortest result for inputs that have multiple repeated characters
if (shortestResult == null || result.Length < shortestResult.Length)
shortestResult = result;
}
return shortestResult;
}
else
{
//For inputs that have no repeated characters,
// just mirror the characters using the last character as the pivot.
for (int i = str.Length - 2; i >= 0; i--)
{
str += str[i];
}
return str;
}
}
Note that you need a Reverse function:
public static string Reverse(string str)
{
string result = "";
for (int i = str.Length - 1; i >= 0; i--)
{
result += str[i];
}
return result;
}
C# Recursive solution adding to the end of the string:
There are 2 base cases. When length is 1 or 2. Recursive case: If the extremes are equal, then
make palindrome the inner string without the extremes and return that with the extremes.
If the extremes are not equal, then add the first character to the end and make palindrome the
inner string including the previous last character. return that.
public static string ConvertToPalindrome(string str) // By only adding characters at the end
{
if (str.Length == 1) return str; // base case 1
if (str.Length == 2 && str[0] == str[1]) return str; // base case 2
else
{
if (str[0] == str[str.Length - 1]) // keep the extremes and call
return str[0] + ConvertToPalindrome(str.Substring(1, str.Length - 2)) + str[str.Length - 1];
else //Add the first character at the end and call
return str[0] + ConvertToPalindrome(str.Substring(1, str.Length - 1)) + str[0];
}
}

Resources