Complexity of the word break algorithm - algorithm

My Q is similar to the one asked on stack overflow in the past
http://www.geeksforgeeks.org/dynamic-programming-set-32-word-break-problem/
The solution I wrote, I am not able to understand that since I do not use DP but still how is it that my sol is solving overlapping problems. I think it is not. Can someone clarify?
my dicitonary that i use is {"cat", "catdog", "dog", "mouse"} and test string as "catdogmouse"
Here is the method i wrote
public static boolean recursiveWordBreak2(String s, int start) {
System.out.println("s is:"+s.substring(start));
if (s.isEmpty() || start >= s.length()) {
return true;
}
for (int i = start; i <= s.length(); i++) {
String str = s.substring(start, i);
System.out.println("substr:" + str);
if (dictSet.contains(str)) {
return recursiveWordBreak2(s, i);
}
}
return false;
}

Your solution uses recursion -only-. recognising that that problem is DP allows you to MEMOIZE (remember) previous results so that you can reuse them without doing the recursion again.
in the link you provided if the dictionary is {a,b,c,d,e} and the input is "abcde", you would need to check if "cde" is valid twice with recursive code, where a DP solution would remember "cde" is valid and only have to check once.
edit: dictionary {a,b,c,d,e} should be {a, ab, cde} to illustrate checking 'cde' twice
edit2 (see comment on algo having logic issue):
if (dictSet.contains(str)) {
return recursiveWordBreak2(s, i);
}
should be
if (dictSet.contains(str) && recursiveWordBreak2(s, i)) { return true }
that way if contains = true but recursiveWB = false, the outer loop will continue to check length+1 instead of returning false

Related

Binary Search: Program doesn't terminate

I've been trying to learn algorithms and as part of this I have been trying to code binary search and the logic seems fine. The code doesn't terminate and the IDE stays idle forever. I don't understand what I'm doing wrong. Any help is appreciated. Thanks in advance!
public class BinarySearch {
public static void main(String[] args) {
int[] arr = {1, 2, 3, 4, 5};
int no = 5;
System.out.print(binSearch(arr, no, 0, arr.length - 1));
}
private static boolean binSearch(int[] arr, int no, int start, int end) {
while(start <= end) {
int mid = (start + end) / 2;
if (arr[mid] == no) {
return true;
} else if (no > arr[mid]) {
binSearch(arr, no, mid + 1, end);
} else if(no < arr[mid]) {
binSearch(arr, no, start, mid - 1);
}
}
return false;
}
}
You are missing the return on the two recursive calls:
private static bool binSearch(int[] arr, int no, int start, int end) {
while(start <= end) {
int mid = (start + end) / 2;
if (arr[mid] == no) {
return true;
} else if (no > arr[mid]) {
return binSearch(arr, no, mid + 1, end);
} else if(no < arr[mid]) {
return binSearch(arr, no, start, mid - 1);
}
}
return false;
}
You could also consider writing it in a non-recursive loop.
okay so i think we review recursion a bit
binSearch(arr, num, start, end){
while (start<=end){
int mid = (start+end)/2;
if (arr[mid] == no) {
return true #when it finally matches return true
}
else if (arr[mid] > no) {
binSearch(arr, no, start, mid-1) #call binSearch for new value
}
}
}
Just to illustrate recursion, imagine we want some value B for an input A. Now imagine a node or some point as an origin that represents our input A. For every point or node that follows after A is some step we take towards finding the value B.
Once we find the value that we want, the structure of our approach can be illustrated as a single graph with one direction. A --> C --> --> D --> B
That is essentially how recursion works. Now first, lets take a look at your else if statement. When your parameters meet one of the else if conditions you make a call to your binSearch method.
What this does is basically create a new point of origin rather than working off the initial one. So lets say at iteration number 3 you finally meet your boolean condition and it returns true. But where does it return true to?
Only the last call or the most recent call that was made to binSearch. Lets call it iteration 2.
Now once the return value is made it simply moves on to the next block of code which brings us to your while loop. The only way your code can move on to the next block of code (which is returning the false value), is to break out of the while loop, ie. have your start value be greater than your end value.
But remember, we are on iteration 2. And iteration 2 was given the values for start and end that satisfied the while-loop so it loops again and whatever else-if statement iteration 2 landed on before the final iteration that returned true, it will keep repeating indefinitely.
The obvious solution as mentioned above is to put 'return' before the call is made as that will return all the way back to the original call to binSearch.
Also, the while loop is not necessary unless you are doing it without recursion.

Complexity of |= in Ruby

What is the complexity (Big O), for this operation:
my_array |= [new_element]
Is it O(n) because it needs to go through the existing array checking if new_element exists?
Let's expand upon Wand Maker's comment.
Take a look at
http://ruby-doc.org/core-2.2.3/Array.html#method-i-7C
https://github.com/ruby/ruby/blob/trunk/array.c
Source for rb_ary_or
static VALUE
rb_ary_or(VALUE ary1, VALUE ary2)
{
VALUE hash, ary3;
long i;
ary2 = to_ary(ary2);
hash = ary_make_hash(ary1);
for (i=0; i<RARRAY_LEN(ary2); i++) {
VALUE elt = RARRAY_AREF(ary2, i);
if (!st_update(RHASH_TBL_RAW(hash), (st_data_t)elt, ary_hash_orset, (st_data_t)elt)) {
RB_OBJ_WRITTEN(hash, Qundef, elt);
}
}
ary3 = rb_hash_values(hash);
ary_recycle_hash(hash);
return ary3;
}
I would say that the answer to your question is "yes" (at best -- refer to #cliffordheath's comment)", as it seems we have O(n1) for ary_make_hash(aryl) and O(n2) for the for cycle.

does eight queens backtracking solution require any sorting at all?

Hi I have just sat my final year programming exam, I was asked the question:
what sorting and searching algorithms are used to solve the 8 queens problem.
Correct me if I am wrong but there is no sorting at all...
I understand that there is a basic level of searching needed when placing the queen and during backtracking, but where does sorting come into this? if at all?
Below is what I have been looking at and just cant see it.
public class Queens
{
int[] positions;
int counter = 0;
boolean isFinished = false;
public Queens()
{
positions = new int[8];
placeQueens(0);
}
public boolean canPlaceQueen(int row, int column)
{
for (int i = 0; i < row; i++)
{
if (positions[i] == column || (i - row)== (positions[i]-column) || (i-row)== (column - positions[i]))
{
return false;
}
}
return true;
}
public void placeQueens(int row)
{
counter++;
printQueens();
for (int column = 0; column < 8; column++)
{
if (canPlaceQueen(row, column))
{
positions[row] = column;
if (row == 8-1)
{
System.out.println("FINAL " +counter);
isFinished = true;
printQueens();
}
else if(!isFinished)
{
placeQueens(row+1);
}
}
}
}
public void printQueens()
{
for (int i = 0; i < 8; i++)
{
for (int j = 0; j< 8; j++)
{
if (positions[i] == j)
{
System.out.print("Q ");
}
else
{
System.out.print("* ");
}
}
System.out.println();
}
System.out.println();
}
}
I think in this case you're misinterpreting what "sorting" means. In order for backtracking to work you need to be analyzing positions in some sort of predictable ordering. If your algorithm is not analyzing positions in a set order, then when you "prune" a set of positions, you may have pruned a valid position. Without this "ordering" or tree like structure of positions backtracking does not work. You do not, however, need to pre-sort a set of positions or anything like that. This would, in fact, defeat the purpose of backtracking.
The idea is that, some combination of positions never even have to be built. Once a conflict is found ALL iterations involving that combination are no longer even considered. It is the ordering in which these combinations are built that is a concern, not sorting them ahead of time. All combinations must be built and considered in proper oder. This allows u to know when we give up on a "branch" that all combinations built on this branch would have been equally(or even worse) incorrect as the option we just rejected, otherwise you can "over prune" your result set, and miss a proper combination. But no NlogN sorting algorithms are required. At least not in the example of the n-queens problem. In fact, if you pre-built all positions and sorted them, you are completely ignoring the dynamic programming elements that allow us to speed up the computation of this problem considerably.
http://en.wikipedia.org/wiki/Backtracking

Convert string to palindrome string with minimum insertions

In order to find the minimal number of insertions required to convert a given string(s) to palindrome I find the longest common subsequence of the string(lcs_string) and its reverse. Therefore the number of insertions to be made is length(s) - length(lcs_string)
What method should be employed to find the equivalent palindrome string on knowing the number of insertions to be made?
For example :
1) azbzczdzez
Number of insertions required : 5
Palindrome string : azbzcezdzeczbza
Although multiple palindrome strings may exist for the same string but I want to find only one palindrome?
Let S[i, j] represents a sub-string of string S starting from index i and ending at index j (both inclusive) and c[i, j] be the optimal solution for S[i, j].
Obviously, c[i, j] = 0 if i >= j.
In general, we have the recurrence:
To elaborate on VenomFangs answer, there is a simple dynamic programming solution to this one. Note that I'm assuming the only operation allowed here is insertion of characters (no deletion, updates). Let S be a string of n characters. The simple recursion function P for this is:
= P [i+1 .. j-1], if S[i] = S[j]
P[i..j]
= min (P[i..j-1], P[i+1..j]) + 1,
If you'd like more explanation on why this is true, post a comment and i'd be happy to explain (though its pretty easy to see with a little thought). This, by the way, is the exact opposite of the LCS function you use, hence validating that your solution is in fact optimal. Of course its wholly possible I bungled, if so, someone do let me know!
Edit: To account for the palindrome itself, this can be easily done as follows:
As stated above, P[1..n] would give you the number of insertions required to make this string a palindrome. Once the above two-dimensional array is built up, here's how you find the palindrome:
Start with i=1, j=n. Now,
string output = "";
while(i < j)
{
if (P[i][j] == P[i+1][j-1]) //this happens if no insertions were made at this point
{
output = output + S[i];
i++;
j--;
}
else
if (P[i][j] == P[i+1][j]) //
{
output = output + S[i];
i++;
}
else
{
output = S[j] + output;
j--;
}
}
cout<<output<<reverse(output);
//You may have to be careful about odd sized palindromes here,
// I haven't accounted for that, it just needs one simple check
Does that make better reading?
The solution looks to be a dynamic programming solution.
You may be able to find your answer in the following post: How can I compute the number of characters required to turn a string into a palindrome?
PHP Solution of O(n)
function insertNode(&$arr, $idx, $val) {
$arr = array_merge(array_slice($arr, 0, $idx), array($val), array_slice($arr, $idx));
}
function createPalindrome($arr, $s, $e) {
$i = 0;
while(true) {
if($s >= $e) {
break;
} else if($arr[$s] == $arr[$e]) {
$s++; $e--; // shrink the queue from both sides
continue;
} else {
insertNode($arr, $s, $arr[$e]);
$s++;
}
}
echo implode("", $arr);
}
$arr = array('b', 'e', 'a', 'a', 'c', 'd', 'a', 'r', 'e');
echo createPalindrome ( $arr, 0, count ( $arr ) - 1 );
Simple. See below :)
String pattern = "abcdefghgf";
boolean isPalindrome = false;
int i=0,j=pattern.length()-1;
int mismatchCounter = 0;
while(i<=j)
{
//reverse matching
if(pattern.charAt(i)== pattern.charAt(j))
{
i++; j--;
isPalindrome = true;
continue;
}
else if(pattern.charAt(i)!= pattern.charAt(j))
{
i++;
mismatchCounter++;
}
}
System.out.println("The pattern string is :"+pattern);
System.out.println("Minimum number of characters required to make this string a palidnrome : "+mismatchCounter);

Longest common prefix for n string

Given n string of max length m. How can we find the longest common prefix shared by at least two strings among them?
Example: ['flower', 'flow', 'hello', 'fleet']
Answer: fl
I was thinking of building a Trie for all the string and then checking the deepest node (satisfies longest) that branches out to two/more substrings (satisfies commonality). This takes O(n*m) time and space. Is there a better way to do this
Why to use trie(which takes O(mn) time and O(mn) space, just use the basic brute force way. first loop, find the shortest string as minStr, which takes o(n) time, second loop, compare one by one with this minStr, and keep an variable which indicates the rightmost index of minStr, this loop takes O(mn) where m is the shortest length of all strings. The code is like below,
public String longestCommonPrefix(String[] strs) {
if(strs.length==0) return "";
String minStr=strs[0];
for(int i=1;i<strs.length;i++){
if(strs[i].length()<minStr.length())
minStr=strs[i];
}
int end=minStr.length();
for(int i=0;i<strs.length;i++){
int j;
for( j=0;j<end;j++){
if(minStr.charAt(j)!=strs[i].charAt(j))
break;
}
if(j<end)
end=j;
}
return minStr.substring(0,end);
}
there is an O(|S|*n) solution to this problem, using a trie. [n is the number of strings, S is the longest string]
(1) put all strings in a trie
(2) do a DFS in the trie, until you find the first vertex with more than 1 "edge".
(3) the path from the root to the node you found at (2) is the longest common prefix.
There is no possible faster solution then it [in terms of big O notation], at the worst case, all your strings are identical - and you need to read all of them to know it.
I would sort them, which you can do in n lg n time. Then any strings with common prefixes will be right next to eachother. In fact you should be able to keep a pointer of which index you're currently looking at and work your way down for a pretty speedy computation.
As a completely different answer from my other answer...
You can, with one pass, bucket every string based on its first letter.
With another pass you can sort each bucket based on its second later. (This is known as radix sort, which is O(n*m), and O(n) with each pass.) This gives you a baseline prefix of 2.
You can safely remove from your dataset any elements that do not have a prefix of 2.
You can continue the radix sort, removing elements without a shared prefix of p, as p approaches m.
This will give you the same O(n*m) time that the trie approach does, but will always be faster than the trie since the trie must look at every character in every string (as it enters the structure), while this approach is only guaranteed to look at 2 characters per string, at which point it culls much of the dataset.
The worst case is still that every string is identical, which is why it shares the same big O notation, but will be faster in all cases as is guaranteed to use less comparisons since on any "non-worst-case" there are characters that never need to be visited.
public String longestCommonPrefix(String[] strs) {
if (strs == null || strs.length == 0)
return "";
char[] c_list = strs[0].toCharArray();
int len = c_list.length;
int j = 0;
for (int i = 1; i < strs.length; i++) {
for (j = 0; j < len && j < strs[i].length(); j++)
if (c_list[j] != strs[i].charAt(j))
break;
len = j;
}
return new String(c_list).substring(0, len);
}
It happens that the bucket sort (radix sort) described by corsiKa can be extended such that all strings are eventually placed alone in a bucket, and at that point, the LCP for such a lonely string is known. Further, the shustring of each string is also known; it is one longer than is the LCP. The bucket sort is defacto the construction of a suffix array but, only partially so. Those comparisons that are not performed (as described by corsiKa) indeed represent those portions of the suffix strings that are not added to the suffix array. Finally, this method allows for determination of not just the LCP and shustrings, but also one may easily find those subsequences that are not present within the string.
Since the world is obviously begging for an answer in Swift, here's mine ;)
func longestCommonPrefix(strings:[String]) -> String {
var commonPrefix = ""
var indices = strings.map { $0.startIndex}
outerLoop:
while true {
var toMatch: Character = "_"
for (whichString, f) in strings.enumerate() {
let cursor = indices[whichString]
if cursor == f.endIndex { break outerLoop }
indices[whichString] = cursor.successor()
if whichString == 0 { toMatch = f[cursor] }
if toMatch != f[cursor] { break outerLoop }
}
commonPrefix.append(toMatch)
}
return commonPrefix
}
Swift 3 Update:
func longestCommonPrefix(strings:[String]) -> String {
var commonPrefix = ""
var indices = strings.map { $0.startIndex}
outerLoop:
while true {
var toMatch: Character = "_"
for (whichString, f) in strings.enumerated() {
let cursor = indices[whichString]
if cursor == f.endIndex { break outerLoop }
indices[whichString] = f.characters.index(after: cursor)
if whichString == 0 { toMatch = f[cursor] }
if toMatch != f[cursor] { break outerLoop }
}
commonPrefix.append(toMatch)
}
return commonPrefix
}
What's interesting to note:
this runs in O^2, or O(n x m) where n is the number of strings and m
is the length of the shortest one.
this uses the String.Index data type and thus deals with Grapheme Clusters which the Character type represents.
And given the function I needed to write in the first place:
/// Takes an array of Strings representing file system objects absolute
/// paths and turn it into a new array with the minimum number of common
/// ancestors, possibly pushing the root of the tree as many level downwards
/// as necessary
///
/// In other words, we compute the longest common prefix and remove it
func reify(fullPaths:[String]) -> [String] {
let lcp = longestCommonPrefix(fullPaths)
return fullPaths.map {
return $0[lcp.endIndex ..< $0.endIndex]
}
}
here is a minimal unit test:
func testReifySimple() {
let samplePaths:[String] = [
"/root/some/file"
, "/root/some/other/file"
, "/root/another/file"
, "/root/direct.file"
]
let expectedPaths:[String] = [
"some/file"
, "some/other/file"
, "another/file"
, "direct.file"
]
let reified = PathUtilities().reify(samplePaths)
for (index, expected) in expectedPaths.enumerate(){
XCTAssert(expected == reified[index], "failed match, \(expected) != \(reified[index])")
}
}
Perhaps a more intuitive solution. Channel the already found prefix out of earlier iteration as input string to the remaining or next string input. [[[w1, w2], w3], w4]... so on], where [] is supposedly the LCP of two strings.
public String findPrefixBetweenTwo(String A, String B){
String ans = "";
for (int i = 0, j = 0; i < A.length() && j < B.length(); i++, j++){
if (A.charAt(i) != B.charAt(j)){
return i > 0 ? A.substring(0, i) : "";
}
}
// Either of the string is prefix of another one OR they are same.
return (A.length() > B.length()) ? B.substring(0, B.length()) : A.substring(0, A.length());
}
public String longestCommonPrefix(ArrayList<String> A) {
if (A.size() == 1) return A.get(0);
String prefix = A.get(0);
for (int i = 1; i < A.size(); i++){
prefix = findPrefixBetweenTwo(prefix, A.get(i)); // chain the earlier prefix
}
return prefix;
}

Resources