Merge Sort confusion, stack level too deep? - ruby

So Im learning merge sort (In Ruby) and I understand mostly of how it works, so I was just writing the beginning of the "mergeSort" functionality and splitting the arrays up:
array = [5,1,8,3,4,6,11,2]
def mergeSort(arr,beginIndex,endIndex)
if endIndex > beginIndex
mid = (beginIndex+endIndex)/2
puts mid
puts "#{arr[0..mid]}"
puts "#{arr[mid+1..-1]}"
mergeSort(arr,0,mid) #comment out here?
mergeSort(arr,mid+1,arr.length-1) #or here?
end
end
mergeSort(array,0,array.length-1)
So I understand whats happening, and whats printing out works correctly. However the strange thing is...if I comment out either one of the secondary nested mergeSort's (where I say "#comment out here? or here?) I get a printout like I would think. However if I leave both it'll go on forever until I get a "stack level too deep". Which is weird because commenting out either one it'll stop when the array is at 1 length.
Why is this happening?

The second call should be mergeSort(arr,mid+1,endIndex).
It works fine if you leave out the second call because you narrow in to the beginning. It works fine if you leave out the first call because you narrow in to the end. It fails if you have both because you go to sort the first half and find yourself on the second call still trying to sort the whole array...recursively.

Related

Selecting key words in a string (that are included in an Array) to change their format in Ruby

Select key words in a string to change their format in Ruby
I have a big string (text) and an Array of strings (key_words) as below:
text = 'So in this election, we cannot sit back and hope that everything works out for the best. We cannot afford to be tired or frustrated or cynical. No, hear me. Between now and November, we need to do what we did eight years ago and four years ago…'
key_words = ['frustrated', 'tired', 'hope']
My objective is to print each word in ‘text’ while changing the colour and case of the words that are included in key_words. I’ve been able to do that by doing:
require 'colorize'
text.split(/\b/).each do |x|
if key_words.include?(x.downcase) ; print '#{x}'.colorize(:red)
else print '#{x}' end
end
However, since I don’t want to include many words in key_words I want to make the selection more sensitive going beyond an exact match. Such as if, for example:
key_words = ['frustrat', 'tire', 'hope'] => the algorithm would select both 'Frustration', 'Frustrated' or 'Tiring' and 'Tired' or 'Hope' and 'Hopeful'.
I’ve tried playing with word lengths in both the string and the array as below but it’s seems very inefficient solution and I’m getting very confused with the usage of .any? and .include? methods in this scenario.
key_words = ['frustrated', 'tired', 'hope']
key_words_abb = []
key_words.each { |x| key_words_abb << x.downcase[0][0..x.length-2]}
text.split(/\b/).each do |x|
if key_words_abb.include?(x.downcase[0][0..x.length-2]); print '#{x}'.colorize(:red)
else print x
end
end
Since I can’t find a specific solution online I would appreciate your help.
It's worth noting that when doing repeated substitutions on strings, especially longer ones, you'll want your substitution method to be as efficient as possible. Spinning through an array of things to switch out is painfully expensive, especially as that list grows.
Here's a variation on your approach:
replacement = Regexp.new('\b%s\b' % [ Regexp.union(key_words) ])
replaced = text.gsub(replacement) do |s|
s.colorize(:red)
end
puts replaced
If you're using that substitution repeatedly you should persist the Regexp object into a constant. That avoids having to compile it for each string you're adjusting. If the list changes based on factors hard to predict, leave it like this and produce it dynamically.
One thing to note about using Ruby is it's often best to express your code as a series of transformations with output as a final step. Putting things like print in the middle of a loop complicates things unnecessarily. If you want to add an additional step to your loop you have to do a lot of extra work to move that print to a later stage. With the approach here you can just chain on the end and do whatever you want.

strange behavior from ruby caesar cipher implementation

I am getting unpredictable response from this code, if the phrase used has only one word the output is as predicted, also if the shift is 0 it pushes things straight through. But when there are spaces in the phrase the shifting seems unpredictable. If someone can see the pattern, or enlighten me on what the methods are doing behind the scenes.(using the new1 array as the receiver yeilded an unaltered array)
def caesar (phrase, shift)
alpha=('a'..'z').to_a
#new1=[]
temp=phrase.downcase.split('')
temp.each{|x| (temp[temp.index(x)]=alpha[(alpha.index(x)+shift)%26]) if alpha.include?(x)}
p temp.join
end
caesar("abcde fghijklmnopqrs tu,,..vwxyz", 1)
caesar("Frank", 1)
caesar("Frank is a willy munching wombat.", 1)
results:
"abcde fghijklmnopqrs tu,,..vwxyz"
"gsbol"
"hucpo js b xjnlz mvodjing xombbt."
Apologies if this is answered elsewhere, to be honest I didn't even know what to ask.
This is wrong as soon as a letter repeats itself:
temp[temp.index(x)]=...
Use each_with_index instead!

Struggling to make sense of an array

So I am trying to make the transition from PHP to ruby(finally). I am attempting to complete the rubymonk challenges but I am stuck on the third challenge.
The challenge itself is easy and I've already found a solution, but I cant figure out what type of data I'm looking at or how to process it properly.
The challenge simply wants you create a method that takes a array containing some strings, and return a count of each string in that same position. so ["I","suck","at","ruby"] == ["1","4","2","4"].
That part is Ez-pz, but I cant for the life of me figure out how to process the input properly.
It gives you a shell of method and tells you to complete it
def lenght_finder(input_array)
#I added the print input_array
print input_array #=> ["I","am","genius"]["things","are","","awesome"]
end
Is this a multidimensional array?
I've tried to replicate this in IRB with
input_array = ["I","am","genius"]["things","are","","awesome"]
but it returns and error
input_array = [["I","am","genius"],["things","are","","awesome"]]
works, but that is clearly not that same.
Because of this I am struggling to traverse the array to process that data properly.
I can't get anything like input_array.flatten to work, or input_array[0] which returns "Ithings".
This is confusing me. Am I looking at a single array? a multidimensional array? Clearly it cant be a string. Why does it skip "am" when accessing input_array[0]?
Ha, like Justin Ko suggested in his comment above, what you're seeing is the stdout of running the function twice.
Since you used print, there's no newline. Use puts instead.
This should help you see it more clearly:
def length_finder(input_array)
puts '*** '+input_array.inspect
return 0
end

Trying to write sort method

Trying to sort an array by writing my own sort method using recursion (Pine's book). Saw some other examples on stackoverflow, but my code looks different from them. Two things I don't understand so far:
What is a wrapper method, and why do I need one? (I put on in the code, I think).
How to fix the "stack level too deep" error.
EDIT: New code updated, working but not correct.
Here's what I have so far:
def word_sorter unsorted, sorted
if unsorted[1] == nil
sorted.push unsorted[0]
words_put(sorted)
elsif unsorted[0] <= unsorted[1]
sorted.push unsorted[0]
unsorted.shift
word_sorter(unsorted, sorted)
else
unsorted.push unsorted[0]
unsorted.shift
word_sorter(unsorted, sorted)
end
end
def words_put sorted
puts 'these are your words now organized.'
sorted.compact!
puts sorted.join(', ')
Process.exit
end
unsorted = Array.new
sorted = Array.new
puts 'list as many words as you want. I will sort them... I think'
while unsorted.last != ''
unsorted.push gets.chomp
if unsorted.last == ''
unsorted.pop
word_sorter(unsorted, sorted)
end
end
Thanks!
1) There is nothing special going on here. We are using plain English (albeit metaphorically). A wrapper method is a method which is a wrapper. A wrapper is a thing which wraps. You are wrapping the word_sorter method with the sort method. You "need" it for convenience: it would be strange for the sort method to expect an empty list for its second parameter when you call it from outside. The wrapping takes into account the fact that the obvious interface for the recursion differs from the obvious interface for the outside world.
2) Take a close look at how the code for handling unsorted[0] >= unsorted[1] differs from the else case (i.e. when unsorted[0] < unsorted[1]).
3) Try describing your algorithm in English first. And then try putting out a few playing cards and testing your algorithm by following it, to the letter.
4) A working sort algorithm will only need to be called once. So work out a proper sorting algorithm, and then only call it once - outside the loop, after you've read in all the values to sort. You might also want to actually call words_put.
You should first try your code by some simple examples. E.g. use the list [3,2,1] as input.
In the first call it will match the 3>=2 condition. Thus now sorted=[2].
There are two issues with this one already.
2 isn't the first entry in the sorted list. There must be an issue with your algorithm being not able to sort any input.
The array unsorted isn't changed at all and thus it will loop with this one forever, yielding sorted=[2,2,2,2,2.....].
"Stack level too deep" implies that you have infinite recursion going on. It doesn't look like the unsorted list gets shorter in any of your branches in word_sorter, so it will keep running forever.

Clone detection algorithm

I'm writing an algorithm that detects clones in source code. E.g. if there is a block like:
for(int i = o; i <5; i++){
doSomething(abc);
}
...and if this block is repeated somewhere else in the source code it will be detected as a clone. The method I am using at the moment is to create hashes for lines/blocks and compare them with hashes of other lines/blocks in the same source to see if there are any matches.
Now, if the same block as above was to be repeated somewhere with only the argument of doSomething different, it would not be detected as a clone even though it would appear very much like a clone to you and me. My algorithm detects exact matches but doesn't detect matching blocks where only the argument is different.
Could anyone suggest any ways of getting around this issue? Thanks!
Here's a super-simple way, which might go too far in erasing information (i.e., might produce too many false positives): replace every identifier that isn't a keyword with some fixed name. So you'd get
for (int DUMMY = DUMMY; DUMMY<5; DUMMY++) {
DUMMY(DUMMY);
}
(assuming you really meant o rather than 0 in the initialization part of the for-loop).
If you get a huge number of false positives with this, you could then post-process them by, for instance, looking to see what fraction of the DUMMYs actually correspond to the same identifier in both halves of the match, or at least to identifiers that are consistent between the two.
To do much better you'll probably need to parse the code to some extent. That would be a lot more work.
Well if you're going todo something else then you're going to have to parse to code at least a bit. For example you could detect methods and then ignore the method arguments in your hash. Anyway I think it's always true that you need your program to understand the code better than 'just text blocks', and that might get awefuly complicated.

Resources