Checking if a string has balanced parentheses - ruby

I am currently working on a Ruby Problem quiz but I'm not sure if my solution is right. After running the check, it shows that the compilation was successful but i'm just worried it is not the right answer.
The problem:
A string S consisting only of characters '(' and ')' is called properly nested if:
S is empty,
S has the form "(U)" where
U is a properly nested string,
S has
the form "VW" where V and W are
properly nested strings.
For example, "(()(())())" is properly nested and "())" isn't.
Write a function
def nesting(s)
that given a string S returns 1 if S
is properly nested and 0 otherwise.
Assume that the length of S does not
exceed 1,000,000. Assume that S
consists only of characters '(' and
')'.
For example, given S = "(()(())())"
the function should return 1 and given
S = "())" the function should return
0, as explained above.
Solution:
def nesting ( s )
# write your code here
if s == '(()(())())' && s.length <= 1000000
return 1
elsif s == ' ' && s.length <= 1000000
return 1
elsif
s == '())'
return 0
end
end

Here are descriptions of two algorithms that should accomplish the goal. I'll leave it as an exercise to the reader to turn them into code (unless you explicitly ask for a code solution):
Start with a variable set to 0 and loop through each character in the string: when you see a '(', add one to the variable; when you see a ')', subtract one from the variable. If the variable ever goes negative, you have seen too many ')' and can return 0 immediately. If you finish looping through the characters and the variable is not exactly 0, then you had too many '(' and should return 0.
Remove every occurrence of '()' in the string (replace with ''). Keep doing this until you find that nothing has been replaced (check the return value of gsub!). If the string is empty, the parentheses were matched. If the string is not empty, it was mismatched.

You're not supposed to just enumerate the given examples. You're supposed to solve the problem generally. You're also not supposed to check that the length is below 1000000, you're allowed to assume that.
The most straight forward solution to this problem is to iterate through the string and keep track of how many parentheses are open right now. If you ever see a closing parenthesis when no parentheses are currently open, the string is not well-balanced. If any parentheses are still open when you reach the end, the string is not well-balanced. Otherwise it is.
Alternatively you could also turn the specification directly into a regex pattern using the recursive regex feature of ruby 1.9 if you were so inclined.

My algorithm would use stacks for this purpose. Stacks are meant for solving such problems
Algorithm
Define a hash which holds the list of balanced brackets for
instance {"(" => ")", "{" => "}", and so on...}
Declare a stack (in our case, array) i.e. brackets = []
Loop through the string using each_char and compare each character with keys of the hash and push it to the brackets
Within the same loop compare it with the values of the hash and pop the character from brackets
In the end, if the brackets stack is empty, the brackets are balanced.
def brackets_balanced?(string)
return false if string.length < 2
brackets_hash = {"(" => ")", "{" => "}", "[" => "]"}
brackets = []
string.each_char do |x|
brackets.push(x) if brackets_hash.keys.include?(x)
brackets.pop if brackets_hash.values.include?(x)
end
return brackets.empty?
end

You can solve this problem theoretically. By using a grammar like this:
S ← LSR | LR
L ← (
R ← )
The grammar should be easily solvable by recursive algorithm.
That would be the most elegant solution. Otherwise as already mentioned here count the open parentheses.

Here's a neat way to do it using inject:
class String
def valid_parentheses?
valid = true
self.gsub(/[^\(\)]/, '').split('').inject(0) do |counter, parenthesis|
counter += (parenthesis == '(' ? 1 : -1)
valid = false if counter < 0
counter
end.zero? && valid
end
end
> "(a+b)".valid_parentheses? # => true
> "(a+b)(".valid_parentheses? # => false
> "(a+b))".valid_parentheses? # => false
> "(a+b))(".valid_parentheses? # => false

You're right to be worried; I think you've got the very wrong end of the stick, and you're solving the problem too literally (the info that the string doesn't exceed 1,000,000 characters is just to stop people worrying about how slow their code would run if the length was 100times that, and the examples are just that - examples - not the definitive list of strings you can expect to receive)
I'm not going to do your homework for you (by writing the code), but will give you a pointer to a solution that occurs to me:
The string is correctly nested if every left bracket has a right-bracket to the right of it, or a correctly nested set of brackets between them. So how about a recursive function, or a loop, that removes the string matches "()". When you run out of matches, what are you left with? Nothing? That was a properly nested string then. Something else (like ')' or ')(', etc) would mean it was not correctly nested in the first place.

Define method:
def check_nesting str
pattern = /\(\)/
while str =~ pattern do
str = str.gsub pattern, ''
end
str.length == 0
end
And test it:
>ruby nest.rb (()(())())
true
>ruby nest.rb (()
false
>ruby nest.rb ((((()))))
true
>ruby nest.rb (()
false
>ruby nest.rb (()(((())))())
true
>ruby nest.rb (()(((())))()
false

Your solution only returns the correct answer for the strings "(()(())())" and "())". You surely need a solution that works for any string!
As a start, how about counting the number of occurrences of ( and ), and seeing if they are equal?

Related

How to verify that the last character in a string is a number

I need to check if the last character in a string is a digit, and if so, increment it.
I have a directory structure of /u01/app/oracle/... and that's where it goes off the rails. Sometimes it ends with the version number, sometimes it ends with dbhome_1 (or 2, or 3), and sometimes, I have to assume, it will take some other form. If it ends with dbhome_X, I need to parse that and bump that final digit, if it is a digit.
I use split to split the directory structure on '/', and use include? to check if the final element is something like "dbhome". As long as my directory structure ends with dbhome_X it seems to work. As I was testing, though, I tried a path that ended with dbhome, and found that my check for the last character being a digit didn't work.
db_home = '/u01/app/oracle/product/11.2.0/dbhome'
if db_home.split('/')[-1].include?('dbhome')
homedir=db_home.split('/')[-1]
if homedir[-1].to_i.is_a? Numeric
homedir=homedir[0...-1]+(homedir[-1].to_i+1).to_s
new_path="/"+db_home.split('/')[1...-1].join("/")+"/"+homedir.to_s
end
else
new_path=db_home+"/dbhome_1"
end
puts new_path
I did not expect the output to be /u01/app/oracle/11.2.0/product/dbhom1 - it seems to have fallen into the if block that added 1 to the final character.
If I set the initial path to /u01/app/.../dbhome_1, I get the expected /u01/app/.../dbhome_2 as the output.
You could use a regular expression to make matching a tad bit easier
if !!(db_home[/.*dbhome.*\z]) ..
You could use regex's
/[0-9]$/.match("How3").nil?
I need to check if the last character in a string is a digit, and if
so, increment it.
This is one option:
s = 'string9'
s[-1].then { |last| last.to_i.to_s == last ? [s[0..-2], last.to_i+1].join : s }
#=> "string10"
'/u01/app/11.2.0/dbhome'.sub(/\d\z/) { |s| s.succ }
#=> "/u01/app/11.2.0/dbhome"
'/u01/app/11.2.0/dbhome9'.sub(/\d\z/) { |s| s.succ }
#=> "/u01/app/11.2.0/dbhome10"
This is a starting point if you're running Ruby v2.6+:
fname = 'filename1'
fname[/\d+$/].then { |digits|
fname[/\d+$/] = digits.to_i.next.to_s if digits
}
fname # => "filename2"
And it's safe if the filename doesn't end with a digit:
fname = 'filename'
fname[/\d+$/].then { |digits|
fname[/\d+$/] = digits.to_i.next.to_s if digits
}
fname # => "filename"
I'm not sure if I like doing it that way better than the more traditional way which works with much older Rubies:
digits = fname[/\d+$/]
fname[/\d+$/] = digits.to_i.next.to_s if digits
except for the fact that digits gets stuck into the variable space after only being used once. There's probably worse things that happen in my code though.
This is taking advantage of String's [] and []= methods.

Regex: text before multiple matches

Idea. Given the string, return all the matches (with overlaps) and the text before these matches.
Example. For the text atatgcgcatatat and the query atat there are three matches, and the desired output is atat, atatgcgcatat and atatgcgcatatat.
Problem. I use Ruby 2.2 and String#scan method to get multiple matches. I've tried to use lookahead, but the regex /(?=(.*?atat))/ returns every substring that ends with atat. There must be some regex magic to solve this problem, but I can't figure out the right spell.
I believe this is at least better than the OP's answer:
text = "atatgcgcatatat"
query = "atat"
res = []
text.scan(/(?=#{query})/){res.push($` + query)} #`
res # => ["atat", "atatgcgcatat", "atatgcgcatatat"]
Given the nature and purpose of regex, there is no way to do that. When a regex matches text, there is no way to include the same text in another match. Therefore, the best option that I can think of is to use a look-behind to find the ending position of each match:
(?<=atat)
With your example input of atatgcgcatatat, that would return the following three matches:
Position 4, Length 0
Position 12, Length 0
Position 14, Length 0
You could then loop through those results, get the position for each one, and then get the sub-string that starts at the beginning of the input string and ends at that position. If you don't know how to get the positions of each match, you may find the answers to this question helpful.
You could do this:
str = 'atatgcgcatatat'
target = 'atat'
[].tap do |a|
str.gsub(/(?=#{target})/) { a << str[0, $~.end(0)+target.size] }
end
#=> ["atat", "atatgcgcatat", "atatgcgcatatat"]
Notice that the string returned by gsub is discarded.
It seems, there's no way to solve the problem in just one go.
One possible solution is to use this knowledge to get indices of matches when using String#scan, and then return the array of sliced strings:
def find_by_end text, query
res = []
n = query.length
text.scan( /(?=(#{query}))/ ) do |m|
res << text.slice(0, $~.offset(0).first + n)
end
res
end
find_by_end "atatgcgcatatat", "atat" #=> ["atat", "atatgcgcatat", "atatgcgcatatat"]
A slightly different solution was proposed by #StevenDoggart. Here's a nice and short code which uses this hack to solve the problem:
"atatgcatatat".to_enum(:scan, /(?<=atat)/).map { $` } #`
#=> ["atat", "atatgcatat", "atatgcatatat"]
As #CasimiretHippolyte notes, reversing the string might help to solve the problem. It actually does, but it's hardly the prettiest solution:
"atatgcatatat".reverse.scan(/(?=(tata.*))/).flatten.map(&:reverse).reverse
#=> ["atat", "atatgcatat", "atatgcatatat"]

Classic ASP InStr() Evaluates True on Empty Comparison String

I ran into an issue with the Classic ASP VbScript InStr() function. As shown below, the second call to InStr() returns 1 when searching for an empty string in a non empty string. I'm curious why this is happening.
' InStr Test
Dim someText : someText = "So say we all"
Dim emptyString : emptyString = ""
'' I expect this to be true
If inStr(1,someText,"so",1) > 0 Then
Response.write ( "I found ""so""<br />" )
End If
'' I expect this to be false
If inStr(1, someText, emptyString, 1) > 0 Then
Response.Write( "I found an empty string<br />" )
End If
EDIT:
Some additional clarification: The reason for the question came up when debugging legacy code and running into a situation like this:
Function Go(value)
If InStr(1, "Option1|Option2|Option3", value, 1) > 0 Then
' Do some stuff
End If
End Function
In some cases function Go() can get called with an empty string. The original developer's intent was not to check whether value was empty, but rather, whether or not value was equal to one of the piped delimited values (Option1,Option2, etc.).
Thinking about this further it makes sense that every string is created from an empty string, and I can understand why a programming language would assume a string with all characters removed still contains the empty string.
What doesn't make sense to me is why programming languages are implementing this. Consider these 2 statements:
InStr("so say we all", "s") '' evaluates to 1
InStr("so say we all", "") '' evaluates to 1
The InStr() function will return the position of the first occurrence of one string within another. In both of the above cases, the result is 1. However, position 1 always contains the character "s", not an empty string. Furthermore, using another string function like Len() or LenB() on an empty string alone will result in 0, indicating a character length of 0.
It seems that there is some inconsistency here. The empty string contained in all strings is not actually a character, but the InStr() function is treating it as one when other string functions are not. I find this to be un-intuitive and un-necessary.
The Empty String is the Identity Element for Strings:
The identity element I (also denoted E, e, or 1) of a group or related
mathematical structure S is the unique element such that Ia=aI=a for
every element a in S. The symbol "E" derives from the German word for
unity, "Einheit." An identity element is also called a unit element.
If you add 0 to a number n the result is n; if you add/concatenate "" to a string s the result is s:
>> WScript.Echo CStr(1 = 1 + 0)
>> WScript.Echo CStr("a" = "a" & "")
>>
True
True
So every String and SubString contains at least one "":
>> s = "abc"
>> For p = 1 To Len(s)
>> WScript.Echo InStr(p, s, "")
>> Next
>>
1
2
3
and Instr() reports that faithfully. The docs even state:
InStr([start, ]string1, string2[, compare])
...
The InStr function returns the following values:
...
string2 is zero-length start
WRT your
However, position 1 always contains the character "s", not an empty
string.
==>
Position 1 always contains the character "s", and therefore an empty
string too.
I'm puzzled why you think this behavior is incorrect. To the extent that asking Does 'abc' contain ''? even makes sense, the answer has to be "yes": All strings contain the empty string as a trivial case. So the answer to your "why is this happening" question is because it's the only sane thing to do.
It is s correct imho. At least it is what I expect that empty string is part of any other string. But maybe this is a philosophical question. ASP does it so, so live with it. Practically speaking, if you need a different behavior write your own Method, InStrNotEmpty or something, which returns false on empty search string.

Ruby Truncate Words + Long Text

I have the following function which accepts text and a word count and if the number of words in the text exceeded the word-count it gets truncated with an ellipsis.
#Truncate the passed text. Used for headlines and such
def snippet(thought, wordcount)
thought.split[0..(wordcount-1)].join(" ") + (thought.split.size > wordcount ? "..." : "")
end
However what this function doesn't take into account is extremely long words, for instance...
"Helloooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
world!"
I was wondering if there's a better way to approach what I'm trying to do so it takes both word count and text size into consideration in an efficient way.
Is this a Rails project?
Why not use the following helper:
truncate("Once upon a time in a world far far away", :length => 17)
If not, just reuse the code.
This is probably a two step process:
Truncate the string to a max length (no need for regex for this)
Using regex, find a max words quantity from the truncated string.
Edit:
Another approach is to split the string into words, loop through the array adding up
the lengths. When you find the overrun, join 0 .. index just before the overrun.
Hint: regex ^(\s*.+?\b){5} will match first 5 "words"
The logic for checking both word and char limits becomes too convoluted to clearly express as one expression. I would suggest something like this:
def snippet str, max_words, max_chars, omission='...'
max_chars = 1+omision.size if max_chars <= omission.size # need at least one char plus ellipses
words = str.split
omit = words.size > max_words || str.length > max_chars ? omission : ''
snip = words[0...max_words].join ' '
snip = snip[0...(max_chars-3)] if snip.length > max_chars
snip + omit
end
As other have pointed out Rails String#truncate offers almost the functionality you want (truncate to fit in length at a natural boundary), but it doesn't let you independently state max char length and word count.
First 20 characters:
>> "hello world this is the world".gsub(/.+/) { |m| m[0..20] + (m.size > 20 ? '...' : '') }
=> "hello world this is t..."
First 5 words:
>> "hello world this is the world".gsub(/.+/) { |m| m.split[0..5].join(' ') + (m.split.size > 5 ? '...' : '') }
=> "hello world this is the world..."

How do I convert a Ruby string with brackets to an array?

I would like to convert the following string into an array/nested array:
str = "[[this, is],[a, nested],[array]]"
newarray = # this is what I need help with!
newarray.inspect # => [['this','is'],['a','nested'],['array']]
You'll get what you want with YAML.
But there is a little problem with your string. YAML expects that there's a space behind the comma. So we need this
str = "[[this, is], [a, nested], [array]]"
Code:
require 'yaml'
str = "[[this, is],[a, nested],[array]]"
### transform your string in a valid YAML-String
str.gsub!(/(\,)(\S)/, "\\1 \\2")
YAML::load(str)
# => [["this", "is"], ["a", "nested"], ["array"]]
You could also treat it as almost-JSON. If the strings really are only letters, like in your example, then this will work:
JSON.parse(yourarray.gsub(/([a-z]+)/,'"\1"'))
If they could have arbitrary characters (other than [ ] , ), you'd need a little more:
JSON.parse("[[this, is],[a, nested],[array]]".gsub(/, /,",").gsub(/([^\[\]\,]+)/,'"\1"'))
For a laugh:
ary = eval("[[this, is],[a, nested],[array]]".gsub(/(\w+?)/, "'\\1'") )
=> [["this", "is"], ["a", "nested"], ["array"]]
Disclaimer: You definitely shouldn't do this as eval is a terrible idea, but it is fast and has the useful side effect of throwing an exception if your nested arrays aren't valid
Looks like a basic parsing task. Generally the approach you are going to want to take is to create a recursive function with the following general algorithm
base case (input doesn't begin with '[') return the input
recursive case:
split the input on ',' (you will need to find commas only at this level)
for each sub string call this method again with the sub string
return array containing the results from this recursive method
The only slighlty tricky part here is splitting the input on a single ','. You could write a separate function for this that would scan through the string and keep a count of the openbrackets - closedbrakets seen so far. Then only split on commas when the count is equal to zero.
Make a recursive function that takes the string and an integer offset, and "reads" out an array. That is, have it return an array or string (that it has read) and an integer offset pointing after the array. For example:
s = "[[this, is],[a, nested],[array]]"
yourFunc(s, 1) # returns ['this', 'is'] and 11.
yourFunc(s, 2) # returns 'this' and 6.
Then you can call it with another function that provides an offset of 0, and makes sure that the finishing offset is the length of the string.

Resources