ruby multiple regexp gsub with array.inject - ruby

I have to work with a long text and make some substitution with regexp inside it.
Now I wrote the following code:
text = File.read(file)
replacements = [
[/```([\w]+)/, "\n\1>"],
[/```\n/, "\n\n"],
[/pattern not found/, 'sub'],
[/pattern that should be found/, 'no sub'],
]
replacements.inject(text) do |text, (k,v)|
if text =~ k
text.gsub!(k,v)
end
end
File.write(name', text)
If every regexp is found in my document everything works fine, but if a replacements pattern is not found, all subsequent replacements are not carried out.
I put the if text =~ k but it does not work the same.
Any idea?

The reason is that String#gsub! returns nil if there were no substitutions made, and the result if there were. Another glitch is that you call matching twice, the check for text =~ k is redundant.
I would go with not inplace version of gsub:
result = replacements.inject(text) do |text, (k, v)|
text.gsub(k, v)
end
the above should do the trick.
Whether you still want to use inplace substitution, you might just return text itself on unsuccessful gsub!:
result = replacements.inject(text) do |text, (k, v)|
text.gsub!(k, v) || text
end

Each inject iteration should return memo (in your case text) to the next iteration. Try this code:
replacements.inject(text) do |text, (k,v)|
if text =~ k
text.gsub!(k,v)
end
text
end

The block of inject must return memo value. So, you may have to change your code to do this:
replacements.inject(text) do |text, (k,v)|
text.gsub(k,v)
end
When if test =~ k failed in your case, the block's output was nil - hence, the issue.
Alternatively, you can use with_object
replacements.each.with_object(text) do |(k,v), text|
text.gsub!(k,v)
end

Related

eval returns string instead of value

Can someone explain why eval is returning the string, rather than the result of the expression?
perms=["12+2","22","-2+"]
perms.each { |line|
matches=/^[\d]+[+-\/\*]{1}[\d]+$/.match(line)
s=matches.to_a
puts s
puts eval(s.to_s)
}
s = matches.to_a is an array ["12+2"], the eval(s.to_s) will return the array ["12+2"], and when you print it, you will get the output 12+2, a string representation of the array ["12+2"].
You should be evaling the element of the array, in this case, s[0] or s.first.
Fix it like this:
perms=["12+2","22","-2+"]
perms.each do |line|
matches=/^[\d]+[+-\/\*]{1}[\d]+$/.match(line)
if matches
s = matches.to_a
puts eval(s.first)
end
end
matches can be nil if there was no match. Use eval only if its not nil by checking if matches
You could further simplify the code and write something like this:
perms=["12+2","22","-2+"]
perms.each do |line|
puts eval(line) if line =~ /^[\d]+[+-\/\*]{1}[\d]+$/
end
Instead of iterating through inputs, one might directly map them to outputs:
perms.map do |p| # map inputs to outputs
eval(p) if p =~ /\A\d+[+-\/*]\d+\z/ # return eval’ed or nil
end.compact # get rid of nils
#⇒ [14]

Taking a string and returning it with vowels removed

I'm attempting to write a function that takes a string and returns it with all vowels removed. Below is my code.
def vowel(str)
result = ""
new = str.split(" ")
i = 0
while i < new.length
if new[i] == "a"
i = i + 1
elsif new[i] != "a"
result = new[i] + result
end
i = i + 1
end
return result
end
When I run the code, it returns the exact string that I entered for (str). For example, if I enter "apple", it returns "apple".
This was my original code. It had the same result.
def vowel(str)
result = ""
new = str.split(" ")
i = 0
while i < new.length
if new[i] != "a"
result = new[i] + result
end
i = i + 1
end
return result
end
I need to know what I am doing wrong using this methodology. What am I doing wrong?
Finding the bug
Let's see what's wrong with your original code by executing your method's code in IRB:
$ irb
irb(main):001:0> str = "apple"
#=> "apple"
irb(main):002:0> new = str.split(" ")
#=> ["apple"]
Bingo! ["apple"] is not the expected result. What does the documentation for String#split say?
split(pattern=$;, [limit]) → anArray
Divides str into substrings based on a delimiter, returning an array of these substrings.
If pattern is a String, then its contents are used as the delimiter when splitting str. If pattern is a single space, str is split on whitespace, with leading whitespace and runs of contiguous whitespace characters ignored.
Our pattern is a single space, so split returns an array of words. This is definitely not what we want. To get the desired result, i.e. an array of characters, we could pass an empty string as the pattern:
irb(main):003:0> new = str.split("")
#=> ["a", "p", "p", "l", "e"]
"split on empty string" feels a bit hacky and indeed there's another method that does exactly what we want: String#chars
chars → an_array
Returns an array of characters in str. This is a shorthand for str.each_char.to_a.
Let's give it a try:
irb(main):004:0> new = str.chars
#=> ["a", "p", "p", "l", "e"]
Perfect, just as advertised.
Another bug
With the new method in place, your code still doesn't return the expected result (I'm going to omit the IRB prompt from now on):
vowel("apple") #=> "elpp"
This is because
result = new[i] + result
prepends the character to the result string. To append it, we have to write
result = result + new[i]
Or even better, use the append method String#<<:
result << new[i]
Let's try it:
def vowel(str)
result = ""
new = str.chars
i = 0
while i < new.length
if new[i] != "a"
result << new[i]
end
i = i + 1
end
return result
end
vowel("apple") #=> "pple"
That looks good, "a" has been removed ("e" is still there, because you only check for "a").
Now for some refactoring.
Removing the explicit loop counter
Instead of a while loop with an explicit loop counter, it's more idiomatic to use something like Integer#times:
new.length.times do |i|
# ...
end
or Range#each:
(0...new.length).each do |i|
# ...
end
or Array#each_index:
new.each_index do |i|
# ...
end
Let's apply the latter:
def vowel(str)
result = ""
new = str.chars
new.each_index do |i|
if new[i] != "a"
result << new[i]
end
end
return result
end
Much better. We don't have to worry about initializing the loop counter (i = 0) or incrementing it (i = i + 1) any more.
Avoiding character indices
Instead of iterating over the character indices via each_index:
new.each_index do |i|
if new[i] != "a"
result << new[i]
end
end
we can iterate over the characters themselves using Array#each:
new.each do |char|
if char != "a"
result << char
end
end
Removing the character array
We don't even have to create the new character array. Remember the documentation for chars?
This is a shorthand for str.each_char.to_a.
String#each_char passes each character to the given block:
def vowel(str)
result = ""
str.each_char do |char|
if char != "a"
result << char
end
end
return result
end
The return keyword is optional. We could just write result instead of return result, because a method's return value is the last expression that was evaluated.
Removing the explicit string
Ruby even allows you to pass an object into the loop using Enumerator#with_object, thus eliminating the explicit result string:
def vowel(str)
str.each_char.with_object("") do |char, result|
if char != "a"
result << char
end
end
end
with_object passes "" into the block as result and returns it (after the characters have been appended within the block). It is also the last expression in the method, i.e. its return value.
You could also use if as a modifier, i.e.:
result << char if char != "a"
Alternatives
There are many different ways to remove characters from a string.
Another approach is to filter out the vowel characters using Enumerable#reject (it returns a new array containing the remaining characters) and then join the characters (see Nathan's answer for a version to remove all vowels):
def vowel(str)
str.each_char.reject { |char| char == "a" }.join
end
For basic operations like string manipulation however, Ruby usually already provides a method. Check out the other answers for built-in alternatives:
str.delete('aeiouAEIOU') as shown in Gagan Gami's answer
str.tr('aeiouAEIOU', '') as shown in Cary Swoveland's answer
str.gsub(/[aeiou]/i, '') as shown in Avinash Raj's answer
Naming things
Cary Swoveland pointed out that vowel is not the best name for your method. Choose the names for your methods, variables and classes carefully. It's desirable to have a short and succinct method name, but it should also communicate its intent.
vowel(str) obviously has something to do with vowels, but it's not clear what it is. Does it return a vowel or all vowels from str? Does it check whether str is a vowel or contains a vowel?
remove_vowels or delete_vowels would probably be a better choice.
Same for variables: new is an array of characters. Why not call it characters (or chars if space is an issue)?
Bottom line: read the fine manual and get to know your tools. Most of the time, an IRB session is all you need to debug your code.
I should use regex.
str.gsub(/[aeiou]/i, "")
> string= "This Is my sAmple tExt to removE vowels"
#=> "This Is my sAmple tExt to removE vowels"
> string.delete 'aeiouAEIOU'
#=> "Ths s my smpl txt t rmv vwls"
You can create a method like this:
def remove_vowel(str)
result = str.delete 'aeiouAEIOU'
return result
end
remove_vowel("Hello World, This is my sample text")
# output : "Hll Wrld, Ths s my smpl txt"
Live Demo
Assuming you're trying to learn about the basics of programming, rather than finding the quickest one-liner to do this (which would be to use a regular expression as Avinash has said), you have a number of problems with your code you need to change.
new = str.split(" ")
This line is likely the culprit, because it splits the string based on spaces. So your input string would have to be "a p p l e" to have the effect you're looking for.
new = str.split("")
You should also remove the duplicate i = i+1 once you've changed that.
As others have already identified the problems with the OP's code, I will merely suggest an alternative; namely, you could use String#tr:
"Now is the time for all good people...".tr('aeiouAEIOU', '')
#=> "Nw s th tm fr ll gd ppl..."
If regex is not allowed, you can do it this way:
def remove_vowels(string)
string.split("").delete_if { |letter| %w[a e i o u].include? letter }.join
end

Search and replace multiple words in file via Ruby

Good afternoon!
I am pretty new to Ruby and want to code a basic search and replace function in Ruby.
When you call the function, you can pass parameters (search pattern, replacing word).
This works like this: multiedit(pattern1, replacement1, pattern2, replacement2, ...)
Now, I want my function to read a text file, search for pattern1 and replace it with replacement2, search for pattern2 and replace it with replacement2 and so on. Finally, the altered text should be written to another text file.
I've tried to do this with a until loop, but all I get is that only the very first pattern is replaced while all the following patterns are ignored (in this example, only apple is replaced with fruit). I think the problem is that I always reread the original unaltered text? But I can't figure out a solution. Can you help me? Calling the function the way I am doing it is important for me.
def multiedit(*_patterns)
return puts "Number of search patterns does not match number of replacement strings!" if (_patterns.length % 2 > 0)
f = File.open("1.txt", "r")
g = File.open("2.txt", "w")
i = 0
until i >= _patterns.length do
f.each_line {|line|
output = line.sub(_patterns[i], _patterns[i+1])
g.puts output
}
i+=2
end
f.close
g.close
end
multiedit("apple", "fruit", "tomato", "veggie", "steak", "meat")
Can you help me out?
Thank you very much in advance!
Regards
Your loop was kind of inside-out ... do this instead ...
f.each_line do |line|
_patterns.each_slice 2 do |a, b|
line.sub! a, b
end
g.puts line
end
Perhaps the most efficient way to evaluate all the patterns for every line is to build a single regexp from all the search patterns and use the hash replacement form of String#gsub
def multiedit *patterns
raise ArgumentError, "Number of search patterns does not match number of replacement strings!" if (_patterns.length % 2 != 0)
replacements = Hash[ *patterns ].
regexp = Regexp.new replacements.keys.map {|k| Regexp.quote(k) }.join('|')
File.open("2.txt", "w") do |out|
IO.foreach("1.txt") do |line|
out.puts line.gsub regexp, replacements
end
end
end
Easier and better method is to use erb.
http://apidock.com/ruby/ERB

How could I check to see if a word exists in a string, and return false if it doesn't, in ruby?

Say I have a string str = "Things to do: eat and sleep."
How could I check if "do: " exists in str, case insensitive?
Like this:
puts "yes" if str =~ /do:/i
To return a boolean value (from a method, presumably), compare the result of the match to nil:
def has_do(str)
(str =~ /do:/i) != nil
end
Or, if you don’t like the != nil then you can use !~ instead of =~ and negate the result:
def has_do(str)
not str !~ /do:/i
end
But I don’t really like double negations …
In ruby 1.9 you can do like this:
str.downcase.match("do: ") do
puts "yes"
end
It's not exactly what you asked for, but I noticed a comment to another answer. If you don't mind using regular expressions when matching the string, perhaps there is a way to skip the downcase part to get case insensitivity.
For more info, see String#match
You could also do this:
str.downcase.include? "Some string".downcase
If all I'm looking for is a case=insensitive substring match I usually use:
str.downcase['do: ']
9 times out of 10 I don't care where in the string the match is, so this is nice and concise.
Here's what it looks like in IRB:
>> str = "Things to do: eat and sleep." #=> "Things to do: eat and sleep."
>> str.downcase['do: '] #=> "do: "
>> str.downcase['foobar'] #=> nil
Because it returns nil if there is no hit it works in conditionals too.
"Things to do: eat and sleep.".index(/do: /i)
index returns the position where the match starts, or nil if not found
You can learn more about index method here:
http://ruby-doc.org/core/classes/String.html
Or about regex here:
http://www.regular-expressions.info/ruby.html

Create regular expression from string

Is there any way to create the regex /func:\[sync\] displayPTS/ from string func:[sync] displayPTS?
The story behind this question is that I have serval string pattens to search against in a text file and I don't want to write the same thing again and again.
File.open($f).readlines.reject {|l| not l =~ /"#{string1}"/}
File.open($f).readlines.reject {|l| not l =~ /"#{string2}"/}
Instead , I want to have a function to do the job:
def filter string
#build the reg pattern from string
File.open($f).readlines.reject {|l| not l =~ pattern}
end
filter string1
filter string2
s = "func:[sync] displayPTS"
# => "func:[sync] displayPTS"
r = Regexp.new(s)
# => /func:[sync] displayPTS/
r = Regexp.new(Regexp.escape(s))
# => /func:\[sync\]\ displayPTS/
I like Bob's answer, but just to save the time on your keyboard:
string = 'func:\[sync] displayPTS'
/#{string}/
If the strings are just strings, you can combine them into one regular expression, like so:
targets = [
"string1",
"string2",
].collect do |s|
Regexp.escape(s)
end.join('|')
targets = Regexp.new(targets)
And then:
lines = File.readlines('/tmp/bar').reject do |line|
line !~ target
end
s !~ regexp is equivalent to not s =~ regexp, but easier to read.
Avoid using File.open without closing the file. The file will remain open until the discarded file object is garbage collected, which could be long enough that your program will run out of file handles. If you need to do more than just read the lines, then:
File.open(path) do |file|
# do stuff with file
end
Ruby will close the file at the end of the block.
You might also consider whether using find_all and a positive match would be easier to read than reject and a negative match. The fewer negatives the reader's mind has to go through, the clearer the code:
lines = File.readlines('/tmp/bar').find_all do |line|
line =~ target
end
How about using %r{}:
my_regex = "func:[sync] displayPTS"
File.open($f).readlines.reject { |l| not l =~ %r{#{my_regex}} }

Resources