How to substitute multiple grep results with str.sub in Ruby? - ruby

i am a ruby beginner and i found a problem, i would like to know if there is a more 'ruby way'
to solve it.
my problem is:
i got a string, like this:
str = "<div class=\"yui-u first\">\r\n\t\t\t\t\t<h1>Jonathan Doe</h1>\r\n
\t\t\t\t\t<h2>Web Designer, Director</h2>\r\n\t\t\t\t</div>"
# now, i want to replace the substring in <h1> </h1> and <h2> and </h2> with
these two string:"fooo" and "barr".
here is what i did:
# first, i got the exactly matched substrings of str:
r = str.scan(/(?<=<h\d>).*?(?=<\/h\d>)/)
# then, i create a hash table to set the corresponding replace strings
h = {r[0] => 'fooo', r[1] => 'barr'}
# finally, using str.gsub to replace those matched strings
str.gsub!(/(?<=<h\d>).*?(?=<\/h\d>)/, h)
# or like this
str.gsub!(/(?<=<h\d>).*?(?=<\/h\d>)/) {|v| h[v]}
PS: The substring in <h1> </h1> and <h2> </h2> are not fixed, so i have
to get these strings FIRST, so that i can build a hash table. But I
really don't like the code above (because i wrote two lines almost the same),
i think there must be a elegant way to do so. i have tried something like this:
str.gsub!(/(?<=<h\d>).*?(?=<\/h\d>)/) { ['fooo', 'barr'].each {|v| v}}
but this didn't work. because this block returns ['fooo', 'barr'] EVERYTIME!
if there is a way to let this block (or something?) return one element at a time(return 'fooo' at the first time, then return 'barr' at the second), my problem will be solved!
thank you!

Although you really have no business parsing HTML with a regexp, as a library like Nokogiri can make this significantly easier as you can modify the DOM directly, the mistake you're making is in presuming that the iterator will execute only once per substitution and that the block will return only one value. each will actually return the object being iterated.
Here's a way to avoid all the Regexp insanity:
require 'rubygems'
gem 'nokogiri'
require 'nokogiri'
str = "<div class=\"yui-u first\">\r\n\t\t\t\t\t<h1>Jonathan Doe</h1>\r\n
\t\t\t\t\t<h2>Web Designer, Director</h2>\r\n\t\t\t\t</div>"
html = Nokogiri::HTML(str)
h1 = html.at_css('h1')
h1.content = 'foo'
h2 = html.at_css('h2')
h2.content = 'bar'
puts html.to_s
If you want to do multiple substitutions where each gets a different value, the simple way is to just rip off values from a stack:
subs = %w[ foo bar baz ]
string = "x x x"
string.gsub!(/x/) do |s|
subs.shift
end
puts string.inspect
# => "foo bar baz"
Keep in mind that subs is consumed here. A more efficient approach would be to increment some kind of index variable and use that value instead, but this is a trivial modification.

Related

Swap part of a string in Ruby

What's the easiest way in Ruby to interchange a part of a string with another value. Let's say that I have an email, and I want to check it on two domains, but I don't know which one I'll get as an input. The app I'm building should work with #gmail.com and #googlemail.com domains.
Example:
swap_string 'user#gmail.com' # >>user#googlemail.com
swap_string 'user#googlemail.com' # >>user#gmail.com
If you're looking to substitute a part of a string with something else, gsub works quite well.
Link to Gsub docs
It lets you match a part of a string with regex, and then substitute just that part with another string. Naturally, in place of regex, you can just use a specific string.
Example:
"user#gmail.com".gsub(/#gmail/, '#googlemail')
is equal to
user#googlemail.com
In my example I used #gmail and #googlemail instead of just gmail and googlemail. The reason for this is to make sure it's not an account with gmail in the name. It's unlikely, but could happen.
Don't match the .com either, as that can change depending on where the user's email is.
Assuming googlemail.com and gmail.com are the only two possibilities, you can use sub to replace a pattern with given replacement:
def swap_string(str)
if str =~ /gmail.com$/
str.sub("gmail.com","googlemail.com")
else
str.sub("googlemail.com","gmail.com")
end
end
swap_string 'user#gmail.com'
# => "user#googlemail.com"
swap_string 'user#googlemail.com'
# => "user#gmail.com"
You can try with Ruby gsub :
eg:
"user#gmail.com".gsub("gmail.com","googlemail.com");
As per your need of passing a string parameter in a function this should do:
def swap_mails(str)
if str =~ /gmail.com$/
str.sub('gmail.com','googlemail.com');
else
str.sub('googlemail.com','gmail.com');
end
end
swap_mails "vgmail#gmail.com" //vgmail#googlemail.com
swap_mails "vgmail#googlemail.com" ////vgmail#gmail.com
My addition :
def swap_domain str
str[/.+#/] + [ 'gmail.com', 'googlemail.com' ].detect do |d|
d != str.split('#')[1]
end
end
swap_domain 'user#gmail.com'
#=> user#googlemail.com
swap_domain 'user#googlemail.com'
#=> user#gmail.com
And this is bad code, imo.
String has a neat trick up it's sleeve in the form of String#[]:
def swap_string(string, lookups = {})
string.tap do |s|
lookups.each { |find, replace| s[find] = replace and break if s[find] }
end
end
# Example Usage
lookups = {"googlemail.com"=>"gmail.com", "gmail.com"=>"googlemail.com"}
swap_string("user#gmail.com", lookups) # => user#googlemail.com
swap_string("user#googlemail.com", lookups) # => user#gmail.com
Allowing lookups to be passed to your method makes it more reusable but you could just as easily have that hash inside of the method itself.

How to click a variable link?

Reference to Ċ½eljko Filipin's answer on
How do I retrieve a custom attribute in watir?
I have a number of links such as:
<a href="//stackoverflow.com"
title="professional and enthusiast programmers">Stack Overflow</a>
<a href="//programmers.stackexchange.com"
title="professional programmers interested in conceptual questions about software development">Programmers</a>
I have stored the links in an array; I need to click on the individual link.
browser.link(:href => /stackoverflow/).click
Instead of "stackoverflow", I want to run through my array (i.e. replace with array variable):
browser.link(:href => /array[i]/).click
Can anyone enlighten me how I can achieve this?
Is that what you want?
array.each { |link|
browser.link(:href, 'link').click
}
Here's a contrived example that will cycle through a list of href attributes and interpolate each into a loop that will click each link:
require 'watir-webdriver'
b = Watir::Browser.new
b.goto('http://www.iana.org/domains/reserved')
arr = %W(/domains /numbers /protocols) # an array of strings
arr.each { |el| b.link(href: "#{el}").click}
The trick here is the interpolation syntax: #{}. When placed within a double-quoted string, the code within #{} is evaluated and inserted into the string. For instance:
# interpolate a string
str_to_insert = "world"
puts "hello #{str_to_insert}"
#=> hello world
# evaluate code before interpolation
puts "one plus two equals #{1 + 2}"
#=> one plus two equals 3

Replace characters from string Ruby

I have the following string which has an array element in it and I will like to remove the quotes in the array element to the outside of the array:
"date":"2014-05-04","name":"John","products":["12","14","45"],"status":"completed"
Is there a way to remove the double quotes in [] and add double quotes to the start and end of []? Results:
"date":"2014-05-04","name":"John","products":"[12,14,45]","status":"completed"
Can that be done in ruby or is there a command line that I can use?
Your string looks like a json hash to me:
json = '{"date":"2014-05-04","name":"John","products":["12","14","45"],"status":"completed"}'
require 'json'
hash = JSON.load(json)
hash.update('products' => hash['products'].map(&:to_i))
puts hash.to_json
# => {"date":"2014-05-04","name":"John","products":[12,14,45],"status":"completed"}
Or if you really want to have the array represented as a string (what is not json anymore):
hash.update('products' => hash['products'].map(&:to_i).to_s) # note .to_s here
puts hash.to_json
# => {"date":"2014-05-04","name":"John","products":"[12,14,45]","status":"completed"}
The answer by #spickermann is pretty good, and the best way I can think of, but since I had fun trying to find an alternative without using json, here it goes:
def string_to_result(str)
str.match(/(?:\[)((?:")+(.)+(?:")+)+(?:\])/)
str.gsub($1, "#{$1.split(',').map{ |num| num.gsub('"', '') }.join(',')}").gsub(/\[/, '"[').gsub(/\]/, ']"').gsub(/String/, 'Results')
end
Is ugly as hell, but it works :P
I tried to do it on a single step, but that was way harder for my regexp skills.
Anyway, you should never parse something structured such as json or xml using only regexps, and this is merely for fun.
[EDIT] Had the bracket adjacent quotes wrong,sorry. Fixed.
Also, one more thing, this fails A LOT! An empty array or an array in other place in the string are just a few cases where it would fail.
You could use the form of String#gsub that takes a block:
str = '"2014-05-04","name":"John","products":["12","14","45"],"status":"completed"'
puts str.gsub(/\["(\d+)","(\d+)","(\d+)"\]/) { "\"[#{$1},#{$2},#{$3}]\"" }
#"2014-05-04","name":"John","products":"[12,14,45]","status":"completed"

How do I break up a string around "{tags}"?

I am writing a function which can have two potential forms of input:
This is {a {string}}
This {is} a {string}
I call the sub-strings wrapped in curly-brackets "tags". I could potentially have any number of tags in a string, and they could be nested arbitrarily deep.
I've tried writing a regular expression to grab the tags, which of course fails on the nested tags, grabbing {a {string}, missing the second curly bracket. I can see it as a recursive problem, but after staring at the wrong answer too long I feel like I'm blind to seeing something really obvious.
What can I do to separate out the potential tags into parts so that they can be processed and replaced?
The More Complicated Version
def parseTags( oBody, szText )
if szText.match(/\{(.*)\}/)
szText.scan(/\{(.*)\}/) do |outers|
outers.each do |blah|
if blah.match(/(.*)\}(.*)\{(.*)/)
blah.scan(/(.*)\}(.*)\{(.*)/) do |inners|
inners.each do |tags|
szText = szText.sub("\{#{tags}\}", parseTags( oBody, tags ))
end
end
else
szText = szText.sub("\{#{blah}\}", parseTags( oBody, blah ))
end
end
end
end
if szText.match(/(\w+)\.(\w+)(?:\.([A-Za-z0-9.\[\]": ]*))/)
func = $1+"_"+$2
begin
szSub = self.send func, oBody, $3
rescue Exception=>e
szSub = "{Error: Function #{$1}_#{$2} not found}"
$stdout.puts "DynamicIO Error Encountered: #{e}"
end
szText = szText.sub("#{$1}.#{$2}#{$3!=nil ? "."+$3 : ""}", szSub)
end
return szText
end
This was the result of tinkering too long. It's not clean, but it did work for a case similar to "1" - {help.divider.red.sys.["{pc.login}"]} is replaced with ---------------[ Duwnel ]---------------. However, {pc.attr.str.dotmode} {ansi.col.red}|{ansi.col.reset} {pc.attr.pre.dotmode} {ansi.col.red}|{ansi.col.reset} {pc.attr.int.dotmode} implodes brilliantly, with random streaks of red and swatches of missing text.
To explain, anything marked {ansi.col.red} marks an ansi red code, reset escapes the color block, and {pc.attr.XXX.dotmode} displays a number between 1 and 10 in "o"s.
As others have noted, this is a perfect case for a parsing engine. Regular expressions don't tend to handle nested pairs well.
Treetop is an awesome PEG parser that you might be interested in taking a look at. The main idea is that you define everything that you want to parse (including whitespace) inside rules. The rules allow you to recursively parse things like bracket pairs.
Here's an example grammar for creating arrays of strings from nested bracket pairs. Usually grammars are defined in a separate file, but for simplicity I included the grammar at the end and loaded it with Ruby's DATA constant.
require 'treetop'
Treetop.load_from_string DATA.read
parser = BracketParser.new
p parser.parse('This is {a {string}}').value
#=> ["This is ", ["a ", ["string"]]]
p parser.parse('This {is} a {string}').value
#=> ["This ", ["is"], " a ", ["string"]]
__END__
grammar Bracket
rule string
(brackets / not_brackets)+
{
def value
elements.map{|e| e.value }
end
}
end
rule brackets
'{' string '}'
{
def value
elements[1].value
end
}
end
rule not_brackets
[^{}]+
{
def value
text_value
end
}
end
end
I would recommend instead of fitting more complex regular expressions to this problem, that you look into one of Ruby's grammar-based parsing engines. It is possible to design recursive and nested grammars in most of these.
parslet might be a good place to start for your problem. The erb-alike example, although it does not demonstrate nesting, might be closest to your needs: https://github.com/kschiess/parslet/blob/master/example/erb.rb

Copy a file with the variables substituted

I have a file containing substituted variables (#{...}) and I would like to copy it into another file, with the variables substituted by their values.
Here's what I have
file = File.open(#batch_file_name, "w+")
script=File.open("/runBatch.script","r")
script.each do |line|
file.puts(line)
end
But this is apparently not the right way to do that. Any suggestion ?
Instead of #{...} in your file use ERB files.
No, this isn't the right way to do it. You can't expect Ruby to magically interpret any #{} it encounters anywhere in your data as variable interpolation. This would (amongst other terrible side effects) yield massive security problems everywhere.
If you want to interpolate data into a string you'll need to eval it, which has its own security risks:
str = 'The value of x is #{x}'
puts str # The value of x is #{x}
x = "123"
puts eval "\"#{str}\"" # Thje value of x is 123
It's not clear which variables you're trying to interpolate into your data. This is almost certainly the wrong way to go about doing whatever it is your doing.
Ok say you have a file named tmp.file that has the following text:
This is #{foobar}!
Then you can easily do the following:
str = ""
File.open("tmp.file", "r") do |f|
str = f.read
end
abc = "Sparta"
puts eval('"' + str + '"')
And your result would be This is Sparta!
But as already suggested you should go with a real template solution like ERB. Then you would use your files like views in Rails. Instead of This is #{foobar}. you would have This is <%= foobar %>.

Resources