How can I get the content in between "{ }" in Ruby? For example,
I love {you}
How can I fetch the element "you"? If I want to replace the content, say change "you" to "her", how should I do that? Probably using gsub?
replacements = {
'you' => 'her',
'angels' => 'demons',
'ice cream' => 'puppies',
}
my_string = "I love {you}.\nYour voice is like {angels} singing.\nI would love to eat {ice cream} with you sometime!"
replacements.each do |source, replacement|
my_string.gsub! "{#{source}}", replacement
end
puts my_string
# => I love her.
# => Your voice is like demons singing.
# => I would love to eat puppies with you sometime!
The simple way to get the content from the inside of the {...} is:
str = 'I love {you}'
str[/{(.+)}/, 1] # => "you"
That basically says, "grab everything inside a leading { to a trailing }. It's not real sophisticated and can be fooled by nested {} pairs.
Replacing the target string can be done various ways:
replace_str = 'her'
'I love {you}'.sub('you', replace_str) # => "I love {her}"
A simple sub will replace the first occurrence of the target string with the replacement text.
You could use a regex instead of the string:
'I love you {you}'.sub(/you/, replace_str) # => "I love her {you}"
If there are multiple occurrences of the target string then use a bit more text to locate it. This uses the wrapping delimiters to locate it, and then replaces them also. There are other ways to do this, but I'd do it like:
'I love you {you}'.sub(/{.+}/, "{#{ replace_str }}") # => "I love you {her}"
Alex Wayne's answer came close but didn't go all the way: Ruby's gsub has a really nice feature, where you can pass it a regex and a hash, and it will replace all the occurrences of the regex matches with the values in the hash:
hash = {
'I' => 'She',
'love' => 'loves',
'you' => 'me'
}
str.gsub(Regexp.union(hash.keys), hash) # => "She loves {me}"
That's really powerful when you want to take a template and quickly replace all the placeholders in it.
You can always use .index:
a = 'I love {bill gates}'
a[a.index('{')+1..a.index('}')-1]
The last line just says get 'a' from right after the first occurrence of '{' and right before the first occurrence of '}'. It is important to note, however, that this will only get the text between the first occurrences of {}. So it will work for your above example.
I would use indexing also to add something new between the {}s.
That would look something like:
a[0..a.index('{')] + 'Steve Jobs' + a[a.index('}')..-1]
Again this only works for the first occurrence of '{' and '}'.
Michael G.
why not use some template engine like: https://github.com/defunkt/mustache
note that ruby can do this for %{}:
"foo = %{foo}" % { :foo => 'bar' }
#=> "foo = bar"
and finally do not forget to check existing ruby template engines - do not reinvent the wheel!
Regular expressions are the way to go with gsub. Something like:
existingString.gsub(/\{(.*?)\}/) { "her" }
Related
I am using Ruby 1.9.
I have a hash:
Hash_List={"ruby"=>"fun to learn","the rails"=>"It is a framework"}
I have a string like this:
test_string="I am learning the ruby by myself and also the rails."
I need to check if test_string contains words that match the keys of Hash_List. And if it does, replace the words with the matching hash value.
I used this code to check, but it is returning them empty:
another_hash=Hash_List.select{|key,value| key.include? test_string}
OK, hold onto your hat:
HASH_LIST = {
"ruby" => "fun to learn",
"the rails" => "It is a framework"
}
test_string = "I am learning the ruby by myself and also the rails."
keys_regex = /\b (?:#{Regexp.union(HASH_LIST.keys).source}) \b/x # => /\b (?:ruby|the\ rails) \b/x
test_string.gsub(keys_regex, HASH_LIST) # => "I am learning the fun to learn by myself and also It is a framework."
Ruby's got some great tricks up its sleeve, one of which is how we can throw a regular expression and a hash at gsub, and it'll search for every match of the regular expression, look up the matching "hits" as keys in the hash, and substitute the values back into the string:
gsub(pattern, hash) → new_str
...If the second argument is a Hash, and the matched text is one of its keys, the corresponding value is the replacement string....
Regexp.union(HASH_LIST.keys) # => /ruby|the\ rails/
Regexp.union(HASH_LIST.keys).source # => "ruby|the\\ rails"
Note that the first returns a regular expression and the second returns a string. This is important when we embed them into another regular expression:
/#{Regexp.union(HASH_LIST.keys)}/ # => /(?-mix:ruby|the\ rails)/
/#{Regexp.union(HASH_LIST.keys).source}/ # => /ruby|the\ rails/
The first can quietly destroy what you think is a simple search, because of the ?-mix: flags, which ends up embedding different flags inside the pattern.
The Regexp documentation covers all this well.
This capability is the core to making an extremely high-speed templating routine in Ruby.
You could do that as follows:
Hash_List.each_with_object(test_string.dup) { |(k,v),s| s.sub!(/#{k}/, v) }
#=> "I am learning the fun to learn by myself and also It is a framework."
First, follow naming conventions. Variables are snake_case, and names of classes are CamelCase.
hash = {"ruby" => "fun to learn", "rails" => "It is a framework"}
words = test_string.split(' ') # => ["I", "am", "learning", ...]
another_hash = hash.select{|key,value| words.include?(key)}
Answering your question: split your test string in words with #split and then check whether words include a key.
For checking if the string is substring of another string use String#[String] method:
another_hash = hash.select{|key, value| test_string[key]}
I have a big text file. Within this text file, I want to replace all mentions of the word 'pizza' with 'spinach', 'Pizza' with 'Spinach', and 'pizzing' with 'spinning' -- unless those words occur anywhere within curly braces. So {pizza}, {giant.pizza} and {hot-pizza-oven} should remain unchanged.
My best proposed solution so far is to iterate over the file line-by-line, issuing a regex that detects everything before an { or after an }, and using regexes on each of those strings. But that gets really complex and unwieldy and I want to know if there's a proper solution for this problem.
This can be done in a few steps. I'd iterate through the file line by line, and pass each line to this method:
def spinachize line
# list of words to swap
swaps = {
'pizza' => 'spinach',
'Pizza' => 'Spinach',
'pizzing' => 'spinning'
}
# random placeholder for bracketed text
placeholder = 'fdjfafdlskdsfajkldfas'
# save all instances of bracketed text
bracketed_text = line.scan(/\{.*?\}/)
# remove bracketed text from line
line.gsub!(/\{.*?\}/, placeholder)
# replace all swaps
swaps.each do |original_text, new_text|
line.gsub!(original_text, new_text)
end
# re-insert bracketed text
line.gsub(placeholder){bracketed_text.shift}
end
The comments above explain things as we go. Here are a couple of examples:
spinachize "Pizza is good, but more pizza is better"
=> "Spinach is good, but more spinach is better"
spinachize "Leave bracketed instances of {pizza} or {this.pizza} alone"
=> "Leave bracketed instances of {pizza} or {this.pizza} alone"
As you can see, you can specify the items you want swapped, or modify the method to pull the list from a database or flat file somewhere. The placeholder just needs to be something unique that wouldn't come up in the source file naturally.
The process is this: remove bracketed text from the original line, and remember it for later. Swap all text that needs swapping, then add back the bracketed text. It's not a one-liner, but it works well and is readable and easy to update.
The last line of the method might need some clarification. Not many people know that the "gsub" method can take a block instead of a second parameter. That block then determines what gets put in place of the original text. In this case, every time the block is called I remove the first item off our saved bracket list, and use that.
rules = {'pizza' => 'spinach','Pizza' => 'Spinach','pizzing' => 'spinning'}
regexp = /\{[^{}]*\}|#{rules.keys.join('|')}/m
puts(file.read.gsub(regexp) { |s| rules[s] || s })
This constructs a regular expression that matches either bracketed strings or the strings to replace. We then run it through a block that replaces strings with the given value, and will leave bracketed strings unchanged. With the /m flag, the regular expression can tolerate newlines inside the brackets--if that won't happen, you can take it out. Either way, no need to iterate line by line.
str = "Pizza {pizza} with spinach is not pizzing."
swaps = {'{pizza}' =>'{pizza}',
'{Pizza}' =>'{Pizza}',
'{pizzing}'=> '{pizzing}'
'pizza' => 'spinach',
'Pizza' => 'Spinach',
'pizzing' => 'spinning'}
regex = Regexp.union(swaps.keys)
p str.gsub(regex, swaps) # => "Spinach {pizza} with spinach is not spinning."
I would call the following method for each line of the file.
Code
def doit(line)
replace = {'pizza'=>'spinach', 'Pizza'=>'Spinach', 'pizzing'=>'spinning'}
r = /\{.*?\}/
arr= line.split(r).map { |str|
str.gsub(/\b(?:pizza|Pizza|pizzing)\b/, replace) }
line.scan(r).each_with_object(arr.shift) { |str,res|
res << str << arr.shift }
end
Examples
doit("Pizza Primastrada's {pizza} is the best {pizzing} pizza in town.")
#=> "Spinach Primastrada's {pizza} is the best {pizzing} spinach in town."
doit("{Pizza Primastrada}'s pizza is the best pizzing {pizza} in town.")
#=> "{Pizza Primastrada}'s spinach is the best spinning {pizza} in town."
Explanation
line = "Pizza Primastrada's {pizza} is the best {pizzing} pizza in town."
replace = {'pizza'=>'spinach', 'Pizza'=>'Spinach', 'pizzing'=>'spinning'}
r = /\{.*?\}/
a = line.split(r)
#=> ["Pizza Primastrada's ", " is the best ", " pizza in town."]
b = a.map { |str| str.gsub(/\b(?:pizza|Pizza|pizzing)\b/, replace) }
#=> ["Spinach Primastrada's ", " is the best ", " spinach in town."]
keepers = line.scan(r)
#=> ["{pizza}", "{pizzing}"]
keepers.each_with_object(b.shift) { |str,res| res << str << b.shift }
#=> "Spinach Primastrada's {pizza} is the best {pizzing} spinach in town."
Nested braces
If you wish to permit nested braces, change the regex to:
r = /\{[^{}]*?(?:\{.*?\})*?[^{}]*?\}/
doit("Pizza Primastrada's {{great {great} pizza} is the best pizza.")
#=> "Spinach Primastrada's {{great {great} pizza} is the best spinach."
You referred to the string
{words,salad,#{1,2,3} pizza|}
in a comment. If that is part of a string enclosed in single quotes, not a problem. If enclosed in double quotes, however, # will raise a syntax error. Again, no problem, if the pound character is escaped (\#).
I have a string "FooFoo2014".
I want the result to be => "Foo Foo 2014"
Any idea?
This works fine:
puts "FooFoo2014".scan(/(\d+|[A-Z][a-z]+)/).join(' ')
# => Foo Foo 2014
Of course in condition that you separate numbers and words from capital letter.
"FooFoo2014"
.gsub(/(?<=\d)(?=\D)|(?<=\D)(?=\d)|(?<=[a-z])(?=[A-Z])/, " ")
# => "Foo Foo 2014"
Your example is a little generic. So this might be guessing in the wrong direction. That being said, it seems like you want to reformat the string a little:
"FooFoo2014".scan(/^([A-Z].*)([A-Z].*\D*)(\d+)$/).flatten.join(" ")
As "FooFoo2014" is a string with some internal structure important to you, you need to come up with the right regular expression yourself.
From your question, I extract two tasks:
split the FooFoo at the capital letter.
/([A-Z].*)([A-Z].*)/ would do that, given you only have standard latin letters
split the letter from the digits
/(.*\D)(\d+)/ achieves that.
The result of scan is an array in my version of ruby. Please verify that in your setup.
If you think that regular expressions are too complicated for this, I suggest that you take a good look into ActiveSupport. http://api.rubyonrails.org/v3.2.1/ might help you.
If its only letters then only digits:
target = "FooFoo2014"
match_data = target.match(/([A-Za-z]+)(\d+)/)
p match_data[1] # => "FooFoo"
p match_data[2] # => "2014
If it is two words each made of one capitalized letter then lowercase letters, then digits:
target = "FooBar2014"
match_data = target.match(/([A-Z][a-z]+)([A-Z][a-z]+)(\d+)/)
p match_data[1] # => "Foo"
p match_data[2] # => "Bar"
p match_data[3] # => "2014
Better regex are probably possible.
I want to create a 'swearscan' that can scan user text and swap the swear words out for 'censored'. I thought I coded it properly, but obviously not because I'll show you what's happening. Someone please help!
And since its stackflow we'll substitute swear words for something else
puts "Input your sentence here: "
text = gets.downcase.strip
swear_words = {'cat' => 'censored', 'dog' => 'censored', 'cow' => 'censored'}
clean_text = swear_words.each do |word, clean|
text.gsub(word,clean)
end
puts clean_text
When I ran this program (with the actual swearwords) all it would return is the hash like so: catcensoreddogcensoredcowcensored. What is wrong with my code that it's returning the hash and not the clean_text with everything substituted out?
This works for me:
puts "Input your sentence here: "
text = gets.downcase.strip
swear_words = {'cat' => 'censored', 'dog' => 'censored', 'cow' => 'censored'}
swear_words.each do |word, clean| # No need to copy here
text.gsub!(word,clean) # Changed from gsub
end
puts text # Changed from clean_text
What is wrong is that gsub does not change the original string, but you are expecting it to do so. Using gsub! will change the original string. You are also wrong to expect each to return something in it. Just refer to text in the end to get the replaced string.
By the way, if the replacement strings are all the same 'censored', then it does not make sense to use a hash there. You should just have an array of the swear words, and put the replacement string in the gsub! method directly (or define it as a constant in some other place).
I'd like to do something like:
string.gsub(/(whatever)/,'\n\1\n')
But I don't want "whatever" to be replaced with the literal "\nwhatever\n"
I want the \n to actually correspond to a new line.
I think you need double quotes:
string.gsub(/(whatever)/,"\n\\1\n")
\n is a new line, that's what it means
depending on how you print it, it will give you a new line so
puts "\nwhatever\n".inspect
=> "\nwhatever\n"
however:
puts "\nwhatever\n"
=>
=> whatever
=>
Unless I misunderstand the question.
If you wanted to split it into a list, do this:
puts "\nwhatever\n".split(?\n).inspect
=> ["", "whatever"]