How do I use gsub to search and replace using a regex? - ruby

I want to filter tags out of a description string, and want to make them into anchor tags. I am not able to return the value of the tag.
My input is:
a = "this is a sample #tag and the string is having a #second tag too"
My output should be:
a = "this is a sample #tag and the string is having a #second tag too"
So far I am able to do some minor stuff but I am not able to achive the final output. This pattern:
a.gsub(/#\S+/i, "<a href='/tags/\0'>\0</a>")
returns:
"this is a sample <a href='/tags/\u0000'>\u0000</a> and the string is having a <a href='/tags/\u0000'>\u0000</a> tag too"
What do I need to do differently?

You can do it like this:
a.gsub(/#(\S+)/, '\0')
The reason why your replacement doesn't work is that you must use double escape when you are between double quotes:
a.gsub(/#(\S+)/, "<a href='/tags/\\1'>\\0</a>")
Note that the /i modifier is not needed here.

You need to give gsub a block if you want to do something with the match from the regex:
a.gsub(/#(\S+)/i) { "<a href='/tags/#{$1}'>##{$1}</a>" }
$1 is a global variable that Ruby automatically fills with the first capture block in the matched string.

Try this:
a.gsub(/(?<a>#\w+)/, '\k<a>')

Related

Laravel how to replace a words in string

I am trying to change words in string, I know laravel Str::replace() function but it converts if you write exact word. This is what i want to do :
$string = " this is #username and this is #username2"
I want to find usernames which starts with "#" and make them like below, i want to add <strong> html tag them :
$newstring = " this is <strong>#username</strong> and this is <strong>#username2</strong>"
I couldnt find the way to change dynamic value on str replace.
Thanks
Use preg_replace
$string = 'this is #username and this is #username2';
$newString = preg_replace('/(\#\w+)/', '<strong>$1</strong>', $string);
Docs: https://www.php.net/manual/en/function.preg-replace.php

Pull multiple values from a string using RegEx

I have the string "{:name=>\"entry 1\", :description=>\"description 1\"}"
I'm using regex to get the values of name and description...
string = "{:name=>\"entry 1\", :description=>\"description 1\"}"
name = /\:name=>\"(.*?)\,/.match(string)
description = /\:description=>\"(.*?)\,/.match(string)
This however only returns name as #<MatchData ":name=>\"entry 1\"," 1:"entry 1\""> and description comes back as nil.
What I ideally want is for name to return "entry 1" and description come back as "description 1"
I'm not sure where I'm going wrong... any ideas?
The problem is the comma in /\:description=>\"(.*?)\,/ should be /\:description=>\"(.*?)/ or /\:description=>\"([^"]+)/
Also you can this method:
def extract_value_from_string(string, key)
%r{#{key}=>\"([^"]+)}.match(string)[1]
end
extract_value_from_string(string, 'description')
=> "description 1"
extract_value_from_string(string, 'name')
=> "name 1"
try this regex to retrieve both name and description at one step
(?<=name=>\\"|description=>\\")[^\\]+
try this Demo
I know this demo is using PCRE but I've tested also on http://rubular.com/ and it works fine
and if you want to get them separately use this regex is to extract name (?<=name=>\\")[^\\]+ and this for description (?<=description=>\\")[^\\]+

just capture the text and remove the email with a regex

I'm trying to make a regex that removes me in my text email: toto#toto.com.
example: I ​​request information on your project email: toto#free.fr
So I did this that captures me "email: toto#toto.com"
message ="I ​​request information on your project email. toto#free.fr"
message.gsub!("/(email: [-a-z0-9_+\.]+\#([-a-z0-9]+\.)+[a-z0-9]{2,4}$)/i")
it returns me nothing, and I wish there was just in the message text.
thanks
Try this. This should work for both uppercase, lowercase and emails appear in the middle of the string.
email = /[A-Za-z]{5}:\s[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4}/
s = "I request information on your project email: toto#free.fr"
s.match(email).pre_match #=> "I request information on your project "
s2 = "This email: blah#bLAH.com is in the middle"
s2.match(email).pre_match #=> "This "
s2.match(email).post_match #=> " is in the middle"
But there are more cases not covered e.g. email: followed by many spaces
Your code has several problems:
You are looking for "email: ...", but you message has "email. ...".
You use gsub!, with one parameter, which is not the classic use case, and returns an Enumerator. The classic use case expects a second parameter, which indicates to what you want to substitute the found matches:
Performs the substitutions of String#gsub in place, returning str, or
nil if no substitutions were performed. If no block and no replacement
is given, an enumerator is returned instead.
You pass a string to the gsub! - "/(email: [-a-z0-9_+\.]+\#([-a-z0-9]+\.)+[a-z0-9]{2,4}$)/i", which is different than sending a regex. To pass a regex, you need to drop the quotes around it: /(email: [-a-z0-9_+\.]+\#([-a-z0-9]+\.)+[a-z0-9]{2,4}$)/i
So a fix to your code would look like this:
message ="I ​​request information on your project email: toto#free.fr"
message.gsub!(/(email: [-a-z0-9_+\.]+\#([-a-z0-9]+\.)+[a-z0-9]{2,4}$)/i, '')
# => "I ​​request information on your project "
Also note I changed your code to use gsub instead of gsub!, since gsub! changes the underlying string, instead of creating a new one, and unless you have a good reason to do that, it is not encouraged to mutate the input arguments...
If you want to remove the email from the text use String#sub
message = "I ​​request information on your project email. toto#free.fr"
message.sub!(/[A-Za-z]{5}:\s[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4}/, '')
# => "I ​​request information on your project "

Replacing scan by gsub in Ruby: how to allow code in gsub block?

I am parsing a Wiki text from an XML dump, for a string named 'section' which includes templates in double braces, including some arguments, which I want to reorganize.
This has an example named TextTerm:
section="Sample of a text with a first template {{TextTerm|arg1a|arg2a|arg3a...}} and then a second {{TextTerm|arg1b|arg2b|arg3b...}} etc."
I can use scan and a regex to get each template and work on it on a loop using:
section.scan(/\{\{(TextTerm)\|(.*?)\|(.*?)\}\}/i).each { |item| puts "1=" + item[1] # arg1a etc.}
And, I have been able to extract the database of the first argument of the template.
Now I also want to replace the name of the template "NewTextTerm" and reorganize its arguments by placing the second argument in place of the first.
Can I do it in the same loop? For example by changing scan by a gsub(rgexp){ block}:
section.gsub!(/\{\{(TextTerm)\|(.*?)\|(.*?)\}\}/) { |item| '{{NewTextTerm|\2|\1}}'}
I get:
"Sample of a text with a first template {{NewTextTerm|\\2|\\1}} and then a second {{NewTextTerm|\\2|\\1}} etc."
meaning that the arguments of the regexp are not recognized. Even if it worked, I would like to have some place within the gsub block to work on the arguments. For example, I can't have a puts in the gsub block similar to the scan().each block but only a string to be substituted.
Any ideas are welcome.
PS: Some editing: braces and "section= added", code is complete.
When you have the replacement as a string argument, you can use '\1', etc. like this:
string.gsub!(regex, '...\1...\2...')
When you have the replacement as a block, you can use "#$1", etc. like this:
string.gsub!(regex){"...#$1...#$2..."}
You are mixing the uses. Stick to either one.
Yes, changing the quote by a double quote isn't enough, #$1 is the answer. Here is the complete code:
section="Sample of a text with a first template {{TextTerm|arg1a|arg2a|arg3a...}} and then a second {{TextTerm|arg1b|arg2b|arg3b...}} etc."
section.gsub(/\{\{(TextTerm)\|(.*?)\|(.*?)\}\}/) { |item| "{{New#$1|#$3|#$2}}"}
"Sample of a text with a first template {{NewTextTerm|arg2a|arg3a...|arg1a}} and then a second {{NewTextTerm|arg2b|arg3b...|arg1b}} etc."
Thus, it works. Thanks.
But now I have to replace the string, by a "function" returning the changed string:
def stringreturn(arg1,arg2,arg3) strr = "{{New"+arg1 + arg3 +arg2 + "}}"; return strr ; end
and
section.gsub(/\{\{(TextTerm)\|(.*?)\|(.*?)\}\}/) { |item| stringreturn("#$1","|#$2","|#$3") }
will return:
"Sample of a text with a first template {{NewTextTerm|arg2a|arg3a...|arg1a}} and then a second {{NewTextTerm|arg2b|arg3b...|arg1b}} etc."
Thanks to all!
There is probably a better way to manipulate arguments in MediaWiki templates using Ruby.

Ruby: replace a given URL in an HTML string

In Ruby, I want to replace a given URL in an HTML string.
Here is my unsuccessful attempt:
escaped_url = url.gsub(/\//,"\/").gsub(/\./,"\.").gsub(/\?/,"\?")
path_regexp = Regexp.new(escaped_url)
html.gsub!(path_regexp, new_url)
Note: url is actually a Google Chart request URL I wrote, which will not have more special characters than /?|.=%:
The gsub method can take a string or a Regexp as its first argument, same goes for gsub!. For example:
>> 'here is some ..text.. xxtextxx'.gsub('..text..', 'pancakes')
=> "here is some pancakes xxtextxx"
So you don't need to bother with a regex or escaping at all, just do a straight string replacement:
html.gsub!(url, new_url)
Or better, use an HTML parser to find the particular node you're looking for and do a simple attribute assignment.
I think you're looking for something like:
path_regexp = Regexp.new(Regexp.escape(url))

Resources