I've got a string like this one below:
My first LINK
and my second LINK
How do I substitute all the links in this string from href="URL" to href="/redirect?url=URL" so that it becomes
My first LINK
and my second LINK
Thanks!
Given your case we can construct following regex:
re = /
href= # Match attribute we are looking for
[\'"]? # Optionally match opening single or double quote
\K # Forget previous matches, as we dont really need it
([^\'" >]+) # Capture group of characters except quotes, space and close bracket
/x
Now you can replace captured group with string you need (use \1 to refer a group):
str.gsub(re, '/redirect?url=\1')
gsub allows you to match regex patterns and use captured substrings in the substitution:
x = <<-EOS
My first LINK
and my second LINK
EOS
x.gsub(/"(.*)"/, '"/redirect?url=\1"') # the \1 refers to the stuff captured
# by the (.*)
Related
From a string:
"(book(1:3000))"
I need to exclude opening and closing brackets and match:
"book(1:3000)"
using regular expression.
I tried this regular expression:
/([^',()]+)|'([^']*)'/
which detects all characters and integers excluding brackets. The string detected by this regex is:
"book 1:3000"
Is there any regex that disregards the opening and closing brackets, and gives the entire string?
Build the regexp that explicitly states exactly what you want to extract: alphanumerics, followed by the opening parenthesis, followed by digits, followed by a colon, followed by digits, followed by closing parenthesis:
'(book(1:3000))'[/\w+\(\d+:\d+\)/]
#⇒ "book(1:3000)"
"(book(1:3000))"[/^\(?(.+?\))\)?/, 1]
=> "book(1:3000)"
"book(1:3000)"[/^\(?(.+?\))\)?/, 1]
=> "book(1:3000)"
The regex split on multiple lines for easier reading:
/
^ # start of string
\(? # character (, possibly (?)
( # start capturing
.+? # any characters forward until..
\) # ..a closing bracket
) # stop capturing
/x # end regex with /x modifier (allows splitting to lines)
1. Look for a possible ( in the beginning of string and ignore it.
2. Start capturing
3. Capture until and including the first )
But this is where it fails:
"book(1:(10*30))"[/^\(?(.+?\))\)?/, 1]
=> "book(1:(10*30)"
If you need something like that, you probably need to use a recursive regex as
described in another stackoverflow answer.
I want to use this regex to match any block comment (c-style) in a string.
But why the below does not?
rblockcmt = Regexp.new "/\\*[.\s]*?\\*/" # match block comment
p rblockcmt=~"/* 22/Nov - add fee update */"
==> nil
And in addition to what Sir Swoveland posted, a . matches any character except a newline:
The following metacharacters also behave like character classes:
/./ - Any character except a newline.
https://ruby-doc.org/core-2.3.0/Regexp.html
If you need . to match a newline, you can specify the m flag, e.g. /.*?/m
Options
The end delimiter for a regexp can be followed by one or more
single-letter options which control how the pattern can match.
/pat/i - Ignore case
/pat/m - Treat a newline as a character matched by .
...
https://ruby-doc.org/core-2.3.0/Regexp.html
Because having exceptions/quirks like newline not matching a . can be painful, some people specify the m option for every regex they write.
It appears that you intend [.\s]*? to match any character or a whitespace, zero or more times, lazily. Firstly, whitespaces are characters, so you don't need \s. That simplifies your expression to [.]*?. Secondly, if your intent is to match any character there is no need for a character class, just write .. Thirdly, and most importantly, a period within a character class is simply the character ".".
You want .*? (or [^*]*).
I'm trying to make a regex that matches anything except an exact ending string, in this case, the extension '.exe'.
Examples for a file named:
'foo' (no extension) I want to get 'foo'
'foo.bar' I want to get 'foo.bar'
'foo.exe.bar' I want to get 'foo.exe.bar'
'foo.exe1' I want to get 'foo.exe1'
'foo.bar.exe' I want to get 'foo.bar'
'foo.exe' I want to get 'foo'
So far I created the regex /.*\.(?!exe$)[^.]*/
but it doesn't work for cases 1 and 6.
You can use a positive lookahead.
^.+?(?=\.exe$|$)
^ start of string
.+? non greedily match one or more characters...
(?=\.exe$|$) until literal .exe occurs at end. If not, match end.
See demo at Rubular.com
Wouldn't a simple replacement work?
string.sub(/\.exe\z/, "")
Do you mean regex matching or capturing?
There may be a regex only answer, but it currently eludes me. Based on your test data and what you want to match, doing something like the following would cover both what you want to match and capture:
name = 'foo.bar.exe'
match = /(.*).exe$/.match(name)
if match == nil
# then this filename matches your conditions
print name
else
# otherwise match[1] is the capture - filename without .exe extension
print match[1]
end
string pattern = #" (?x) (.* (?= \.exe$ )) | ((?=.*\.exe).*)";
First match is a positive look-ahead that checks if your string
ends with .exe. The condition is not included in the match.
Second match is a positive look-ahead with the condition included in the
match. It only checks if you have something followed by .exe.
(?x) is means that white spaces inside the pattern string are ignored.
Or don't use (?x) and just delete all white spaces.
It works for all the 6 scenarios provided.
I want to capture any word between two colons. I tried with this (try on Rubular):
(\:.*\:)
Hello :name:
What are you doing today, :title:?
$:name:, have a lovely :event:.
It works except the last line it captures this:
Match 3
1. :name:, have a lovely :event:
It's getting tripped up by the second (closing) colon and the third (opening) colon. It should capture :name: and :event: individually on that last line.
You need a non-greedy regular expression:
(\:.*?\:)
The .*? will match the shortest possible string, whereas .* matches the longest string found.
For any word between two colons:
(?<=:)\b.*?\b(?=:)
Rubular link
(\:[^:]*\:)
[^:] means "anything but a ':'.
Please be aware that this expression will match "::" also.
Here is your rubular link updated: http://rubular.com/r/VtwhIqtbli.
Hey I'm trying to use a regex to count the number of quotes in a string that are not preceded by a backslash..
for example the following string:
"\"Some text
"\"Some \"text
The code I have was previously using String#count('"')
obviously this is not good enough
When I count the quotes on both these examples I need the result only to be 1
I have been searching here for similar questions and ive tried using lookbehinds but cannot get them to work in ruby.
I have tried the following regexs on Rubular from this previous question
/[^\\]"/
^"((?<!\\)[^"]+)"
^"([^"]|(?<!\)\\")"
None of them give me the results im after
Maybe a regex is not the way to do that. Maybe a programatic approach is the solution
How about string.count('"') - string.count("\\"")?
result = subject.scan(
/(?: # match either
^ # start-of-string\/line
| # or
\G # the position where the previous match ended
| # or
[^\\] # one non-backslash character
) # then
(\\\\)* # match an even number of backslashes (0 is even, too)
" # match a quote/x)
gives you an array of all quote characters (possibly with a preceding non-quote character) except unescaped ones.
The \G anchor is needed to match successive quotes, and the (\\\\)* makes sure that backslashes are only counted as escaping characters if they occur in odd numbers before the quote (to take Amarghosh's correct caveat into account).