I am building a project which users should be able to generate links easily by putting: #this is the link#. And i am trying to catch strings in between 2 # symbols with regex. I have tried,
#.+#
it works perfectly if only 1 link in users string, but if there are more than 1 links like,
#asdfasdf asdf# asdf asfasdfasdf asd fasd fasdf #asdfasdf asdfasdf asdf asdf#
it catches the whole string. But i need them separately, so i can substitute them with tags.
This is called "greedy regex". By default regular expression matches the longest string possible. You can make it non-greedy this way:
/#.+?#/
Demo: http://rubular.com/r/7WWyaUApFt
Use non-greedy match
#.+?#
It will catch indivisual ones.
Related
I'm trying to output all the lines of a file which contain a specific word/pattern even if it contains other characters between its letters.
Let's say we have a bunch of domain names and we want to filter out all those that contain "paypal" inside, I would like to have this kind of output :
pay-pal-secure.com
payppal.net
etc...
I was wondering if this is possible with grep or does it exist something else that might do it.
Many thanks !
Replace paypal with regexp p.*a.*y.*p.*a.*l to allow all characters between the letters.
Update:
Use extended regular expression p.{0,2}a.{0,2}y.{0,2}p.{0,2}a.{0,2}l to limit characters between the letters to none to two.
Example: grep -E 'p.{0,2}a.{0,2}y.{0,2}p.{0,2}a.{0,2}l' file
See: The Stack Overflow Regular Expressions FAQ
Alternatively you could use agrep (approximate grep):
$ agrep -By paypal file
agrep: 2 words match within 1 error
pay-pal-secure.com
payppal.net
I'm trying to make a regex that matches anything except an exact ending string, in this case, the extension '.exe'.
Examples for a file named:
'foo' (no extension) I want to get 'foo'
'foo.bar' I want to get 'foo.bar'
'foo.exe.bar' I want to get 'foo.exe.bar'
'foo.exe1' I want to get 'foo.exe1'
'foo.bar.exe' I want to get 'foo.bar'
'foo.exe' I want to get 'foo'
So far I created the regex /.*\.(?!exe$)[^.]*/
but it doesn't work for cases 1 and 6.
You can use a positive lookahead.
^.+?(?=\.exe$|$)
^ start of string
.+? non greedily match one or more characters...
(?=\.exe$|$) until literal .exe occurs at end. If not, match end.
See demo at Rubular.com
Wouldn't a simple replacement work?
string.sub(/\.exe\z/, "")
Do you mean regex matching or capturing?
There may be a regex only answer, but it currently eludes me. Based on your test data and what you want to match, doing something like the following would cover both what you want to match and capture:
name = 'foo.bar.exe'
match = /(.*).exe$/.match(name)
if match == nil
# then this filename matches your conditions
print name
else
# otherwise match[1] is the capture - filename without .exe extension
print match[1]
end
string pattern = #" (?x) (.* (?= \.exe$ )) | ((?=.*\.exe).*)";
First match is a positive look-ahead that checks if your string
ends with .exe. The condition is not included in the match.
Second match is a positive look-ahead with the condition included in the
match. It only checks if you have something followed by .exe.
(?x) is means that white spaces inside the pattern string are ignored.
Or don't use (?x) and just delete all white spaces.
It works for all the 6 scenarios provided.
I need to clear all characters but numbers and dots in a file.
The numbers are formatted as follows:
$(24.50)
Im using the following code to accomplish the task:
sed 's/[^0-9]*//'
It works but the last parenthesis is not removed. After running the code i get:
24.50)
I should get:
24.50
Please help
I think you could use the following:
sed 's/[^0-9.]//g'
Your regular expression is only matching a single instance of [^0-9.]*. Namely, the $( at the beginning. In order to get sed to match and replace all instances, you need to put a g at the end, as in:
sed 's/[^0-9.]*//g'
The g basically means "match this regular expression anywhere in the input". By default, it will only match on the first instance it encounters, and then stop.
I need a regex so that it captures these:
/articles/123
/articles/123/something
/articles/123/something_else doesnt' matter
and doesn't these:
/articles
/articles/
/articles/123a
where 123 is integer
I tried this:
%r{^/articles/\d+} and %r{^/articles/\d+/*} but it also captured this /articles/1a
Use this:
^\/articles\/\d+(\/.*|$)
Conditionally match the end of the string using the $ character.
Here's a regex101 to play around with: https://regex101.com/r/dA1wD4/1
I'm trying to figure out how I can match any .css filename, not starting with an underscore, preceding any string. I found a good starting point from this question on stackoverflow (ActiveAdmin assets precompile error) :
[/^[^_]\w+\.(css|css.scss)$/]
However, this regex only matches filename.css without an underscore. I'd like to have a regex that matches any path before the filename without underscore. The following strings should match :
mystyle.css
application.css.scss
/assets/stylesheets/application.css
but the following strings should not match :
_mystyle.css
_application.css.scss
/assets/stylesheets/_application.css
Any help would be appreciated.
Something like this should work:
/(.+\/|^)[a-z0-9\.]+\.s?css$/
Not all ruby versions support it, but you coul also try a negative lookahead:
/.+\/?(?!_)\w+\.s?css$/
http://ruby-doc.org/core-2.0/Regexp.html#label-Anchors
This should work:
/^(.*?\/)?[^_]\w*\.(css|css\.scss)$/
Explanation:
(.*?\/)? # Means it accepts any characters upfront, ending with a slash,
# then the filename. The ? makes it optional.
Assuming that the input is always a file path, I would prefer to do like this:
File.basename(file_path).match(/_.*(css|scss|sass)/)
Regexps are hard to read, and so to improve your code readability is a good idea to use as few as necessary.
Also, if you are doing other matches, you might want to extract the css file extensions in other regexp like
css_exts_regexp = /(css|scss|sass)/
File.basename(file_path).match(/_.*#{css_exts_regxp}/)