Regexp, how to limit a match

Regexp, how to limit a match - ruby

I have a string:
string = %q{<span class="no">2503</span>read_attribute_before_type_cast(<span class="pc">self</span>.class.primary_key)}
In this example I want to match the words 'class' which are not in the tag. Regexp for this:
/\bclass[^=]/
But the problem is that it matches the last letter
/\bclass[^=]/.match(string) => 'class.'
I don't want have a last dot in a result. I've tried this regexp:
/\bclass(?:[^=])/
but still got the same result. How to limit the result to 'class'? Thanks

You are almost correct, but you have an error in your look ahead. Try this:
/\bclass(?!=)/
The regex term (?!=) means the input to the right must not match the character '='

You can take your variable string and extract a subsection using groups:
substring = string[/\b(class)[^=]/, 1]
The brackets around class will set that as the first "group", which is referred to by the 1 as the second parameter in the square brackets.

Assuming your only issue is keeping it from matching span.class.blah, just ignore . as well, so [^=.].

Related

Split a string and remove the first element in string

Original string '4.0.0-4.0-M-672092'
How to modify the Original string to "4.0-M-672092" using a one line code.
Any Help is highly appreciated .
Thanks and Regards

The 'split' method works in this case
https://apidock.com/ruby/String/split
'4.0.0-4.0-M-672092'.split('-')[1..-1].join('-')
# => "4.0-M-672092"
Just be careful, in this application is fine, but in long texts this might become unoptimized, since it splits all the string and then joins the array all over again
If you need this in wider texts to be more optimized, you can find the "-" index (which is your split) and use the next position to make a substring
text = '4.0.0-4.0-M-672092'
text[(text.index('-') + 1)..-1]
# => "4.0-M-672092"
But you can't do it in one line, and not finding a split character will result in an error, so use a rescue statement if that is possible to happen

Simplest way:
'4.0.0-4.0-M-672092'.split('-', 2).second

"4.0.0-4.0-M-672092"[/(?<=-).*/]
#=> "4.0-M-672092"
The regular expression reads, "Match zero or more characters other than newlines, as many as possible (.*), provided the match is preceded by a hyphen. (?<=-) is a positive lookbehind. See String#[].

How to regex the strings in an url

http://something.com/bOhxBeD,SyhyTGi,TMDDSIB,U72gx2J,kQTIRy9,7VXgGDw,eSxIcK6,S5oNlnn,WBHHsLk,BdMGd2d,U9kNlsF,cHVyc7Y,D83kaJ5,cLWgdSO,iWtCIF3,ount8L6
I have tried to get the value: bOhxBeD, SyhyTGi and so on. This is what I come up with ( yes fairly simple ) /([a-zA-Z0-9]{7})/, it seems to work with PCRE:
([a-zA-Z0-9]{7})
Debuggex Demo
But when it comes to Ruby, I use it like this :
str.match(/([a-zA-Z0-9]{7})/)
#<MatchData "bOhxBeD" 1:"bOhxBeD">
it doesn't seem to work. Can anyone point out what's wrong with this regex ? Thanks

You need to add word boundary \b inorder to match an exact 7 alphanumeric characters.
\b[a-zA-Z0-9]{7}\b
DEMO
irb(main):006:0> "http://something.com/bOhxBeD,SyhyTGi,TMDDSIB,U72gx2J,kQTIRy9,7VXgGDw,eSxIcK6,S5oNlnn,WBHHsLk,BdMGd2d,U9kNlsF,cHVyc7Y,D83kaJ5,cLWgdSO,iWtCIF3,ount8L6".scan(/\b([a-zA-Z0-9]{7})\b/)
=> [["bOhxBeD"], ["SyhyTGi"], ["TMDDSIB"], ["U72gx2J"], ["kQTIRy9"], ["7VXgGDw"], ["eSxIcK6"], ["S5oNlnn"], ["WBHHsLk"], ["BdMGd2d"], ["U9kNlsF"], ["cHVyc7Y"], ["D83kaJ5"], ["cLWgdSO"], ["iWtCIF3"], ["ount8L6"]]

(?!.*?\/)[a-zA-Z0-9]{7}
Is should be this.Or else it will pick 7 letter words from link as well."somethi" will be in ans.But i guess that is not required.

match only picks up the first match.
You can try the global version of match which is scan.
You can use scan to search string not containing specific characters using [^...]:
str.scan(/[^\/\.\,]+/)[3..-1]
#=> ["bOhxBeD", "SyhyTGi", "TMDDSIB", "U72gx2J", "kQTIRy9", "7VXgGDw", "eSxIcK6", "S5oNlnn", "WBHHsLk", "BdMGd2d", "U9kNlsF", "cHVyc7Y", "D83kaJ5", "cLWgdSO", "iWtCIF3", "ount8L6"]
Update:
If you know that the strings between the comma are always 7 characters, you can use this instead:
str.scan(/[^\/\.\,]{7}/)[1..-1]

it happens because your regexp match just one element which contain 7 chars, nothing more,
as simple solution could be:
str.match(/\/(.*)\z/)[1].split(',')

You could use String#[] and String#split:
str[/.*\/(.*)/,1].split(',')
#=> ["bOhxBeD", "SyhyTGi", "TMDDSIB", "U72gx2J", "kQTIRy9", "7VXgGDw",
# "eSxIcK6", "S5oNlnn", "WBHHsLk", "BdMGd2d", "U9kNlsF", "cHVyc7Y",
# "D83kaJ5", "cLWgdSO", "iWtCIF3", "ount8L6"]
.*\/ in the regex, "greedy" as it is, will consume characters up to and including the last forward slash in the string. Capture group #1 (.*) sucks up the remainder of the string and, due to the presence of ,1, returns it. split(',') then breaks up the string to give you the desired array.
Another way:
str[str[/.*\//].size..-1].split(',')

String gsub - Replace characters between two elements, but leave surrounding elements

Suppose I have the following string:
mystring = "start/abc123/end"
How can you splice out the abc123 with something else, while leaving the "/start/" and "/end" elements intact?
I had the following to match for the pattern, but it replaces the entire string. I was hoping to just have it replace the abc123 with 123abc.
mystring.gsub(/start\/(.*)\/end/,"123abc") #=> "123abc"
Edit: The characters between the start & end elements can be any combination of alphanumeric characters, I changed my example to reflect this.

You can do it using this character class : [^\/] (all that is not a slash) and lookarounds
mystring.gsub(/(?<=start\/)[^\/]+(?=\/end)/,"7")

For your example, you could perhaps use:
mystring.gsub(/\/(.*?)\//,"/7/")
This will match the two slashes between the string you're replacing and putting them back in the substitution.

Alternatively, you could capture the pieces of the string you want to keep and interpolate them around your replacement, this turns out to be much more readable than lookaheads/lookbehinds:
irb(main):010:0> mystring.gsub(/(start)\/.*\/(end)/, "\\1/7/\\2")
=> "start/7/end"
\\1 and \\2 here refer to the numbered captures inside of your regular expression.

The problem is that you're replacing the entire matched string, "start/8/end", with "7". You need to include the matched characters you want to persist:
mystring.gsub(/start\/(.*)\/end/, "start/7/end")
Alternatively, just match the digits:
mystring.gsub(/\d+/, "7")

You can do this by grouping the start and end elements in the regular expression and then referring to these groups in in the substitution string:
mystring.gsub(/(?<start>start\/).*(?<end>\/end)/, "\\<start>7\\<end>")

Ruby regular expression

Apparently I still don't understand exactly how it works ...
Here is my problem: I'm trying to match numbers in strings such as:
910 -6.258000 6.290
That string should gives me an array like this:
[910, -6.2580000, 6.290]
while the string
blabla9999 some more text 1.1
should not be matched.
The regex I'm trying to use is
/([-]?\d+[.]?\d+)/
but it doesn't do exactly that. Could someone help me ?
It would be great if the answer could clarify the use of the parenthesis in the matching.

Here's a pattern that works:
/^[^\d]+?\d+[^\d]+?\d+[\.]?\d+$/
Note that [^\d]+ means at least one non digit character.
On second thought, here's a more generic solution that doesn't need to deal with regular expressions:
str.gsub(/[^\d.-]+/, " ").split.collect{|d| d.to_f}
Example:
str = "blabla9999 some more text -1.1"
Parsed:
[9999.0, -1.1]

The parenthesis have different meanings.
[] defines a character class, that means one character is matched that is part of this class
() is defining a capturing group, the string that is matched by this part in brackets is put into a variable.
You did not define any anchors so your pattern will match your second string
blabla9999 some more text 1.1
^^^^ here ^^^ and here
Maybe this is more what you wanted
^(\s*-?\d+(?:\.\d+)?\s*)+$
See it here on Regexr
^ anchors the pattern to the start of the string and $ to the end.
it allows Whitespace \s before and after the number and an optional fraction part (?:\.\d+)? This kind of pattern will be matched at least once.

maybe /(-?\d+(.\d+)?)+/
irb(main):010:0> "910 -6.258000 6.290".scan(/(\-?\d+(\.\d+)?)+/).map{|x| x[0]}
=> ["910", "-6.258000", "6.290"]

str = " 910 -6.258000 6.290"
str.scan(/-?\d+\.?\d+/).map(&:to_f)
# => [910.0, -6.258, 6.29]
If you don't want integers to be converted to floats, try this:
str = " 910 -6.258000 6.290"
str.scan(/-?\d+\.?\d+/).map do |ns|
ns[/\./] ? ns.to_f : ns.to_i
end
# => [910, -6.258, 6.29]

Rails 3 + regex - Replace part of a string, 1 occurrence

I'm new to Rails, and furthermore to regex. Been looking around, but I'm blocked...
I have a string like this :
Current: http://zs.domain.com/user_images/123456789/imageName_size.ext
Wanted: http://zs.domain.com/user_images/123456789/imageName.ext
I've managed to get to this :
http://a0.twimg.com/profile/1240267050/logo1.png
=> losing all occurrences with
picture.gsub!(/_([a-z0-9-]+)/, '')
or this :
http://a0.twimg.com/profile_images/1240267050/logo1
=> changing only the last occurrence, but losing the extension with
picture.gsub!(/_([a-z0-9-]+)**.(png|gif|jpg|jpeg)**/, '')

You're almost there. The second parameter is the string with which the match will be replaced, and you can re-use matched groups from the match. This will do the trick:
picture.gsub!(/_([a-z0-9-]+).(png|gif|jpg|jpeg)/, '.\2')
To accomodate for the additional conditions, as posed in the comment:
picture.gsub!(/_([^\/]+).(png|gif|jpg|jpeg)/, '.\2')

markijbema's answer will change the string
.../xxx_yyygifzzz/...,
into
.../xxxgifzzz/....
In order to avoid that, you can do this:
picture.gsub!(/_[^\/]+(?=\.[^\.]+\z)/, '')
(?=...) is understood as a context that follows the string, and will not be included in the match.
\z describes the end of the string, so this regexp is safe to use when some intermediate directory includes a string like above.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Regexp, how to limit a match - ruby

You are almost correct, but you have an error in your look ahead. Try this: /\bclass(?!=)/ The regex term (?!=) means the input to the right must not match the character '='

You can take your variable string and extract a subsection using groups: substring = string[/\b(class)[^=]/, 1] The brackets around class will set that as the first "group", which is referred to by the 1 as the second parameter in the square brackets.

Assuming your only issue is keeping it from matching span.class.blah, just ignore . as well, so [^=.].

Related

Split a string and remove the first element in string

How to regex the strings in an url

String gsub - Replace characters between two elements, but leave surrounding elements

Ruby regular expression

Rails 3 + regex - Replace part of a string, 1 occurrence

Categories

Resources