Regexp in Ruby does not give the expected result - ruby

I have some strings like 2015 - THIS Test and 2015 - THAT Test.
I want to have the part THIS Test or THAT Test so I tried this:
"2015 - THIS Test"[/((THIS|THAT)\s\.*)/]
But that only gives me THIS or THAT.
Why does it cut the rest?
How to get the desired substring correctly?
I don't want to rely on just cutting the first 7 characters.

You escaped the dot and it lost the meaning of any character but a newline and started to denote a literal . symbol. \.* matches zero or more literal dots.
Remove the \:
puts "2015 - THIS Test"[/((THIS|THAT)\s.*)/]
puts "2015 - THAT Test"[/((THIS|THAT)\s.*)/]
Result (see demo):
THIS Test
THAT Test

Related

Regex test if value is valid format

I have a task where I need to check if a value is properly quoted CSV column:
cases:
no quotation - OK
"with quotation" - OK
"opening quote - Not Good
improper"quote" - Not Good
closing quote" - Not Good
CSV flags an error like below:
Illegal quoting in line 5. (CSV::MalformedCSVError)
Question: How would I get to have this working using a single regex? I need to flag error for cases 3-5.
And if you have any idea what should be checked if a CSV value is valid or not, please tell so.
EDIT: I have added 2 scenarios/cases below:
"quote "inside quotes" - Not Good
"quotes ""inside quotes" - Not Good
EDIT: added 1 more case:
"" - OK
Without considering escaped quotes :
/^("[^"]*"|[^"]+)$/m
See it here.
It means :
beginning of line
1 quote + anything except quote + 1 quote, or
anything except quote (at least one character)
end of the line
^"{1}.+"{1}$|^[^"]*$
This matches all lines either starting and ending with one quotation mark, or lines not including quotation marks at all.
demo

grep wildcards issue ubuntu

I have an input file named test which looks like this
leonid sergeevich vinogradov
ilya alexandrovich svintsov
and when I use grep like this grep 'leonid*vinogradov' test it says nothing, but when I type grep 'leonid.*vinogradov' test it gives me the first string. What's the difference between * and .*? Because I see no difference between any number of any characters and any character followed by any number of any characters.
I use ubuntu 14.04.3.
* doesn't match any number of characters, like in a file glob. It is an operator, which indicates 0 or more matches of the previous character. The regular expression leonid*vinogradov would require a v to appear immediately after 0 or more ds. The . is the regular expression metacharcter representing any single character, so .* matches 0 or more arbitrary characters.
grep uses regex and .* matches 0 or more of any characters.
Where as 'leonid*vinogradov' is also evaluated as regex and it means leoni followed by 0 or more of letter d hence your match fails.
It's Regular Expression grep uses, short as regexp, not wildcards you thought. In this case, "." means any character, "" means any number of (include zero) the previous character, so "." means anything here.
Check the link, or google it, it's a powerful tool you'll find worth to knew.

repeating regex to match mathematical symbol then number fails

I am trying to match mathematica expressions like 1+2 and 1*2/3.... to infinity. Can someone explain why my regex matches the final case below, and how to fix it so that it matches only valid expressions (that might stretch forever)?
perms=["12+2*4","2+2","-2+","12+34-"]
perms.each do |line|
puts "#{line}=#{eval(line)}" if line =~ /^\d+([+-\/*]\d+){1,}/
end
I expected the output to be:
12+2*4=20
2+2=4
Inside a [character set], the - character defines a range of characters -- think of [a-z] or [0-9]. If you want to match a literal -, it must be the first or last character.
/^\d+(?:[+\/*-]\d+)+$/
Other things: {1,} is exactly +; and you need to anchor at the end too, so you don't match 1+2+
You should finalize your expression with $ to match the entire input string:
/^\d+([-+\/*]\d+){1,}$/
The wrong position of the hyphen - is one source of error in your expression. The missing $ the other.

Ruby capture words between two colons

I want to capture any word between two colons. I tried with this (try on Rubular):
(\:.*\:)
Hello :name:
What are you doing today, :title:?
$:name:, have a lovely :event:.
It works except the last line it captures this:
Match 3
1. :name:, have a lovely :event:
It's getting tripped up by the second (closing) colon and the third (opening) colon. It should capture :name: and :event: individually on that last line.
You need a non-greedy regular expression:
(\:.*?\:)
The .*? will match the shortest possible string, whereas .* matches the longest string found.
For any word between two colons:
(?<=:)\b.*?\b(?=:)
Rubular link
(\:[^:]*\:)
[^:] means "anything but a ':'.
Please be aware that this expression will match "::" also.
Here is your rubular link updated: http://rubular.com/r/VtwhIqtbli.

count quotes in a string that do not have a backslash before them

Hey I'm trying to use a regex to count the number of quotes in a string that are not preceded by a backslash..
for example the following string:
"\"Some text
"\"Some \"text
The code I have was previously using String#count('"')
obviously this is not good enough
When I count the quotes on both these examples I need the result only to be 1
I have been searching here for similar questions and ive tried using lookbehinds but cannot get them to work in ruby.
I have tried the following regexs on Rubular from this previous question
/[^\\]"/
^"((?<!\\)[^"]+)"
^"([^"]|(?<!\)\\")"
None of them give me the results im after
Maybe a regex is not the way to do that. Maybe a programatic approach is the solution
How about string.count('"') - string.count("\\"")?
result = subject.scan(
/(?: # match either
^ # start-of-string\/line
| # or
\G # the position where the previous match ended
| # or
[^\\] # one non-backslash character
) # then
(\\\\)* # match an even number of backslashes (0 is even, too)
" # match a quote/x)
gives you an array of all quote characters (possibly with a preceding non-quote character) except unescaped ones.
The \G anchor is needed to match successive quotes, and the (\\\\)* makes sure that backslashes are only counted as escaping characters if they occur in odd numbers before the quote (to take Amarghosh's correct caveat into account).

Resources