Limit number of '1' in a string using Regexp - ruby

I am trying to make a Regexp to match a expression which has more or equal to two '1's.
Here is what I have written till now -
puts "Match." if /(1){1,5}/ =~ test_string
This correctly matches strings having '1' more than or equal to two, but it still matches if the numbers of occurrences of '1' is greater than 5.
How can I correct this Regexp to only match strings having 1 to 5 occurrences of 1?

There are possibly better versions, but this seems to do the trick:
/^([^1]*1){1,5}[^1]*$/
Broken down:
^ - Start of string
[^1]*1 - Zero or more non-1 characters
1 - A '1'.
([^1]*1){1,5} - This pattern occurring between one and five times.
[^1]* - Zero or more non-1 characters
$ - End of string

#Adrian Wragg already have explained the answer,as asked by OP.But I would like to propose another possible solution for this problem,which is below:
puts "Match." if "#{test_string}".count("1") >= 2

If you have strings which contain characters other than one, here is a Regex that will do the job. See an example here at Rubular.
/\A([^1]*1[^1]*){1,5}\Z/
This will match any strings with 2 or more ones. See an example here at Rubular.
/\A1{2,}\Z/
This will match any string with 1-5 ones. See an example here at Rubular.
/\A1{1,5}\Z/

Related

Ruby Regular Expression String Matching t =~ /^\d{2}(:\d{2}){2}$/ [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 6 years ago.
I found this from a code challenge:
def time_correct(t)
return unless t =~ /^\d{2}(:\d{2}){2}$/
end
it is used to find out whether e.g. "0;:44:07" is a regular time string ("HH:MM:SS") or not.
I don't understand the regex though. Can someone explain the /^\d{2}(:\d{2}){2}$/ to me please? Thanks!
On /^\d{2}(:\d{2}){2}$/:
/.../ delimiters the regex expression.
^ matches the start of the line, if on multi line mode, or the beginning of the string otherwise.
\d matches one digit
{2} states that the preceding statement \d must match 2 times.
(...) delimiters a capture group. It group things together as the usual math parenthesis concept and also allow you to you refer to them latter using \i, where i is the index of the group. Example, (a)(b), a is the group 1 and b is the group 2.
\d{2} just explained on the steps 3 and 4.
{2} the same as on the step 4, but here the preceding is the capture group (:\d{2}), which must repeat also 2 times.
$ matches the end of the line, if on multi line mode, or the end of the string otherwise.
If the multi line mode is enabled, your expression matches only things like:
22:33:44
02:33:44
But not as
22:33:44 d
d 22:33:44
f 02:33:44 f
If multi line is not enabled, your expression only matches a string containing a valid expression as:
22:33:44
But nothing, on a string with two valid lines:
22:33:44
02:33:44
This is a link for live testing: https://regex101.com/r/cdSdt4/1

Ruby regex count matched elements in the array of digits

I have a string:
'my_array1: ["1445","374","1449","378"], my_array2: ["1445","374", "1449","378"]'
I need to match all sets of digits from my_array2: [...] and count how many of them there.
I need to do something like this with regex and ruby MatchData
string = 'my_array1: ["1445","374", "1449","378"], my_array2: ["1445","374", "1449","378"]'
matches = string.match(/my_array2\:\s[\[,]\"(\d+)\"/)
count_matches = matches.size
Expected result should be 4.
What is the correct way of doing it?
If you are guaranteed that the content of my_array2 is always numeric you could simply use split twice. First you splitby my_array2: [" and then split by ,. This should give you the amount of items you are after.
If you are not guaranteed that, you could still split by my_array2 and instead of splitting again, you use a pattern such as "\d+" (or "\d+(\.\d+)? if you have floating point values) and count.
An example of the expression is available here.

How does this gsub and regex work?

I'm trying to learn ruby and having a hard time figuring out what each individual part of this code is doing. Specifically, how does the global subbing determine whether two sequential numbers are both one of these values [13579] and how does it add a dash (-) in between them?
def DashInsert(num)
num_str = num.to_s
num_str.gsub(/([13579])(?=[13579])/, '\1-')
end
num_str.gsub(/([13579])(?=[13579])/, '\1-')
() called capturing group, which captures the characters matched by the pattern present inside the capturing group. So the pattern present inside the capturing group is [13579] which matches a single digit from the given set of digits. That corresponding digit was captured and stored inside index 1.
(?=[13579]) Positive lookahead which asserts that the match must be followed by the character or string matched by the pattern inside the lookahead. Replacement will occur only if this condition is satisfied.
\1 refers the characters which are present inside the group index 1.
Example:
> "13".gsub(/([13579])(?=[13579])/, '\1-')
=> "1-3"
You may start with some random tests:
def DashInsert(num)
num_str = num.to_s
num_str.gsub(/([13579])(?=[13579])/, '\1-')
end
10.times{
x = rand(10000)
puts "%6i: %6s" % [x,DashInsert(x)]
}
Example:
9633: 963-3
7774: 7-7-74
6826: 6826
7386: 7-386
2145: 2145
7806: 7806
9499: 949-9
4117: 41-1-7
4920: 4920
14: 14
And now to check the regex.
([13579]) take any odd number and remember it (it can be used later with \1
(?=[13579]) Check if the next number is also odd, but don't take it (it still remains in the string)
'\1-' Output the first odd num and ab a - to it.
In other word:
Puts a - between each two odds numbers.

How do I match repeated characters?

How do I find repeated characters using a regular expression?
If I have aaabbab, I would like to match only characters which have three repetitions:
aaa
Try string.scan(/((.)\2{2,})/).map(&:first), where string is your string of characters.
The way this works is that it looks for any character and captures it (the dot), then matches repeats of that character (the \2 backreference) 2 or more times (the {2,} range means "anywhere between 2 and infinity times"). Scan will return an array of arrays, so we map the first matches out of it to get the desired results.

How to insert tag every 5 characters in a Ruby String?

I would like to insert a <wbr> tag every 5 characters.
Input: s = 'HelloWorld-Hello guys'
Expected outcome: Hello<wbr>World<wbr>-Hell<wbr>o guys
s = 'HelloWorld-Hello guys'
s.scan(/.{5}|.+/).join("<wbr>")
Explanation:
Scan groups all matches of the regexp into an array. The .{5} matches any 5 characters. If there are characters left at the end of the string, they will be matched by the .+. Join the array with your string
There are several options to do this. If you just want to insert a delimiter string you can use scan followed by join as follows:
s = '12345678901234567'
puts s.scan(/.{1,5}/).join(":")
# 12345:67890:12345:67
.{1,5} matches between 1 and 5 of "any" character, but since it's greedy, it will take 5 if it can. The allowance for taking less is to accomodate the last match, where there may not be enough leftovers.
Another option is to use gsub, which allows for more flexible substitutions:
puts s.gsub(/.{1,5}/, '<\0>')
# <12345><67890><12345><67>
\0 is a backreference to what group 0 matched, i.e. the whole match. So substituting with <\0> effectively puts whatever the regex matched in literal brackets.
If whitespaces are not to be counted, then instead of ., you want to match \s*\S (i.e. a non whitespace, possibly preceded by whitespaces).
s = '123 4 567 890 1 2 3 456 7 '
puts s.gsub(/(\s*\S){1,5}/, '[\0]')
# [123 4 5][67 890][ 1 2 3 45][6 7]
Attachments
Source code and output on ideone.com
References
regular-expressions.info
Finite Repetition, Greediness
Character classes
Grouping and Backreferences
Dot Matches (Almost) Any Character
Here is a solution that is adapted from the answer to a recent question:
class String
def in_groups_of(n, sep = ' ')
chars.each_slice(n).map(&:join).join(sep)
end
end
p 'HelloWorld-Hello guys'.in_groups_of(5,'<wbr>')
# "Hello<wbr>World<wbr>-Hell<wbr>o guy<wbr>s"
The result differs from your example in that the space counts as a character, leaving the final s in a group of its own. Was your example flawed, or do you mean to exclude spaces (whitespace in general?) from the character count?
To only count non-whitespace (“sticking” trailing whitespace to the last non-whitespace, leaving whitespace-only strings alone):
# count "hard coded" into regexp
s.scan(/(?:\s*\S(?:\s+\z)?){1,5}|\s+\z/).join('<wbr>')
# parametric count
s.scan(/\s*\S(?:\s+\z)?|\s+\z/).each_slice(5).map(&:join).join('<wbr>')

Resources