Ruby regex count matched elements in the array of digits - ruby

I have a string:
'my_array1: ["1445","374","1449","378"], my_array2: ["1445","374", "1449","378"]'
I need to match all sets of digits from my_array2: [...] and count how many of them there.
I need to do something like this with regex and ruby MatchData
string = 'my_array1: ["1445","374", "1449","378"], my_array2: ["1445","374", "1449","378"]'
matches = string.match(/my_array2\:\s[\[,]\"(\d+)\"/)
count_matches = matches.size
Expected result should be 4.
What is the correct way of doing it?

If you are guaranteed that the content of my_array2 is always numeric you could simply use split twice. First you splitby my_array2: [" and then split by ,. This should give you the amount of items you are after.
If you are not guaranteed that, you could still split by my_array2 and instead of splitting again, you use a pattern such as "\d+" (or "\d+(\.\d+)? if you have floating point values) and count.
An example of the expression is available here.

Related

Split a string and remove the first element in string

Original string '4.0.0-4.0-M-672092'
How to modify the Original string to "4.0-M-672092" using a one line code.
Any Help is highly appreciated .
Thanks and Regards
The 'split' method works in this case
https://apidock.com/ruby/String/split
'4.0.0-4.0-M-672092'.split('-')[1..-1].join('-')
# => "4.0-M-672092"
Just be careful, in this application is fine, but in long texts this might become unoptimized, since it splits all the string and then joins the array all over again
If you need this in wider texts to be more optimized, you can find the "-" index (which is your split) and use the next position to make a substring
text = '4.0.0-4.0-M-672092'
text[(text.index('-') + 1)..-1]
# => "4.0-M-672092"
But you can't do it in one line, and not finding a split character will result in an error, so use a rescue statement if that is possible to happen
Simplest way:
'4.0.0-4.0-M-672092'.split('-', 2).second
"4.0.0-4.0-M-672092"[/(?<=-).*/]
#=> "4.0-M-672092"
The regular expression reads, "Match zero or more characters other than newlines, as many as possible (.*), provided the match is preceded by a hyphen. (?<=-) is a positive lookbehind. See String#[].

Ruby. Split string in separate decimal numbers

I have a long string which contains only decimal numbers with two signs after comma
str = "123,457568,22321,5484123,77"
The numbers in string only decimals with two signs after comma. How I can separate them in different numbers like that
arr = ["123,45" , "7568,22" , "321,54" , "84123,77"]
You could try a regex split here:
str = "123,457568,22321,5484123,77"
nums = str.split(/(?<=,\d{2})/)
print nums
This prints:
123,45
7568,22
321,54
84123,77
The logic above says to split at every point where a comma followed by two digits precedes.
Scan String for Commas Followed by Two Digits
This is a case where you really need to know your data. If you always have floats with two decimal places, and commas are decimals in your locale, then you can use String#scan as follows:
str.scan /\d+,\d{2}/
#=> ["123,45", "7568,22", "321,54", "84123,77"]
Since your input data isn't consistent (which can be assumed by the lack of a reliable separator between items), you may not be able to guarantee that each item has a fractional component at all, or that the component has exactly two digits. If that's the case, you'll need to find a common pattern that is reliable for your given inputs or make changes to the way you assign data from your data source into str.

Performing operations on each line of a string

I have a string named "string" that contains six lines.
I want to remove an "Z" from the end of each line (which each has) and capitalize the first character in each line (ignoring numbers and white space; e.g., "1. apple" -> "1. Apple").
I have some idea of how to do it, but have no idea how to do it in Ruby. How do I accomplish this? A loop? What would the syntax be?
Using regular expression (See String#gsub):
s = <<EOS
1. applez
2. bananaz
3. catz
4. dogz
5. elephantz
6. fruitz
EOS
puts s.gsub(/z$/i, '').gsub(/^([^a-z]*)([a-z])/i) { $1 + $2.upcase }
# /z$/i - to match a trailing `z` at the end of lines.
# /^([^a-z]*)([a-z])/i - to match leading non-alphabets and alphabet.
# capture them as group 1 ($1), group 2 ($2)
output:
1. Apple
2. Banana
3. Cat
4. Dog
5. Elephant
6. Fruit
I would approach this by breaking your problem into smaller steps. After we've solved each of the smaller problems, you can put it all back together for a more elegant solution.
Given the initial string put forth by falsetru:
s = <<EOS
1. applez
2. bananaz
3. catz
4. dogz
5. elephantz
6. fruitz
EOS
1. Break your string into an array of substrings, separated by the newline.
substrings = s.split(/\n/)
This uses the String class' split method and a regular expression. It searches for all occurrences of newline (backslash-n) and treats this as a delimiter, splitting the string into substrings based on this delimiter. Then it throws all of these substrings into an array, which we've named substrings.
2. Iterate through your array of substrings to do some stuff (details on what stuff later)
substrings.each do |substring|
.
# Do stuff to each substring
.
end
This is one form for how you iterate across an array in Ruby. You call the Array's each method, and you give it a block of code which it will run on each element in the array. In our example, we'll use the variable name substring within our block of code so that we can do stuff to each substring.
3. Remove the z character at the end of each substring
substrings.each do |substring|
substring.gsub!(/z$/, '')
end
Now, as we iterate through the array, the first thing we want to do is remove the z character at the end of each string. You do this with the gsub! method of String, which is a search-and-replace method. The first argument for this method is the regular expression of what you're looking for. In this case, we are looking for a z followed by the end-of-string (denoted by the dollar sign). The second argument is an empty string, because we want to replace what's been found with nothing (another way of saying - we just want to remove what's been found).
4. Find the index of the first letter in each substring
substrings.each do |substring|
substring.gsub!(/z$/, '')
index = substring.index(/[a-zA-Z]/)
end
The String class also has a method called index which will return the index of the first occurrence of a string that matches the regular expression your provide. In our case, since we want to ignore numbers and symbols and spaces, we are really just looking for the first occurrence of the very first letter in your substring. To do this, we use the regular expression /[a-zA-Z]/ - this basically says, "Find me anything in the range of small A to small Z or in big A to big Z." Now, we have an index (using our example strings, the index is 3).
5. Capitalize the letter at the index we have found
substrings.each do |substring|
substring.gsub!(/z$/, '')
index = substring.index(/[a-zA-Z]/)
substring[index] = substring[index].capitalize
end
Based on the index value that we found, we want to replace the letter at that index with that same letter, but capitalized.
6. Put our substrings array back together as a single-string separated by newlines.
Now that we've done everything we need to do to each substring, our each iterator block ends, and we have what we need in the substrings array. To put the array back together as a single string, we use the join method of Array class.
result = substrings.join("\n")
With that, we now have a String called result, which should be what you're looking for.
Putting It All Together
Here is what the entire solution looks like, once we put together all of the steps:
substrings = s.split(/\n/)
substrings.each do |substring|
substring.gsub!(/z$/, '')
index = substring.index(/[a-zA-Z]/)
substring[index] = substring[index].capitalize
end
result = substrings.join("\n")

regex that returns matches of a range in length and containing one of several words in Ruby

I tried the following regex in Ruby:
"the foodogand the catlada are mouseing".scan(/\b(?=\w{6,12}\b)\w{0,9}(cat|dog|mouse)\w*/)
but instead of it returning
["foodogand", "catlada", "mouseing"]
I'm getting
[["dog"],["cat]] # the results are also in arrays
What's wrong here?
the results are also in arrays, I could flatten this but is there a way to avoid it?
Use ?: for the last group:
"the foodogand the catlada are mouseing".scan(/\b(?=\w{6,12}\b)\w{0,9}(?:cat|dog|mouse)\w*/)
#=> ["foodogand", "catlada", "mouseing"]
From the docs:
If the pattern contains groups, each individual result is itself an array containing one entry per group.
The ?: makes the group non-capturing, avoiding a nested array.
I would just clean that up a bit by moving the second \b to the end and replacing \w{0,9} with \w* (the lookahead takes care of the length)
"the foodogand the catlada are mouseing".scan /\b(?=\w{6,12})\w*(?:cat|dog|mouse)\w*\b/
#=> ["foodogand", "catlada", "mouseing"]

Match consecutive list of exactly one character in set with regular expressions

I don't think I'll even try to explain this, I don't know the words to, but I'd like to achieve the following:
Given a string like this:
+++>><<<--
I'd like a match to give me: +++, but also match if any of the other characters were in the string consecutively like they are. So if the +++ wasn't there, I'd like to match >>.
I tried using the following regular expression:
([><\-\+]+)
However, given the string above, it would match the entire string, and not the first list of consecutive characters.
If it makes a difference, this is in Ruby (1.9.3).
Not sure about the ruby bit, but you can do this with backreferences in the pattern:
(.)\1+
What this does is to use a capturing group () to capture any character . followed by any number + of the same character \1. The \1 is a backreference to the the first captured group; in a pattern with more capturing groups \2 would be the second captured group and so on.
Java Example
Pattern p = Pattern.compile("(.)\\1+");
Matcher m = p.matcher("aaabbccaa");
m.find();
System.out.println(m.group(0)); // prints "aaa"
Ruby Example
# Return an array of matched patterns.
string = '+++>><<<--'
string.scan( /((.)\2+)/ ).collect { |match| match.first }

Resources