How to extract number from array of string? (I m using regex) - ruby

I have a array of string
test= ["ChangeServer<br/>Test: 3-7<br/>PinCode:DFSFSDFB04008<br/>ShipCode:DFADFSDFSDM-000D3<br/>SomeCode:sdfsdf", "werwerwe", "adfsdfsd",
"sdfsdfsdfsd<br/>Test: 9<br/>PinCode:ADFSDF4NS0<br/>ShipCode:FADFSDFD-9ZM170<br/>"]
I want to grab the number after Test: which in the above array of string are 3, 4, 5, 6, 7 ( range 3-7) and 9
Desired output:
["3","4","5","6","7","9"]
What I tried so far
test.join.scan(/(?<=Test: )[0-9]+/)
=> ["3", "7"]
How to deal with range?
Second test case:
test= ["ChangeServer<br/>Test: 3-7<br/>PinCode:DFSFSDFB04008<br/>ShipCode:DFADFSDFSDM-000D3<br/>SomeCode:sdfsdf", "werwerwe", "adfsdfsd",
"sdfsdfsdfsd<br/>Test: 9<br/>PinCode:ADFSDF4NS0<br/>ShipCode:FADFSDFD-9ZM170<br/>", "sdfsdfsdfsd<br/>Test: 15-18<br/>PinCode:ADFSDF4NS0<br/>ShipCode:FADFSDFD-9ZM170<br/>"]
Desired output:
["3","4","5","6","7","9","15","16","17","18"]

There are a lot of ways you could solve this. I'd probably do it this way:
test.flat_map do |s|
_, m, n = *s.match(/Test:\s*(\d+)(?:-(\d+))?/)
m ? (m..n||m).to_a : []
end
See it in action on repl.it: https://repl.it/JFwT/13
Or, more succinctly:
test.flat_map {|s| s.match(/Test:\s*(\d+)(?:-(\d+))?/) { $1..($2||$1) }.to_a }
https://repl.it/JFwT/11

You could create a new Range for each range found (i.e N-N) using the splat operator (i.e. *) and combine the results, like this 1:
test.join.scan(/(?<=Test: )[0-9-]+/)
.flat_map { |r| Range.new(*r.split('-').values_at(0, -1)).to_a }
#=> ["3", "4", "5", "6", "7", "9"]
This will work for both examples.
1 Notice the the added - next to 0-9 in the regex.
Is the a way where we can include both Test: 1 (with space between
Test: and 1) and Test:1 (without space between Test: and 1)?
Yes, update your regex (change where space is placed) and add an additional map to get rid of those spaces:
test.join
.scan(/(?<=Test:)[ 0-9-]+/)
.map(&:strip)
.flat_map { |r| Range.new(*r.split('-').values_at(0, -1)).to_a }
And here's shortened option using two captures in the regex, as suggested by Jordan.
test.join
.scan(/Test:\s*(\d+)(?:-(\d+))?/)
.flat_map { |m,n| (m..n||m).to_a }

Just out of curiosity:
test.
join.
scan(/(?<=Test: )[\d-]+/).
map { |e| e.gsub(/\A\d+\Z/) { |m| "#{m}..#{m}" }.gsub('-', '..') }.
map(&method(:eval)).
flat_map(&:to_a)

Related

How to make a repeated string to the left be deleted without using While?

For example, I have this string of only numbers:
0009102
If I convert it to integer Ruby automatically gives me this value:
9102
That's correct. But my program gives me different types of numbers:
2229102 desired output => 9102
9999102 desired output => 102
If you look at them I have treated 2 and 9 as zeros since they are automatically deleted, well, it is easy to delete that with an while but I must avoid it.
In other words, how do you make 'n' on the left be considered a zero for Ruby?
"2229102".sub(/\A(\d)\1*/, "") #=> "9102"`.
The regular expression reads, "match the first digit in the string (\A is the beginning-of-string anchor) in capture group 1 ((\d)), followed by zero or more characters (*) that equal the contents of capture group 1 (\1). String#gsub converts that match to an empty string.
Try with Enumerable#chunk_while:
s = '222910222'
s.each_char.chunk_while(&:==).drop(1).join
#=> "910222"
Where s.each_char.chunk_while(&:==).to_a #=> [["2", "2", "2"], ["9"], ["1"], ["0"], ["2", "2", "2"]]
Similar to the solution of iGian you could also use drop_while.
s = '222910222'
s.each_char.each_cons(2).drop_while { |a, b| a == b }.map(&:last).join
#=> "910222"
# or
s.each_char.drop_while.with_index(-1) { |c, i| i < 0 || c == s[i] }.join
#=> "910222"
You can also try this way:
s = '9999102938'
s.chars.then{ |chars| chars[chars.index(chars.uniq[1])..-1] }.join
=> "102938"

Ruby Hash: type casting

I’m trying to get a better grasp on writing in Ruby and working with Hash tables and their values.
1. Say you have a hash:
‘FOO’= {‘baz’ => [1,2,3,4,5]}
Goal: convert each value into a string in the ‘Ruby’ way.
I’ve come across multiple examples of using .each eg.
FOO.each = { |k,v| FOO[k] = v.to_s }
However this renders an array encapsulated in a string. Eg. "[1,2,3,4,5]" where it should be ["1", "2", "3", "4", "5"].
2. When type casting is performed on a Hash that’s holds an array of values, is the result a new array? Or simply a change in type of value (eg. 1 becomes “1” when .to_s is applied (say the value was placed through a each enumerator like above).
An explanation is greatly appreciated. New to Ruby.
In the each block, k and v are the key value pair. In your case, 'baz' is key and [1,2,3,4,5] is value. Since you're doing v.to_s, it converts the whole array to string and not the individual values.
You can do something like this to achieve what you want.
foo = { 'baz' => [1,2,3,4,5] }
foo.each { |k, v| foo[k] = v.map(&:to_s) }
You can use Hash#transform_values:
foo = { 'baz' => [1, 2, 3, 4, 5] }
foo.transform_values { |v| v.map(&:to_s) } #=> {"baz"=>["1", "2", "3", "4", "5"]}

Ruby Split string at character difference using regex

I'm current working on a problem that involves splitting a string by each group of characters.
For example,
"111223334456777" #=> ['111','22','333','44','5','6','777']
The way I am currently doing it now is using a enumerator and comparing each character with the next one, and splitting the array that way.
res = []
str = "111223334456777"
group = str[0]
(1...str.length).each do |i|
if str[i] != str[i-1]
res << group
group = str[i]
else
group << str[i]
end
end
res << group
res #=> ['111','22','333','44','5','6','777']
I want to see if I can use regex to do this, which will make this process a lot easier. I understand I could just put this block of code in a method, but I'm curious if regex can be used here.
So what I want to do is
str.split(/some regex/)
to produce the same result. I thought about positive lookahead, but I can't figure out how to have regex recognize that the character is different.
Does anyone have an idea if this is possible?
The chunk_while method is what you're looking for here:
str.chars.chunk_while { |b,a| b == a }.map(&:join)
That will break anything where the current character a doesn't match the previous character b. If you want to restrict to just numbers you can do some pre-processing.
There's a lot of very handy methods in Enumerable that are worth exploring, and each new version of Ruby seems to add more of them.
str = "111333224456777"
str.scan /0+|1+|2+|3+|4+|5+|6+|7+|8+|9+/
#=> ["111", "333", "22", "44", "5", "6", "777"]
or
str.gsub(/(\d)\1*/).to_a
#=> ["111", "333", "22", "44", "5", "6", "777"]
The latter uses the (underused) form of String#gsub that takes one argument and no block, returning an enumerator. It merely generates matches and has nothing to do with character replacement.
For fun, here are several other ways to do that.
str.scan(/((\d)\2*)/).map(&:first)
str.split(/(?<=(.))(?!\1)/).each_slice(2).map(&:first)
str.each_char.slice_when(&:!=).map(&:join)
str.each_char.chunk(&:itself).map { |_,a| a.join }
str.each_char.chunk_while(&:==).map(&:join)
str.gsub(/(?<=(.))(?!\1)/, ' ').split
str.gsub(/(.)\1*/).reduce([], &:<<)
str[1..-1].each_char.with_object([txt[0]]) {|c,a| a.last[-1]==c ? (a.last<<c) : a << c}
Another option which utilises the group_by method, which returns a hash with each individual number as a key and an array of grouped numbers as the value.
"111223334456777".split('').group_by { |i| i }.values.map(&:join) => => ["111", "22", "333", "44", "5", "6", "777"]
Although it doesn't implement a regex, someone else may find it useful.

How do I skip the first pair in `zip`?

The following code compares two hashes: one with strings as values and the other with arrays as values.
hash1 = {"1"=>"val", "2"=>"val", "3"=>"vall", "4"=>""}
hash2 = {"1"=>[""], "2"=>["value"], "3"=>["val1", "val2"], "4"=>[""]}
unless hash1.zip(hash2).all? { |(_, fv), (_, lv)| fv.empty? ^ !lv.all?(&:empty?) }
...
end
If hash1 has an empty string and hash2 has a value or vise versa, it is false.
I need the comparison to skip the first element in both hashes. I would add with_index to do so, but I don't know how to add it or if it's the best way in this case.
The result of zip is an array, so just cut off the head:
hash1.zip(hash2).drop(1).all? { ... }
You can't use with_index since any?, unlike map and others, does not return an Enumerator. You could do a workaround:
hash1.zip(hash2).map.with_index { |((_, fv), (_, lv)), i|
i.zero? || fv.empty? ^ !lv.all?(&:empty?)
}.all?
But that's like the very opposite of legible.
EDIT: Thanks to sawa for improving (and debugging) the answer.
keys = hash1.keys.drop(1)
#=> ["2", "3", "4"]
pairs = hash1.values_at(*keys).zip(hash2.values_at(*keys))
#=> [["val", ["value"]], ["vall", ["val1", "val2"]], ["", [""]]]
pairs.all? { |v1,v2| v1.empty? ^ v2.uniq != [""] }
#=> true
You could of course chain the last two expressions.

Strip in collect on splitted array in Ruby

The following code:
str = "1, hello,2"
puts str
arr = str.split(",")
puts arr.inspect
arr.collect { |x| x.strip! }
puts arr.inspect
produces the following result:
1, hello,2
["1", " hello", "2"]
["1", "hello", "2"]
This is as expected. The following code:
str = "1, hello,2"
puts str
arr = (str.split(",")).collect { |x| x.strip! }
puts arr.inspect
Does however produce the following output:
1, hello,2
[nil, "hello", nil]
Why do I get these "nil"? Why can't I do the .collect immediately on the splitted-array?
Thanks for the help!
The #collect method will return an array of the values returned by each block's call. In your first example, you're modifying the actual array contents with #strip! and use those, while you neglect the return value of #collect.
In the second case, you use the #collect result. Your problem is that #strip! will either return a string or nil, depending on its result – especially, it'll return nil if the string wasn't modified.
Therefore, use #strip (without the exclamation mark):
1.9.3-p194 :005 > (str.split(",")).collect { |x| x.strip }
=> ["1", "hello", "2"]
Because #strip! returns nil if the string was not altered.
In your early examples you were not using the result of #collect, just modifying the strings with #strip!. Using #each in that case would have made the non-functional imperative loop a bit more clear. One normally uses #map / #collect only when using the resulting new array.
You last approach looks good, you wrote a functional map but you left the #strip! in ... just take out the !.

Resources