Splitting at Space Between Letter and Digit - ruby

I'm having the worst time with this simple regex.
Example input:
Cleveland Indians 5, Boston Redsox 4
I'm trying to split at the , and the space between the letter and number
Example output:
Cleveland Indians
5
Boston Redsox
4
Here is what I have so far, but it's including the number still.
/,|\s[0-9]/

string = "Cleveland Indians 5, Boston Redsox 4"
string.split /,\s*|\s(?=\d)/
# => ["Cleveland Indians", "5", "Boston Redsox", "4"]
\s(?=\d): a space followed by a digit using lookahead.

If you divide it into two splits -- one at the comma + space, then one to separate the team name from the score -- it might be a bit clearer, especially if you have to add more options like a space before the comma too (real-world data gets messy!):
scores = "Cleveland Indians 5, Boston Redsox 4"
scores.split(/,\s*/).map{|score| score.split(/\s+(?=\d)/)}
=> [["Cleveland Indians", "5"], ["Boston Redsox", "4"]]
The resulting list of lists is a more meaningful grouping, too.

"Cleveland Indians 5, Boston Redsox 4".split(/\s*(\d+)(?:,\s+|\z)/)
# => ["Cleveland Indians", "5", "Boston Redsox", "4"]

1)
str = "Cleveland Indians 15, Boston Red Sox 4"
phrases = str.split(", ")
phrases.each do |phrase|
*team_names, score = phrase.split(" ")
puts team_names.join " "
puts score
end
--output:--
Cleveland Indians
15
Boston Red Sox
4
.
2)
str = "Cleveland Indians 15, Boston Red Sox 4"
pieces = str.split(/
\s* #A space 0 or more times
(\d+) #A digit 1 or more times, include match with results
[,\s]* #A comma or space, 0 or more times
/x)
puts pieces
--output:--
Cleveland Indians
15
Boston Red Sox
4
The first split is on " 15, " and the second split is on " 4" -- with the score included in the results.
.
3)
str = "Cleveland Indians 15, Boston Red Sox 4"
str.scan(/
(
\w #Begin with a word character
\D+ #followed by not a digit, 1 or more times
)
[ ] #followed by a space
(\d+) #followed by a digit, one or more times
/x) {|capture_groups| puts capture_groups}
--output:--
Cleveland Indians
15
Boston Red Sox
4

Related

Ruby -- Why does += increase the number for my string?

In the following code the value for "seven" changes from 1 to 2:
word_counts = Hash.new(0)
sample = "If seven maids with seven mops"
sample.split.each do |word|
word_counts[word.downcase] += 1
puts word_counts
end
Output:
{}
{"if"=>1}
{"if"=>1, "seven"=>1}
{"if"=>1, "seven"=>1, "maids"=>1}
{"if"=>1, "seven"=>1, "maids"=>1, "with"=>1}
{"if"=>1, "seven"=>2, "maids"=>1, "with"=>1}
{"if"=>1, "seven"=>2, "maids"=>1, "with"=>1, "mops"=>1}
Can someone explain why it went from 1 to 2?
OK, I'll try..
word_counts[word.downcase] += 1 means word_counts[word.downcase] = word_counts[word.downcase] + 1. Now, on fifth iteration word equals 'seven', so it does word_counts['seven'] = word_counts['seven'] + 1. But word_counts['seven'] was 1, so it becomes 2.
When you split the string you get the array with two strings "seven", because the sentence has two occurrences of that word.
"If seven maids with seven mops".split #=> ["If", "seven", "maids", "with", "seven", "mops"]

summing up string representation of year and month durations in ruby

I was wondering if there is a way to sum up multiple durations in string representations like 2 years 2 months, 10 months, 3 years and output 6 years
You could do that as follows.
str = "2 years 4 months, 10 months, 3 years, 1 month"
r = /
(\d+) # match one or more digits in capture group 1
\s+ # match one or more whitespace chars
(year|month) # match 'year' or 'month' in capture group 2
s? # optionally match 's'
\b # match a word break
/x # free-spacing regex definition mode
a = str.scan r
#=> [["2", "year"], ["4", "month"], ["10", "month"], ["3", "year"], ["1", "month"]]
h = a.each_with_object(Hash.new(0)) { |(n,period),h| h[period] += n.to_i }
#=> {"year"=>5, "month"=>15}
y, m = h["month"].divmod(12)
#=> [1, 3]
h["year"] += y
#=> 6
h["month"] = m
#=> 3
h #=> {"year"=>6, "month"=>3}
Notes:
As noted in the doc for String#scan, "If the pattern contains groups, each individual result is itself an array containing one entry per group."
Hash.new(0) creates an empty hash with a default value of zero, meaning that if that hash h does not have a key k, h[k] returns zero. Thos is sometimes called a counting hash. See the doc for Hash::new.
Numeric#divmod is a useful and greatly-underused method.

Inserting a space in between characters using gsub - Ruby

Let's say I had a string "I have 36 dogs in 54 of my houses in 24 countries".
Is it possible by using only gsub to add a " " between each digit so that the string becomes "I have 3 6 dogs in 5 4 of my houses in 2 4 countries"?
gsub(/(\d)(\d)/, "#{$1} #{$2}") does not work as it replaces each digit with a space and neither does gsub(/\d\d/, "\d \d"), which replaces the each digit with d.
s = "I have 3651 dogs in 24 countries"
Four ways to use String#gsub:
Use a positive lookahead and capture group
r = /
(\d) # match a digit in capture group 1
(?=\d) # match a digit in a positive lookahead
/x # extended mode
s.gsub(r, '\1 ')
#=> "I have 3 6 5 1 dogs in 2 4 countries"
A positive lookbehind could be used as well:
s.gsub(/(?<=\d)(\d)/, ' \1')
Use a block
s.gsub(/\d+/) { |s| s.chars.join(' ') }
#=> "I have 3 6 5 1 dogs in 2 4 countries"
Use a positive lookahead and a block
s.gsub(/\d(?=\d)/) { |s| s + ' ' }
#=> "I have 3 6 5 1 dogs in 2 4 countries"
Use a hash
h = '0'.upto('9').each_with_object({}) { |s,h| h[s] = s + ' ' }
#=> {"0"=>"0 ", "1"=>"1 ", "2"=>"2 ", "3"=>"3 ", "4"=>"4 ",
# "5"=>"5 ", "6"=>"6 ", "7"=>"7 ", "8"=>"8 ", "9"=>"9 "}
s.gsub(/\d(?=\d)/, h)
#=> "I have 3 6 5 1 dogs in 2 4 countries"
An alternative way is to look for the place between the numbers using lookahead and lookbehind and then just replace that with a space.
[1] pry(main)> s = "I have 36 dogs in 54 of my houses in 24 countries"
=> "I have 36 dogs in 54 of my houses in 24 countries"
[2] pry(main)> s.gsub(/(?<=\d)(?=\d)/, ' ')
=> "I have 3 6 dogs in 5 4 of my houses in 2 4 countries"
In order to reference a match you should use \n where n is the match, not $1.
s = "I have 36 dogs in 54 of my houses in 24 countries"
s.gsub(/(\d)(\d)/, '\1 \2')
# => "I have 3 6 dogs in 5 4 of my houses in 2 4 countries"

regex to pull in number with decimal or comma

This is my line of code:
col_value = line_item[column].scan(/\d+./).join().to_i
When I enter 30,000 into the textfield, col_value is 30.
I want it to bring in any number:
30,000
30.5
30.55
30000
Any of these are valid...
Is there a problem with the scan and or join which would cause it to return 30? Using the suggested regexes below still retunrs 30 e.g.
col_value = line_item[column].scan(/\d+[,.]?\d+/).join().to_i
Could it be that "to_i" converts "30,000" to 30??
This regex will match you desired output:
\d+[,.]?\d*
here ? is used as optional to match.
DEMO
\d+(?:[,.]\d+)?
Try this.This should do it for you.
Yes, "30,000".to_i #=> 30". See String#to_i: "Extraneous characters past the end of a valid number are ignored."
I suggest you first remove the commas, then apply a regex:
R = /
\d+ # match >= 0 digits
| # or
\d+\.\d+ # match > 0 digits, a decimal point, then > 0 digits
/x # extended mode
str = "30,000 30.5 30.55 30000 1. .1"
str1 = str.tr(',','')
#=> "30000 30.5 30.55 30000 1. .1"
a = str1.scan(R)
#=> ["30000", "30", "5", "30", "55", "30000"]
a.map(&:to_i)
#=> [30000, 30, 5, 30, 55, 30000]
After chaining, we have:
str.tr(',','').scan(R).map(&:to_i)
If the desired solution is instead:
#=> [30000, 30, 5, 30, 55, 30000, 1, 0]
the regex needs to be modified as follows:
R = /
\d+ # match >= 0 digits
| # or
\d+\.\d+ # match > 0 digits, a decimal point, then > 0 digits
| # or
\d+\. # match > 0 digits, then a decimal point
| # or
\.\d+ # match a decimal point, then > 0 digits
/x # extended mode

Ruby: increment all integers in a string by +1

I am looking for a succinct way to increment all the integers found in a string by +1 and return the full string.
For example:
"1 plus 2 and 10 and 100"
needs to become
"2 plus 3 and 11 and 101"
I can find all the integers very easily with
"1 plus 2 and 10 and 100".scan(/\d+/)
but I'm stuck at this point trying to increment and put the parts back together.
Thanks in advance.
You could use the block form of String#gsub:
str = "1 plus 2 and 10 and 100".gsub(/\d+/) do |match|
match.to_i + 1
end
puts str
Output:
2 plus 3 and 11 and 101
The gsub method can take in a block, so you can do this
>> "1 plus 2 and 10 and 100".gsub(/\d+/){|x|x.to_i+1}
=> "2 plus 3 and 11 and 101"
The thing with your regex is that it doesn't preserve your original string in the chain in order to put it back. What I did was to split it using spaces, detect which are words or integers using w.to_i != 0 (not counting 0 as an integer, you might want to improve this), add one, and join it back:
s = "1 plus 2 and 10 and 100"
s.split(" ").map{ |e| if (e.to_i != 0) then e.to_i+1 else e end }.join(" ")
=> "2 plus 3 and 11 and 101"

Resources