Ruby: Matching a delimiter with Regex - ruby

I'm trying to solve this with a regex pattern, and even though my test passes with this solution, I would like split to only have ["1", "2"] inside the array. Is there a better way of doing this?
irb testing:
s = "//;\n1;2" # when given a delimiter of ';'
s2 = "1,2,3" # should read between commas
s3 = "//+\n2+2" # should read between delimiter of '+'
s.split(/[,\n]|[^0-9]/)
=> ["", "", "", "", "1", "2"]
Production:
module StringCalculator
def self.add(input)
solution = input.scan(/\d+/).map(&:to_i).reduce(0, :+)
input.end_with?("\n") ? nil : solution
end
end
Test:
context 'when given a newline delimiter' do
it 'should read between numbers' do
expect(StringCalculator.add("1\n2,3")).to eq(6)
end
it 'should not end in a newline' do
expect(StringCalculator.add("1,\n")).to be_nil
end
end
context 'when given different delimiter' do
it 'should support that delimiter' do
expect(StringCalculator.add("//;\n1;2")).to eq(3)
end
end

Very simple using String#scan :
s = "//;\n1;2"
s.scan(/\d/) # => ["1", "2"]
/\d/ - A digit character ([0-9])
Note :
If you have a string like below then, you should use /\d+/.
s = "//;\n11;2"
s.scan(/\d+/) # => ["11", "2"]

You're getting data that looks like this string: //1\n212
If you're getting the data as a file, then treat it as two separate lines. If it's a string, then, again, treat it as two separate lines. In either case it'd look like
//1
212
when output.
If it's a string:
input = "//1\n212".split("\n")
delimiter = input.first[2] # => "1"
values = input.last.split(delimiter) # => ["2", "2"]
If it's a file:
line = File.foreach('foo.txt')
delimiter = line.next[2] # => "1"
values = line.next.chomp.split(delimiter) # => ["2", "2"]

Related

Replace words in a string with the words defined as values in a hash

The aim is to replace specific words in a string by the values defined in the dictionary.
dictionary =
{"Hello" => "hi",
"to, two, too" => "2",
"for, four" => "4",
"be" => "b",
"you" => "u",
"at" => "#",
"and" => "&"
}
def word_substituter(tweet)
tweet_array = tweet.split(',') ##converting the string to array
tweet_array.each do |word|
if word === dictionary.keys ##if the words of array are equal to the keys of the dictionary
word == dictionary.values ##then now the words are now the the values of the dictionary
puts word
end
end
word.join(", ")
end
word_substituter("Hey guys, can anyone teach me how to be cool? I really want to be the best at everything, you know what I mean? Tweeting is super fun you guys!!!!")
I would appreciate the help. Could you explain it?
Naive words enumeration
DICTIONARY = {
"Hello" => "hi",
"to, two, too" => "2",
"for, four" => "4",
"be" => "b",
"you" => "u",
"at" => "#",
"and" => "&"
}.freeze
def word_substituter(tweet)
dict = {}
DICTIONARY.keys.map { |k| k.split(', ') }.flatten.each do |w|
DICTIONARY.each { |k, v| dict.merge!(w => v) if k.include?(w) }
end
tweet.split(' ').map do |s|
dict.each { |k, v| s.sub!(/#{k}/, v) if s =~ /\A#{k}[[:punct:]]*\z/ }
s
end.join(' ')
end
word_substituter("Hey guys, I'm Robe too. Can anyone teach me how to be cool? I really want to be the best at everything, you know what I mean? Tweeting is super fun you guys!!!!")
# => "Hey guys, I'm Robe 2. Can anyone teach me how 2 b cool? I really want 2 b the best # everything, u know what I mean? Tweeting is super fun u guys!!!!"
I feel like this provides a fairly simple solution to this:
DICTIONARY = {
"Hello" => "hi",
"to, two, too" => "2",
"for, four" => "4",
"be" => "b",
"you" => "u",
"at" => "#",
"and" => "&"
}.freeze
def word_substituter(tweet)
tweet.dup.tap do |t|
DICTIONARY.each { |key, replacement| t.gsub!(/\b(#{key.split(', ').join('|')})\b/, replacement) }
end
end
word_substituter("Hey guys, can anyone teach me how to be cool? I really want to be the best at everything, you know what I mean? Tweeting is super fun you guys!!!!")
# => "Hey guys, can anyone teach me how 2 b cool? I really want 2 b the best # everything, u know what I mean? Tweeting is super fun u guys!!!!"
Breaking it down into steps:
the method takes a tweet and creates a copy
it passes this to tap so it's returned from the method call
it iterates through the DICTIONARY
transforms the key into a regex matcher (e.g. /\b(to|two|too)\b/)
passes this to gsub! and replaces any matches with the replacement
If you want this to replace occurrence within words (e.g. what => wh#), you can remove the checks for word boundaries (\b) in the regex.
One gotcha is that if any of your dictionary's keys contain regex matchers, this would need a little rework: if you had "goodbye." => 'later xxx', the dot would match any character and include it in what's replaced. Regexp.escape is your friend here.
Hope this helps - let me know how you get on.

Regex to find a key in a string w Rails Ruby?

given the following string examples:
human_id_2, human_id_44, human_id_3123121, human_id_11111
I'm trying to loop through and extract just the ID/integer.
params.each do |key, value|
if (key.to_s[/human_id_.*/])
theId= key.to_s[/human_id_.*/]
....
end
end
I'm expecting theId to loop through an be 2, 44, etc...
Any idea why theId is not being set properly?
Why not just replace "human_id_" with ""? No need to do regexp for that:
theId = key.gsub("human_id_", "")
This will work and filter out if there's a mixture:
string = "human_id_2, blahblah blah_other_stuff234234, human_id_44, what_up_4545, human_id_3123121, human_id_11111, sdjfhksfh$##$4343894"
string.scan(/(?<=human_id_).*?(?!\d)/)
=> ["2", "44", "3123121", "11111"]
Notice how it ignores the unneeded data.
"human_id_2, human_id_44, human_id_3123121, human_id_11111".scan(/\d+/)
# => ["2", "44", "3123121", "11111"]

How do I split on a "." but only if there are non-numbers following it?

I want to split a line by a space, or a "." separating a number in front of it and a non-number behind it. I want to split like:
"10.ABC DEF GHI" # => ["10", "ABC", "DEF", "GHI"]
"10.00 DEF GHI" #=> ["10.00", "DEF", "GHI"]
I have
words = line.strip.split(/(?<=\d)\.|[[:space:]]+/)
But I discovered this doesn't quite do what I want. Although it will split the line:
line = "10.ABC DEF GHI"
words = line.strip.split(/(?<=\d)\.|[[:space:]]+/) # => ["10", "ABC", "DEF", "GHI"]
It will also incorrectly split
line = "10.00 DEF GHI"
line.strip.split(/(?<=\d)\.|[[:space:]]+/) # => ["10", "00", "DEF", "GHI"]
How do I correct my regular expression to only split on the dot if there are non-numbers following the "."?
Add a negative lookahead (?!\d) after \.:
/(?<=\d)\.(?!\d)|[[:space:]]+/
^^^^^^
It will fail the match if the . is followed with a digit.
See the Rubular demo.

How to replace CSV headers

If using the 'csv' library in ruby, how would you replace the headers without re-reading in a file?
foo.csv
'date','foo',bar'
1,2,3
4,5,6
Using a CSV::Table because of this answer
Here is a working solution, however it requires writing and reading from a file twice.
require 'csv'
#csv = CSV.table('foo.csv')
# Perform additional operations, like remove specific pieces of information.
# Save fixed csv to a file (with incorrect headers)
File.open('bar.csv','w') do |f|
f.write(#csv.to_csv)
end
# New headers
new_keywords = ['dur','hur', 'whur']
# Reopen the file, replace the headers, and print it out for debugging
# Not sure how to replace the headers of a CSV::Table object, however I *can* replace the headers of an array of arrays (hence the file.open)
lines = File.readlines('bar.csv')
lines.shift
lines.unshift(new_keywords.join(',') + "\n")
puts lines.join('')
# TODO: re-save file to disk
How could I modify the headers without reading from disk twice?
'dur','hur','whur'
1,x,3
4,5,x
Update
For those curious, here is the unabridged code. In order to use things like delete_if() the CSV must be imported with the CSV.table() function.
Perhaps the headers could be changed by converting the csv table into an array of arrays, however I'm not sure how to do that.
Given a test.csv file whose contents look like this:
id,name,age
1,jack,8
2,jill,9
You can replace the header row using this:
require 'csv'
array_of_arrays = CSV.read('test.csv')
p array_of_arrays # => [["id", "name", "age"],
# => ["1", "jack", "26"],
# => ["2", "jill", "27"]]
new_keywords = ['dur','hur','whur']
array_of_arrays[0] = new_keywords
p array_of_arrays # => [["dur", "hur", "whur"],
# => ["1", " jack", " 26"],
# => ["2", " jill", " 27"]]
Or if you'd rather preserve your original two-dimensional array:
new_array = Array.new(array_of_arrays)
new_array[0] = new_keywords
p new_array # => [["dur", "hur", "whur"],
# => ["1", " jack", " 26"],
# => ["2", " jill", " 27"]]
p array_of_arrays # => [["id", "name", "age"],
# => ["1", "jack", "26"],
# => ["2", "jill", "27"]]

Extract the last word in sentence/string?

I have an array of strings, of different lengths and contents.
Now i'm looking for an easy way to extract the last word from each string, without knowing how long that word is or how long the string is.
something like;
array.each{|string| puts string.fetch(" ", last)
This should work just fine
"my random sentence".split.last # => "sentence"
to exclude punctuation, delete it
"my rando­m sente­nce..,.!?".­split.last­.delete('.­!?,') #=> "sentence"
To get the "last words" as an array from an array you collect
["random sentence...",­ "lorem ipsum!!!"­].collect { |s| s.spl­it.last.delete('.­!?,') } # => ["sentence", "ipsum"]
array_of_strings = ["test 1", "test 2", "test 3"]
array_of_strings.map{|str| str.split.last} #=> ["1","2","3"]
["one two",­ "thre­e four five"­].collect { |s| s.spl­it.last }
=> ["two", "five"]
"a string of words!".match(/(.*\s)*(.+)\Z/)[2] #=> 'words!' catches from the last whitespace on. That would include the punctuation.
To extract that from an array of strings, use it with collect:
["a string of words", "Something to say?", "Try me!"].collect {|s| s.match(/(.*\s)*(.+)\Z/)[2] } #=> ["words", "say?", "me!"]
The problem with all of these solutions is that you only considering spaces for word separation. Using regex you can capture any non-word character as a word separator. Here is what I use:
str = 'Non-space characters, like foo=bar.'
str.split(/\W/).last
# "bar"
This is the simplest way I can think of.
hostname> irb
irb(main):001:0> str = 'This is a string.'
=> "This is a string."
irb(main):002:0> words = str.split(/\s+/).last
=> "string."
irb(main):003:0>

Resources