ruby/regex getting the first letter of each word - ruby

I want to get the first letter of each word put together, making something like "I need help" turn into "Inh". I was thinking to trim everything off, then going from there, or grab each first letter right away.

You could simply use split, map and join together here.
string = 'I need help'
result = string.split.map(&:first).join
puts result #=> "Inh"

How about regular expressions? Using the split method here forces a focus on the parts of the string that you don't need to for this problem, then taking another step of extracting the first letter of each word (chr). that's why I think regular expressions is better for this case. Node that this will also work if you have a - or another special character in the string. And then, of course you can add .upcase method at the end to get a proper acronym.
string = 'something - something and something else'
string.scan(/\b\w/).join
#=> ssase

Alternative solution using regex
string = 'I need help'
result = string.scan(/(\A\w|(?<=\s)\w)/).flatten.join
puts result
This basically says "look for either the first letter or any letter directly preceded by a space". The scan function returns array of arrays of matches, which is flattened (made into one array) and joined (made into a string).

string = 'I need help'
result = string.split.map(&:chr).join
puts result
http://ruby-doc.org/core-2.0/String.html#method-i-chr

Related

opposite of sub in ruby

I want to replace the content (or delete it) that does not match with my filter.
I think the perfect description would be an opposite sub. I cannot find anything similar in the docs, and I'm not sure how to invert the regex, but I think a method would probably be the more convenient.
An example of how it would work (I've just changed the words to make it more clear)
"bird.cats.dogs".opposite_sub(/(dogs|cats)\.(dogs|cats)/, '')
#"cats.dogs"
I hope it's easy enough to understand.
Thanks in advance.
String#[] can take a regular expression as its parameter:
▶ "bird.cats.dogs"[/(dogs|cats)\.(dogs|cats)/]
#⇒ "cats.dogs"
For multiple matches one can use String#scan:
▶ "bird.cats.dogs.bird.cats.dogs".scan /(?:dogs|cats)\.(?:dogs|cats)/
#⇒ ["cats.dogs", "cats.dogs"]
So you want to extract the part that matches your regex?
You can use String#slice, for example:
"bird.cats.dogs".slice(/(dogs|cats)\.(dogs|cats)/)
#=> "cats.dogs"
And String#[] does the same.
"bird.cats.dogs"[/(dogs|cats)\.(dogs|cats)/]
#=> "cats.dogs"
You cannot have a single replacement string because the part of the string that matches the regex might not be at the beginning or end of the string, in which case it's not clear whether the replacement string should precede or follow the matching string. I've therefore written the following with two replacement strings, one for pre-match, the other for post_match. I've made this a method of the String class as that's what you've asked for (though I've given the method a less-perfect name :-) )
class String
def replace_non_matching(regex, replace_before, replace_after)
first, match, last = partition(regex)
replace_before + match + replace_after
end
end
r = /(dogs|cats)\.(dogs|cats)/
"birds.cats.dogs.pigs".replace_non_matching(r, "", "")
#=> "cats.dogs"
"birds.cats.dogs".replace_non_matching(r, "snakes.", ".hens")
#=> "snakes.cats.dogs.hens"
"birds.cats.dogs.mice.cats.dogs.bats".replace_non_matching(r, "snakes.", ".hens")
#=> "snakes.cats.dogs.hens"
Regarding the last example, the method could be modified to replace "birds.", ".mice." and ".bats", but in that case three replacement strings would be needed. In general, determining in advance the number of replacement strings needed could be problematic.

Issue with multiple wildcard symbols when iterating in array with `.gsub!`

I am trying to figure out how to replace multiple characters in an array of strings by using multiple wildcards (or some other method if someone knows better.) Each element in the array is a telephone number and date, (ex. 8675309,2015-01-20). I am trying to remove the comma and date only so that each element in the array be the telephone number only
When iterating over each element in the array, I obtained expected results by calling .gsub! when replacing a single character each element.
file_data = ["8675309,2015-01-20"]
puts file_data[0] #=> 8675309,2015-01-20
file_data.each do |s|
s.gsub!(/0/, "X")
end
puts file_data[0] #> 86753X9,2X15-X1-2X
To eliminate the comma and date, I tried simply using wildcards, calling s.gsub!(",****/**/**", ""). Then, this shows unexpected results:
file_data = ["8675309,2015-01-20"]
file_data.each do |s|
s.gsub!(/,****-**-**/, "")
end
puts file_data[0] #> 8675309,2015-01-20
I also tried several other wildcard characters that have been suggested in other threads ('.' and '^'), but the results have not changed.
I am lost on how to eliminate the comma and date in each element while leaving the primary number intact. I thought .gsub! would be the proper method, but am open to any alternatives as well. Any help is appreciated.
At first glance, I might use String#split to get the phone number:
file_data = ["8675309,2015-01-20"]
phone_numbers = file_data.map {|s| s.split(',').first }
phone_numbers[0] #=> "8675309"
Or, if the phone number is always 7 characters, I might get a string subset with []:
file_data.map {|s| s[0,7] }
Or, if you really want to stick with a regular expression:
file_data.each do |s|
s.gsub!(/,.*\z/, '')
end
Which reads as: part of a string starting from the first comma to the end of the string, replace with nothing.
The way you are handling wildcards is excessive. Why are you using wildcards when you know what you want to sub? Removing commas and the date (as long as the date is always the same format) should be simple:
name = "8675309,2015-01-20"
name.gsub!(/,\d{4}-\d{2}-\d{2}/,"")
Use String#partition
name.partition(',')[0]
=>"8675309"

Take an array and a letter as arguments and return a new array with words that contain that letter

I can run a search and find the element I want and can return those words with that letter. But when I start to put arguments in, it doesn't work. I tried select with include? and it throws an error saying, private method. This is my code, which returns what I am expecting:
my_array = ["wants", "need", 3, "the", "wait", "only", "share", 2]
def finding_method(source)
words_found = source.grep(/t/) #I just pick random letter
print words_found
end
puts finding_method(my_array)
# => ["wants", "the", "wait"]
I need to add the second argument, but it breaks:
def finding_method(source, x)
words_found = source.grep(/x/)
print words_found
end
puts finding_method(my_array, "t")
This doesn't work, (it returns an empty array because there isn't an 'x' in the array) so I don't know how to pass an argument. Maybe I'm using the wrong method to do what I'm after. I have to define 'x', but I'm not sure how to do that. Any help would be great.
Regular expressions support string interpolation just like strings.
/x/
looks for the character x.
/#{x}/
will first interpolate the value of the variable and produce /t/, which does what you want. Mostly.
Note that if you are trying to search for any text that might have any meaning in regular expression syntax (like . or *), you should escape it:
/#{Regexp.quote(x)}/
That's the correct answer for any situation where you are including literal strings in regular expression that you haven't built yourself specifically for the purpose of being a regular expression, i.e. 99% of cases where you're interpolating variables into regexps.

Capture arbitrary string before either '/' or end of string

Suppose I have:
foo/fhqwhgads
foo/fhqwhgadshgnsdhjsdbkhsdabkfabkveybvf/bar
And I want to replace everything that follows 'foo/' up until I either reach '/' or, if '/' is never reached, then up to the end of the line. For the first part I can use a non-capturing group like this:
(?<=foo\/).+
And that's where I get stuck. I could match to the second '/' like this:
(?<=foo\/).+(?=\/)
That doesn't help for the first case though. Desired output is:
foo/blah
foo/blah/bar
I'm using Ruby.
Try this regex:
/(?<=foo\/)[^\/]+/
Implementing #Endophage's answer:
def fix_post_foo_portion(string)
portions = string.split("/")
index_to_replace = portions.index("foo") + 1
portions[index_to_replace ] = "blah"
portions.join("/")
end
strings = %w{foo/fhqwhgads foo/fhqwhgadshgnsdhjsdbkhsdabkfabkveybvf/bar}
strings.each {|string| puts fix_post_foo_portion(string)}
I'm not a ruby dev but is there some equivalent of php's explode() so you could explode the string, insert a new item at the second array index then implode the parts with / again... Of course you can match on the first array element if you only want to do the switch in certain cases.
['foo/fhqwhgads', 'foo/fhqwhgadshgnsdhjsdbkhsdabkfabkveybvf/bar'].each do |s|
puts s.sub(%r|^(foo/)[^/]+(/.*)?|, '\1blah\2')
end
Output:
foo/blah
foo/blah/bar
I'm too tired to think of a nicer way to do it but I'm sure there is one.
Checking for the end-of-string anchor -- $ -- as well as the / character should do the trick. You'll also need to make the .+ non-greedy by changing it to .+? since the greedy version will always match right up to the end of the string, given the chance.
(?<=foo\/).+?(?=\/|$)

Capitalization of strings

Let us imagine, that we have a simple abstract input form, whose aim is accepting some string, which could consist of any characters.
string = "mystical characters"
We need to process this string by making first character uppercased. Yes, that is our main goal. Thereafter we need to display this converted string in some abstract view template. So, the question is: do we really need to check whether the first character is already written correctly (uppercased) or we are able to write just this?
theresult = string.capitalize
=> "Mystical characters"
Which approach is better: check and then capitalize (if need) or force capitalization?
Check first if you need to process something, because String#capitalize doesn't only convert the first character to uppercase, but it also converts all other characters downcase. So..
"First Lastname".capitalize == "First lastname"
That might not be the wanted result.
If I understood correctly you are going to capitalize the string anyway, so why bother checking if it's already capitalized?
Based on Tonttu answer I would suggest not to worry too much and just capitalize like this:
new_string = string[0...1].capitalize + string[1..-1]
I ran in to Tonttu's problem importing a bunch of names, I went with:
strs = "first lastname".split(" ")
return_string = ""
strs.each do |str|
return_string += "#{str[0].upcase}#{str[1..str.length].downcase} "
end
return_string.chop
EDIT: The inevitable refactor (over a year) later.
"first lastname".split(" ").map do |str|
"#{str[0].upcase}#{str[1..str.length].downcase}"
end.join(' ')
while definitely not easier to read, it gets the same result while declaring fewer temporary variables.
I guess you could write something like:
string.capitalize unless string =~ /^[A-Z].*/
Personally I would just do string.capitalize
Unless you have a flag to be set for capitalized strings which you going to check than just capitalize without checking.
Also the capitalization itself is probably performing some checking.

Resources