Trying to concat 2 arrays that are output from the map method - ruby

I'm doing a 'morse code' exercise and running into some difficulty. I'll skip posting the hash I created that stores the code and the letter.
The morse code in the method call has 3 spaces between the 'words' Example -
decodeMorse('.... . -.-- .--- ..- -.. .')
My strategy was to split the words first using split(/\s\s\s/) which gives me separate arrays for each word, but then those arrays need a split(' ') to get to the letters.
This is my code -
sc = str.split(/\s\s\s/)
sc.each do |string|
string.split(' ').map {|key| morsecode[key]; }
It works okay, but I'm left with two arrays at the end:
=> ["h", "e", "y"]
=> ["j", "u", "d", "e"]
Normally, if I had two or more arrays that had assigned variable names I would know how to concat them but what I've tried and searched on hasn't changed the situation. Obviously all I get from join('') is the two words together with no space between them.

There is no need to convert the string to an array, then convert the elements of the array to arrays, join the latter arrays then join the former array. Instead one can simply use the form of String#gsub that employs a hash to make substitutions.
morsecode = { ".-"=>"a", "-..."=>"b", "-.-."=>"c", "-.."=>"d", "."=>"e", "..-."=>"f",
"--."=>"g", "...."=>"h", ".."=>"i", ".---"=>"j", "-.-"=>"k", ".-.."=>"l",
"--"=>"m", "-."=>"n", "---"=>"o", ".--."=>"p", "--.-"=>"q", ".-."=>"r",
"..."=>"s", "-"=>"t", "..-"=>"u", "...-"=>"v", ".--"=>"w", "-..-"=>"x",
"-.--"=>"y", "--.."=>"z", " "=>" ", " "=>""}
Notice the last two key-value pairs in morsecode.
'.... . -.-- .--- ..- -.. .'.gsub(/[.-]+| | /, morsecode)
#=> "hey jude"
The regular expression reads, "match one or more dits and dahs or three spaces or one space". Note that three spaces must precede the single space in the regex.

sc = str.split(/\s\s\s/)
deciphered = sc.map do |string|
string.split(' ').map {|key| morsecode[key]; }.join
end
deciphered.join(' ')

Related

Splitting the string with last underscore

I have a string like "a_b_c" or "a_b_c_d" or "a_b_c_d_e". I want to split the string at the last underscore.
**input**
'a_b_c'
**output**
a_b
c
**input**
'a_b_c_d'
**output**
a_b_c
d
I have done the following:
a='a_b_c'
a=a.split('_')
last=a.pop
a.delete(last)
p a.join("_")
p last
and achieved the result, but I don't think this should be done this way. I hope there is some regular expression to achieve this. Is there anyone who can help me with this?
You can use String#rpartition that searches for a given pattern form the right end of the string and splits when it finds it.
'a_b_c_d_e'.rpartition(/_/)
=> ["a_b_c_d", "_", "e"]
s = 'a_b_c_d_e'
parts = s.rpartition(/_/)
[parts.first, parts.last]
=> ["a_b_c_d", "e"]
EDIT: applying advices from the comments:
'a_b_c_d_e'.rpartition('_').values_at(0,2)
=> ["a_b_c_d", "e"]
Do you really need to split? How about just replacing the _ with a space? e.g. using rindex and []=
a[a.rindex('_')] = ' '
I didn't do a benchmark, but split creates a new array, which typically requires more resources, at least in other languages.
EDIT: as the question was edited, its now clear the OP is asking for a list instead of a string output
You can also get values as below,
> a = a.split('_')
> a[0..-2].join('_')
# => "a_b_c_d"
> a[-1]
# => "e"
'a_b_c_d_e'.split /_(?!.*_)/
#=> ["a_b_c_d", "e"]
The negative lookahead (?!.*_) requires that following the match of the underscore there is no other underscore in the string.
Split it with regex:
a.split(/_(?=[^_]+$)/)
Explanation:
matches the character _ with positive Lookahead (?=[^_]+$)
Match a single character not present in the list below [^_]+ and
$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)
Assuming you know this string follows this format:
str = 'a_b_c_d_e'
# Remainder
str[0...-2] # -> 'a_b_c_d'
# Last symbol
str[-1] # -> 'e'

How does this regexp split on the first vowel?

This code splits a word into two strings at the first vowel. Why?
word = "banana"
parts = word.split(/([aeiou].*)/)
The key here is the regular expression (or regex) that is being used between the two /'s
[aeiou] says to look for the first instance of one of those characters.
. matches any single character
* modifies the previous thing to mean match 0 or more of it
(...) means capture everything enclosed between the parentheses
Translated to english this regular expression might read something like "Given a string, find the first vowel that is followed by zero or more characters. Collect that vowel and its following characters and set them aside."
The slightly more confusing part is the regex's interaction with the split method. The value the regex returns is 'anana'. And we can see that calling split with 'anana' doesn't have the same result:
'banana'.split('anana') #=> ["b"]
But when split is called with a regular expression that uses a capture group - or parentheses (...), then anything in that capture group will also be returned in the result of the split. Which is why:
'banana'.split /([aeiou].*)/ #=> ["b", "anana"]
If you want to learn more about how regular expressions work (particularly in ruby), Rubular is a great resource to fiddle with - http://www.rubular.com/r/XEUgPhOdlH
This is actually a bit tricky. This regexp
/[aeiou].*/
matches the string from the first vowel to the end of the string i.e. "anana". But if you were to split on that, you would only get the first letter since split doesn't include the splitting pattern:
"banana".split /[aeiou].*/
# ["b"]
But according to the String#split docs, if the splitting pattern is a regexp with a capture group, the capture groups are included in the result as well. Since the whole pattern is wrapped in a capture group, the result is that the string splits before the first vowel.
For example, if you change the regexp to have two capture groups, it splits further:
"banana".split /([aeiou])(.*)/
# ["b", "a", "nana"]
ANSWER FOR OLD TITLE
It's not really a Ruby's syntax, it's a standard Regular Expression's syntax that also implemented by Ruby.
* means zero or more of previous item
. means any character
[aeiou] means any character inside the brace
() means capture it
So that regex means: capture anything that starts with a, e, i, o, or u.
the word.split(/([aeiou].*)/) means, split the word variable based on anything that starts with letter a, e, i, o, or u.
See here fore more information.
ANSWER FOR NEW TITLE
Why does it split on the first vowel? It's not really like that.. What it does is, split by anything that start with vowels and capture it (the string that starts with vowels) also, see more example here:
word = 'banana'
word.split /[aeiou]/ # split by vowels
#=> ["b", "n", "n"]
word.split /([aeiou])/ # split by vowels and capture the vowels
#=> ["b", "a", "n", "a", "n", "a"]
word.split /[aeiou].*/ # split by anything that start with vowels
#=> ["b"]
word.split /([aeiou].*)/ # split by anything that start with vowels and capture the thing that start with vowels also
#=> ["b", "anana"]
ANSWER FOR OLD TITLE
If the * symbol not inside the regular expression // (Ruby's syntax), there are some possibilities:
multiplication 2 * 3 == 6, 'na' * 3 == 'nanana' # batman!
splat operation [*(1..4)] == [1,2,3,4], see more info here

Split string into a list, but keeping the split pattern

Currently i am splitting a string by pattern, like this:
outcome_array=the_text.split(pattern_to_split_by)
The problem is that the pattern itself that i split by, always gets omitted.
How do i get it to include the split pattern itself?
Thanks to Mark Wilkins for inpsiration, but here's a shorter bit of code for doing it:
irb(main):015:0> s = "split on the word on okay?"
=> "split on the word on okay?"
irb(main):016:0> b=[]; s.split(/(on)/).each_slice(2) { |s| b << s.join }; b
=> ["split on", " the word on", " okay?"]
or:
s.split(/(on)/).each_slice(2).map(&:join)
See below the fold for an explanation.
Here's how this works. First, we split on "on", but wrap it in parentheses to make it into a match group. When there's a match group in the regular expression passed to split, Ruby will include that group in the output:
s.split(/(on)/)
# => ["split", "on", "the word", "on", "okay?"
Now we want to join each instance of "on" with the preceding string. each_slice(2) helps by passing two elements at a time to its block. Let's just invoke each_slice(2) to see what results. Since each_slice, when invoked without a block, will return an enumerator, we'll apply to_a to the Enumerator so we can see what the Enumerator will enumerator over:
s.split(/(on)/).each_slice(2).to_a
# => [["split", "on"], ["the word", "on"], ["okay?"]]
We're getting close. Now all we have to do is join the words together. And that gets us to the full solution above. I'll unwrap it into individual lines to make it easier to follow:
b = []
s.split(/(on)/).each_slice(2) do |s|
b << s.join
end
b
# => ["split on", "the word on" "okay?"]
But there's a nifty way to eliminate the temporary b and shorten the code considerably:
s.split(/(on)/).each_slice(2).map do |a|
a.join
end
map passes each element of its input array to the block; the result of the block becomes the new element at that position in the output array. In MRI >= 1.8.7, you can shorten it even more, to the equivalent:
s.split(/(on)/).each_slice(2).map(&:join)
You could use a regular expression assertion to locate the split point without consuming any of the input. Below uses a positive look-behind assertion to split just after 'on':
s = "split on the word on okay?"
s.split(/(?<=on)/)
=> ["split on", " the word on", " okay?"]
Or a positive look-ahead to split just before 'on':
s = "split on the word on okay?"
s.split(/(?=on)/)
=> ["split ", "on the word ", "on okay?"]
With something like this, you might want to make sure 'on' was not part of a larger word (like 'assertion'), and also remove whitespace at the split:
"don't split on assertion".split(/(?<=\bon\b)\s*/)
=> ["don't split on", "assertion"]
If you use a pattern with groups, it will return the pattern in the results as well:
irb(main):007:0> "split it here and here okay".split(/ (here) /)
=> ["split it", "here", "and", "here", "okay"]
Edit The additional information indicated that the goal is to include the item on which it was split with one of the halves of the split items. I would think there is a simple way to do that, but I don't know it and haven't had time today to play with it. So in the absence of the clever solution, the following is one way to brute force it. Use the split method as described above to include the split items in the array. Then iterate through the array and combine every second entry (which by definition is the split value) with the previous entry.
s = "split on the word on and include on with previous"
a = s.split(/(on)/)
# iterate through and combine adjacent items together and store
# results in a second array
b = []
a.each_index{ |i|
b << a[i] if i.even?
b[b.length - 1] += a[i] if i.odd?
}
print b
Results in this:
["split on", " the word on", " and include on", " with previous"]

Must a gsub hash key be a string, not a regexp?

I want to do a sequence of gsubs against one string, so I utilized the fact that gsub can take a hash as the second argument. One thing I wanted to do with gsub is to convert a sequence of one or more space/tab into a single space, so I have something essentially as follows:
gsub(/[ \t]+/, {/[ \t]+/ => ' '})
In my actual code, the first argument is a union of the regexp I gave here, and the second argument includes more key-value pairs.
Now, when I apply this to a string, all of the space/tabs are deleted. I suppose this is because the match to the first argument is not regarded as matching to the key [ \t] in the second argument (hash). Does the match in the second argument hash only looks for exact string match, not regexp match? If so, is there any way to get around it?
This is a related question. If you need to use the hash because many things have to be substituted, this might work:
list = Hash.new{|h,k|if /\s+/ =~ k then ' ' else k end}
list['foo'] = 'bar'
list['apple'] = 'banana'
p "appleabc\t \tabc apple foo".gsub(/\w+|\W+/,list)
#=> "appleabc abc banana bar"
p list
#=>{"foo"=>"bar", "apple"=>"banana"} no garbage
According to the docs, gsub with a hash as the second parameter only matches against literal strings:
'hello'.gsub(/[eo]/, 'e' => 3, 'o' => '*') #=> "h3ll*"
If you want to supply multiple hashes you could work around it by creating a hash, where the key/value pairs are the search => replacement pairs, iterate over the hash, and pass those into the gsub. Because Ruby 1.9+ maintains the insertion order of the hash, you're guaranteed that the search will occur in the order you want.
search_hash = {
'1' => 'one',
'too' => 'two',
/[\t ]+/ => ' '
}
str = "1, too,\t3 , four"
search_hash.each { |n,v| str.gsub!(n, v) }
str #=> "one, two, 3 , four"
If you just want the spaces/tabs replaced with one space, why not just specify that as the replacement, and omit the whole hash?
gsub(/[ \t]+/, ' ')
UPDATE: based on your comment, you can use the block syntax of gsub
gsub(/[ \t]+/) {|match| *do stuff here* }

Matching and returning value using wildcard with regex in Ruby

Let's say I have a sting with a number of words separated by spaces. Each word has a single digit number after it. If I have a word, is it possible to find the word in the string and return it along with the number after it? (using Ruby)
For example:
string = "test0 chance1 already0 again4"
word = "chance"
How can I get a return value of "chance1"?
Update:
/word\d+/.match(string) returns "chance1"
This seems to be working.
Your sample and update don't work:
Update:
/string\d+/.match(word) returns "chance1"
This seems to be working.
Dumping it into irb shows:
>> string = "test0 chance1 already0 again4" #=> "test0 chance1 already0 again4"
>> word = "chance" #=> "chance"
>> /string\d+/.match(word) #=> nil
so that isn't working.
I'd recommend:
>> Hash[*string.scan(/(\w+)(\d)/).flatten]['chance'] #=> "1"
or
>> hash = Hash[*string.scan(/(\w+)(\d)/).flatten]
>> hash['chance'] #=> "1"
>> hash['test'] #=> "0"
>> hash['again'] #=> "4"
It works by scanning for the words ending with a digit, and grabbing the word and the digit separately. String.scan will return an array of arrays, where each inner array contains the groups matched.
>> string.scan(/(\w+)(\d)/) #=> [["test", "0"], ["chance", "1"], ["already", "0"], ["again", "4"]]
Then I flatten it to get a list of words followed by their matching digit
>> string.scan(/(\w+)(\d)/).flatten #=> ["test", "0", "chance", "1", "already", "0", "again", "4"]
and turn it into a hash.
>> Hash[*string.scan(/(\w+)(\d)/).flatten] #=> {"test"=>"0", "chance"=>"1", "already"=>"0", "again"=>"4"}
Then it's a simple case of asking the hash for the value that matches a particular word.
String.scan is powerful but often overlooked. For Perl programmers it's similar to using the m//g pattern match.
Here's a little different way to populate the hash:
>> string.scan(/(\w+)(\d)/).inject({}){|h,a| h[a[0]]=a[1]; h} #=> {"test"=>"0", "chance"=>"1", "already"=>"0", "again"=>"4"}
Split the string into words, then find which one includes the target word.
string.split(' ').find {|s| s.index word}
result = string.match Regexp.new(word + '\d')
This concatenates word with \d(the regexp for a single digit) which in your case would compile to /chance\d/, which would match the word 'chance' with any single digit after it. Then it would check the string for a match, so you'd get the 'chance1' in the string.

Resources