I've justed started learning to code in Ruby and have hit a snag in my first script. The idea is to translate the English alphabet into morse code.
I have set up a hash for my letters and their corresponding values:
morse_code = {
'a' => '.-',
'b' => '-...',
etc etc
I use the following to iterate through the hash and pull the corresponding values based on input then output it:
print "What would you like to translate: "
code = gets.strip.downcase
morse_code.each do |morse, alpha|
code.gsub!( morse, alpha )
end
puts code
The problem is that my output does not contain spacing so looks like this:
......-...-..----
instead of what I want:
.... . .-.. .-.. --- -
All I've found thus far are relating to adding a whitespace when calling variables inside a string. Below is an example:
Putting space between the output of defined variables in Ruby
Any help on how I can achieve this with my current code or rewrite it accordingly would be appreciated.
What you need is to take the input and map its characters to corresponding values from the morse_code hash, and then join it with spaces:
code = 'abb'
code.each_char.map { |letter| morse_code[letter] }.join(' ')
#=> ".- -... -..."
Reference:
String#each_char
Enumerable#map
Array#join
EDIT:
To make your initial code to work the only thing you lacked is a space, which is easy to add using interpolation:
code = 'abab'
morse_code.each do |morse, alpha|
code.gsub!(morse, "#{alpha} ") # <=============
end
code
#=> ".- -... .- -... "
code.rstrip
#=> ".- -... .- -..."
If you did not know about interpolation - here is how it works:
foo = 'bar'
"#{foo}" #=> "bar"
"hello I am #{foo}" #=> "hello I am bar"
So going back to your case, all the following does
"#{alpha} "
is adding a space after, which you needed. Problem with it, that the resulting string will have an extra space at the end, which we solved with
code.rstrip
I tried to write a function which will be able to randomly change letters in word except first and last one.
def fun(string)
z=0
s=string.size
tab=string
a=(1...s-1).to_a.sample s-1
for i in 1...(s-1)
puts tab[i].replace(string[a[z]])
z=z+1
end
puts tab
end
fun("sample")
My output is:
p
l
a
m
sample
Anybody know how to make it my tab be correct?
it seems to change in for block, because in output was 'plamp' so it's random as I wanted but if I want to print the whole word (splampe) it doesn't working. :(
What about:
def fun(string)
first, *middle, last = string.chars
[first, middle.shuffle, last].join
end
fun("sample") #=> "smalpe"
s = 'sample'
[s[0], s[1..-2].chars.shuffle, s[-1]].join
# => "slpmae"
Here is my solution:
def fun(string)
first = string[0]
last = string[-1]
middle = string[1..-2]
puts "#{first}#{middle.split('').shuffle.join}#{last}"
end
fun('sample')
there are some problems with your function. First, when you say tab=string, tab is now a reference to string, so, when you change characters on tab you change the string characters too. I think that for clarity is better to keep the index of sample (1....n)to reference the position in the original array.
I suggest the usage of tab as a new array.
def fun(string)
if string.length <= 2
return
z=1
s=string.size
tab = []
tab[0] = string[0]
a=(1...s-1).to_a.sample(s-1)
(1...s-1).to_a.each do |i|
tab[z] = string[a[i - 1]]
z=z+1
end
tab.push string[string.size-1]
tab.join('')
end
fun("sample")
=> "spalme"
Another way, using String#gsub with a block:
def inner_jumble(str)
str.sub(/(?<=\w)\w{2,}(?=\w)/) { |s| s.chars.shuffle.join }
end
inner_jumble("pneumonoultramicroscopicsilicovolcanoconiosis") # *
#=> "poovcanaiimsllinoonroinuicclprsciscuoooomtces"
inner_jumble("what ho, fellow coders?")
#=> "waht ho, folelw coedrs?"
(?<=\w) is a ("zero-width") positive look-behind that requires the match to immediately follow a word character.
(?=\w) is a ("zero-width") positive look-ahead that requires the match to be followed immediately by a word character.
You could use \w\w+ in place of \w{2,} for matching two or more consecutive word characters.
If you only want it to apply to individual words, you can use gsub or sub.
*A lung disease caused by inhaling very fine ash and sand dust, supposedly the longest word in some English dictionaries.
What would be a good way to remove the hash-tags from a string and then join the hash-tag words together in another string separated by commas:
'Some interesting tweet #hash #tags'
The result would be:
'Some interesting tweet'
And:
'hash,tags'
str = 'Some interesting tweet #hash #tags'
a,b = str.split.partition{|e| e.start_with?("#")}
# => [["#hash", "#tags"], ["Some", "interesting", "tweet"]]
a
# => ["#hash", "#tags"]
b
# => ["Some", "interesting", "tweet"]
a.join(",").delete("#")
# => "hash,tags"
b.join(" ")
# => "Some interesting tweet"
An alternate path is to use scan then remove the hash tags:
tweet = 'Some interesting tweet #hash #tags'
tags = tweet.scan(/#\w+/).uniq
tweet = tweet.gsub(/(?:#{ Regexp.union(tags).source })\b/, '').strip.squeeze(' ') # => "Some interesting tweet"
tags.join(',').tr('#', '') # => "hash,tags"
Dissecting it shows:
tweet.scan(/#\w+/) returns an array ["#hash", "#tags"].
uniq would remove any duplicated tags.
Regexp.union(tags) returns (?-mix:\#hash|\#tags).
Regexp.union(tags).source returns \#hash|\#tags. We don't want the pattern-flags at the start, so using source fixes that.
/(?:#{ Regexp.union(tags).source })\b/ returns the regular expression /(?:\#hash|\#tags)\b/.
tr is an extremely fast way to translate one character or characters to another, or strip them.
The final regex isn't the most optimized that can be generated. I'd actually write code to generate:
/#(?:hash|tags)\b/
but how to do that is left as an exercise for you. And, for short strings it won't make much difference as far as speed goes.
This has an array of hash that starts out empty
It then splits the hash tag based off spaces
It then looks for a hash tag and grabs the rest of the word
It then stores it into the array
array_of_hashetags = []
array_of_words = []
str = "Some interesting tweet #hash #tags"
str.split.each do |x|
if /\#\w+/ =~ x
array_of_hashetags << x.gsub(/\#/, "")
else
array_of_words << x
end
end
Hope the helps
i am a ruby beginner and i found a problem, i would like to know if there is a more 'ruby way'
to solve it.
my problem is:
i got a string, like this:
str = "<div class=\"yui-u first\">\r\n\t\t\t\t\t<h1>Jonathan Doe</h1>\r\n
\t\t\t\t\t<h2>Web Designer, Director</h2>\r\n\t\t\t\t</div>"
# now, i want to replace the substring in <h1> </h1> and <h2> and </h2> with
these two string:"fooo" and "barr".
here is what i did:
# first, i got the exactly matched substrings of str:
r = str.scan(/(?<=<h\d>).*?(?=<\/h\d>)/)
# then, i create a hash table to set the corresponding replace strings
h = {r[0] => 'fooo', r[1] => 'barr'}
# finally, using str.gsub to replace those matched strings
str.gsub!(/(?<=<h\d>).*?(?=<\/h\d>)/, h)
# or like this
str.gsub!(/(?<=<h\d>).*?(?=<\/h\d>)/) {|v| h[v]}
PS: The substring in <h1> </h1> and <h2> </h2> are not fixed, so i have
to get these strings FIRST, so that i can build a hash table. But I
really don't like the code above (because i wrote two lines almost the same),
i think there must be a elegant way to do so. i have tried something like this:
str.gsub!(/(?<=<h\d>).*?(?=<\/h\d>)/) { ['fooo', 'barr'].each {|v| v}}
but this didn't work. because this block returns ['fooo', 'barr'] EVERYTIME!
if there is a way to let this block (or something?) return one element at a time(return 'fooo' at the first time, then return 'barr' at the second), my problem will be solved!
thank you!
Although you really have no business parsing HTML with a regexp, as a library like Nokogiri can make this significantly easier as you can modify the DOM directly, the mistake you're making is in presuming that the iterator will execute only once per substitution and that the block will return only one value. each will actually return the object being iterated.
Here's a way to avoid all the Regexp insanity:
require 'rubygems'
gem 'nokogiri'
require 'nokogiri'
str = "<div class=\"yui-u first\">\r\n\t\t\t\t\t<h1>Jonathan Doe</h1>\r\n
\t\t\t\t\t<h2>Web Designer, Director</h2>\r\n\t\t\t\t</div>"
html = Nokogiri::HTML(str)
h1 = html.at_css('h1')
h1.content = 'foo'
h2 = html.at_css('h2')
h2.content = 'bar'
puts html.to_s
If you want to do multiple substitutions where each gets a different value, the simple way is to just rip off values from a stack:
subs = %w[ foo bar baz ]
string = "x x x"
string.gsub!(/x/) do |s|
subs.shift
end
puts string.inspect
# => "foo bar baz"
Keep in mind that subs is consumed here. A more efficient approach would be to increment some kind of index variable and use that value instead, but this is a trivial modification.
I have a string:
s="123--abc,123--abc,123--abc"
I tried using Ruby 1.9's new feature "named groups" to fetch all named group info:
/(?<number>\d*)--(?<chars>\s*)/
Is there an API like Python's findall which returns a matchdata collection? In this case I need to return two matches, because 123 and abc repeat twice. Each match data contains of detail of each named capture info so I can use m['number'] to get the match value.
Named captures are suitable only for one matching result.
Ruby's analogue of findall is String#scan. You can either use scan result as an array, or pass a block to it:
irb> s = "123--abc,123--abc,123--abc"
=> "123--abc,123--abc,123--abc"
irb> s.scan(/(\d*)--([a-z]*)/)
=> [["123", "abc"], ["123", "abc"], ["123", "abc"]]
irb> s.scan(/(\d*)--([a-z]*)/) do |number, chars|
irb* p [number,chars]
irb> end
["123", "abc"]
["123", "abc"]
["123", "abc"]
=> "123--abc,123--abc,123--abc"
Chiming in super-late, but here's a simple way of replicating String#scan but getting the matchdata instead:
matches = []
foo.scan(regex){ matches << $~ }
matches now contains the MatchData objects that correspond to scanning the string.
You can extract the used variables from the regexp using names method. So what I did is, I used regular scan method to get the matches, then zipped names and every match to create a Hash.
class String
def scan2(regexp)
names = regexp.names
scan(regexp).collect do |match|
Hash[names.zip(match)]
end
end
end
Usage:
>> "aaa http://www.google.com.tr aaa https://www.yahoo.com.tr ddd".scan2 /(?<url>(?<protocol>https?):\/\/[\S]+)/
=> [{"url"=>"http://www.google.com.tr", "protocol"=>"http"}, {"url"=>"https://www.yahoo.com.tr", "protocol"=>"https"}]
#Nakilon is correct showing scan with a regex, however you don't even need to venture into regex land if you don't want to:
s = "123--abc,123--abc,123--abc"
s.split(',')
#=> ["123--abc", "123--abc", "123--abc"]
s.split(',').inject([]) { |a,s| a << s.split('--'); a }
#=> [["123", "abc"], ["123", "abc"], ["123", "abc"]]
This returns an array of arrays, which is convenient if you have multiple occurrences and need to see/process them all.
s.split(',').inject({}) { |h,s| n,v = s.split('--'); h[n] = v; h }
#=> {"123"=>"abc"}
This returns a hash, which, because the elements have the same key, has only the unique key value. This is good when you have a bunch of duplicate keys but want the unique ones. Its downside occurs if you need the unique values associated with the keys, but that appears to be a different question.
If using ruby >=1.9 and the named captures, you could:
class String
def scan2(regexp2_str, placeholders = {})
return regexp2_str.to_re(placeholders).match(self)
end
def to_re(placeholders = {})
re2 = self.dup
separator = placeholders.delete(:SEPARATOR) || '' #Returns and removes separator if :SEPARATOR is set.
#Search for the pattern placeholders and replace them with the regex
placeholders.each do |placeholder, regex|
re2.sub!(separator + placeholder.to_s + separator, "(?<#{placeholder}>#{regex})")
end
return Regexp.new(re2, Regexp::MULTILINE) #Returns regex using named captures.
end
end
Usage (ruby >=1.9):
> "1234:Kalle".scan2("num4:name", num4:'\d{4}', name:'\w+')
=> #<MatchData "1234:Kalle" num4:"1234" name:"Kalle">
or
> re="num4:name".to_re(num4:'\d{4}', name:'\w+')
=> /(?<num4>\d{4}):(?<name>\w+)/m
> m=re.match("1234:Kalle")
=> #<MatchData "1234:Kalle" num4:"1234" name:"Kalle">
> m[:num4]
=> "1234"
> m[:name]
=> "Kalle"
Using the separator option:
> "1234:Kalle".scan2("#num4#:#name#", SEPARATOR:'#', num4:'\d{4}', name:'\w+')
=> #<MatchData "1234:Kalle" num4:"1234" name:"Kalle">
I needed something similar recently. This should work like String#scan, but return an array of MatchData objects instead.
class String
# This method will return an array of MatchData's rather than the
# array of strings returned by the vanilla `scan`.
def match_all(regex)
match_str = self
match_datas = []
while match_str.length > 0 do
md = match_str.match(regex)
break unless md
match_datas << md
match_str = md.post_match
end
return match_datas
end
end
Running your sample data in the REPL results in the following:
> "123--abc,123--abc,123--abc".match_all(/(?<number>\d*)--(?<chars>[a-z]*)/)
=> [#<MatchData "123--abc" number:"123" chars:"abc">,
#<MatchData "123--abc" number:"123" chars:"abc">,
#<MatchData "123--abc" number:"123" chars:"abc">]
You may also find my test code useful:
describe String do
describe :match_all do
it "it works like scan, but uses MatchData objects instead of arrays and strings" do
mds = "ABC-123, DEF-456, GHI-098".match_all(/(?<word>[A-Z]+)-(?<number>[0-9]+)/)
mds[0][:word].should == "ABC"
mds[0][:number].should == "123"
mds[1][:word].should == "DEF"
mds[1][:number].should == "456"
mds[2][:word].should == "GHI"
mds[2][:number].should == "098"
end
end
end
I really liked #Umut-Utkan's solution, but it didn't quite do what I wanted so I rewrote it a bit (note, the below might not be beautiful code, but it seems to work)
class String
def scan2(regexp)
names = regexp.names
captures = Hash.new
scan(regexp).collect do |match|
nzip = names.zip(match)
nzip.each do |m|
captgrp = m[0].to_sym
captures.add(captgrp, m[1])
end
end
return captures
end
end
Now, if you do
p '12f3g4g5h5h6j7j7j'.scan2(/(?<alpha>[a-zA-Z])(?<digit>[0-9])/)
You get
{:alpha=>["f", "g", "g", "h", "h", "j", "j"], :digit=>["3", "4", "5", "5", "6", "7", "7"]}
(ie. all the alpha characters found in one array, and all the digits found in another array). Depending on your purpose for scanning, this might be useful. Anyway, I love seeing examples of how easy it is to rewrite or extend core Ruby functionality with just a few lines!
A year ago I wanted regular expressions that were more easy to read and named the captures, so I made the following addition to String (should maybe not be there, but it was convenient at the time):
scan2.rb:
class String
#Works as scan but stores the result in a hash indexed by variable/constant names (regexp PLACEHOLDERS) within parantheses.
#Example: Given the (constant) strings BTF, RCVR and SNDR and the regexp /#BTF# (#RCVR#) (#SNDR#)/
#the matches will be returned in a hash like: match[:RCVR] = <the match> and match[:SNDR] = <the match>
#Note: The #STRING_VARIABLE_OR_CONST# syntax has to be used. All occurences of #STRING# will work as #{STRING}
#but is needed for the method to see the names to be used as indices.
def scan2(regexp2_str, mark='#')
regexp = regexp2_str.to_re(mark) #Evaluates the strings. Note: Must be reachable from here!
hash_indices_array = regexp2_str.scan(/\(#{mark}(.*?)#{mark}\)/).flatten #Look for string variable names within (#VAR#) or # replaced by <mark>
match_array = self.scan(regexp)
#Save matches in hash indexed by string variable names:
match_hash = Hash.new
match_array.flatten.each_with_index do |m, i|
match_hash[hash_indices_array[i].to_sym] = m
end
return match_hash
end
def to_re(mark='#')
re = /#{mark}(.*?)#{mark}/
return Regexp.new(self.gsub(re){eval $1}, Regexp::MULTILINE) #Evaluates the strings, creates RE. Note: Variables must be reachable from here!
end
end
Example usage (irb1.9):
> load 'scan2.rb'
> AREA = '\d+'
> PHONE = '\d+'
> NAME = '\w+'
> "1234-567890 Glenn".scan2('(#AREA#)-(#PHONE#) (#NAME#)')
=> {:AREA=>"1234", :PHONE=>"567890", :NAME=>"Glenn"}
Notes:
Of course it would have been more elegant to put the patterns (e.g. AREA, PHONE...) in a hash and add this hash with patterns to the arguments of scan2.
Piggybacking off of Mark Hubbart's answer, I added the following monkey-patch:
class ::Regexp
def match_all(str)
matches = []
str.scan(self) { matches << $~ }
matches
end
end
which can be used as /(?<letter>\w)/.match_all('word'), and returns:
[#<MatchData "w" letter:"w">, #<MatchData "o" letter:"o">, #<MatchData "r" letter:"r">, #<MatchData "d" letter:"d">]
This relies on, as others have said, the use of $~ in the scan block for the match data.
I like the match_all given by John, but I think it has an error.
The line:
match_datas << md
works if there are no captures () in the regex.
This code gives the whole line up to and including the pattern matched/captured by the regex. (The [0] part of MatchData) If the regex has capture (), then this result is probably not what the user (me) wants in the eventual output.
I think in the case where there are captures () in regex, the correct code should be:
match_datas << md[1]
The eventual output of match_datas will be an array of pattern capture matches starting from match_datas[0]. This is not quite what may be expected if a normal MatchData is wanted which includes a match_datas[0] value which is the whole matched substring followed by match_datas[1], match_datas[[2],.. which are the captures (if any) in the regex pattern.
Things are complex - which may be why match_all was not included in native MatchData.