Convert a string based on hash values [closed] - ruby

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 years ago.
Improve this question
I am trying to write a method that takes in a string and a hash and "encodes" the string based on hash keys and values.
def encode(str,encoding)
end
str = "12#3"
encoding = {"1" => "one", "2"=> "two", "3"=> "three"}
I am expecting the output to be "one two three" any char in the string that is not a key in the hash is replaced with an empty string.
Right now my code looks like the following:
def encode(str, encoding)
output = ""
str.each_char do |ch|
if encoding.has_key?(ch)
output += encoding[ch]
else
output += ""
end
end
return output
end
Any help is appreciated

You can use use the form of String#gsub that uses a hash for substitutions, and a simple regex:
str = "12#3"
encoding = {"1"=>"one", "2"=>"two", "3"=>"three"}
First create a new hash that adds a space to each value in encoding:
adj_encoding = encoding.each_with_object({}) { |(k,v),h| h[k] = "#{v} " }
#=> {"1"=>"one ", "2"=>"two ", "3"=>"three "}
Now perform the substitutions and strip off the extra space if one of the keys of encoding is the last character of str:
str.gsub(/./, adj_encoding).rstrip
#=> "one two three"
Another example:
"1ab 2xx4cat".gsub(/./, adj_encoding).rstrip
#=> "one two"
Ruby determines whether each character of str (the /./ part) equals a key of adj_encodeing. If it does, she substitutes the key's value for the character; else she substitutes an empty string ('') for the character.

You can build a regular expression that matches your keys via Regexp.union:
re = Regexp.union(encoding.keys)
#=> /1|2|3/
scan the string for occurrences of keys using that regular expression:
keys = str.scan(re)
#=> ["1", "2", "3"]
fetch the corresponding values using values_at:
values = encoding.values_at(*keys)
#=> ["one", "two", "three"]
and join the array with a single space:
values.join(' ')
#=> "one two three"
As a "one-liner":
encoding.values_at(*str.scan(Regexp.union(encoding.keys))).join(' ')
#=> "one two three"

Try:
def encode(str, encoding)
output = ""
str.each_char do |ch|
if encoding.has_key?(ch)
output += encoding[ch] + " "
else
output += ""
end
end
return output.split.join(' ')
end
str = "12#3"
encoding = {"1" => "one", "2"=> "two", "3"=> "three"}
p encode(str, encoding) #=> "one two three"

If you are expecting "one two three" you just need to add an space to your concat line and before return, add .lstrip to remove the first space.
Hint: You don't need the "else" concatenating an empty string. If the "#" don't match the encoding hash, it will be ignored.
Like this:
#str = "12#3"
#encoding = {"1" => "one", "2"=> "two", "3"=> "three"}
def encode(str, encoding)
output = ""
str.each_char do |ch|
if encoding.has_key?(ch)
output += " " + encoding[ch]
end
end
return output.lstrip
end
# Output: "one two three"

I would do:
encoding = {"1" => "one", "2"=> "two", "3"=> "three"}
str = "12#3"
str.chars.map{|x|encoding.fetch(x,nil)}.compact.join(' ')
Or two lines like this:
in_encoding_hash = -> x { encoding.has_key? x }
str.chars.grep(in_encoding_hash){|x|encoding[x]}.join(' ')

Related

Unscrambling a string given the number of splits and words that the sentence can be comprised of

Im working on a problem in which I'm given a string that has been scrambled. The scrambling works like this.
An original string is chopped into substrings at random positions and a random number of times.
Each substring is then moved around randomly to form a new string.
I'm also given a dictionary of words that are possible words in the string.
Finally, i'm given the number of splits in the string that were made.
The example I was given is this:
dictionary = ["world", "hello"]
scrambled_string = rldhello wo
splits = 1
The expected output of my program would be the original string, in this case:
"hello world"
Suppose the initial string
"hello my name is Sean"
with
splits = 2
yields
["hel", "lo my name ", "is Sean"]
and those three pieces are shuffled to form the following array:
["lo my name ", "hel", "is Sean"]
and then the elements of this array are joined to form:
scrambled = "lo my name helis Sean"
Also suppose:
dictionary = ["hello", "Sean", "the", "name", "of", "my", "cat", "is", "Sugar"]
First convert dictionary to a set to speed lookups.
require 'set'
dict_set = dictionary.to_set
#=> #<Set: {"hello", "Sean", "the", "name", "of", "my", "cat", "is", "Sugar"}>
Next I will create a helper method.
def indices_to_ranges(indices, last_index)
[-1, *indices, last_index].each_cons(2).map { |i,j| i+1..j }
end
Suppose we split scrambled twice (because splits #=> 2), specifically after the 'y' and the 'h':
indices = [scrambled.index('y'), scrambled.index('h')]
#=> [4, 11]
The first element of indices will always be -1 and the last value will always be scrambled.size-1.
We may then use indices_to_ranges to convert these indices to ranges of indices of characters in scrambed:
ranges = indices_to_ranges(indices, scrambled.size-1)
#=> [0..4, 5..11, 12..20]
a = ranges.map { |r| scrambled[r] }
#=> ["lo my", " name h", "elis Sean"]
We could of course combine these two steps:
a = indices_to_ranges(indices, scrambled.size-1).map { |r| scrambled[r] }
#=> ["lo my", " name h", "elis Sean"]
Next I will permute the values of a. For each permutation I will join the elements to form a string, then split the string on single spaces to form an array of words. If all of those words are in the dictionary we may claim success and are finished. Otherwise, a different array indices will be constructed and we try again, continuing until success is realized or all possible arrays indices have been considered. We can put all this in the following method.
def unscramble(scrambled, dict_set, splits)
last_index = scrambled.size-1
(0..scrambled.size-2).to_a.combination(splits).each do |indices|
indices_to_ranges(indices, last_index).
map { |r| scrambled[r] }.
permutation.each do |arr|
next if arr[0][0] == ' ' || arr[-1][-1] == ' '
words = arr.join.split(' ')
return words if words.all? { |word| dict_set.include?(word) }
end
end
end
Let's try it.
original string: "hello my name is Sean"
scrambled = "lo my name helis Sean"
splits = 4
unscramble(scrambled, dict_set, splits)
#=> ["my", "name", "hello", "is", "Sean"]
See Array#combination and Array#permutation.
bonkers answer (not quite perfect yet ... trouble with single chars):
#
# spaces appear to be important!
#check = {}
#ordered = []
def previous_words (word)
#check.select{|y,z| z[:previous] == word}.map do |nw,z|
#ordered << nw
previous_words(nw)
end
end
def in_word(dictionary, string)
# check each word in the dictionary to see if the string is container in one of them
dictionary.each do |word|
if word.include?(string)
return word
end
end
return nil
end
letters=scrambled.split("")
previous=nil
substr=""
letters.each do |l|
if in_word(dictionary, substr+l)
substr+= l
elsif (l==" ")
word=in_word(dictionary, substr)
#check[word]={found: 1}
#check[word][:previous] = previous if previous
substr=""
previous=word
else
word=in_word(dictionary, substr)
#check[word]={found: 1}
#check[word][:previous] = previous if previous
substr=l
previous=nil
end
end
word=in_word(dictionary, substr)
#check[word]={found: 1}
#check[word][:previous] = previous if previous
#check.select{|y,z| z[:previous].nil?}.map do |w,z|
#ordered << w
previous_words(w)
end
pp #ordered
output:
dictionary = ["world", "hello"]
scrambled = "rldhello wo"
... my code here ...
2.5.8 :817 > #ordered
=> ["hello", "world"]
dictionary = ["hello", "my", "name", "is", "Sean"]
scrambled = "me is Shelleano my na"
... my code here ...
2.5.8 :879 > #ordered
=> ["Sean", "hello", "my", "name", "is"]

Remove a string pattern and symbols from string

I need to clean up a string from the phrase "not" and hashtags(#). (I also have to get rid of spaces and capslock and return them in arrays, but I got the latter three taken care of.)
Expectation:
"not12345" #=> ["12345"]
" notabc " #=> ["abc"]
"notone, nottwo" #=> ["one", "two"]
"notCAPSLOCK" #=> ["capslock"]
"##doublehash" #=> ["doublehash"]
"h#a#s#h" #=> ["hash"]
"#notswaggerest" #=> ["swaggerest"]
This is the code I have
def some_method(string)
string.split(", ").map{|n| n.sub(/(not)/,"").downcase.strip}
end
All of the above test does what I need to do except for the hash ones. I don't know how to get rid of the hashes; I have tried modifying the regex part: n.sub(/(#not)/), n.sub(/#(not)/), n.sub(/[#]*(not)/) to no avail. How can I make Regex to remove #?
arr = ["not12345", " notabc", "notone, nottwo", "notCAPSLOCK",
"##doublehash:", "h#a#s#h", "#notswaggerest"].
arr.flat_map { |str| str.downcase.split(',').map { |s| s.gsub(/#|not|\s+/,"") } }
#=> ["12345", "abc", "one", "two", "capslock", "doublehash:", "hash", "swaggerest"]
When the block variable str is set to "notone, nottwo",
s = str.downcase
#=> "notone, nottwo"
a = s.split(',')
#=> ["notone", " nottwo"]
b = a.map { |s| s.gsub(/#|not|\s+/,"") }
#=> ["one", "two"]
Because I used Enumerable#flat_map, "one" and "two" are added to the array being returned. When str #=> "notCAPSLOCK",
s = str.downcase
#=> "notcapslock"
a = s.split(',')
#=> ["notcapslock"]
b = a.map { |s| s.gsub(/#|not|\s+/,"") }
#=> ["capslock"]
Here is one more solution that uses a different technique of capturing what you want rather than dropping what you don't want: (for the most part)
a = ["not12345", " notabc", "notone, nottwo",
"notCAPSLOCK", "##doublehash:","h#a#s#h", "#notswaggerest"]
a.map do |s|
s.downcase.delete("#").scan(/(?<=not)\w+|^[^not]\w+/)
end
#=> [["12345"], ["abc"], ["one", "two"], ["capslock"], ["doublehash"], ["hash"], ["swaggerest"]]
Had to delete the # because of h#a#s#h otherwise delete could have been avoided with a regex like /(?<=not|^#[^not])\w+/
You can use this regex to solve your problem. I tested and it works for all of your test cases.
/^\s*#*(not)*/
^ means match start of string
\s* matches any space at the start
#* matches 0 or more #
(not)* matches the phrase "not" zero or more times.
Note: this regex won't work for cases where "not" comes before "#", such as not#hash would return #hash
Fun problem because it can use the most common string functions in Ruby:
result = values.map do |string|
string.strip # Remove spaces in front and back.
.tr('#','') # Transform single characters. In this case remove #
.gsub('not','') # Substitute patterns
.split(', ') # Split into arrays.
end
p result #=>[["12345"], ["abc"], ["one", "two"], ["CAPSLOCK"], ["doublehash"], ["hash"], ["swaggerest"]]
I prefer this way rather than a regexp as it is easy to understand the logic of each line.
Ruby regular expressions allow comments, so to match the octothorpe (#) you can escape it:
"#foo".sub(/\#/, "") #=> "foo"

Compare string against array and extract array elements present in ruby

I have the following string:
str = "This is a string"
What I want to do is compare it with this array:
a = ["this", "is", "something"]
The result should be an array with "this" and "is" because both are present in the array and in the given string. "something" is not present in the string so it shouldn't appear. How can I do this?
One way to do this:
str = "This is a string"
a = ["this","is","something"]
str.downcase.split & a
# => ["this", "is"]
I am assuming Array a will always have keys(elements) in downcase.
There's always many ways to do this sort of thing
str = "this is the example string"
words_to_compare = ["dogs", "ducks", "seagulls", "the"]
words_to_compare.select{|word| word =~ Regexp.union(str.split) }
#=> ["the"]
Your question has an XY problem smell to it. Usually when we want to find what words exist the next thing we want to know is how many times they exist. Frequency counts are all over the internet and Stack Overflow. This is a minor modification to such a thing:
str = "This is a string"
a = ["this", "is", "something"]
a_hash = a.each_with_object({}) { |i, h| h[i] = 0 } # => {"this"=>0, "is"=>0, "something"=>0}
That defined a_hash with the keys being the words to be counted.
str.downcase.split.each{ |k| a_hash[k] += 1 if a_hash.key?(k) }
a_hash # => {"this"=>1, "is"=>1, "something"=>0}
a_hash now contains the counts of the word occurrences. if a_hash.key?(k) is the main difference we'd see compared to a regular word-count as it's only allowing word-counts to occur for the words in a.
a_hash.keys.select{ |k| a_hash[k] > 0 } # => ["this", "is"]
It's easy to find the words that were in common because the counter is > 0.
This is a very common problem in text processing so it's good knowing how it works and how to bend it to your will.

Use single quote in string inspection

I have the following program:
args = ["a", "b"]
cmd_args = args.map{|x| x.inspect}
str = cmd_args.join(' ')
puts str
The output is:
"a" "b"
I expect the output to be like the following (sub-string quoted with ' instead of "):
'a' 'b'
I don't want to do a gsub after string inspect because, in my real system, substring might contain ". For example:
args = ['a"c', "b"]
cmd_args = args.map{|x| x.inspect.gsub('"', '\'')}
str = cmd_args.join(' ')
puts str
will output:
'a\'c' 'b'
The " between a and c is wrongly replaced. My expected output is:
'a"c' 'b'
How can I make string inspect to quote strings with ' instead of "?
s = 'a"c'.inspect
s[0] = s[-1] = "'"
puts s.gsub("\\\"", "\"") #=> 'a"c'
You can't force String#inspect to use a single quote without rewriting or overwriting it.
Instead of x.inspect, you could substitute "'#{x}'", but then you would have to make sure you escape any ' characters that appear in x.
Here it is, working:
args = ["a", "b"]
cmd_args = args.map{|x| "'#{x}'" }
str = cmd_args.join(' ')
puts str
The output is:
'a' 'b'

Regular expression and String

With the expression below:
words = string.scan(/\b\S+\b/i)
I am trying to scan through the string with word boundaries and case insensitivity, so if I have:
string = "A ball a Ball"
then when I have this each block:
words.each { |word| result[word] += 1 }
I am anticipating something like:
{"a"=>2, "ball"=>2}
But instead what I get is:
{"A"=>1, "ball"=>1, "a"=>1, "Ball"=>1}
After this thing didnt work I tried to create a new Regexp like:
Regexp.new(Regexp.escape(string), "i")
but then I do not know how to use this or move forward from here.
The regex matches words in case-insensitive mode, but it doesn't alter matched text in any way. So you will receive text in its original form in the block. Try casting strings to lower case when counting.
string = "A ball a Ball"
words = string.scan(/\b\S+\b/i) # => ["A", "ball", "a", "Ball"]
result = Hash.new(0)
words.each { |word| result[word.downcase] += 1 }
result # => {"a"=>2, "ball"=>2}
The regexp is fine; your problem is when you increment your counter using the hash. Hash keys are case sensitive, so you must change the case when incrementing:
words.each { |word| result[word.upcase] += 1 }

Resources