Finding the difference between strings in Ruby - ruby

I need to take two strings, compare them, and print the difference between them.
So say I have:
teamOne = "Billy, Frankie, Stevie, John"
teamTwo = "Billy, Frankie, Stevie"
$ teamOne.eql? teamTwo
=> false
I want to say "If the two strings are not equal, print whatever it is that is different between them. In this case, I'm just looking to print "John."

All of the solutions so far ignore the fact that the second array can also have elements that the first array doesn't have. Chuck has pointed out a fix (see comments on other posts), but there is a more elegant solution if you work with sets:
require 'set'
teamOne = "Billy, Frankie, Stevie, John"
teamTwo = "Billy, Frankie, Stevie, Zach"
teamOneSet = teamOne.split(', ').to_set
teamTwoSet = teamTwo.split(', ').to_set
teamOneSet ^ teamTwoSet # => #<Set: {"John", "Zach"}>
This set can then be converted back to an array if need be.

If the real string you are comparing are similar to the strings you provided, then this should work:
teamOneArr = teamOne.split(", ")
=> ["Billy", "Frankie", Stevie", "John"]
teamTwoArr = teamTwo.split(", ")
=> ["Billy", "Frankie", Stevie"]
teamOneArr - teamTwoArr
=> ["John"]

easy solution:
def compare(a, b)
diff = a.split(', ') - b.split(', ')
if diff === [] // a and b are the same
true
else
diff
end
end
of course this only works if your strings contain comma-separated values, but this can be adjusted to your situation.

You need to sort first to ensure you are not subtracting a bigger string from a smaller one:
def compare(*params)
params.sort! {|x,y| y <=> x}
diff = params[0].split(', ') - params[1].split(', ')
if diff === []
true
else
diff
end
end
puts compare(a, b)

I understood the question in two ways. In case you wanted to do a string difference (word by word) which covers this case:
teamOne = "Billy, Frankie, Tom, Stevie, John"
teamTwo = "Billy, Frankie, Stevie, Tom, Zach"
s1 = teamOne.split(' ')
s2 = teamTwo.split(' ')
diff = []
s1.zip(s2).each do |s1, s2|
if s1 != s2
diff << s1
end
end
puts diff.join(' ')
Result is:
Tom, Stevie, John
Accepted answer gives:
#<Set: {"Zach", "John"}>

Related

Check whether a string contains all the characters of another string in Ruby

Let's say I have a string, like string= "aasmflathesorcerersnstonedksaottersapldrrysaahf". If you haven't noticed, you can find the phrase "harry potter and the sorcerers stone" in there (minus the space).
I need to check whether string contains all the elements of the string.
string.include? ("sorcerer") #=> true
string.include? ("harrypotterandtheasorcerersstone") #=> false, even though it contains all the letters to spell harrypotterandthesorcerersstone
Include does not work on shuffled string.
How can I check if a string contains all the elements of another string?
Sets and array intersection don't account for repeated chars, but a histogram / frequency counter does:
require 'facets'
s1 = "aasmflathesorcerersnstonedksaottersapldrrysaahf"
s2 = "harrypotterandtheasorcerersstone"
freq1 = s1.chars.frequency
freq2 = s2.chars.frequency
freq2.all? { |char2, count2| freq1[char2] >= count2 }
#=> true
Write your own Array#frequency if you don't want to the facets dependency.
class Array
def frequency
Hash.new(0).tap { |counts| each { |v| counts[v] += 1 } }
end
end
I presume that if the string to be checked is "sorcerer", string must include, for example, three "r"'s. If so you could use the method Array#difference, which I've proposed be added to the Ruby core.
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
str = "aasmflathesorcerersnstonedksaottersapldrrysaahf"
target = "sorcerer"
target.chars.difference(str.chars).empty?
#=> true
target = "harrypotterandtheasorcerersstone"
target.chars.difference(str.chars).empty?
#=> true
If the characters of target must not only be in str, but must be in the same order, we could write:
target = "sorcerer"
r = Regexp.new "#{ target.chars.join "\.*" }"
#=> /s.*o.*r.*c.*e.*r.*e.*r/
str =~ r
#=> 2 (truthy)
(or !!(str =~ r) #=> true)
target = "harrypotterandtheasorcerersstone"
r = Regexp.new "#{ target.chars.join "\.*" }"
#=> /h.*a.*r.*r.*y* ... o.*n.*e/
str =~ r
#=> nil
A different albeit not necessarily better solution using sorted character arrays and sub-strings:
Given your two strings...
subject = "aasmflathesorcerersnstonedksaottersapldrrysaahf"
search = "harrypotterandthesorcerersstone"
You can sort your subject string using .chars.sort.join...
subject = subject.chars.sort.join # => "aaaaaaacddeeeeeffhhkllmnnoooprrrrrrssssssstttty"
And then produce a list of substrings to search for:
search = search.chars.group_by(&:itself).values.map(&:join)
# => ["hh", "aa", "rrrrrr", "y", "p", "ooo", "tttt", "eeeee", "nn", "d", "sss", "c"]
You could alternatively produce the same set of substrings using this method
search = search.chars.sort.join.scan(/((.)\2*)/).map(&:first)
And then simply check whether every search sub-string appears within the sorted subject string:
search.all? { |c| subject[c] }
Create a 2 dimensional array out of your string letter bank, to associate the count of letters to each letter.
Create a 2 dimensional array out of the harry potter string in the same way.
Loop through both and do comparisons.
I have no experience in Ruby but this is how I would start to tackle it in the language I know most, which is Java.

Build list of substrings created by separating a string by a match

I have a string:
"a_b_c_d_e"
I would like to build a list of substrings that result from removing everything after a single "_" from the string. The resulting list would look like:
['a_b_c_d', 'a_b_c', 'a_b', 'a']
What is the most rubyish way to achieve this?
s = "a_b_c_d_e"
a = []
s.scan("_"){a << $`} #`
a # => ["a", "a_b", "a_b_c", "a_b_c_d"]
You can split the string on the underscore character into an Array. Then discard the last element of the array and collect the remaining elements in another array joined by underscores. Like this:
str = "a_b_c_d_e"
str_ary = str.split("_") # will yield ["a","b","c","d","e"]
str_ary.pop # throw out the last element in str_ary
result_ary = [] # an empty array where you will collect your results
until str_ary.empty?
result_ary << str_ary.join("_") #collect the remaining elements of str_ary joined by underscores
str_ary.pop
end
# result_ary = ["a_b_c_d","a_b_c","a_b","a"]
Hope this helps.
I am not sure about “most rubyish”, my solutions would be:
str = 'a_b_c_d_e'
(items = str.split('_')).map.with_index do |_, i|
items.take(i + 1).join('_')
end.reverse
########################################################
(items = str.split('_')).size.downto(1).map do |e|
items.take(e).join('_')
end
########################################################
str.split('_').inject([]) do |memo, l|
memo << [memo.last, l].compact.join('_')
end.reverse
########################################################
([items]*items.size).map.with_index(&:take).map do |e|
e.join('_')
end.reject(&:empty?).reverse
My fave:
([str]*str.count('_')).map.with_index do |s, i|
s[/\A([^_]+_){#{i + 1}}/][0...-1]
end.reverse
Ruby ships with a module for abbreviation.
require "abbrev"
puts ["a_b_c_d_e".tr("_","")].abbrev.keys[1..-1].map{|a| a.chars*"_"}
# => ["a_b_c_d", "a_b_c", "a_b", "a"]
It works on an Array with words - just one in this case. Most work is removing and re-placing the underscores.

Find if all letters in a string are unique

I need to know if all letters in a string are unique. For a string to be unique, a letter can only appear once. If all letters in a string are distinct, the string is unique. If one letter appears multiple times, the string is not unique.
"Cwm fjord veg balks nth pyx quiz."
# => All 26 letters are used only once. This is unique
"This is a string"
# => Not unique, i and s are used more than once
"two"
# => unique, each letter is shown only once
I tried writing a function that determines whether or not a string is unique.
def unique_characters(string)
for i in ('a'..'z')
if string.count(i) > 1
puts "This string is unique"
else
puts "This string is not unique"
end
end
unique_characters("String")
I receive the output
"This string is unique" 26 times.
Edit:
I would like to humbly apologize for including an incorrect example in my OP. I did some research, trying to find pangrams, and assumed that they would only contain 26 letters. I would also like to thank you guys for pointing out my error. After that, I went on wikipedia to find a perfect pangram (I wrongly thought the others were perfect).
Here is the link for reference purposes
http://en.wikipedia.org/wiki/List_of_pangrams#Perfect_pangrams_in_English_.2826_letters.29
Once again, my apologies.
s = "The quick brown fox jumps over the lazy dog."
.downcase
("a".."z").all?{|c| s.count(c) <= 1}
# => false
Another way to do it is:
s = "The quick brown fox jumps over the lazy dog."
(s.downcase !~ /([a-z]).*\1/)
# => false
I would solve this in two steps: 1) extract the letters 2) check if there are duplicates:
letters = string.scan(/[a-z]/i) # append .downcase to ignore case
letters.length == letters.uniq.length
Here is a method that does not convert the string to an array:
def dupless?(str)
str.downcase.each_char.with_object('') { |c,s|
c =~ /[a-z]/ && s.include?(c) ? (return false) : s << c }
true
end
dupless?("Cwm fjord veg balks nth pyx quiz.") #=> true
dupless?("This is a string.") #=> false
dupless?("two") #=> true
dupless?("Two tubs") #=> false
If you want to actually keep track of the duplicate characters:
def is_unique?(string)
# Remove whitespaces
string = string.gsub(/\s+/, "")
# Build a hash counting all occurences of each characters
h = Hash.new { |hash, key| hash[key] = 0 }
string.chars.each { |c| h[c] += 1 }
# An array containing all the repetitions
res = h.keep_if {|k, c| c > 1}.keys
if res.size == 0
puts "All #{string.size} characters are used only once. This is unique"
else
puts "Not unique #{res.join(', ')} are used more than once"
end
end
is_unique?("This is a string") # Not unique i, s are used more than once
is_unique?("two") # All 3 characters are used only once. This is unique
To check if a string is unique or not, you can try out this:
string_input.downcase.gsub(/[^a-z]/, '').split("").sort.join('') == ('a' .. 'z').to_a.join('')
This will return true, if all the characters in your string are unique and if they include all the 26 characters.
def has_uniq_letters?(str)
letters = str.gsub(/[^A-Za-z]/, '').chars
letters == letters.uniq
end
If this doesn't have to be case sensitive,
def has_uniq_letters?(str)
letters = str.downcase.gsub(/[^a-z]/, '').chars
letters == letters.uniq
end
In your example, you mentioned you wanted additional information about your string (list of unique characters, etc), so this example may also be useful to you.
# s = "Cwm fjord veg balks nth pyx quiz."
s = "This is a test string."
totals = Hash.new(0)
s.downcase.each_char { |c| totals[c] += 1 if ('a'..'z').cover?(c) }
duplicates, uniques = totals.partition { |k, v| v > 1 }
duplicates, uniques = Hash[duplicates], Hash[uniques]
# duplicates = {"t"=>4, "i"=>3, "s"=>4}
# uniques = {"h"=>1, "a"=>1, "e"=>1, "r"=>1, "n"=>1, "g"=>1}

Ruby array to reject or delete if similar value is found during array merge

How do I delete the earlier array value if similar values exist? Here's the code I use:
def address_geo
arr = []
arr << do if do
arr << re if re
arr << me if me
arr << fa if fa
arr << so if so
arr << la if la
arr.reject{|y|y==''}.join(' ')
end
Given the following values
do = 'I'
re = 'am'
me = 'a'
fa = 'good'
so = 'good'
la = 'boy'
The above method would yield:
I am a good good boy
How should I write the array merge to reject fa and just take so to yield:
I am a good boy
Many thanks!
You can use Array#uniq
> arr = ['good', 'good']
> arr.uniq
=> ['good']
As per #tokland's suggestion, if you wanted to remove only consecutive duplicates, this would work (and support ruby 1.8). By building a new array using inject we can filter out each string that is either empty?, or the same as the previous string.
> %w(a good good boy).inject([]) do |mem, str|
> mem << str if !str.empty? && mem[-1] != str
> mem
> end
=> ['a', 'good', 'boy']
You you can remove consecutive elements in the array with Enumerable#chunk:
strings = ["hi", "there", "there", "hi", "bye"].select { |x| x && !x.empty? }
strings.chunk { |x| x }.map(&:first).join(" ")
#=> "hi there hi bye"

How to create an ordered list of matches from multiple Regexps in a string?

How can one get a list of matches in a string from multiple different Regexps, and have these matches ordered relatively by their position in the string?
The string can contain multiple matches from the same Regexp.
Based on sepp2k's answer, here's the solution I implemented (simplified example):
test_data = "
a_word
another_word
23445
12432423
third_word
"
regexps = /(?<word>[a-zA-Z_]+)/, /(?<number>[\d]+)/
words = regexps.map{|re| re.names}.flatten!
matches = []
test_data.scan(Regexp.union(regexps)) do
words.each do |word|
m = Regexp.last_match
matches << {word => m.to_s} if m[word]
end
end
p matches
This outputs:
[{"word"=>"a_word"}, {"word"=>"another_word"}, {"number"=>"23445"}, {"number"=>"12432423"}, {"word"=>"third_word"}]
You can use Regexp.union to turn all the regexps into one regexp and then use String#scan to find all matches. The array returned by scan will be ordered by the position of the match.
That seems awfully complex when inject and a case statement will do IMHO:
> %w{a_word another_word 23445 12432423 third_word}.inject([]) {|s,v| s << case v when /^[a-zA-Z_]+$/ then {'word' => v} when /^\d+$/ then {'number' => v} end }
=> [{"word"=>"a_word"}, {"word"=>"another_word"}, {"number"=>"23445"}, {"number"=>"12432423"}, {"word"=>"third_word"}]
For readability you could have the following:
data = <<EOD
a_word
another_word
23445
12432423
third_word
EOD
data.split.inject([]) do |s,v|
s << case v
when /^[a-zA-Z_]+$/
{'word' => v}
when /^\d+$/
{'number' => v}
end
end

Resources