Sort an array that are merged together - ruby

I have 2 strings that I made into an array to sort then to convert back into a string. But, in my test in my response.body The given string is sorted differently. I have a method that takes 2 strings and removes the headers from both and mergers the array and sorts it. But getting different results. How can I get the desired results of the below response.body string
string1 = "Category Name,Code,Enabled?,Category Hidden?\nPRESENT AVAIALBLE,PRESENT AVAILABLE,No,No,\nBUG AVAILABLE,BUG,No,No,\nBUG,BUG,No,No,\nPRESENT,PRESENT,No,No\n"
string2 = "Category Name,Code,Enabled?,Category Hidden?\nBUG,BUG,No,No,\nBUG AVAILABLE,BUG,No,No,\nEXAMPLE 1,EXAMPLE 1,Yes,No,\nEXAMPLE 2,EXAMPLE 2,Yes,No,\nPRESENT AVAIALBLE,PRESENT AVAILABLE,No,No,\nPRESENT,PRESENT,No,No\n"
how would I get that array to be sorted as the response.body string before inserting the header "Category Name,Code,Enabled?,Category Hidden?"
response.body string
"Category Name,Code,Enabled?,Category Hidden?
BUG,BUG,No,No,
BUG AVAILABLE,BUG,No,No,
EXAMPLE 1,EXAMPLE 1,No,No,
EXAMPLE 2,EXAMPLE 2,Yes,No,
PRESENT,PRESENT,No,No"
PRESENT AVAIALBLE,PRESENT AVAILABLE,No,No"
My output from method
"Category Name,Code,Enabled?,Category Hidden?
BUG AVAILABLE,BUG,No,No,
BUG,BUG,No,No,
EXAMPLE 1,EXAMPLE 1,No,No,
EXAMPLE 2,EXAMPLE 2,Yes,No,
PRESENT AVAIALBLE,PRESENT AVAILABLE,No,No,
PRESENT,PRESENT,No,No"
method I wrote
def merge(string1, string2)
string1 = string1.split("\n") # Split into array.
headers = string1.first # Get headers.
string1.shift # Remove headers.
string2 = string2.split("\n")[1..-1] # Remove headers.
final = (string1 + string2).sort.unshift(headers).join("\n") + "\n" # Create merged sorted string.
end
desired result wanted
"Category Name,Code,Enabled?,Category Hidden?
BUG,BUG,No,No,
BUG AVAILABLE,BUG,No,No,
EXAMPLE 1,EXAMPLE 1,No,No,
EXAMPLE 2,EXAMPLE 2,Yes,No,
PRESENT,PRESENT,No,No"
PRESENT AVAIALBLE,PRESENT AVAILABLE,No,No"

Here are three ways to do that.
I assume you are given two strings:
str1 = "Category Name,Code,Enabled?,Category Hidden?\nBUG,BUG,No,No\nEXAMPLE 1,EXAMPLE 1,No,No\nPRESENT,PRESENT,No,No"
str2 = "Category Name,Code,Enabled?,Category Hidden?\nBUG AVAILABLE,BUG,No,No\nEXAMPLE 2,EXAMPLE 2,Yes,No\nPRESENT AVAILABLE,PRESENT AVAILABLE,No,No"
Then
header, *body1 = str1.split("\n")
#=> ["Category Name,Code,Enabled?,Category Hidden?",
# "BUG,BUG,No,No",
# "EXAMPLE 1,EXAMPLE 1,No,No",
# "PRESENT,PRESENT,No,No"]
so
header
#=> "Category Name,Code,Enabled?,Category Hidden?"
body1
#=> ["BUG,BUG,No,No",
# "EXAMPLE 1,EXAMPLE 1,No,No",
# "PRESENT,PRESENT,No,No"]
and
_, *body2 = str2.split("\n")
#=> ["Category Name,Code,Enabled?,Category Hidden?",
# "BUG AVAILABLE,BUG,No,No",
# "EXAMPLE 2,EXAMPLE 2,Yes,No",
# "PRESENT AVAILABLE,PRESENT AVAILABLE,No,No"]
so
_ #=> "Category Name,Code,Enabled?,Category Hidden?"
body2
#=> ["BUG AVAILABLE,BUG,No,No",
# "EXAMPLE 2,EXAMPLE 2,Yes,No",
# "PRESENT AVAILABLE,PRESENT AVAILABLE,No,No"]
We may then compute the desired string.
str = [header].concat(body1.zip(body2).flatten).join("\n")
#=> "Category Name,Code,Enabled?,Category Hidden?\nBUG,BUG,No,No\nBUG AVAILABLE,BUG,No,No\nEXAMPLE 1,EXAMPLE 1,No,No\nEXAMPLE 2,EXAMPLE 2,Yes,No\nPRESENT,PRESENT,No,No\nPRESENT AVAILABLE,PRESENT AVAILABLE,No,No"
which when displayed appears as follows.
puts str
Category Name,Code,Enabled?,Category Hidden?
BUG,BUG,No,No
BUG AVAILABLE,BUG,No,No
EXAMPLE 1,EXAMPLE 1,No,No
EXAMPLE 2,EXAMPLE 2,Yes,No
PRESENT,PRESENT,No,No
PRESENT AVAILABLE,PRESENT AVAILABLE,No,No
See Array#concat, Array#zip, Array#flatten and Array#join.
The variable _ in _, *body2 = str2.split("\n") is so-named to tell the reader that it is not used in subsequent calculations. Sometimes might write _header, *body2 = str2.split("\n") to convey the same message.
Here is a second way of doing that by treating the strings as comma-delimited CSV strings.
require 'csv'
arr1 = CSV.parse(str1)​
#=> [["Category Name", "Code", "Enabled?", "Category Hidden?"],
# ["BUG", "BUG", "No", "No"],
# ["EXAMPLE 1", "EXAMPLE 1", "No", "No"],
# ["PRESENT", "PRESENT", "No", "No"]],
arr2 = CSV.parse(str2)
#=> [["Category Name", "Code", "Enabled?", "Category Hidden?"],
# ["BUG AVAILABLE", "BUG", "No", "No"],
# ["EXAMPLE 2", "EXAMPLE 2", "Yes", "No"],
# ["PRESENT AVAILABLE", "PRESENT AVAILABLE", "No", "No"]]
Then
str = CSV.generate do |csv|
csv << arr1.shift
arr2.shift
until arr2.empty? do
csv << arr1.shift
csv << arr2.shift
end
end
#=> "Category Name,Code,Enabled?,Category Hidden?\nBUG,BUG,No,No\nBUG AVAILABLE,BUG,No,No\nEXAMPLE 1,EXAMPLE 1,No,No\nEXAMPLE 2,EXAMPLE 2,Yes,No\nPRESENT,PRESENT,No,No\nPRESENT AVAILABLE,PRESENT AVAILABLE,No,No\n"
puts str
Category Name,Code,Enabled?,Category Hidden?
BUG,BUG,No,No
BUG AVAILABLE,BUG,No,No
EXAMPLE 1,EXAMPLE 1,No,No
EXAMPLE 2,EXAMPLE 2,Yes,No
PRESENT,PRESENT,No,No
PRESENT AVAILABLE,PRESENT AVAILABLE,No,No
See CSV::parse and CSV::generate.
This can also be done without converting the strings to arrays, manipulating those arrays to form a single array and then converting the single array back to a string.
arr = [str1, str2]
str_indices = 0..str1.count("\n")
arr_indices = 0..arr.size-1
idx_begin = Array.new(arr.size, 0)
puts str_indices.each_with_object("") do |i, str|
arr_indices.each do |j|
idx_end = arr[j].index(/(?:\n|\z)/, idx_begin[j])
s = arr[j][idx_begin[j]..idx_end]
s << "\n" unless s[-1] == "\n" || (i == str_indices.last && j == arr_indices.last)
str << s unless i.zero? && j > 0
idx_begin[j] = idx_end + 1
end
end
Category Name,Code,Enabled?,Category Hidden?
BUG,BUG,No,No
BUG AVAILABLE,BUG,No,No
EXAMPLE 1,EXAMPLE 1,No,No
EXAMPLE 2,EXAMPLE 2,Yes,No
PRESENT,PRESENT,No,No
PRESENT AVAILABLE,PRESENT AVAILABLE,No,No
The regular expression /(?:\n|\z)/ matches a newline character (\n) or (|) the end of the string (\z).
See the form of String#index that takes an optional second argument that specifies the string index where the search is to begin.

Related

Unscrambling a string given the number of splits and words that the sentence can be comprised of

Im working on a problem in which I'm given a string that has been scrambled. The scrambling works like this.
An original string is chopped into substrings at random positions and a random number of times.
Each substring is then moved around randomly to form a new string.
I'm also given a dictionary of words that are possible words in the string.
Finally, i'm given the number of splits in the string that were made.
The example I was given is this:
dictionary = ["world", "hello"]
scrambled_string = rldhello wo
splits = 1
The expected output of my program would be the original string, in this case:
"hello world"
Suppose the initial string
"hello my name is Sean"
with
splits = 2
yields
["hel", "lo my name ", "is Sean"]
and those three pieces are shuffled to form the following array:
["lo my name ", "hel", "is Sean"]
and then the elements of this array are joined to form:
scrambled = "lo my name helis Sean"
Also suppose:
dictionary = ["hello", "Sean", "the", "name", "of", "my", "cat", "is", "Sugar"]
First convert dictionary to a set to speed lookups.
require 'set'
dict_set = dictionary.to_set
#=> #<Set: {"hello", "Sean", "the", "name", "of", "my", "cat", "is", "Sugar"}>
Next I will create a helper method.
def indices_to_ranges(indices, last_index)
[-1, *indices, last_index].each_cons(2).map { |i,j| i+1..j }
end
Suppose we split scrambled twice (because splits #=> 2), specifically after the 'y' and the 'h':
indices = [scrambled.index('y'), scrambled.index('h')]
#=> [4, 11]
The first element of indices will always be -1 and the last value will always be scrambled.size-1.
We may then use indices_to_ranges to convert these indices to ranges of indices of characters in scrambed:
ranges = indices_to_ranges(indices, scrambled.size-1)
#=> [0..4, 5..11, 12..20]
a = ranges.map { |r| scrambled[r] }
#=> ["lo my", " name h", "elis Sean"]
We could of course combine these two steps:
a = indices_to_ranges(indices, scrambled.size-1).map { |r| scrambled[r] }
#=> ["lo my", " name h", "elis Sean"]
Next I will permute the values of a. For each permutation I will join the elements to form a string, then split the string on single spaces to form an array of words. If all of those words are in the dictionary we may claim success and are finished. Otherwise, a different array indices will be constructed and we try again, continuing until success is realized or all possible arrays indices have been considered. We can put all this in the following method.
def unscramble(scrambled, dict_set, splits)
last_index = scrambled.size-1
(0..scrambled.size-2).to_a.combination(splits).each do |indices|
indices_to_ranges(indices, last_index).
map { |r| scrambled[r] }.
permutation.each do |arr|
next if arr[0][0] == ' ' || arr[-1][-1] == ' '
words = arr.join.split(' ')
return words if words.all? { |word| dict_set.include?(word) }
end
end
end
Let's try it.
original string: "hello my name is Sean"
scrambled = "lo my name helis Sean"
splits = 4
unscramble(scrambled, dict_set, splits)
#=> ["my", "name", "hello", "is", "Sean"]
See Array#combination and Array#permutation.
bonkers answer (not quite perfect yet ... trouble with single chars):
#
# spaces appear to be important!
#check = {}
#ordered = []
def previous_words (word)
#check.select{|y,z| z[:previous] == word}.map do |nw,z|
#ordered << nw
previous_words(nw)
end
end
def in_word(dictionary, string)
# check each word in the dictionary to see if the string is container in one of them
dictionary.each do |word|
if word.include?(string)
return word
end
end
return nil
end
letters=scrambled.split("")
previous=nil
substr=""
letters.each do |l|
if in_word(dictionary, substr+l)
substr+= l
elsif (l==" ")
word=in_word(dictionary, substr)
#check[word]={found: 1}
#check[word][:previous] = previous if previous
substr=""
previous=word
else
word=in_word(dictionary, substr)
#check[word]={found: 1}
#check[word][:previous] = previous if previous
substr=l
previous=nil
end
end
word=in_word(dictionary, substr)
#check[word]={found: 1}
#check[word][:previous] = previous if previous
#check.select{|y,z| z[:previous].nil?}.map do |w,z|
#ordered << w
previous_words(w)
end
pp #ordered
output:
dictionary = ["world", "hello"]
scrambled = "rldhello wo"
... my code here ...
2.5.8 :817 > #ordered
=> ["hello", "world"]
dictionary = ["hello", "my", "name", "is", "Sean"]
scrambled = "me is Shelleano my na"
... my code here ...
2.5.8 :879 > #ordered
=> ["Sean", "hello", "my", "name", "is"]

Convert a string based on hash values [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 years ago.
Improve this question
I am trying to write a method that takes in a string and a hash and "encodes" the string based on hash keys and values.
def encode(str,encoding)
end
str = "12#3"
encoding = {"1" => "one", "2"=> "two", "3"=> "three"}
I am expecting the output to be "one two three" any char in the string that is not a key in the hash is replaced with an empty string.
Right now my code looks like the following:
def encode(str, encoding)
output = ""
str.each_char do |ch|
if encoding.has_key?(ch)
output += encoding[ch]
else
output += ""
end
end
return output
end
Any help is appreciated
You can use use the form of String#gsub that uses a hash for substitutions, and a simple regex:
str = "12#3"
encoding = {"1"=>"one", "2"=>"two", "3"=>"three"}
First create a new hash that adds a space to each value in encoding:
adj_encoding = encoding.each_with_object({}) { |(k,v),h| h[k] = "#{v} " }
#=> {"1"=>"one ", "2"=>"two ", "3"=>"three "}
Now perform the substitutions and strip off the extra space if one of the keys of encoding is the last character of str:
str.gsub(/./, adj_encoding).rstrip
#=> "one two three"
Another example:
"1ab 2xx4cat".gsub(/./, adj_encoding).rstrip
#=> "one two"
Ruby determines whether each character of str (the /./ part) equals a key of adj_encodeing. If it does, she substitutes the key's value for the character; else she substitutes an empty string ('') for the character.
You can build a regular expression that matches your keys via Regexp.union:
re = Regexp.union(encoding.keys)
#=> /1|2|3/
scan the string for occurrences of keys using that regular expression:
keys = str.scan(re)
#=> ["1", "2", "3"]
fetch the corresponding values using values_at:
values = encoding.values_at(*keys)
#=> ["one", "two", "three"]
and join the array with a single space:
values.join(' ')
#=> "one two three"
As a "one-liner":
encoding.values_at(*str.scan(Regexp.union(encoding.keys))).join(' ')
#=> "one two three"
Try:
def encode(str, encoding)
output = ""
str.each_char do |ch|
if encoding.has_key?(ch)
output += encoding[ch] + " "
else
output += ""
end
end
return output.split.join(' ')
end
str = "12#3"
encoding = {"1" => "one", "2"=> "two", "3"=> "three"}
p encode(str, encoding) #=> "one two three"
If you are expecting "one two three" you just need to add an space to your concat line and before return, add .lstrip to remove the first space.
Hint: You don't need the "else" concatenating an empty string. If the "#" don't match the encoding hash, it will be ignored.
Like this:
#str = "12#3"
#encoding = {"1" => "one", "2"=> "two", "3"=> "three"}
def encode(str, encoding)
output = ""
str.each_char do |ch|
if encoding.has_key?(ch)
output += " " + encoding[ch]
end
end
return output.lstrip
end
# Output: "one two three"
I would do:
encoding = {"1" => "one", "2"=> "two", "3"=> "three"}
str = "12#3"
str.chars.map{|x|encoding.fetch(x,nil)}.compact.join(' ')
Or two lines like this:
in_encoding_hash = -> x { encoding.has_key? x }
str.chars.grep(in_encoding_hash){|x|encoding[x]}.join(' ')

How do you strip substrings in ruby?

I'd like to replace/duplicate a substring, between two delimeters -- e.g.,:
"This is (the string) I want to replace"
I'd like to strip out everything between the characters ( and ), and set that substr to a variable -- is there a built in function to do this?
I would just do:
my_string = "This is (the string) I want to replace"
p my_string.split(/[()]/) #=> ["This is ", "the string", " I want to replace"]
p my_string.split(/[()]/)[1] #=> "the string"
Here are two more ways to do it:
/\((?<inside_parenthesis>.*?)\)/ =~ my_string
p inside_parenthesis #=> "the string"
my_new_var = my_string[/\((.*?)\)/,1]
p my_new_var #=> "the string"
Edit - Examples to explain the last method:
my_string = 'hello there'
capture = /h(e)(ll)o/
p my_string[capture] #=> "hello"
p my_string[capture, 1] #=> "e"
p my_string[capture, 2] #=> "ll"
var = "This is (the string) I want to replace"[/(?<=\()[^)]*(?=\))/]
var # => "the string"
str = "This is (the string) I want to replace"
str.match(/\((.*)\)/)
some_var = $1 # => "the string"
As I understand, you want to remove or replace a substring as well as set a variable equal to that substring (sans the parentheses). There are many ways to do this, some of which are slight variants of the other answers. Here's another way that also allows for the possibility of multiple substrings within parentheses, picking up from #sawa's comments:
def doit(str, repl)
vars = []
str.gsub(/\(.*?\)/) {|m| vars << m[1..-2]; repl}, vars
end
new_str, vars = doit("This is (the string) I want to replace", '')
new_str # => => "This is I want to replace"
vars # => ["the string"]
new_str, vars = doit("This is (the string) I (really) want (to replace)", '')
new_str # => "This is I want"
vars # => ["the string", "really, "to replace"]
new_str, vars = doit("This (short) string is a () keeper", "hot dang")
new_str # => "This hot dang string is a hot dang keeper"
vars # => ["short", ""]
In the regex, the ? in .*? makes .* "lazy". gsub passes each match m to the block; the block strips the parens and adds it to vars, then returns the replacement string. This regex also works:
/\([^\(]*\)/

Parse HTML string into array

I'm developing a wiki-like difference functionality for bodies of HTML produced by TinyMCE. diff-lcs is a difference gem that accepts arrays or objects. Most difference tasks are on code and just compare lines. A difference on bodies of HTML ridden text is more complex. If I just plug in the bodies of text, I get a character by character comparison. Although the output would be correct, it would look like garbage.
seq1 = "<p>Here is a paragraph. A sentence with <strong>bold text</strong>.</p><p>The second paragraph.</p>"
seq2 = seq1.gsub(/[.!?]/, '\0|').split('|')
=> ["<p>Here is a paragraph.", " A sentence with <strong>bold text</strong>.", "</p><p>The second paragraph.", "</p>"]
If someone changes the second paragraph, the difference output involves the previous paragraphs end tag. I can't just use strip_tags because I'd like to keep formatting on the compare view. The ideal comparison is one based on complete sentences, with HTML separated out.
seq2.NokogiriMagic
=> ["<p>", "Here is a paragraph.", " A sentence with ", "<strong>", "bold text", "</strong>", ".", "</p>", "<p>", "The second paragraph.", "</p>"]
I found plenty of neat Nokogiri methods but nothing I've found does the above.
Here's how you could do it with a SAX parser:
require 'nokogiri'
html = "<p>Here is a paragraph. A sentence with <strong>bold text</strong>.</p><p>The second paragraph.</p>"
class ArraySplitParser < Nokogiri::XML::SAX::Document
attr_reader :array
def initialize; #array = []; end
def start_element(name, attrs=[])
tag = "<" + name
attrs.each { |k,v| tag += " #{k}=\"#{v}\"" }
#array << tag + ">"
end
def end_element(name); #array << "</#{name}>"; end
def characters(str); #array += str.gsub(/\s/, '\0|').split('|'); end
end
parser = ArraySplitParser.new
Nokogiri::XML::SAX::Parser.new(parser).parse(html)
puts parser.array.inspect
# ["<p>", "Here ", "is ", "a ", "paragraph. ", "A ", "sentence ", "with ", "<strong>", "bold ", "text", "</strong>", ".", "</p>"]
Note that you'll have to wrap your HTML in a root element so that the XML parser doesn't miss the second paragraph in your example. Something like this should work:
# ...
Nokogiri::XML::SAX::Parser.new(parser).parse('<x>' + html + '</x>')
# ...
puts parser.array[1..-2]
# ["<p>", "Here ", "is ", "a ", "paragraph. ", "A ", "sentence ", "with ", "<strong>", "bold ", "text", "</strong>", ".", "</p>", "<p>", "The ", "second ", "paragraph.", "</p>"]
[Edit] Updated to demonstrate how to retain element attributes in the "start_element" method.
You're not writing your code in idiomatic Ruby. We don't use mixed upper/lower case in variable names, also, in programming in general, it's a good idea to use mnemonic variable names for clarity. Refactoring your code to be more how I'd write it:
tags = %w[p ol ul li h6 h5 h4 h3 h2 h1 em strong i b table thead tbody th tr td]
# Deconstruct HTML body 1
doc = Nokogiri::HTML.fragment(#versionOne.body)
nodes = doc.css(tags.join(', '))
# Reconstruct HTML body 1 into comparable array
output = []
nodes.each do |node|
output << [
"<#{ node.name }",
node.attributes.map { |param| '%s="%s"' % [param.name, param.value] }.join(' '),
'>'
].join
output << node.children.to_s.gsub(/[\s.!?]/, '|\0|').split('|').flatten
output << "</#{ node.name }>"
end
# Same deal for nokoOutput2
sdiff = Diff::LCS.sdiff(nokoOutput2.flatten, output.flatten)
The line:
tag | " #{ param.name }=\"#{ param.value }\" "
in your code isn't Ruby at all because String doesn't have a | operator. Did you add the | operator to your code and not show that definition?
A problem I see is:
output << node.children.to_s.gsub(/[\s.!?]/, '|\0|').split('|').flatten
Many of the tags you are looking for can contain other tags in your list:
<html>
<body>
<table><tr><td>
<table><tr><td>
foo
</td></tr></table>
</td></tr></table>
</body>
</html>
Creating a recursive method that handles:
node.attributes.map { |param| '%s="%s"' % [param.name, param.value] }.join(' '),
would probably improve your output. This is untested but is the general idea:
def dump_node(node)
output = [
"<#{ node.name }",
node.attributes.map { |param| '%s="%s"' % [param.name, param.value] }.join(' '),
'>'
].join
output += node.children.map{ |n| dump_node(n) }
output << "</#{ node.name }>"
end

Ruby String concatenation

I have an array
books = ["Title 1", "Title 2", "Title 3"]
I need to iterate through this array and get a variable like this:
#books_read = "Title 1 \n Title 2 \n Title 3"
I tried this bit of code:
books.each do |book|
#books_read += "#{book} \n"
end
puts #books_read
But, the + operator does not concatenate the strings. Any leads on this please.
Cheers!
You can use Array#join: books.join(" \n ").
join(sep=$,) → str
Returns a string created by converting each element of the array to a
string, separated by sep.
You can use join: books.join(" \n ")

Resources