Sorting Multiple Directory Entries in Ruby - ruby

If I have the following files And they have the following paths (changed for simplicity)
array = ["root_path/dir1/a/file.jpg",
"root_path/dir1/a/file2.jpg",
"root_path/dir1/b/file3.jpg",
"root_path/dir2/c/file4.jpg"]
How can I sort them to get this sort of hash like this?
sort_directory(array)
#=>
{
"dir1" => {
"a" => [
"root_path/dir1/a/file.jpg",
"root_path/dir1/a/file2.jpg"
],
"b" => [
"root_path/dir1/b/file3.jpg"
]
},
"dir2" => {
"c" => [
"root_path/dir2/c/file4.jpg"
]
}
}

one way of doing it using group_by, split and/or some regex
array.group_by{ |dir| dir.split('/')[1] }.map{ |k,v| {k => v.group_by{ |file| file[/\/([^\/]+)(?=\/[^\/]+\/?\Z)/, 1]} } }

Here is how you can use recursion to obtain the desired result.
If s = "root_path/dir1/a/b/c/file.jpg", we can regard "root_path" as being at "position" 0, "dir1" at position 1 and so on. The example given by the OP has desired grouping on values at positions 1 and 2, which I will write positions = [1,2].
There is no limit to the number of positions on which to group or their order. For the string above we could write, for example, positions = [2,4,1], so the first grouping would be on position 2, the next on position 4 and the last on position 1 (though I have no idea if that could be of interest).
Code
def hashify(arr, positions)
recurse(positions, arr.map { |s| s.split("/") })
end
def recurse(positions, parts)
return parts.map { |a| a.join('/') } if positions.empty?
pos, *positions = positions
h = parts.group_by { |a| a[pos] }.
each_with_object({}) { |(k,a),g| g[k]=recurse(positions, a) }
end
Example
arr = ["root_path/dir1/a/file.jpg",
"root_path/dir1/a/file2.jpg",
"root_path/dir1/b/file3.jpg",
"root_path/dir2/c/file4.jpg"]
hashify(arr, [1, 2])
#=>{"dir1"=>{"a"=>["root_path/dir1/a/file.jpg", "root_path/dir1/a/file2.jpg"],
# "b"=>["root_path/dir1/b/file3.jpg"]},
# "dir2"=>{"c"=>["root_path/dir2/c/file4.jpg"]}}
Explanation
Recursive methods are difficult to explain. The best way, in my opinion, is to insert puts statements to show the sequence of calculation. I've also indented a few spaces whenever the method calls itself. Here is how the code might be modified for that purpose.
INDENT = 4
def hashify(arr, positions)
recurse(positions, arr.map { |s| s.split("/") }, 0)
end
def recurse(positions, parts, lvl)
puts
"lvl=#{lvl}".pr(lvl)
"positions=#{ positions }".pr(lvl)
if positions.empty?
"parts=#{parts}".pr(lvl)
return parts.map { |a| a.join('/') }
end
pos, *positions = positions
"pos=#{pos}, positions=#{positions}".pr(lvl)
h = parts.group_by { |a| a[pos] }
"h=#{h}".pr(lvl)
g = h.each_with_object({}) { |(k,a),g| g[k]=recurse(positions, a, lvl+1) }
"rv=#{g}".pr(lvl)
g
end
class String
def pr(lvl)
print "#{ ' ' * INDENT * lvl}"
puts self
end
end
We now execute this method for the data given in the example.
hashify(arr, [1, 2])
lvl=0
positions=[1, 2]
pos=1, positions=[2]
h={"dir1"=>[["root_path", "dir1", "a", "file.jpg"],
["root_path", "dir1", "a", "file2.jpg"],
["root_path", "dir1", "b", "file3.jpg"]],
"dir2"=>[["root_path", "dir2", "c", "file4.jpg"]]}
lvl=1
positions=[2]
pos=2, positions=[]
h={"a"=>[["root_path", "dir1", "a", "file.jpg"],
["root_path", "dir1", "a", "file2.jpg"]],
"b"=>[["root_path", "dir1", "b", "file3.jpg"]]}
lvl=2
positions=[]
parts=[["root_path", "dir1", "a", "file.jpg"],
["root_path", "dir1", "a", "file2.jpg"]]
lvl=2
positions=[]
parts=[["root_path", "dir1", "b", "file3.jpg"]]
rv={"a"=>["root_path/dir1/a/file.jpg", "root_path/dir1/a/file2.jpg"],
"b"=>["root_path/dir1/b/file3.jpg"]}
lvl=1
positions=[2]
pos=2, positions=[]
h={"c"=>[["root_path", "dir2", "c", "file4.jpg"]]}
lvl=2
positions=[]
parts=[["root_path", "dir2", "c", "file4.jpg"]]
rv={"c"=>["root_path/dir2/c/file4.jpg"]}
rv={"dir1"=>{"a"=>["root_path/dir1/a/file.jpg",
"root_path/dir1/a/file2.jpg"],
"b"=>["root_path/dir1/b/file3.jpg"]},
"dir2"=>{"c"=>["root_path/dir2/c/file4.jpg"]}}

Related

How to find the largest value of a hash in an array of hashes

In my array, I'm trying to retrieve the key with the largest value of "value_2", so in this case, "B":
myArray = [
"A" => {
"value_1" => 30,
"value_2" => 240
},
"B" => {
"value_1" => 40,
"value_2" => 250
},
"C" => {
"value_1" => 18,
"value_2" => 60
}
]
myArray.each do |array_hash|
array_hash.each do |key, value|
if value["value_2"] == array_hash.values.max
puts key
end
end
end
I get the error:
"comparison of Hash with Hash failed (ArgumentError)".
What am I missing?
Though equivalent, the array given in the question is generally written:
arr = [{ "A" => { "value_1" => 30, "value_2" => 240 } },
{ "B" => { "value_1" => 40, "value_2" => 250 } },
{ "C" => { "value_1" => 18, "value_2" => 60 } }]
We can find the desired key as follows:
arr.max_by { |h| h.values.first["value_2"] }.keys.first
#=> "B"
See Enumerable#max_by. The steps are:
g = arr.max_by { |h| h.values.first["value_2"] }
#=> {"B"=>{"value_1"=>40, "value_2"=>250}}
a = g.keys
#=> ["B"]
a.first
#=> "B"
In calculating g, for
h = arr[0]
#=> {"A"=>{"value_1"=>30, "value_2"=>240}}
the block calculation is
a = h.values
#=> [{"value_1"=>30, "value_2"=>240}]
b = a.first
#=> {"value_1"=>30, "value_2"=>240}
b["value_2"]
#=> 240
Suppose now arr is as follows:
arr << { "D" => { "value_1" => 23, "value_2" => 250 } }
#=> [{"A"=>{"value_1"=>30, "value_2"=>240}},
# {"B"=>{"value_1"=>40, "value_2"=>250}},
# {"C"=>{"value_1"=>18, "value_2"=>60}},
# {"D"=>{"value_1"=>23, "value_2"=>250}}]
and we wish to return an array of all keys for which the value of "value_2" is maximum (["B", "D"]). We can obtain that as follows.
max_val = arr.map { |h| h.values.first["value_2"] }.max
#=> 250
arr.select { |h| h.values.first["value_2"] == max_val }.flat_map(&:keys)
#=> ["B", "D"]
flat_map(&:keys) is shorthand for:
flat_map { |h| h.keys }
which returns the same array as:
map { |h| h.keys.first }
See Enumerable#flat_map.
Code
p myArray.pop.max_by{|k,v|v["value_2"]}.first
Output
"B"
I'd use:
my_array = [
"A" => {
"value_1" => 30,
"value_2" => 240
},
"B" => {
"value_1" => 40,
"value_2" => 250
},
"C" => {
"value_1" => 18,
"value_2" => 60
}
]
h = Hash[*my_array]
# => {"A"=>{"value_1"=>30, "value_2"=>240},
# "B"=>{"value_1"=>40, "value_2"=>250},
# "C"=>{"value_1"=>18, "value_2"=>60}}
k = h.max_by { |k, v| v['value_2'] }.first # => "B"
Hash[*my_array] takes the array of hashes and turns it into a single hash. Then max_by will iterate each key/value pair, returning an array containing the key value "B" and the sub-hash, making it easy to grab the key using first:
k = h.max_by { |k, v| v['value_2'] } # => ["B", {"value_1"=>40, "value_2"=>250}]
I guess the idea of your solution is looping through each hash element and compare the found minimum value with hash["value_2"].
But you are getting an error at
if value["value_2"] == array_hash.values.max
Because the array_hash.values is still a hash
{"A"=>{"value_1"=>30, "value_2"=>240}}.values.max
#=> {"value_1"=>30, "value_2"=>240}
It should be like this:
max = nil
max_key = ""
myArray.each do |array_hash|
array_hash.each do |key, value|
if max.nil? || value.values.max > max
max = value.values.max
max_key = key
end
end
end
# max_key #=> "B"
Another solution:
myArray.map{ |h| h.transform_values{ |v| v["value_2"] } }.max_by{ |k| k.values }.keys.first
You asked "What am I missing?".
I think you are missing a proper understanding of the data structures that you are using. I suggest that you try printing the data structures and take a careful look at the results.
The simplest way is p myArray which gives:
[{"A"=>{"value_1"=>30, "value_2"=>240}, "B"=>{"value_1"=>40, "value_2"=>250}, "C"=>{"value_1"=>18, "value_2"=>60}}]
You can get prettier results using pp:
require 'pp'
pp myArray
yields:
[{"A"=>{"value_1"=>30, "value_2"=>240},
"B"=>{"value_1"=>40, "value_2"=>250},
"C"=>{"value_1"=>18, "value_2"=>60}}]
This helps you to see that myArray has only one element, a Hash.
You could also look at the expression array_hash.values.max inside the loop:
myArray.each do |array_hash|
p array_hash.values
end
gives:
[{"value_1"=>30, "value_2"=>240}, {"value_1"=>40, "value_2"=>250}, {"value_1"=>18, "value_2"=>60}]
Not what you expected? :-)
Given this, what would you expect to be returned by array_hash.values.max in the above loop?
Use p and/or pp liberally in your ruby code to help understand what's going on.

How to set array to tree fabric using Ruby

I have an array with values:
list = [["a"], ["a", "b"], ["a", "b", "c"], ["a", "b", "c", "d"]]
I would like to convert this array to print a tree struct, just like computer directory struct.
Im trying to using recursive function to resolve this question. and expect result is Hash type, like this:
{ "a" => { "b" => { "c" => { "d" => {} } } } }
This's question will help me show the redis keys with tree shape, it's folding.
Using brilliant Hashie::Mash and Kernel.eval:
input = [%w|a|, %w|a b|, %w|a b c|, %w|a b c d|]
require 'hashie/mash'
input.each_with_object(Hashie::Mash.new) do |e, acc|
eval ["acc", e.map{ |k| "#{k}!" }].join(".")
end
#⇒ { "a" => { "b" => { "c" => { "d" => {} } } } }
You didn't show any code, so I won't either.
You're looking for a Trie, not just a Tree.
Pick any gem.

Ruby search for word in string

Given input = "helloworld"
The output should be output = ["hello", "world"]
Given I have a method called is_in_dict? which returns true if there's a word given
So far i tried:
ar = []
input.split("").each do |f|
ar << f if is_in_dict? f
// here need to check given char
end
How to achieve it in Ruby?
Instead of splitting the input into characters, you have to inspect all combinations, i.e. "h", "he", "hel", ... "helloworld", "e", "el" , "ell", ... "elloworld" and so on.
Something like this should work:
(0..input.size).to_a.combination(2).each do |a, b|
word = input[a...b]
ar << word if is_in_dict?(word)
end
#=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
ar
#=> ["hello", "world"]
Or, using each_with_object, which returns the array:
(0..input.size).to_a.combination(2).each_with_object([]) do |(a, b), array|
word = input[a...b]
array << word if is_in_dict?(word)
end
#=> ["hello", "world"]
Another approach is to build a custom Enumerator:
class String
def each_combination
return to_enum(:each_combination) unless block_given?
(0..size).to_a.combination(2).each do |a, b|
yield self[a...b]
end
end
end
String#each_combination yields all combinations (instead of just the indices):
input.each_combination.to_a
#=> ["h", "he", "hel", "hell", "hello", "hellow", "hellowo", "hellowor", "helloworl", "helloworld", "e", "el", "ell", "ello", "ellow", "ellowo", "ellowor", "elloworl", "elloworld", "l", "ll", "llo", "llow", "llowo", "llowor", "lloworl", "lloworld", "l", "lo", "low", "lowo", "lowor", "loworl", "loworld", "o", "ow", "owo", "owor", "oworl", "oworld", "w", "wo", "wor", "worl", "world", "o", "or", "orl", "orld", "r", "rl", "rld", "l", "ld", "d"]
It can be used with select to easily filter specific words:
input.each_combination.select { |word| is_in_dict?(word) }
#=> ["hello", "world"]
This seems to be a task for recursion. In short you want to take letters one by one until you get a word which is in dictionary. This however will not guarantee that the result is correct, as the remaining letters may not form a words ('hell' + 'oworld'?). This is what I would do:
def split_words(string)
return [[]] if string == ''
chars = string.chars
word = ''
(1..string.length).map do
word += chars.shift
next unless is_in_dict?(word)
other_splits = split_words(chars.join)
next if other_splits.empty?
other_splits.map {|split| [word] + split }
end.compact.inject([], :+)
end
split_words('helloworld') #=> [['hello', 'world']] No hell!
It will also give you all possible splits, so pages with urls like penisland can be avoided
split_words('penisland') #=> [['pen', 'island'], [<the_other_solution>]]

Array multiplication - what is the best algorithm?

I have to make a multiplication of n arrays.
Example :
input = ["a", "b", "c"] * ["1", "2"] * ["&", "(", "$"]
output = ["a1&", "a1(", "a1$", "a2&", "a2(", "a2$", "b1&", "b1(", "b1$", "b2&", "b2(", "b2$, "c1&, "c1(, "c1$, "c2&", "c2(", "c2$"]
I have created an algorithm to do that, it works good.
# input
entries = ["$var1$", "$var2$", "$var3$"]
data = [["a", "b", "c"], ["1", "2"], ["&", "(", "$"]]
num_combinaison = 1
data.each { |item| num_combinaison = num_combinaison * item.length }
result = []
for i in 1..num_combinaison do
result.push entries.join()
end
num_repetition = num_combinaison
data.each_index do |index|
item = Array.new(data[index])
num_repetition = num_repetition / item.length
for i in 1..num_combinaison do
result[i-1].gsub!(entries[index], item[0])
if i % num_repetition == 0
item.shift
item = Array.new(data[index]) if item.length == 0
end
end
end
I'm sure there is a best way to do that, but I don't find it. I have tried to use product or flatten function without success.
Somebody see a best solution ?
Thanks for your help.
Eric
class Array
def * other; product(other).map(&:join) end
end
["a", "b", "c"] * ["1", "2"] * ["&", "(", "$"]
# =>
# ["a1&", "a1(", "a1$", "a2&", "a2(", "a2$", "b1&", "b1(", "b1$", "b2&",
# "b2(", "b2$", "c1&", "c1(", "c1$", "c2&", "c2(", "c2$"]
The best algorithm you can use is implemented by the Array#product method:
data = [["a", "b", "c"], ["1", "2"], ["&", "(", "$"]]
data.first.product(*entries.drop(1)).map(&:join)
# => ["a1&", "a1(", "a1$", "a2&", "a2(", "a2$", ...
Update
A safer alternative, my first solution raises a NoMethodError if data is emtpy:
data.reduce { |result, ary| result.product(ary).map(&:join) }
# => ["a1&", "a1(", "a1$", "a2&", "a2(", "a2$", ...
[].reduce { |r, a| r.product(a).map(&:join) }
# => nil

Convert cartesian product to nested hash in ruby

I have a structure with a cartesian product that looks like this (and could go out to arbitrary depth)...
variables = ["var1","var2",...]
myhash = {
{"var1"=>"a", "var2"=>"a", ...}=>1,
{"var1"=>"a", "var2"=>"b", ...}=>2,
{"var1"=>"b", "var2"=>"a", ...}=>3,
{"var1"=>"b", "var2"=>"b", ...}=>4,
}
... it has a fixed structure but I'd like simple indexing so I'm trying to write a method to convert it to this :
nested = {
"a"=> {
"a"=> 1,
"b"=> 2
},
"b"=> {
"a"=> 3,
"b"=> 4
}
}
Any clever ideas (that allow for arbitrary depth)?
Maybe like this (not the cleanest way):
def cartesian_to_map(myhash)
{}.tap do |hash|
myhash.each do |h|
(hash[h[0]["var1"]] ||= {}).merge!({h[0]["var2"] => h[1]})
end
end
end
Result:
puts cartesian_to_map(myhash).inspect
{"a"=>{"a"=>1, "b"=>2}, "b"=>{"a"=>3, "b"=>4}}
Here is my example.
It uses a method index(hash, fields) that takes the hash, and the fields you want to index by.
It's dirty, and uses a local variable to pass up the current level in the index.
I bet you can make it much nicer.
def index(hash, fields)
# store the last index of the fields
last_field = fields.length - 1
# our indexed version
indexed = {}
hash.each do |key, value|
# our current point in the indexed hash
point = indexed
fields.each_with_index do |field, i|
key_field = key[field]
if i == last_field
point[key_field] = value
else
# ensure the next point is a hash
point[key_field] ||= {}
# move our point up
point = point[key_field]
end
end
end
# return our indexed hash
indexed
end
You can then just call
index(myhash, ["var1", "var2"])
And it should look like what you want
index({
{"var1"=>"a", "var2"=>"a"} => 1,
{"var1"=>"a", "var2"=>"b"} => 2,
{"var1"=>"b", "var2"=>"a"} => 3,
{"var1"=>"b", "var2"=>"b"} => 4,
}, ["var1", "var2"])
==
{
"a"=> {
"a"=> 1,
"b"=> 2
},
"b"=> {
"a"=> 3,
"b"=> 4
}
}
It seems to work.
(see it as a gist
https://gist.github.com/1126580)
Here's an ugly-but-effective solution:
nested = Hash[ myhash.group_by{ |h,n| h["var1"] } ].tap{ |nested|
nested.each do |v1,a|
nested[v1] = a.group_by{ |h,n| h["var2"] }
nested[v1].each{ |v2,a| nested[v1][v2] = a.flatten.last }
end
}
p nested
#=> {"a"=>{"a"=>1, "b"=>2}, "b"=>{"a"=>3, "b"=>4}}
You might consider an alternative representation that is easier to map to and (IMO) just as easy to index:
paired = Hash[ myhash.map{ |h,n| [ [h["var1"],h["var2"]], n ] } ]
p paired
#=> {["a", "a"]=>1, ["a", "b"]=>2, ["b", "a"]=>3, ["b", "b"]=>4}
p paired[["a","b"]]
#=> 2

Resources