I have a hash in Ruby that looks like this:
{"NameValues"=>[
{"Name"=>"Field 1", "Values"=>["Data 1"]},
{"Name"=>"Field 2", "Values"=>["Data 2"]},
{"Name"=>"Field 3", "Values"=>["Data 3"]},
{"Name"=>"Field 4", "Values"=>["Data 4"]},
{"Name"=>"Field 5", "Values"=>["Data 5"]}
]}
I want to select the contents of the "Values" element by using the name from the "Names" element, e.g., locate the "Data 3" string by searching for "Field 3" etc.
You could use the Enumerable#find method to find the hash by name:
hash = {"NameValues"=>[
{"Name"=>"Field 1", "Values"=>["Data 1"]},
{"Name"=>"Field 2", "Values"=>["Data 2"]},
{"Name"=>"Field 3", "Values"=>["Data 3"]},
{"Name"=>"Field 4", "Values"=>["Data 4"]},
{"Name"=>"Field 5", "Values"=>["Data 5"]}
]}
p hash['NameValues'].find{ |h| h['Name'] == 'Field 3'}['Values']
#=> ["Data 3"]
find basically iterates through the NameValues array until a matching element is found. You can then get the Values from the returned element.
Related
I have elastic search documents with structure like this:
{
"name": "item1",
"storages": [
{"items": ["a", "b", "c", "d", "e", "f"]},
{"items": ["a 1", "b 2", "c 3", "d 4", "e 5", "f 6"]}]
}
{
"name": "item2",
"storages": [
{"items": ["d", "e", "f", "g", "h", "i", "j"]},
{"items": ["d 4", "e 5", "f 6", "g 7", "h 8", "i 9", "j 10"]}
]
}
and I want to search for sequence of strings, for example ["d 4","e 5"].
For this I use MoreLikeThis query:
{
"query": {
"more_like_this" : {
"fields" : ["storages.items"],
"like" : ["d 4","e 5"],
"min_term_freq": 1,
"min_doc_freq": 1
}
}
}
and it works almost fine, but it returns "_score": 0.1620518 for first document and "_score": 0.13890153 for second.
I want to boost score for terms from the begining of array ('items'), so because "d 4", "e 5" appears on the begining of array it should be ranked higher.
Is there way to create such query in elasticsearch? May be it should be not more like this query?
Tricky part is that query could be something like ["d 4","e 5", "xxx"] (xxx not present in document, but it's ok)
as you can see in this answer to a related question,
arrays are indexed—made searchable—as multivalue fields, which are
unordered
so you can't count on the order when you search.
Even worse, the array of objects is not stored as you think.
Arrays of objects do not work as you would expect: you cannot query each object independently of the other objects in the array. If you need to be able to do this then you should use the nested datatype instead of the object datatype.
Assuming the following data tuple containing a person's name, age and the books he has read:
list = [
["Peter", 21, ["Book 1", "Book 2", "Book 3", "Book 4"],
["Amy", 19, ["Book 3", "Book 4"],
["Sanders", 32, ["Book 1", "Book 2",],
["Charlie", 21, ["Book 4", "Book 5", "Book 6"],
["Amanda", 21, ["Book 2", "Book 5"]
]
What is the optimal way to extract names grouped by the books read, into the following format (basically a an array of arrays containing the book name and an array of names of people who read it)
results = [
["Book 1", ["Sanders", "Peter"]],
["Book 2", ["Sanders" "Amanda", "Peter"]],
["Book 3", ["Peter", "Amy"]],
["Book 4", ["Charlie", "Peter", "Amy"]],
["Book 5", ["Amanda","Charlie"]],
["Book 6", ["Charlie"]]
]
I've tried the following iterating method which extracts the lists of names and puts them into a hash, with the book title as the keys.
book_hash = Hash.new([])
list.each { |name,age,books|
books { |x| book_hash[x] = book_hash[x] + [name] }
}
results = book_hash.to_a.sort
However, the above method seems rather inefficient when handling large datasets containing millions of names. I've attempted to use the Array.group_by, but so far I'm unable to make it work with nested arrays.
Does anyone have any idea about the above?
Hash output. More suitable.
list.each_with_object({}) do |(name, age, books), hash|
books.each do |book|
(hash[book] ||= []) << name
end
end
If you must make it an array, then append a .to_a to the output of the above.
I have an hash like this -
{"examples"=>
[{"year"=>1999,
"provider"=>{"name"=>"abc", "id"=>711},
"url"=> "http://example.com/1",
"reference"=>"abc",
"text"=> "Sample text 1",
"title"=> "Sample Title 1",
"documentId"=>30091286,
"exampleId"=>786652043,
"rating"=>357.08115},
{"year"=>1999,
"provider"=>{"name"=>"abc", "id"=>3243},
"url"=> "http://example.com/2",
"reference"=>"dec",
"text"=> "Sample text 2",
"title"=> "Sample Title 2",
"documentId"=>30091286,
"exampleId"=>786652043,
"rating"=>357.08115},
{"year"=>1999,
"provider"=>{"name"=>"abc", "id"=>191920},
"url"=> "http://example.com/3",
"reference"=>"wer",
"text"=> "Sample text 3",
"title"=> "Sample Title 3",
"documentId"=>30091286,
"exampleId"=>786652043,
"rating"=>357.08115}]
}
and I would like to create a new array by pulling out the keys, and values for just the "text", "url" and "title" keys like below.
[
{"text"=> "Sample text 1", "title"=> "Sample Title 1", "url"=> "http://example.com/1"},
{"text"=> "Sample text 2", "title"=> "Sample Title 2", "url"=> "http://example.com/2"},
{"text"=> "Sample text 3", "title"=> "Sample Title 3", "url"=> "http://example.com/3"}
]
Any help is sincerely appreciated.
You should do as
hash['examples'].map do |hash|
keys = ["text", "title", "url"]
keys.zip(hash.values_at(*keys)).to_h
end
If you are below < 2.1 use,
Hash[keys.zip(hash.values_at(*keys))]
Here's another way this could be done (where h is the hash given in the question).
KEEPERS = ['text','url','title']
h.each_key.with_object({}) { |k,g|
g[k] = h[k].map { |h| h.select { |sk,_| KEEPERS.include? sk } } }
#=> {"examples"=>[
# [{"url"=>"http://example.com/1", "text"=>"Sample text 1",
# "title"=>"Sample Title 1"},
# {"url"=>"http://example.com/2", "text"=>"Sample text 2",
# "title"=>"Sample Title 2"},
# {"url"=>"http://example.com/3", "text"=>"Sample text 3",
# "title"=>"Sample Title 3"}]}
Here we simply create a new hash (denoted by the outer block variable g) which has all the keys of the original hash h (just one, "examples", but there could be more), and for each associated value, which is an array of hashes, we use Enumerable#map and Hash#select to retain only the desired key/value pairs from each of those hashes.
If I have a string array that looks like this:
array = ["STRING1", "STRING05", "STRING20", "STRING4", "STRING3"]
or
array = ["STRING: 1", "STRING: 05", "STRING: 20", "STRING: 4", "STRING: 3"]
How can I sort the array by the number in each string (descending)?
I know that If the array consisted of integers and not strings, I could use:
sort_by { |k, v| -k }
I've searched all around but can't come up with a solution
The below would sort by the number in each string and not the string itself
array.sort_by { |x| x[/\d+/].to_i }
=> ["STRING: 1", "STRING: 2", "STRING: 3", "STRING: 4", "STRING: 5"]
descending order:
array.sort_by { |x| -(x[/\d+/].to_i) }
=> ["STRING: 5", "STRING: 4", "STRING: 3", "STRING: 2", "STRING: 1"]
sort the array by the number in each string (descending)
array.sort_by { |x| -x[/\d+/].to_i }
Convert this Array:
a = ["item 1", "item 2", "item 3", "item 4"]
...to a Hash:
{ "item 1" => "item 2", "item 3" => "item 4" }
i.e. elements at even indexes are keys and odd ones are values.
a = ["item 1", "item 2", "item 3", "item 4"]
h = Hash[*a] # => { "item 1" => "item 2", "item 3" => "item 4" }
That's it. The * is called the splat operator.
One caveat per #Mike Lewis (in the comments): "Be very careful with this. Ruby expands splats on the stack. If you do this with a large dataset, expect to blow out your stack."
So, for most general use cases this method is great, but use a different method if you want to do the conversion on lots of data. For example, #Łukasz Niemier (also in the comments) offers this method for large data sets:
h = Hash[a.each_slice(2).to_a]
Ruby 2.1.0 introduced a to_h method on Array that does what you require if your original array consists of arrays of key-value pairs: http://www.ruby-doc.org/core-2.1.0/Array.html#method-i-to_h.
[[:foo, :bar], [1, 2]].to_h
# => {:foo => :bar, 1 => 2}
Just use Hash.[] with the values in the array. For example:
arr = [1,2,3,4]
Hash[*arr] #=> gives {1 => 2, 3 => 4}
Or if you have an array of [key, value] arrays, you can do:
[[1, 2], [3, 4]].inject({}) do |r, s|
r.merge!({s[0] => s[1]})
end # => { 1 => 2, 3 => 4 }
This is what I was looking for when googling this:
[{a: 1}, {b: 2}].reduce({}) { |h, v| h.merge v }
=> {:a=>1, :b=>2}
Enumerator includes Enumerable. Since 2.1, Enumerable also has a method #to_h. That's why, we can write :-
a = ["item 1", "item 2", "item 3", "item 4"]
a.each_slice(2).to_h
# => {"item 1"=>"item 2", "item 3"=>"item 4"}
Because #each_slice without block gives us Enumerator, and as per the above explanation, we can call the #to_h method on the Enumerator object.
You could try like this, for single array
irb(main):019:0> a = ["item 1", "item 2", "item 3", "item 4"]
=> ["item 1", "item 2", "item 3", "item 4"]
irb(main):020:0> Hash[*a]
=> {"item 1"=>"item 2", "item 3"=>"item 4"}
for array of array
irb(main):022:0> a = [[1, 2], [3, 4]]
=> [[1, 2], [3, 4]]
irb(main):023:0> Hash[*a.flatten]
=> {1=>2, 3=>4}
a = ["item 1", "item 2", "item 3", "item 4"]
Hash[ a.each_slice( 2 ).map { |e| e } ]
or, if you hate Hash[ ... ]:
a.each_slice( 2 ).each_with_object Hash.new do |(k, v), h| h[k] = v end
or, if you are a lazy fan of broken functional programming:
h = a.lazy.each_slice( 2 ).tap { |a|
break Hash.new { |h, k| h[k] = a.find { |e, _| e == k }[1] }
}
#=> {}
h["item 1"] #=> "item 2"
h["item 3"] #=> "item 4"
All answers assume the starting array is unique. OP did not specify how to handle arrays with duplicate entries, which result in duplicate keys.
Let's look at:
a = ["item 1", "item 2", "item 3", "item 4", "item 1", "item 5"]
You will lose the item 1 => item 2 pair as it is overridden bij item 1 => item 5:
Hash[*a]
=> {"item 1"=>"item 5", "item 3"=>"item 4"}
All of the methods, including the reduce(&:merge!) result in the same removal.
It could be that this is exactly what you expect, though. But in other cases, you probably want to get a result with an Array for value instead:
{"item 1"=>["item 2", "item 5"], "item 3"=>["item 4"]}
The naïve way would be to create a helper variable, a hash that has a default value, and then fill that in a loop:
result = Hash.new {|hash, k| hash[k] = [] } # Hash.new with block defines unique defaults.
a.each_slice(2) {|k,v| result[k] << v }
a
=> {"item 1"=>["item 2", "item 5"], "item 3"=>["item 4"]}
It might be possible to use assoc and reduce to do above in one line, but that becomes much harder to reason about and read.