Algorithm to transform tree data in Ruby - ruby

How can i change my tree made of Array of hashes into another structure such as:
My data looks like :
{
"A": [
{ "A1": [] },
{ "A2": [] },
{
"A3": [
{
"A31": [
{ "A311": [] },
{ "A312": [] }
]
}
]
}
]
}
into something like :
{
"name": "A",
"children": [
{ "name": "A1" },
{ "name": "A2" },
{
"name": "A3",
"children": [
{
"name": "A31",
"children": [
{ "name": "A311" },
{ "name": "A312" }
]
}
]
}
]
}
I tried a few things but nothing worked as I hoped.
This is how i move into my tree
def recursive(data)
return if data.is_a?(String)
data.each do |d|
keys = d.keys
keys.each do |k|
recursive(d[k])
end
end
return data
end
I tried my best to follow how to ask so to clarify :
The tree can have a unlimited deeph
Names are more complexe than A1, A2 ...

λ = ->(h) { [h[:name], h[:children] ? h[:children].map(&λ).to_h : []] }
[λ.(inp)].to_h
#⇒ {
# "A" => {
# "A1" => [],
# "A2" => [],
# "A3" => {
# "A31" => {
# "A311" => [],
# "A312" => []
# }
# }
# }
# }
This solution returns hashes that are not wrapped in arrays inside. If you really want to wrap nested hashes with arrays, map them in λ.

When you don't know how to implement something, always think the simplest case first.
Step 1: Convert {"A1" => []} to{"name" => "A1", "children" => []}
This is simple
def convert(hash)
pair = hash.each_pair.first
["name", "children"].zip(pair).to_h
end
Step2: Recursively convert all hashes in children
def convert(hash)
pair = hash.each_pair.first
pair[1] = pair[1].map{|child| convert(child)}
["name", "children"].zip(pair).to_h
end
Step 3: Handle corner cases
If children is empty then omit it.
def convert(hash)
pair = hash.each_pair.first
pair[1] = pair[1].map{|child| convert(child)}
result = {"name" => pair[0]}
result.merge!("children" => pair[1]) if pair[1].any?
result
end

Related

Selecting items from a hash based on sub-hash values

I have the following JSON output from an API:
{
"Objects": [
{
"FieldValues": [
{
"Field": {
"Name": "Nuix Field"
},
"Value": "Primary Date"
},
{
"Field": {
"Name": "Field Type"
},
"Value": {
"Name": "Nuix"
}
},
{
"Field": {
"Name": "Field Category"
},
"Value": {
"Name": "Generic"
}
}
]
}
]
}
I want to be able to select all Objects where "Field" has a "Name" of "Field Type" and it's "Value" has a "Name" of "Nuix".
This is my attempt, but I feel like there is a better way to do it?
json = JSON.parse(response)
results = []
json["Objects"].each do |obj|
obj["FieldValues"].each do |fv|
if fv["Field"]["Name"] == "Field Type" && fv["Value"]["Name"] == "Nuix"
results << obj
end
end
end
One of the options is not to loop all FieldValues but only until expected one is found with the any? method.
Then you can simplify code with select method, which will create new array with only "satisfied" objects.
objects_with_required_fields = json.fetch("Objects", []).select do |obj|
obj.fetch("FieldValues", []).any? do |fv|
name = fv.dig("Field", "Name")
value = fv["Value"]
name == "Field Type" && value.is_a?(Hash) && value["Name"] == "Nuix"
end
end
Here's a more minimal Ruby solution:
json = JSON.parse(response, symbolize_names: true)
target = [ 'Field Type', 'Value' ]
# For each of the entries in Objects...
results = json[:Objects].flat_map do |obj|
# ...filter out those that...
obj[:FieldValues].select do |fv|
# ...match the target criteria.
[ fv.dig(:Field, :Name), fv[:Value] ] == target
end
end
Where that uses symbolized keys and just filters through an array of arrays looking for matching entries, then returns those in one (flat) array.

Transform ElasticSearch index from field explosions into nested documents via Logstash

So we have an old elasticsearch index that succumbed to field explosion. We have redesigned the structure of the index to fix this using nested documents. However, we are attempting to figure out how to migrate the old index data into the new structure. We are currently looking at using Logstash plugins, notably the aggregate plugin, to try to accomplish this. However, all the examples we can find show how to created the nested documents from database calls, as opposed to from a field-exploded index. For context, here is an example of what an old index might look like:
"assetID": 22074,
"metadata": {
"50": {
"analyzed": "Phase One",
"full": "Phase One",
"date": "0001-01-01T00:00:00"
},
"51": {
"analyzed": "H 25",
"full": "H 25",
"date": "0001-01-01T00:00:00"
},
"58": {
"analyzed": "50",
"full": "50",
"date": "0001-01-01T00:00:00"
}
}
And here is what we would like the transformed data to look like in the end:
"assetID": 22074,
"metadata": [{
"metadataId": 50,
"ngrams": "Phase One", //This was "analyzed"
"alphanumeric": "Phase One", //This was "full"
"date": "0001-01-01T00:00:00"
}, {
"metadataId": 51,
"ngrams": "H 25", //This was "analyzed"
"alphanumeric": "H 25", //This was "full"
"date": "0001-01-01T00:00:00"
}, {
"metadataId": 58,
"ngrams": "50", //This was "analyzed"
"alphanumeric": "50", //This was "full"
"date": "0001-01-01T00:00:00"
}
}]
As a dumbed-down example, here is what we can figure from the aggregate plugin:
input {
elasticsearch {
hosts => "my.old.host.name:9266"
index => "my-old-index"
query => '{"query": {"bool": {"must": [{"term": {"_id": "22074"}}]}}}'
size => 500
scroll => "5m"
docinfo => true
}
}
filter {
aggregate {
task_id => "%{id}"
code => "
map['assetID'] = event.get('assetID')
map['metadata'] ||= []
map['metadata'] << {
metadataId => ? //somehow parse the Id out of the exploded field name "metadata.#.full",
ngrams => event.get('metadata.#.analyzed'),
alphanumeric => event.get('metadata.#.full'),
date => event.get('metadata.#.date'),
}
"
push_previous_map_as_event => true
timeout => 150000
timeout_tags => ['aggregated']
}
if "aggregated" not in [tags] {
drop {}
}
}
output {
elasticsearch {
hosts => "my.new.host:9266"
index => "my-new-index"
document_type => "%{[#metadata][_type]}"
document_id => "%{[#metadata][_id]}"
action => "update"
}
file {
path => "C:\apps\logstash\logstash-5.6.6\testLog.log"
}
}
Obviously the above example is basically just pseudocode, but that is all we can gather from looking at the documentation for both Logstash and ElasticSearch, as well as the aggregate filter plugin and generally Googling things within an inch of their life.
You can play around with the event object, massage it and then add it into the new index. Something like below (The logstash code is untested, you may find some errors. Check the working ruby code after this section):
aggregate {
task_id => "%{id}"
code => "arr = Array.new()
map["assetID"] = event.get("assetID")
metadataObj = event.get("metadata")
metadataObj.to_hash.each do |key,value|
transformedMetadata = {}
transformedMetadata["metadataId"] = key
value.to_hash.each do |k , v|
if k == "analyzed" then
transformedMetadata["ngrams"] = v
elsif k == "full" then
transformedMetadata["alphanumeric"] = v
else
transformedMetadata["date"] = v
end
end
arr.push(transformedMetadata)
end
map['metadata'] ||= []
map['metadata'] << arr
"
}
}
try to play around with above based on the event input and you will get there. Here's a working example, with the input you have in the question, for you to play around : https://repl.it/repls/HarshIntelligentEagle
json_data = {"assetID": 22074,
"metadata": {
"50": {
"analyzed": "Phase One",
"full": "Phase One",
"date": "0001-01-01T00:00:00"
},
"51": {
"analyzed": "H 25",
"full": "H 25",
"date": "0001-01-01T00:00:00"
},
"58": {
"analyzed": "50",
"full": "50",
"date": "0001-01-01T00:00:00"
}
}
}
arr = Array.new()
transformedObj = {}
transformedObj["assetID"] = json_data[:assetID]
json_data[:metadata].to_hash.each do |key,value|
transformedMetadata = {}
transformedMetadata["metadataId"] = key
value.to_hash.each do |k , v|
if k == :analyzed then
transformedMetadata["ngrams"] = v
elsif k == :full then
transformedMetadata["alphanumeric"] = v
else
transformedMetadata["date"] = v
end
end
arr.push(transformedMetadata)
end
transformedObj["metadata"] = arr
puts transformedObj
In the end, we used ruby code to solve it in a script:
# Must use the input plugin for elasticsearch at version 4.0.2, or it cannot contact a 1.X index
input {
elasticsearch {
hosts => "my.old.host.name:9266"
index => "my-old-index"
query => '{
"query": {
"bool": {
"must": [
{ "match_all": { } }
]
}
}
}'
size => 500
scroll => "5m"
docinfo => true
}
}
filter {
mutate {
remove_field => ['#version', '#timestamp']
}
}
#metadata
filter {
mutate {
rename => { "[metadata]" => "[metadata_OLD]" }
}
ruby {
code => "
metadataDocs = []
metadataFields = event.get('metadata_OLD')
metadataFields.each { |key, value|
metadataDoc = {
'metadataID' => key.to_i,
'date' => value['date']
}
if !value['full'].nil?
metadataDoc[:alphanumeric] = value['full']
end
if !value['analyzed'].nil?
metadataDoc[:ngrams] = value['analyzed']
end
metadataDocs << metadataDoc
}
event.set('metadata', metadataDocs)
"
}
mutate {
remove_field => ['metadata_OLD']
}
}
output {
elasticsearch {
hosts => "my.new.host:9266"
index => "my-new-index"
document_type => "searchasset"
document_id => "%{assetID}"
action => "update"
doc_as_upsert => true
}
file {
path => "F:\logstash-6.1.2\logs\esMigration.log"
}
}

Generate a tree from string split

I have an array of strings
["ana_ola_una",
"ana_ola_ina",
"ana_asta",
"ana_ena_ola",
"ana_ena_cala",
"ana_ena_cina",
"ana_ena_cina_ula"]
I need to reformat it as a hash of hashes of hashes of ... to make it a tree. The expected result would be:
{ana:
{
ola: {
una: {},
ina: {}
},
asta: {},
ena: {
ola: {},
cala:{},
cina:
{
ula: {}
}
}
}
}
EDIT:
I edit this issue because I have a related question, finally I want it in a JSON with this format. How could I do:
var tree = [
{
text: "Parent 1",
nodes: [
{
text: "Child 1",
nodes: [
{
text: "Grandchild 1"
},
{
text: "Grandchild 2"
}
]
},
{
text: "Child 2"
}
]
},
{
text: "Parent 2"
},
{
text: "Parent 3"
},
{
text: "Parent 4"
},
{
text: "Parent 5"
}
];
arr = %w|ana_ola_una
ana_ola_ina
ana_asta
ana_ena_ola
ana_ena_cala
ana_ena_cina
ana_ena_cina_ula|
result = arr.each_with_object({}) do |s, memo|
s.split('_').inject(memo) do |deep, k|
deep[k.to_sym] ||= {}
end
end
mudasobwa's answer is good, but if you're using Ruby 2.3+ here's a slightly more concise alternative:
arr = [
"ana_ola_una",
"ana_ola_ina",
"ana_asta",
"ana_ena_ola",
"ana_ena_cala",
"ana_ena_cina",
"ana_ena_cina_ula"
]
tree = Hash.new {|h,k| h[k] = Hash.new(&h.default_proc) }
arr.each {|str| tree.dig(*str.split(?_).map(&:to_sym)) }
p tree
# => { ana:
# { ola:
# { una: {},
# ina: {}
# },
# asta: {},
# ena:
# { ola: {},
# cala: {},
# cina: { ula: {} }
# }
# }
# }

How find a nth node value using ruby

I have a JSON response tree like structure
{
"id":""
"node": [
{
"id":""
"node": [
{
"id":""
"node":[]
}
]
}
]
}
How could I get the last id value, it's just example it may contain n number of loops.
h = {
"id" => "1",
"node" => [
{
"id" => "2",
"node" => [
{
"id" => "3",
"node" => []
}
]
}
]
}
▶ λ = ->(h) { h['node'].empty? ? h['id'] : λ.(h['node'].last) }
#⇒ #<Proc:0x00000002f4b490#(pry):130 (lambda)>
▶ λ.(h)
#⇒ "3"
Maybe this method will helps you. You can call recursion method with sub hash.
h = {
"id" => "1",
"node" => [
{
"id" => "2",
"node" => [
{
"id" => "3",
"node" => []
}
]
}
]
}
def get_last_node(h)
if Array === h['node'] && !h['node'].empty?
h['node'].each do |node_h|
id = send(__callee__, node_h)
return id if id
end
nil
else
h['id']
end
end
get_last_node(h)
# => 3
Similar to #mudasobwa's answer:
def get_last_node(h)
h["node"].empty? ? h["id"] : get_last_node(h["node"].first)
end
get_last_node(h)
#=> 3

How do I access JSON array data?

I have the following array:
[ { "attributes": {
"id": "usdeur",
"code": 4
},
"name": "USD/EUR"
},
{ "attributes": {
"id": "eurgbp",
"code": 5
},
"name": "EUR/GBP"
}
]
How can I get both ids for futher processing as output?
I tried a lot but no success. My problem is I always get only one id as output:
Market.all.select.each do |market|
present market.id
end
Or:
Market.all.each{|attributes| present attributes[:id]}
which gives me only "eurgbp" as a result while I need both ids.
JSON#parse should help you with this
require 'json'
json = '[ { "attributes": {
"id": "usdeur",
"code": 4
},
"name": "USD/EUR"
},
{ "attributes": {
"id": "eurgbp",
"code": 5
},
"name": "EUR/GBP"
}]'
ids = JSON.parse(json).map{|hash| hash['attributes']['id'] }
#=> ["usdeur", "eurgbp"]
JSON#parse turns a jSON response into a Hash then just use standard Hash methods for access.
I'm going to assume that the data is JSON that you're parsing (with JSON.parse) into a Ruby Array of Hashes, which would look like this:
hashes = [ { "attributes" => { "id" => "usdeur", "code" => 4 },
"name" => "USD/EUR"
},
{ "attributes" => { "id" => "eurgbp", "code" => 5 },
"name" => "EUR/GBP"
} ]
If you wanted to get just the first "id" value, you'd do this:
first_hash = hashes[0]
first_hash_attributes = first_hash["attributes"]
p first_hash_attributes["id"]
# => "usdeur"
Or just:
p hashes[0]["attributes"]["id"]
# => "usdeur"
To get them all, you'll do this:
all_attributes = hashes.map {|hash| hash["attributes"] }
# => [ { "id" => "usdeur", "code" => 4 },
# { "id" => "eurgbp", "code" => 5 } ]
all_ids = all_attributes.map {|attrs| attrs["id"] }
# => [ "usdeur", "eurgbp" ]
Or just:
p hashes.map {|hash| hash["attributes"]["id"] }
# => [ "usdeur", "eurgbp" ]
JSON library what using Rails is very slowly...
I prefer to use:
gem 'oj'
from https://github.com/ohler55/oj
fast and simple! LET'S GO!

Resources