How to extract a value from a hash - ruby

I have a parsed JSON file that contains a Hash:
{
"user1" : {
"about_you" : "jjhj",
"age" : 18,
"email" : 18
},
"user2" : {
"about_you" : "jjhj",
"age" : 18,
"email" : 18
},
"user3" : {
"about_you" : "jjhj",
"age" : 18,
"email" : 18
}
}
I'm trying to loop and get all the email values and write them to a CSV file.
At the moment I'm trying to read the email, and tried a few variations but this is the closest I got, but it doesn't read the value, it just shows blank.
data_hash = JSON.parse(File.read('user.json'))
data_hash.keys.each do |user|
puts user['email']
end

The keys method returns an array of the key names; it doesn't return the values.
Given these inputs:
json = '{"user1":{"about_you":"jjhj","age":18,"email":18},"user2":{"about_you":"jjhj","age":18,"email":18},"user3":{"about_you":"jjhj","age":18,"email":18}}'
data_hash = JSON.parse(json)
Try just iterating over the hash's keys and values:
data_hash.each { |k,v| puts v['email'] }
Or if you prefer:
data_hash.each do |k,v|
puts v['email']
end
Each returns:
18
18
18

If you only need the email data then you can just use map:
data_hash = JSON.parse(File.read('user.json'))
data_hash.values.map{|x| x[:email]}

Related

Logstash filter out values with null values for a key in a nested json array

I have quite an extensive Logstash pipeline ending in a Json as such:
{
"keyA": 1,
"keyB": "sample",
"arrayKey": [
{
"key": "data"
},
{
"key": null
}
]
}
What I want to achieve is to filter "arrayKey" and remove objects within with value for "key" is null.
Tried this to no luck:
filter {
ruby {
code => "
event.get('arrayKey').each do |key|
[key].delete_if do |keyCandidate|
if [keyCandidate][key] != nil
true
end
end
end
"
}
}
This gives no implicit converter found from |hash|:|Int| error. How do I achieve this? Is there and easier way to do this?
As Aleksei pointed out, you can create a copy of the array that does not contain entries where [key] is null using reject. You have to use event.set to overwrite the inital value of [arrayKey]
ruby {
code => '
a = event.get("arrayKey")
if a
event.set("arrayKey", a.reject { |x| x["key"] == nil })
end
'
}

Ruby - Elegantly replace hash values with nested value (description)

The hash I'm working with has a hash for it's values which always contains an ID, name, and description. I am not interested in keeping the ID or name and just want to replace every hash value with its corresponding description.
Code
hsh['nested']['entries']['addr'] = hsh['nested']['entries']['addr']['description']
hsh['nested']['entries']['port'] = hsh['nested']['entries']['port']['description']
hsh['nested']['entries']['protocol'] = hsh['nested']['entries']['protocol']['description']
hsh['nested']['entries']['type'] = hsh['nested']['entries']['type']['description']
... (many more)
This works fine, but it is not very elegant--in reality, I have 20 entries/lines of code to get the job done.
Structure of the hash value (for hsh['nested']['entries']['addr'])
{ "id" => "27", "name" => "Instance", "description" => "**This is what I need.**" }
Taking the first line of code above as a sample, the end result would be the value of hsh['nested']['entries']['addr'] becomes **This is what I need.**
What is an elegant way to achieve this?
hsh = { 'nested'=>
{ 'entries'=>
{
'addr'=>{ "id" => "1", "description"=>"addr" },
'port'=>{ "id" => "2", "description"=>"port" },
'cats'=>{ "id" => "3", "description"=>"dogs" },
'type'=>{ "id" => "4", "description"=>"type" }
}
}
}
keys_to_replace = ["addr", "port", "type"]
hsh['nested']['entries'].tap { |h| keys_to_replace.each { |k| h[k]=h[k]["description"] }
#=> { "addr"=>"addr",
# "port"=>"port",
# "cats"=>{"id"=>"3", "description"=>"dogs"},
# "type"=>"type"
# }
hsh
#=> {"nested"=>
# { "entries"=>
# { "addr"=>"addr",
# "port"=>"port",
# "cats"=>{"id"=>"3", "description"=>"dogs"},
# "type"=>"type"
# }
# }
# }
sub_hash = hsh['nested']['entries']
categories = %w{addr port protocol type}
categories.each do |category|
sub_hash[category] = sub_hash[category]['description']
end

LINQ to JSON - Querying an array

I need to select users that have a "3" in their json array.
{
"People":[
{
"id" : "123",
"firstName" : "Bill",
"lastName" : "Gates",
"roleIds" : {
"int" : ["3", "9", "1"]
}
},
{
"id" : "456",
"firstName" : "Steve",
"lastName" : "Jobs",
"roleIds" : {
"int" : ["3", "1"]
}
},
{
"id" : "789",
"firstName" : "Elon",
"lastName" : "Musk",
"roleIds" : {
"int" : ["3", "7"]
}
},
{
"id" : "012",
"firstName" : "Agatha",
"lastName" : "Christie",
"roleIds" : {
"int" : "2"
}
}
]}
In the end, my results should be Elon Musk & Steve Jobs. This is the code that I used (& other variations):
var roleIds = pplFeed["People"]["roleIds"].Children()["int"].Values<string>();
var resAnAssocInfo = pplFeed["People"]
.Where(p => p["roleIds"].Children()["int"].Values<string>().Contains("3"))
.Select(p => new
{
id = p["id"],
FName = p["firstName"],
LName = p["lastName"]
}).ToList();
I'm getting the following error:
"Accessed JArray values with invalid key value: "roleIds". Int32 array index expected"
I changed .Values<string>() to .Values<int>() and still no luck.
What am I doing wrong?
You are pretty close. Change your Where clause from this:
.Where(p => p["roleIds"].Children()["int"].Values<string>().Contains("3"))
to this:
.Where(p => p["roleIds"]["int"].Children().Contains("3"))
and you will get you the result you want (although there are actually three users in your sample data with a role id of "3", not two).
However, there's another issue that you might hit for which this code still won't work. You'll notice that for Agatha Christie, the value of int is not an array like the others, it is a simple string. If the value will sometimes be an array and sometimes not, then you need a where clause that can handle both. Something like this should work:
.Where(p => p["roleIds"]["int"].Children().Contains(roleId) ||
p["roleIds"]["int"].ToString() == roleId)
...where roleId is a string containing the id you are looking for.
Fiddle: https://dotnetfiddle.net/Zr1b6R
The problem is that not all objects follow the same interface. The last item in that list has a single string value in the roleIds.int property while all others has an array. You need to normalize that property and then do the check. It'll be easiest if they were all arrays.
You should be able to do this:
var roleId = "3";
var query =
from p in pplFeed["People"]
let roleIds = p.SelectToken("roleIds.int")
let normalized = roleIds.Type == JTokenType.Array ? roleIds : new JArray(roleIds)
where normalized.Values().Contains(roleId)
select new
{
id = p["id"],
FName = p["firstName"],
LName = p["lastName"],
};

Compare three arrays of hashes and get the result without duplicates in ruby?

I m using the fql gem to retrieve the data from facebook. The original array of hashes is like this. Here. When i compare these three arrays of hashes then i want to get the final result in this way:
{
"photo" => [
[0] {
"owner" : "1105762436",
"src_big" : "https://fbcdn-sphotos-b-a.akamaihd.net/hphotos-ak-xap1/t31.0-8/q71/s720x720/10273283_10203050474118531_5420466436365792507_o.jpg",
"caption" : "Rings...!!\n\nView Full Screen.",
"created" : 1398953040,
"modified" : 1398953354,
"like_info" : {
"can_like" : true,
"like_count" : 22,
"user_likes" : true
},
"comment_info" : {
"can_comment" : true,
"comment_count" : 2,
"comment_order" : "chronological"
},
"object_id" : "10203050474118531",
"pid" : "4749213500839034982"
}
],
"comment" => [
[0] {
"text" : "Wow",
"text_tags" : [],
"time" : 1398972853,
"likes" : 1,
"fromid" : "100001012753267",
"object_id" : "10203050474118531"
},
[1] {
"text" : "Woww..",
"text_tags" : [],
"time" : 1399059923,
"likes" : 0,
"fromid" : "100003167704574",
"object_id" : "10203050474118531"
}
],
"users" =>[
[0] {
"id": "1105762436",
"name": "Nilanjan Joshi",
"username": "NilaNJan219"
},
[1] {
"id": "1105762436",
"name": "Ashish Joshi",
"username": "NilaNJan219"
}
]
}
Here is my attempt:
datas = File.read('source2.json')
all_data = JSON.parse(datas)
photos = all_data[0]['fql_result_set'].group_by{|x| x['object_id']}.to_a
comments = all_data[1]['fql_result_set'].group_by{|x| x['object_id']}.to_a
#photos_comments = []
#comments_users = []
#photo_users = []
photos.each do |a|
comments.each do |b|
if a.first == b.first
#photos_comments << {'photo' => a.last, 'comment' => b.last}
else
#comments_users << {'photo' => a.last, 'comment' => ''} unless #photos_comments.include? (a.last)
end
end
end
#photo_users = #photos_comments | #comments_users
#photo_comment_users = {photos_comments: #photo_users }
Here is what i'm getting final result
Still there are duplicates in the final array. I've grouped by the array by object id which is common between the photo and the comment array. But the problem it is only taking those photos which has comments. I'm not getting the way how to find out the photos which don't have the comments.
Also in order to find out the details of the person who has commented, ive users array and the common attribute between comments and users is fromid and id. I'm not able to understand how to get the user details also.
I think this is what you want:
photos = all_data[0]['fql_result_set']
comments = all_data[1]['fql_result_set'].group_by{|x| x['object_id']}
#photo_comment_users = photos.map do |p|
{ 'photo' => p, 'comment' => comments[p['object_id']] || '' }
end
For each photo it takes all the comments with the same object_id, or if none exist - returns ''.
If you want to connect the users too, you can map them by id, and select the relevant ones by the comment:
users = Hash[all_data[2]['fql_result_set'].map {|x| [x['id'], x]}]
#photo_comment_users = photos.map do |p|
{ 'photo' => p, 'comment' => comments[p['object_id']] || '',
'user' => (comments[p['object_id']] || []).map {|c| users[c['formid']]} }
end

Ruby Mongo::ObjectID comparison

SO,
I have two Mongo::ObjectID objects that are equal according to both == and eql? (they both return true). However, if one is a key in a Hash and the other is in a document stored in an array, this fails:
myhash[array_of_docs[0]['_id']] # => nil
myhash.fetch(array_of_docs[0]['_id']) # => KeyError: key not found
My db has 2 collections, "bookmarks", with mainly a title and url, and "tags", with a 'bkm_id' key pointing to a bookmark doc's _id and a 'name' key. With the following query, I map each bookmark's _id to the corresponding comma-separated list of tags:
bkms_tags_array = tags_collection.group(['bkm_id'], nil, { tags: Array.new }, "function(tag, agg){ agg.tags.push(tag.name) }", true)
bkms_tags = {}
bkms_tags_array.each do |bt|
bkms_tags.merge! Hash[bt.values[0], bt.values[1].join(", ")]
end
bkms_tags # => {4d60b29603e5665f82000001=>"socialnw, blablabla", 4d60b44703e5665fff000001=>"mail, app, google", 4d61812f03e5661ad8000001=>"socialnw, comms, web"}
Given that bks is the result of 'bookmarks_collection.find.to_a', this is my problem:
bkms_tags[bks[0]['_id']] # => nil
bkms_tags.include? bks[0]['_id'] # => false ; however:
bkms_tags.keys.include? bks[0]['_id'] # => true
How come 'hash.include?' be false and 'hash.keys.include?' be true? Is there a difference between ObjectIDs returned by different queries?
Like I said, both == and eql? return true:
bkms_tags.each { |k,v| puts k == bks[0]['_id'] } # => true false false
bkms_tags.keys.each { |k| puts k == bks[0]['_id'] } # => true false false
bkms_tags.each { |k,v| puts k.eql? bks[0]['_id'] } # => true false false
bkms_tags.keys.each { |k| puts k.eql? bks[0]['_id'] } # => true false false
So, by any comparison possible, 'bks[0]['_id']' is a key of bkms_tags, but when I try to retrieve it's value, Ruby gets confused somehow.
Any ideas? Thanks!
Extra info:
Some sample documents:
Bookmarks
{ "_id" : ObjectId("4d60b29603e5665f82000001"), "url" : "http://www.facebook.com/", "title" : "Facebook", "host" : "facebook.com", "saved_at" : "Sun Feb 20 2011 01:20:06 GMT-0500 (PET)" }
{ "_id" : ObjectId("4d60b44703e5665fff000001"), "url" : "http://mail.google.com/", "title" : "gmail", "host" : "mail.google.com", "saved_at" : "Sun Feb 20 2011 01:27:19 GMT-0500 (PET)" }
{ "_id" : ObjectId("4d61812f03e5661ad8000001"), "url" : "http://twitter.com/", "title" : "twitter", "host" : "twitter.com", "saved_at" : "Sun Feb 20 2011 16:01:35 GMT-0500 (PET)" }
Tags
{ "_id" : ObjectId("4d60b44703e5665fff000002"), "bkm_id" : ObjectId("4d60b44703e5665fff000001"), "name" : "mail" }
{ "_id" : ObjectId("4d60b44703e5665fff000003"), "bkm_id" : ObjectId("4d60b44703e5665fff000001"), "name" : "app" }
{ "_id" : ObjectId("4d60b29603e5665f82000003"), "bkm_id" : ObjectId("4d60b29603e5665f82000001"), "name" : "socialnw" }
{ "_id" : ObjectId("4d60b29603e5665f82000004"), "bkm_id" : ObjectId("4d60b29603e5665f82000001"), "name" : "blablabla" }
{ "_id" : ObjectId("4d61812f03e5661ad8000003"), "bkm_id" : ObjectId("4d61812f03e5661ad8000001"), "name" : "comms" }
{ "_id" : ObjectId("4d61812f03e5661ad8000004"), "bkm_id" : ObjectId("4d61812f03e5661ad8000001"), "name" : "web" }
EDIT
Testing some more, I came up with more irregularities:
bkms_ids # => [4d60b29603e5665f82000001, 4d61812f03e5661ad8000001, 4d61ba9103e5667dbe000001, 4d61ba9103e5667dbe000001]
bkms_ids[2] == bkms_ids[3] # => true
bkms_ids[2].eql? bkms_ids[3] # => true
bkms_ids.uniq # => nothing changes: [4d60b29603e5665f82000001, 4d61812f03e5661ad8000001, 4d61ba9103e5667dbe000001, 4d61ba9103e5667dbe000001]
EDIT 2
As requested, my bson version:
irb> BSON::VERSION # NameError: uninitialized constant BSON::VERSION
$ gem list bson
*** LOCAL GEMS ***
bson (1.2.2)
bson_ext (1.2.2)
and inspect and class of my array of docs:
array_of_docs = bookmarks_collection.find.to_a
array_of_docs[0].inspect # => "{\"_id\"=>4d60b29603e5665f82000001, \"url\"=>\"http://www.facebook.com/\", \"title\"=>\"Facebook\", \"host\"=>\"facebook.com\", \"saved_at\"=>2011-02-20 06:20:06 UTC}"
array_of_docs[0].class # => OrderedHash
array_of_docs[0]['_id'].class # => Mongo::ObjectID
Change uniq to uniq! :
bkms_ids
bkms_ids[2] == bkms_ids[3]
bkms_ids[2].eql? bkms_ids[3]
bkms_ids.uniq!

Resources