Compare Multidimensional Hash - ruby

I have a huge hash (JSON) that I want to compare to a "master key" by deleting the values that are dissimilar then totaling a value set.
I thought it would be a good way to handle test scoring with complex scoring criterion.
Advice about how to do this? Any gems exist to make my life easier?
{
"A" => 10,
"B" => 7,
etc
....
The hash is constructed like test[answer] => test[point_value] and the question key/value is the answer/point value.
So if I want to compare to a master_key and remove dissimilar items (not remove similar ones like arr1-arr2 does...then total the values, what would be best?

After converting the hashes to ruby hashes i'd do something like this
tester = { :"first" => { :"0" => { :"0" => { :"B" => 10 }, :"1" => { :"B" => 7 }, :"2" => { :"B" => 5 } } }}
master = { :"first" => { :"0" => { :"0" => { :"A" => 10 }, :"1" => { :"B" => 7 }, :"2" => { :"B" => 5 } } }}
tester.reduce(0) do |score, (test, section)|
section.each do |group, questions|
questions.each do |question, answer|
if answer.keys.first == master[test][group][question].keys.first
score += answer.values.first
end
end
end
score
end

Related

Perl Sorting Hash by Value of Value

I have a hash that looks like
'Jarrod' => {
'Age' => '25 ',
'Occupation' => Student
},
'Elizabeth' => {
'Age' => '18',
'Occupation' => Student
},
'Nick' => {
'Age' => '32 ',
'Occupation' => Lawyer
},
I am trying to sort them by age so it will look like
'Nick' => {
'Age' => '32 ',
'Occupation' => Lawyer
},
'Jarrod' => {
'Age' => '25 ',
'Occupation' => Student
},
'Elizabeth' => {
'Age' => '18',
'Occupation' => Student
},
But I can't seem to figure out how to access anything past Age. How can I access the value of a value when ordering hash keys?
A hash variable %h with the shown data can be processed in the desired order as
use Data::Dump qw(pp); # to print out a complex data structure
say "$_->[0] => ", pp($_) for
map { [ $_, $h{$_} ] }
sort { $h{$b}->{Age} <=> $h{$a}->{Age} }
keys %h;
what prints (from a complete program below)
Nick => { Age => "32 ", Occupation => "Lawyer" }
Jarrod => { Age => "25 ", Occupation => "Student" }
Elizabeth => { Age => 18, Occupation => "Student" }
Note though that we cannot "sort a hash" and then have it be that way, as hashes are inherently random with order.† But we can of course go through and process the elements in a particular order, as above for example.
Explanation: sort takes pairs of elements of a submitted list in turn, available in variables $a and $b, and then runs the block of code we provide, so to compare them as prescribed. Here we have it compare, and thus sort, the elements by the value at key Age.
The output, though, is just those keys sorted as such! So we then pass that through a map, which combines each key with its hashref value and returns those pairs, each in an arrayref. That is used to print them, as a place holder for the actual processing.
A complete program
use warnings;
use strict;
use feature 'say';
use Data::Dump qw(dd pp);
my %h = (
'Jarrod' => {
'Age' => '25 ',
'Occupation' => 'Student'
},
'Elizabeth' => {
'Age' => '18',
'Occupation' => 'Student'
},
'Nick' => {
'Age' => '32 ',
'Occupation' => 'Lawyer'
},
);
say "$_->[0] => ", pp($_->[1]) for
map { [ $_, $h{$_} ] } sort { $h{$b}->{Age} <=> $h{$a}->{Age} } keys %h;
Or, for a workable template, change to something like
my #sorted_pairs =
map { [ $_, $h{$_} ] } sort { $h{$b}->{Age} <=> $h{$a}->{Age} } keys %h;
for my $key_data (#sorted_pairs) {
say $key_data->[0], ' => ', pp $key_data->[1]; # or just: dd $key_data;
# my ($name, $data) = #$key_data;
# Process $data (a hashref) for each $name
}
Once we start building more suitable data structures for ordered data then there are various options, including one-person hashrefs for each name, stored in an array in the right order. Ultimately, all that can be very nicely organized in a class.
Note how Sort::Key makes the sorting part considerably less cumbersome
use Sort::Key qw(rnkeysort); # rn... -> Reverse Numerical
my #pairs = map { [ $_, $h{$_} ] } rnkeysort { $h{$_}->{Age} } keys %h;
The more complex -- or specific -- the sorting the more benefit from this module, with its many specific functions for generic criteria.
If these ages are given as integers then there is ikeysort (and rikeysort) and then those are hopefully unsigned integers, for which there is ukeysort (and rukeysort).
† See keys and perlsec, for example.
You can't sort a hash. A hash's elements are inherently unordered.
If you just want to visit the elements of hash in order, you can do that by getting and sorting the keys.
for my $name (
sort { $people{ $b }{ age } <=> $people{ $a }{ age } }
keys( %people )
) {
my $person = $people{ $name };
...
}
or
use Sort::Key qw( rikeysort );
for my $name (
rikeysort { $people{ $_ }{ age } }
keys( %people )
) {
my $person = $people{ $name };
...
}
If you need an ordered structure, you could start by converting the data to an array of people.
my #unordered_people =
map { +{ name => $_, %{ $people{ $_ } } }
keys( %people );
Then sorting that.
my #ordered_people =
sort { $b->{ age } <=> $a->{ age } }
#unordered_people;
or
use Sort::Key qw( rikeysort );
my #ordered_people =
rikeysort { $_->{ age } }
#unordered_people;

Using delete_if to delete single value of hash not the entire hash RUBY

I have a array
array_hash = [
{
"array_value" => 1,
"other_values" => "whatever",
"inner_value" => [
{"iwantthis" => "forFirst"},
{"iwantthis2" => "forFirst2"},
{"iwantthis3" => "forFirst3"}
]
},
{
"array_value" => 2,
"other_values" => "whatever2",
"inner_value" => [
{"iwantthis" => "forSecond"},
{"iwantthis2" => "forSecond2"},
{"iwantthis3" => "forSecond3"}
]
},
]
I want to delete inner value or pop it out (i prefer pop).
So my output should be this:
array_hash = [
{
"array_value" => 1,
"other_values" => "whatever"
},
{
"array_value" => 2,
"other_values" => "whatever2"
},
]
I tried delete_if
array_hash.delete_if{|a| a['inner_value'] }
But it deletes all data in the array. Is there any solution?
try this:
array_hash.map{ |a| {'array_value' => a['array_value'], 'other_values' => a['other_values'] }}
You are telling ruby to delete all hashes that have a key called inner_value. That explains why the array remains empty.
What you should probably do instead is:
array_hash.each { |x| x.delete 'inner_value' }
which means: for each hash in this array, erase the inner_value key.
Well I found it,
array_hash_popped = array_hash.map{ |a| a.delete('inner_value') }
This will pop (because I want pop as stated in the question) the inner_value out and hence inner value will be reduced/deleted from array_hash.

Mongoid: Query based on size of embedded document array

This is similar to this question here but I can't figure out how to convert it to Mongoid syntax:
MongoDB query based on count of embedded document
Let's say I have Customer: {_id: ..., orders: [...]}
I want to be able to find all Customers that have existing orders, i.e. orders.size > 0. I've tried queries like Customer.where(:orders.size.gt => 0) to no avail. Can it be done with an exists? operator?
I nicer way would be to use the native syntax of MongoDB rather than resort to rails like methods or JavaScript evaluation as pointed to in the accepted answer of the question you link to. Especially as evaluating a JavaScript condition will be much slower.
The logical extension of $exists for a an array with some length greater than zero is to use "dot notation" and test for the presence of the "zero index" or first element of the array:
Customer.collection.find({ "orders.0" => { "$exists" => true } })
That can seemingly be done with any index value where n-1 is equal to the value of the index for the "length" of the array you are testing for at minimum.
Worth noting that for a "zero length" array exclusion the $size operator is also a valid alternative, when used with $not to negate the match:
Customer.collection.find({ "orders" => { "$not" => { "$size" => 0 } } })
But this does not apply well to larger "size" tests, as you would need to specify all sizes to be excluded:
Customer.collection.find({
"$and" => [
{ "orders" => { "$not" => { "$size" => 4 } } },
{ "orders" => { "$not" => { "$size" => 3 } } },
{ "orders" => { "$not" => { "$size" => 2 } } },
{ "orders" => { "$not" => { "$size" => 1 } } },
{ "orders" => { "$not" => { "$size" => 0 } } }
]
})
So the other syntax is clearer:
Customer.collection.find({ "orders.4" => { "$exists" => true } })
Which means 5 or more members in a concise way.
Please also note that none of these conditions alone can just an index, so if you have another filtering point that can it is best to include that condition first.
Just adding my solution which might be helpful for someone:
scope :with_orders, -> { where(orders: {"$exists" => true}, :orders.not => {"$size" => 0}}) }

Elasticsearch sum total values for specific hours within a month

I have an elasticsearch server with fields: timestamp, user and bytes_down (among others)
I would like to total the bytes_down value for a user for a month BUT only where the hours are between 8am and 8pm
I'm able to get the daily totals with the date histogram with following query (I'm using the perl API here) but can't figure out a way of reducing this down to the hour range for each day
my $query = {
index => 'cm',
body => {
query => {
filtered => {
query => {
term => {user => $user}
},
filter => {
and => [
{
range => {
timestamp => {
gte => '2014-01-01',
lte => '2014-01-31'
}
}
},
{
bool => {
must => {
term => { zone => $zone }
}
}
}
]
}
}
},
facets => {
bytes_down => {
date_histogram => {
field => 'timestamp',
interval => 'day',
value_field => 'downstream'
}
}
},
size => 0
}
};
Thanks
Dale
I think you need to use script filter instead of range filter and then you need to put it in facet_filter section of your facet:
"facet_filter" => {
"script" => {
"script" => "doc['timestamp'].date.getHourOfDay() >= 8 &&
doc['timestamp'].date.getHourOfDay() < 20"
}
}
Add a bool must range filter for every hour, I'm not sure if you're looking to do this forever or for the specific day, but this slide show from Zachary Tong is a good way to understand what you could be doing, especially with filters in general.
https://speakerdeck.com/polyfractal/elasticsearch-query-optimization?slide=28

How can I get a key => value out of this foursquare hash?

Here is what it looks like:
{
"groups" => [
{ "venues" => [
{ "city" => "Madrid",
"address" => "Camino de Perales, s/n",
"name" => "Caja Mágica",
"stats" => {"herenow"=>"0"},
"geolong" => -3.6894333,
"primarycategory" => {
"iconurl" => "http://foursquare.com/img/categories/arts_entertainment/stadium.png",
"fullpathname" => "Arts & Entertainment:Stadium",
"nodename" => "Stadium",
"id" => 78989 },
"geolat" => 40.375045,
"id" => 2492239,
"distance" => 0,
"state" => "Spain" }],
"type" => "Matching Places"}]
}
Big and ugly... I just want to grab the id out. How would I go about doing this?
h = { "groups" => ......... }
The two ids are:
h["groups"][0]["venues"][0]["primarycategory"]["id"]
h["groups"][0]["venues"][0]["id"]
If the hash stores one id:(assuming the value is stored in a variable called hash)
hash["groups"][0]["venues"][0]["primarycategory"]["id"] rescue nil
If the hash stores multiple ids then:
ids = Array(hash["groups"]).map do |g|
Array(g["venues"]).map do |v|
v["primarycategory"]["id"] rescue nil
end.compact
end.flatten
The ids holds the array of id's.

Resources