Ruby : Group a Hash with an occurrence restriction (?) [closed] - ruby

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Given an array of objects exampled below.
[
{ group: '1' },
{ group: '1' },
{ group: '1' },
{ group: '2' },
{ group: '1' }
]
Expected output would be:
[
[{ group: '1' }, { group: '1' }, { group: '1' }],
[{ group: '2' }],
[{ group: '1' }]
]
Important to note that even though "group 1" occurs 4 times, there will be 2 different groupings because we're taking the position of the object in the array as well... the group names are arbitrary as well.

Ruby does this with a builtin: Chunk
Enumerates over the items, chunking them together based on the return value of the block.
Consecutive elements which return the same block value are chunked together.
Which looks like exactly what you want:
data
.chunk{ |item| item[:group] }
.map{ |_chunk_value, items| items } # chunk gives a pair of the value the chunk used and the values in the chunk, but we only need the values.

arr = [
{ group: '1' },
{ group: '1' },
{ group: '1' },
{ group: '2' },
{ group: '1' }
]
arr.slice_when { |g,h| g[:group] != h[:group] }.to_a
#=> [[{:group=>"1"}, {:group=>"1"}, {:group=>"1"}],
# [{:group=>"2"}],
# [{:group=>"1"}]]
See Enumerable#slice_when.

You can initialize results = [] and then build it up by iterating over the input. As you process each input hash, you check whether the last element in results matches. If so, you add the input hash there. Otherwise, you start a new list in results:
input = [
{ group: '1', },
{ group: '1', },
{ group: '1', },
{ group: '2', },
{ group: '1', }
]
results = []
input.each do |hsh|
last = results.last
if last && last.last[:group] == hsh[:group]
last.push hsh
else
results.push [hsh]
end
end
print results
# => [
# [{:group=>"1"}, {:group=>"1"}, {:group=>"1"}],
# [{:group=>"2"}],
# [{:group=>"1"}]
# ]

Related

Ruby print or return specific field from object

How do I print the group_id from the returned object?
The following is returned from a function. I want to print the group_id or maybe return the group_id
{
:security_groups=>[
{
:description=>"Created By ManageIQ",
:group_name=>"MIQ_019",
:ip_permissions=>[
{
:from_port=>22,
:ip_protocol=>"tcp",
:ip_ranges=>[
{
:cidr_ip=>"0.0.0.0/0",
:description=>nil
}
],
:ipv_6_ranges=>[],
:prefix_list_ids=>[],
:to_port=>22,
:user_id_group_pairs=>[]
}
],
:owner_id=>"943755119718",
:group_id=>"sg-0c2c5f219f1bafc1a",
:ip_permissions_egress=>[
{
:from_port=>nil,
:ip_protocol=>"-1",
:ip_ranges=>[
{
:cidr_ip=>"0.0.0.0/0",
:description=>nil
}
],
:ipv_6_ranges=>[],
:prefix_list_ids=>[],
:to_port=>nil,
:user_id_group_pairs=>[]
}
],
:tags=>[],
:vpc_id=>"vpc-d817c1b3"
}
],
:next_token=>nil
}
This is the function: I want to return security_group.group_id
def describe_security_group (
group_name
)
ec2 = get_aws_client
security_group = ec2.describe_security_groups(
filters: [
{name: 'group-name', values: [ group_name ]}]
)
puts "Describing security group '#{group_name}' with ID " \
"'#{security_group}'"
return security_group
rescue StandardError => e
puts "Error describing security group: #{e.message}"
return
end
So, returning value seems like a hash, or you can make it hash exactly.
For case with one-element array you can simple use ruby dig method.
And according to your datum and comment below we can access needed element like this:
# from your ec2 api call
security_group = ec2.describe_security_groups(...)
# Result value is stored in `security_group` variable,
# and looks exactly like hash below
{
:security_groups=>[
{
:description=>"Created By ManageIQ",
:group_name=>"MIQ_019",
:ip_permissions=>[
{
:from_port=>22,
:ip_protocol=>"tcp",
:ip_ranges=>[
{
:cidr_ip=>"0.0.0.0/0",
:description=>nil
}
],
:ipv_6_ranges=>[],
:prefix_list_ids=>[],
:to_port=>22,
:user_id_group_pairs=>[]
}
],
:owner_id=>"943755119718",
:group_id=>"sg-0c2c5f219f1bafc1a",
:ip_permissions_egress=>[
{
:from_port=>nil,
:ip_protocol=>"-1",
:ip_ranges=>[
{
:cidr_ip=>"0.0.0.0/0",
:description=>nil
}
],
:ipv_6_ranges=>[],
:prefix_list_ids=>[],
:to_port=>nil,
:user_id_group_pairs=>[]
}
],
:tags=>[],
:vpc_id=>"vpc-d817c1b3"
}
],
:next_token=>nil
}
# And this is a target value, that you can store in another one,
# return from method or simply print to output
security_group.dig(:security_groups)
.try(:[], 0)
.dig(:group_id)
=> "sg-0c2c5f219f1bafc1a"
But if you need to search in array with multiple elements, methods from Ruby's Enumerable module could be helpful (like select or reject).
UPDATE with OpenStruct, if you prefer such method calls with dot notation:
json = security_group.to_json
os = JSON.parse(json, object_class: OpenStruct)
os.security_groups.first.group_id
=> "sg-0c2c5f219f1bafc1a"

How to find all documents that don't have an array or it is smaller than

I'm trying to find all documents, which either don't have an array tags or the size of the array is smaller than 2. How do I do this? I'm trying this, but doesn't work:
db.collection.find({
'text' => { '$exists' => true }, # I need this one too
'tags' => {
'$or' => [
{ '$exists' => false },
{ '$lt' => ['$size', 2] }
]
}
})
It's Ruby, btw. MongoDB version is 4.
I'm getting:
unknown operator: $or
You can use below query
db.collection.find({
text: { $exists: true },
$or: [{
tags: { $exists: false }
}, {
$expr: { $lt: [{ $size: '$tags' }, 2] }
}]
})
To slightly modify MauriRamone's answer to a smaller version:
db.getCollection('test').find({
$and:[
{"text":{$exists:true} },
{$where: "!this.tags || this.tags.length < 2"}
]
})
However, $where is slow, and other options (such as Anthony's) should be preferred.
Your original query wasn't working because $or only works in expressions, not in fields, and you need an $expr operator for the size.
try using $were in your query, like this:
db.getCollection('test').find({
$and:[
{"text":{$exists:true} },
{
$or:[
{"tags":{$exists:false}},
{$where: "this.tags.length < 2"}
]
}
]
})
I am using Robomongo to test, you should format the query to Ruby.
regards.

How can I compare two rethinkdb objects to create a new object that only contains their differences?

Say I have two objects stored in rethinkdb I wish to compare, let's call them old_val and new_val. As an example, let's say these values represent a TODO task that has changed owner and status:
{
old_val: {
status: 'active',
content: 'buy apples',
owner: 'jordan'
},
new_val: {
status: 'done',
content: 'buy apples',
owner: 'matt'
}
}
When I compare old_val and new_val, I'd like to yield a new object where new_val only contains the fields that differ from old_val. I want to do this in order to save bytes on the wire; and make rendering changes on my client easier. The result of the query should look something like this:
{
old_val: {
content: 'buy apples',
owner: 'jordan',
status: 'active'
},
new_val: {
owner: 'matt',
status: 'done'
}
}
How would I do this?
There are three separate parts to solving this problem:
Generate a list of fields to compare
Compare between common fields, include only fields which differ
Create a new object
(1) A list of fields can be generated by using the keys() method. We can filter these fields to (2) only include those which exist in both old_val and new_val and whose values differ. We can then pass this sequence to concatMap() to build an array of key/value pairs like [key0, value0, key1, value1]. Finally, a new object can be constructed (3) from this sequence by applying it as arguments (using r.args()) to the r.object() function.
It comes together like this:
r.expr({
old_val: {
status: 'active',
content: 'buy apples',
owner: 'jordan'
},
new_val: {
status: 'done',
content: 'buy apples',
owner: 'matt'
}
}).do((diff_raw) =>
r.expr({
old_val: diff_raw('old_val'),
// build an object only containing changes between old_val and new_val:
new_val: r.object(r.args(
diff_raw('new_val')
.keys()
// only include keys existing in old and new that have changed:
.filter((k) =>
r.and(
diff_raw('old_val').hasFields(k),
diff_raw('new_val')(k).ne(diff_raw('old_val')(k))
)
)
// build a sequence of [ k0, v0, k1, v1, ... ]:
.concatMap((k) => [k, diff_raw('new_val')(k)])
))
})
)
This will return:
{
"new_val": {
"owner": "matt" ,
"status": "done"
} ,
"old_val": {
"content": "buy apples" ,
"owner": "jordan" ,
"status": "active"
}
}

Delete nested hash according to key => value

I have this hash:
response = '{"librairies":[{"id":1,"books":[{"id":1,"qty":1},{"id":2,"qty":3}]},{"id":2,"books":[{"id":1,"qty":0},{"id":2,"qty":3}]}]}'
in which I'd like to delete every librairies where, at least, one of the book quantity is null.
For instance, with this given response, I'd expect this return:
'{"librairies":[{"id":1,"books":[{"id":1,"qty":1},{"id":2,"qty":3}]}]}'
I've tried this:
parsed = JSON.parse(response)
parsed["librairies"].each do |library|
library["books"].each do |book|
parsed.delete(library) if book["qty"] == 0
end
end
but this returns the exact same response hash, without having deleted the second library (the one with id => 2).
You can use Array#delete_if and Enumerable#any? for this
# Move through each array element with delete_if
parsed["librairies"].delete_if do |library|
# evaluates to true if any book hash in the library
# has a "qty" value of 0
library["books"].any? { |book| book["qty"] == 0 }
end
Hope this helps
To avoid changing the hash parsed, you could do the following.
Firstly, let's format parsed so we can see what we're dealing with:
parsed = { "libraries"=>[ { "id"=>1,
"books"=>[ { "id"=>1, "qty"=>1 },
{ "id"=>2, "qty"=>3 } ]
},
{ "id"=>2,
"books"=>[ { "id"=>1, "qty"=>0 },
{ "id"=>2, "qty"=>3 } ]
}
]
}
Later I want to show that parsed has not been changed when we create the new hash. An easy way of doing that is to compute a hash code on parsed before and after, and see if it changes. (While it's not 100% certain that different hashes won't have the same hash code, here it's not something to lose sleep over.)
parsed.hash
#=> 852445412783960729
We first need to make a "deep copy" of parsed so that changes to the copy will not affect parsed. One way of doing that is to use the Marshal module:
new_parsed = Marshal.load(Marshal.dump(parsed))
We can now modify the copy as required:
new_parsed["libraries"].reject! { |h| h["books"].any? { |g| g["qty"].zero? } }
#=> [ { "id"=>1,
# "books"=>[ { "id"=>1, "qty"=>1 },
# { "id"=>2, "qty"=>3 }
# ]
# }
# ]
new_parsed # => { "libraries"=>[ { "id"=>1,
"books"=>[ { "id"=>1, "qty"=>1},
{ "id"=>2, "qty"=>3}
]
}
]
}
And we confirm the original hash was not changed:
parsed.hash
#=> 852445412783960729

rethinkdb - how to do nest `group` query

My document
{
itemName: 'name1',
itemType: 'book',
createTime: '2014-09-24 10:10:10'
}
Then I want to query the last n days created item group by createTime and itemType
In other words, I expected my result something like this
[{
group: '2014-09-24',
reduction {
{
group: 'book',
reduction: {count: 100}
},
{
group: 'computer',
reduction: {count: 100}
},
},
{
group: '2014-09-22',
reduction {
{
group: 'book',
reduction: {count: 100}
},
{
group: 'computer',
reduction: {count: 100}
}
}
}]
The rql may looks like
r.db(xx).table(yy)
.order({index: r.desc(createTime)})
.group(r.row('createTime').date())
.map(function() {
????
}).ungroup()
You can also directly order everything (but you don't get nested fields:
r.db(xx).table(yy)
.group([r.row("createTime"), r.row("itemType")])
.count()
You somehow cannot use group inside group for now.
See https://github.com/rethinkdb/rethinkdb/issues/3097 to track progress on this limitation.

Resources