Mongodb replacing dot (.) in key name while inserting document - ruby

MongoDb doesn't support keys with dot. I have a Ruby nested hash that has many keys with dot (.) character. Is there a configuration that can be used to specify a character replacement for . like an underscore _ while inserting such data to MongoDb
I'm using MongoDB with Ruby & mongo gem.
example hash is like below
{
"key.1" => {
"second.key" => {
"third.key" => "val"
}
}
}

If it isn't possible to use keys with . in Mongodb, you'll have to modify the input data :
hash = {
'key.1' => {
'second.key' => {
'third.key' => 'val.1',
'fourth.key' => ['val.1', 'val.2']
}
}
}
Transforming string keys
This recursive method transforms the keys of a nested Hash :
def nested_gsub(object, pattern = '.', replace = '_')
if object.is_a? Hash
object.map do |k, v|
[k.to_s.gsub(pattern, replace), nested_gsub(v, pattern, replace)]
end.to_h
else
object
end
end
nested_gsub(hash) returns :
{
"key_1" => {
"second_key" => {
"third_key" => "val.1",
"fourth_key" => [
"val.1",
"val.2"
]
}
}
}
Transforming keys and values
It's possible to add more cases to the previous method :
def nested_gsub(object, pattern = '.', replace = '_')
case object
when Hash
object.map do |k, v|
[k.to_s.gsub(pattern, replace), nested_gsub(v, pattern, replace)]
end.to_h
when Array
object.map { |v| nested_gsub(v, pattern, replace) }
when String
object.gsub(pattern, replace)
else
object
end
end
nested_gsub will now iterate on string values and arrays :
{
"key_1" => {
"second_key" => {
"third_key" => "val_1",
"fourth_key" => [
"val_1",
"val_2"
]
}
}
}

In mongoDB, there is no configuration to support dot in the key. You need to preprocess the JSON before inserting to MongoDB collection.
One approach is that you can replace the dot with its unicode equivalent U+FF0E before insertion.
Hope this helps.

Related

Ruby print or return specific field from object

How do I print the group_id from the returned object?
The following is returned from a function. I want to print the group_id or maybe return the group_id
{
:security_groups=>[
{
:description=>"Created By ManageIQ",
:group_name=>"MIQ_019",
:ip_permissions=>[
{
:from_port=>22,
:ip_protocol=>"tcp",
:ip_ranges=>[
{
:cidr_ip=>"0.0.0.0/0",
:description=>nil
}
],
:ipv_6_ranges=>[],
:prefix_list_ids=>[],
:to_port=>22,
:user_id_group_pairs=>[]
}
],
:owner_id=>"943755119718",
:group_id=>"sg-0c2c5f219f1bafc1a",
:ip_permissions_egress=>[
{
:from_port=>nil,
:ip_protocol=>"-1",
:ip_ranges=>[
{
:cidr_ip=>"0.0.0.0/0",
:description=>nil
}
],
:ipv_6_ranges=>[],
:prefix_list_ids=>[],
:to_port=>nil,
:user_id_group_pairs=>[]
}
],
:tags=>[],
:vpc_id=>"vpc-d817c1b3"
}
],
:next_token=>nil
}
This is the function: I want to return security_group.group_id
def describe_security_group (
group_name
)
ec2 = get_aws_client
security_group = ec2.describe_security_groups(
filters: [
{name: 'group-name', values: [ group_name ]}]
)
puts "Describing security group '#{group_name}' with ID " \
"'#{security_group}'"
return security_group
rescue StandardError => e
puts "Error describing security group: #{e.message}"
return
end
So, returning value seems like a hash, or you can make it hash exactly.
For case with one-element array you can simple use ruby dig method.
And according to your datum and comment below we can access needed element like this:
# from your ec2 api call
security_group = ec2.describe_security_groups(...)
# Result value is stored in `security_group` variable,
# and looks exactly like hash below
{
:security_groups=>[
{
:description=>"Created By ManageIQ",
:group_name=>"MIQ_019",
:ip_permissions=>[
{
:from_port=>22,
:ip_protocol=>"tcp",
:ip_ranges=>[
{
:cidr_ip=>"0.0.0.0/0",
:description=>nil
}
],
:ipv_6_ranges=>[],
:prefix_list_ids=>[],
:to_port=>22,
:user_id_group_pairs=>[]
}
],
:owner_id=>"943755119718",
:group_id=>"sg-0c2c5f219f1bafc1a",
:ip_permissions_egress=>[
{
:from_port=>nil,
:ip_protocol=>"-1",
:ip_ranges=>[
{
:cidr_ip=>"0.0.0.0/0",
:description=>nil
}
],
:ipv_6_ranges=>[],
:prefix_list_ids=>[],
:to_port=>nil,
:user_id_group_pairs=>[]
}
],
:tags=>[],
:vpc_id=>"vpc-d817c1b3"
}
],
:next_token=>nil
}
# And this is a target value, that you can store in another one,
# return from method or simply print to output
security_group.dig(:security_groups)
.try(:[], 0)
.dig(:group_id)
=> "sg-0c2c5f219f1bafc1a"
But if you need to search in array with multiple elements, methods from Ruby's Enumerable module could be helpful (like select or reject).
UPDATE with OpenStruct, if you prefer such method calls with dot notation:
json = security_group.to_json
os = JSON.parse(json, object_class: OpenStruct)
os.security_groups.first.group_id
=> "sg-0c2c5f219f1bafc1a"

Ruby: transform Hash-Keys

I have a Hash:
urls = [{'logs' => 'foo'},{'notifications' => 'bar'}]
The goal is to add a prefix to the keys:
urls = [{'example.com/logs' => 'foo'},{'example.com/notifications' => 'bar'}]
My attempt:
urls.map {|e| e.keys.map { |k| "example.com#{k}" }}
Then I get an array with the desired form of the keys but how can I manipulate the original hash?
If you want to "manually" transform the keys, then you can first iterate over your array of hashes, and then over each object (each hash) map their value to a hash where the key is interpolated with "example.com/", and the value remains the same:
urls.flat_map { |hash| hash.map { |key, value| { "example.com/#{key}" => value } } }
# [{"example.com/logs"=>"foo"}, {"example.com/notifications"=>"bar"}]
Notice urls are being "flat-mapped", otherwise you'd get an arrays of arrays containing hash/es.
If you prefer to simplify that, you can use the built-in method for for transforming the keys in a hash that Ruby has; Hash#transform_keys:
urls.map { |url| url.transform_keys { |key| "example.com/#{key}" } }
# [{"example.com/logs"=>"foo"}, {"example.com/notifications"=>"bar"}]
Use transform_keys.
urls = [{'logs' => 'foo'}, {'notifications' => 'bar'}]
urls.map { |hash| hash.transform_keys { |key| "example.com/#{key}" } }
# => [{"example.com/logs"=>"foo"}, {"example.com/notifications"=>"bar"}]
One question: are you best served with an array of hashes here, or would a single hash suit better? For example:
urls = { 'logs' => 'foo', 'notifications' => 'bar' }
Seems a little more sensible a way to store the data. Then, saying you did still need to transform these:
urls.transform_keys { |key| "example.com/#{key}" }
# => {"example.com/logs"=>"foo", "example.com/notifications"=>"bar"}
Or to get from your original array to the hash output:
urls = [{'logs' => 'foo'}, {'notifications' => 'bar'}]
urls.reduce({}, &:merge).transform_keys { |key| "example.com/#{key}" }
# => {"example.com/logs"=>"foo", "example.com/notifications"=>"bar"}
Much easier to work with IMHO :)
If you don't have access to Hash#transform_keys i.e. Ruby < 2.5.5 this should work:
urls.map{ |h| a = h.to_a; { 'example.com/' + a[0][0] => a[0][1] } }

no implicit conversion of nil into String when using hash without =>

my program works fine with this hash
hash =
{
'keyone'=> 'valueone',
'keytwo'=> 'valuetwo',
'keythree'=> 'valuethree'
}
but someone pointed out the this notation is old and that now I should use:
hash =
{
'keyone': 'valueone',
'keytwo': 'valuetwo',
'keythree': 'valuethree'
}
I get this error:
no implicit conversion of nil into String (TypeError)
I only changed the hash notation.
Can someone explain what is happening?
In the latter your keys are saved as symbols. So you should refer to them as:
hash[:keyone]
And if symbols are just fine, this is even better
hash = {
keyone: 'valueone',
keytwo: 'valuetwo',
keythree: 'valuethree'
}
But, if you need string keys, you have to stick with the "old" syntax
hash = {
'keyone' => 'valueone',
'keytwo' => 'valuetwo',
'keythree' => 'valuethree'
}
I only changed the hash notation.
No, you didn't. You also changed the type of the key objects from Strings to Symbols.
{ 'key': 'value' }
is not equivalent to
{ 'key' => 'value' }
it is equivalent to
{ :key => 'value' }
The new notation uses symbols for keys:
hash = {
keyone: 'valueone',
keytwo: 'valuetwo',
keythree: 'valuethree'
}
puts hash
# {:keyone=>"valueone", :keytwo=>"valuetwo", :keythree=>"valuethree"}
Your code also misses the commas between the items.

Logstash escape JSON Keys

I have multiple systems that send data as JSON Request Body. This is my simple config file.
input {
http {
port => 5001
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
}
}
In most cases this works just fine. I can look at the json data with kibana.
In some cases the JSON will not be processed. It hase something to do with the JSON escaping. For example: If a key contains a '.', the JSON will not be processed.
I can not control the JSON. Is there a way to escape these characters in a JSON key?
Update: As mentioned in the comments I'll give an example of a JSON String (Content is altered. But I,ve tested the JSON String. It has the same behavior as the original.):
{
"http://example.com": {
"a": "",
"b": ""
}
}
My research brings me back to my post, finally.
Before Elasticsearch 2.0 dots in the key were allowed. Since version 2.0 this is not the case anymore.
One user in the logstash forum developed a ruby script that takes care of the dots in json keys:
filter {
ruby {
init => "
def remove_dots hash
new = Hash.new
hash.each { |k,v|
if v.is_a? Hash
v = remove_dots(v)
end
new[ k.gsub('.','_') ] = v
if v.is_a? Array
v.each { |elem|
if elem.is_a? Hash
elem = remove_dots(elem)
end
new[ k.gsub('.','_') ] = elem
} unless v.nil?
end
} unless hash.nil?
return new
end
"
code => "
event.instance_variable_set(:#data,remove_dots(event.to_hash))
"
}
}
All credits go to #hanzmeier1234 (Field name cannot contain ‘.’)

Ruby mongoid aggregation return object

I am doing an mongodb aggregation using mongoid, using ModleName.collection.aggregate(pipeline) . The value returned is an array and not a Mongoid::Criteria, so if a do a first on the array, I get the first element which is of the type BSON::Document instead of ModelName. As a result, I am unable to use it as a model.
Is there a method to return a criteria instead of an array from the aggregation, or convert a bson document to a model instance?
Using mongoid (4.0.0)
I've been struggling with this on my own too. I'm afraid you have to build your "models" on your own. Let's take an example from my code:
class Searcher
# ...
def results(page: 1, per_page: 50)
pipeline = []
pipeline <<
"$match" => {
title: /#{#params['query']}/i
}
}
geoNear = {
"near" => coordinates,
"distanceField" => "distance",
"distanceMultiplier" => 3959,
"num" => 500,
"spherical" => true,
}
pipeline << {
"$geoNear" => geoNear
}
count = aggregate(pipeline).count
pipeline << { "$skip" => ((page.to_i - 1) * per_page) }
pipeline << { "$limit" => per_page }
places_hash = aggregate(pipeline)
places = places_hash.map { |attrs| Offer.new(attrs) { |o| o.new_record = false } }
# ...
places
end
def aggregate(pipeline)
Offer.collection.aggregate(pipeline)
end
end
I've omitted a lot of code from original project, just to present the way what I've been doing.
The most important thing here was the line:
places_hash.map { |attrs| Offer.new(attrs) { |o| o.new_record = false } }
Where both I'm creating an array of Offers, but additionally, manually I'm setting their new_record attribute to false, so they behave like any other documents get by simple Offer.where(...).
It's not beautiful, but it worked for me, and I could take the best of whole Aggregation Framework!
Hope that helps!

Resources