adding allowDiskUse parameter to db.collection.aggregate() query using Mongoid - ruby

I recently updated mongodb from 2.4 to 2.6, and the new memory limit in aggregate() is causing my aggregation to fail with the following error:
Moped::Errors::OperationFailure: The operation: #<Moped::Protocol::Command
#length=251
#request_id=6
#response_to=0
#op_code=2004
#flags=[:slave_ok]
#full_collection_name="items.$cmd"
#skip=0
#limit=-1
#selector={:aggregate=>"items", :pipeline=>[{"$group"=>{"_id"=>"$serial_number", "total"=>{"$sum"=>1}}}, {"$match"=>{"total"=>{"$gte"=>2}}}, {"$sort"=>{"total"=>-1}}, {"$limit"=>750000}]}
#fields=nil>
failed with error 16945: "exception: Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in."
So, I'm trying to pass allowDiskUse: true in the query:
dupes = Item.collection.aggregate([{
'$group' => {'_id' => "$serial_number", 'total' => { "$sum" => 1 } } },
{ '$match' => { 'total' => { '$gte' => 2 } } },
{ '$sort' => {'total' => -1}},
{ '$limit' => 750000 }],
{ 'allowDiskUse' => true })
But this isnt working.... no matter how I try I get this error:
Moped::Errors::OperationFailure: The operation: #<Moped::Protocol::Command
#length=274
#request_id=2
#response_to=0
#op_code=2004
#flags=[:slave_ok]
#full_collection_name="items.$cmd"
#skip=0
#limit=-1
#selector={:aggregate=>"items", :pipeline=>[{"$group"=>{"_id"=>"$serial_number", "total"=>{"$sum"=>1}}}, {"$match"=>{"total"=>{"$gte"=>2}}}, {"$sort"=>{"total"=>-1}}, {"$limit"=>750000}, {"allowDiskUse"=>true}]}
#fields=nil>
failed with error 16436: "exception: Unrecognized pipeline stage name: 'allowDiskUse'"
Does anyone know how I can structure this query appropriately to pass allowDiskUse outside of the pipeline arg?

Follow below syntax for Mongoid 5.0.0
Modelname.collection.aggregate(
[your stages, ... ],
:allow_disk_use => true
)
For instance
group = { "$group" => {"_id" => {"column_xyz"=>"$column_xyz" }, "collection_name" => { "$push" => "$$ROOT" }, "count" => { "$sum" => 1 } }};
Hive.collection.aggregate([group], {:allow_disk_use => true})
ref: MongoDB jira Ruby-1041's comments

The problem is that Moped does not currently permit options for Moped::Collection#aggregate, just a pipeline for args,
as can be seen here: https://github.com/mongoid/moped/blob/master/lib/moped/collection.rb#L146 -
the Mongo Ruby driver supports options for Mongo::Collection#aggregate, but Mongoid 3 uses Moped for its driver.
However, thanks to the dynamic nature of Ruby, you can work around this.
The following test includes a monkey-patch for Moped::Collection#aggregate provided that you supply the pipeline
as an array for the first argument, allowing you to tack on options like allowDiskUse.
Hope that this helps.
test/unit/item_test.rb
require 'test_helper'
module Moped
class Collection
def aggregate(pipeline, opts = {})
database.session.command({aggregate: name, pipeline: pipeline}.merge(opts))["result"]
end
end
end
class ItemTest < ActiveSupport::TestCase
def setup
Item.delete_all
end
test "moped aggregate with allowDiskUse" do
puts "\nMongoid::VERSION:#{Mongoid::VERSION}\nMoped::VERSION:#{Moped::VERSION}"
docs = [
{serial_number: 1},
{serial_number: 2},
{serial_number: 2},
{serial_number: 3},
{serial_number: 3},
{serial_number: 3}
]
Item.create(docs)
assert_equal(docs.count, Item.count)
dups = Item.collection.aggregate(
[{'$group' => {'_id' => "$serial_number", 'total' => {"$sum" => 1}}},
{'$match' => {'total' => {'$gte' => 2}}},
{'$sort' => {'total' => -1}},
{'$limit' => 750000}],
{'allowDiskUse' => true})
p dups
end
end
$ rake test
Run options:
# Running tests:
[1/1] ItemTest#test_moped_aggregate_with_allowDiskUse
Mongoid::VERSION:3.1.6
Moped::VERSION:1.5.2
[{"_id"=>3, "total"=>3}, {"_id"=>2, "total"=>2}]
Finished tests in 0.027865s, 35.8873 tests/s, 35.8873 assertions/s.
1 tests, 1 assertions, 0 failures, 0 errors, 0 skips

Related

How to iterate through a nested dict in puppet?

Here is the python example of a data set i have in my puppet code similar to the below, :
dict = {'account1': {'uid': ['123456'], 'user': ['appuser1'], 'appname': ['myapp1']},
'account2': {'uid':['567878'], 'user':['appuser'2], 'appname':['myapp2']}}
for i in dict.keys():
print dict[i]['user'], dict[i]['uid']
How do i achieve the same solution in puppet/ruby.TIA.
In Puppet manifests, you can iterate over a hash using the each function:
$ cat foo.pp
$dict = {
'account1' => {
'uid' => ['123456'],
'user' => ['appuser1'],
'appname' => ['myapp1']
},
'account2' => {
'uid' => ['567878'],
'user' => ['appuser2'],
'appname' => ['myapp2']
}
}
$dict.each | $account_key, $account | {
notice("${account['user'][0]}, ${account['uid'][0]}")
}
$ puppet apply foo.pp
Notice: Scope(Class[main]): appuser1, 123456
Notice: Scope(Class[main]): appuser2, 567878
Notice: Compiled catalog for it070137 in environment production in 0.04 seconds
Notice: Applied catalog in 0.03 seconds
If you like, you can use types to check that the key and value in the hash are what you expect:
$dict.each | String $account_key, Hash $account | {
notice("${account['user'][0]}, ${account['uid'][0]}")
}

translation from XML to rest-client for POST request

The is the XML request via POST I have to make in order to receive a response:
<BackgroundCheck userId="username" password="password">
<BackgroundSearchPackage action="submit" type="demo product">
<ReferenceId>some_id_value</ReferenceId>
<PersonalData>
<PersonName>
<GivenName>John</GivenName>
<MiddleName>Q</MiddleName>
<FamilyName>Test</FamilyName>
</PersonName>
<Aliases>
<PersonName>
<GivenName>Jack</GivenName>
<MiddleName>Quigley</MiddleName>
<FamilyName>Example</FamilyName>
</PersonName>
</Aliases>
<DemographicDetail>
<GovernmentId issuingAuthority="SSN">123456789</GovernmentId>
<DateOfBirth>1973-12-25</DateOfBirth>
</DemographicDetail>
<PostalAddress>
<PostalCode>83201</PostalCode>
<Region>UT</Region>
<Municipality>Salt Lake City</Municipality>
<DeliveryAddress>
<AddressLine>1234</AddressLine>
<StreetName>Main Street</StreetName>
</DeliveryAddress>
</PostalAddress>
<EmailAddress>john#test.com</EmailAddress>
<Telephone>801-789-4229</Telephone>
</PersonalData>
</BackgroundCheck>
</BackgroundSearchPackage>
Using the examples on the rest-client github page I came up with the following translation using rest-client:
response = RestClient.post( 'url',
{
:BackgroundCheck => {
:userID => 'username',
:password => 'password',
},
:BackgroundSearchPackage => {
:action => 'submit',
:type => 'demo'
},
:ReferenceID => 'some_id_value',
:PersonalData => {
:PersonalName => {
:GivenName => 'John',
:MiddleName => 'Q',
:FamilyName => 'Test'
},
:Aliases => {
:GivenName => 'Jack',
:MiddleName => 'Quigly',
:FamilyName => 'Example'
}
},
:DemographicDetail => {
:GovernmentId => {
:issuingAuthority => "SSN"
}, ## where do I enter the SSN?
:DateOfBirth => '1972-12-25'
},
:PostalAddress => {
:PostalCode => '83201',
:Region => 'UT',
:Municipality => 'Salt Lake City',
:DeliveryAddress => {
:AddressLine => '1234',
:StreetName => 'Main Street'
}
},
:EmailAddress => 'john#test.com',
:Telephone => '801-789-4229'
})
Its my first time with XML and the rest-client gem.
My question is did I translate the XML correctly in the POST request?
More specifically how do I handle the GovernmentID and referencing the SSN entry?
First of all, the XML you've provided isn't valid! Your root element starts with BackgroundCheck and ends with BackgroundSearchPackage:
<BackgroundCheck userId="username" password="password">
<BackgroundSearchPackage action="submit" type="demo product">
</BackgroundCheck>
</BackgroundSearchPackage>
In addition, your translation / transformation from XML to Ruby hash is incorrect. If BackgroundCheck is your root element and BackgroundSearchPackage is a child of it, your Ruby hash should look like this (rest-client accepts the string and the symbol notation):
my_xml_hash = {
"BackgroundCheck" => {
"userId"=>"username",
"password"=>"password",
"BackgroundSearchPackage" => {
"action"=>"submit",
"type"=>"demo product",
...
"PersonalData" => { ... },
...
}
}
}
You can access values in a Ruby hash like this:
# string syntax
my_xml_hash['BackgroundCheck']['BackgroundSearchPackage']['PersonalData']['DemographicDetail']['GovernmentId']
=> "123456789"
# symbol syntax
other_xml_hash[:BackgroundCheck][:BackgroundSearchPackage][:PersonalData][:DemographicDetail]['GovernmentId']
=> "123456789"
If I understood you correctly, you want to send XML via a POST request. But if you use the hash syntax, you will not achieve the result, what you probably want, because rest-client will post your data as parameters and not as XML data!
If you need to adjust only GovernmentID and issuingAuthority, I would do it as follows.
require 'rest_client'
# the customized 'GovernmentID'
government_id = '123'
# the customized 'issuingAuthority'
issuing_authority = 'FOO'
xml_template =<<END_OF_XML
<BackgroundCheck userId="username" password="password">
<BackgroundSearchPackage action="submit" type="demo product">
<ReferenceId>some_id_value</ReferenceId>
<PersonalData>
<PersonName>
<GivenName>John</GivenName>
<MiddleName>Q</MiddleName>
<FamilyName>Test</FamilyName>
</PersonName>
<Aliases>
<PersonName>
<GivenName>Jack</GivenName>
<MiddleName>Quigley</MiddleName>
<FamilyName>Example</FamilyName>
</PersonName>
</Aliases>
<DemographicDetail>
<GovernmentId issuingAuthority="#{issuing_authority}">#{government_id}</GovernmentId>
<DateOfBirth>1973-12-25</DateOfBirth>
</DemographicDetail>
<PostalAddress>
<PostalCode>83201</PostalCode>
<Region>UT</Region>
<Municipality>Salt Lake City</Municipality>
<DeliveryAddress>
<AddressLine>1234</AddressLine>
<StreetName>Main Street</StreetName>
</DeliveryAddress>
</PostalAddress>
<EmailAddress>john#test.com</EmailAddress>
<Telephone>801-789-4229</Telephone>
</PersonalData>
</BackgroundSearchPackage>
</BackgroundCheck>
END_OF_XML
# Go to http://requestb.in/ , click on "Create a RequestBin", copy the "Bin URL" and use it for your tests ;-)
response = RestClient.post('http://your.target.tld/your/webservice', xml_template, { content_type: :xml })
puts "Response: #{response.inspect}"
REXML example:
require 'rest_client'
require 'rexml/document'
xml_string =<<END_OF_XML
<BackgroundCheck userId="username" password="password">
<BackgroundSearchPackage action="submit" type="demo product">
<ReferenceId>some_id_value</ReferenceId>
<PersonalData>
<PersonName>
<GivenName>John</GivenName>
<MiddleName>Q</MiddleName>
<FamilyName>Test</FamilyName>
</PersonName>
<Aliases>
<PersonName>
<GivenName>Jack</GivenName>
<MiddleName>Quigley</MiddleName>
<FamilyName>Example</FamilyName>
</PersonName>
</Aliases>
<DemographicDetail>
<GovernmentId issuingAuthority="SSN">123456789</GovernmentId>
<DateOfBirth>1973-12-25</DateOfBirth>
</DemographicDetail>
<PostalAddress>
<PostalCode>83201</PostalCode>
<Region>UT</Region>
<Municipality>Salt Lake City</Municipality>
<DeliveryAddress>
<AddressLine>1234</AddressLine>
<StreetName>Main Street</StreetName>
</DeliveryAddress>
</PostalAddress>
<EmailAddress>john#test.com</EmailAddress>
<Telephone>801-789-4229</Telephone>
</PersonalData>
</BackgroundSearchPackage>
</BackgroundCheck>
END_OF_XML
# Build XML document from string
doc = REXML::Document.new(xml_string)
government_element = REXML::XPath.first(doc, "//GovernmentId")
# Read values:
puts government_element.text
puts government_element.attributes['issuingAuthority']
# OR directly via XPath
puts REXML::XPath.first(doc, "//GovernmentId").text
puts REXML::XPath.first(doc, "//GovernmentId/#issuingAuthority").value
# Write values:
government_element.text = 'my new text value'
government_element.attributes['issuingAuthority'] = 'my new attribute value'
# Go to http://requestb.in/ , click on "Create a RequestBin", copy the "Bin URL" and use it for your tests ;-)
response = RestClient.post('http://your.target.tld/your/webservice', doc.to_s, { content_type: :xml })
puts "Response: #{response.inspect}"
If you need to write complex XML trees, I recommend you to take a look at the following gems:
Nokogiri
LibXml Ruby
XmlSimple
REXML (Ruby built in)
Or use a templating engine like ERB, to simplify it.

How to update all fields in MailChimp API batch subscribe using Ruby and Gibbon

I am using Ruby 1.9.3 without Rails and version 1.0.4 of the Gibbon gem.
I have referrals populated with my list and can send the following to MailChimp with Gibbon. However, only the email address and email type fields are populated in the list in MailChimp. What am I doing wrong that is prohibiting all the merge fields from being imported via API?
Here is the batch and map of the list.
referrals.each_slice(3) do |batch|
begin
prepared_batch = batch.map do |referral|
{
:EMAIL => {:email => referral['client_email']},
:EMAIL_TYPE => 'html',
:MMERGE6 => referral['field_1'],
:MMERGE7 => referral['field_2'],
:MMERGE8 => referral['field_3'],
:MMERGE9 => referral['field_4'],
:MMERGE11 => referral['field_5'],
:MMERGE12 => referral['field_6'],
:MMERGE13 => referral['field_7'],
:MMERGE14 => referral['field_8'],
:MMERGE15 => referral['field_9'],
:FNAME => referral['client_first_name']
}
end
#log.info("prepared_batch : #{prepared_batch}")
result = #gibbon.lists.batch_subscribe(
:id => #mc_list_id,
:batch => prepared_batch,
:double_optin => false,
:update_existing => true
)
#log.info("#{result}")
rescue Exception => e
#log.warn("Unable to load batch into mailchimp because #{e.message}")
end
end
The above executes successfully. However, only the email address and email type are populated but most of the fields should be populated.
Here is my log output for one of the prepared_batches. I replaced the real values with Value. I used my own email for testing.
I, [2013-11-11T09:01:14.778907 #70827] INFO -- : prepared_batch : [{:EMAIL=>
{:email=>"jason+6#marketingscience.co"}, :EMAIL_TYPE=>"html", :MMERGE6=>"Value",
:MMERGE7=>"Value", :MMERGE8=>nil, :MMERGE9=>nil, :MMERGE11=>"8/6/13 0:00",
:MMERGE12=>"Value", :MMERGE13=>nil, :MMERGE14=>"10/18/13 19:09", :MMERGE15=>"Value",
:FNAME=>"Value"}, {:EMAIL=>{:email=>"jason+7#marketingscience.co"}, :EMAIL_TYPE=>"html",
:MMERGE6=>"Value", :MMERGE7=>"Value", :MMERGE8=>nil, :MMERGE9=>nil, :MMERGE11=>"8/6/13
0:00", :MMERGE12=>"Value", :MMERGE13=>nil, :MMERGE14=>nil, :MMERGE15=>"Value",
:FNAME=>"Value"}, {:EMAIL=>{:email=>"jason+8#marketingscience.co"}, :EMAIL_TYPE=>"html",
:MMERGE6=>"Value", :MMERGE7=>"Value", :MMERGE8=>nil, :MMERGE9=>nil, :MMERGE11=>"8/7/13
0:00", :MMERGE12=>"Value", :MMERGE13=>nil, :MMERGE14=>nil, :MMERGE15=>"Value",
:FNAME=>"Value"}]
Here is the log output of result from the MailChimp call.
I, [2013-11-11T09:01:14.778691 #70827] INFO -- : {"add_count"=>3, "adds"=>
[{"email"=>"jason+3#marketingscience.co", "euid"=>"ab512177b4", "leid"=>"54637465"},
{"email"=>"jason+4#marketingscience.co", "euid"=>"eeb8388524", "leid"=>"54637469"},
{"email"=>"jason+5#marketingscience.co", "euid"=>"7dbc84cb75", "leid"=>"54637473"}],
"update_count"=>0, "updates"=>[], "error_count"=>0, "errors"=>[]}
Any advice on how to get all the fields to update in MailChimp is appreciated. Thanks.
Turns out the documentation for using the Gibbon gem to batch subscribe is not correct. You need to add the :merge_vars struct to contain the fields other than email and email type. My final code looks like the following. I'm also going to update this code in its entirety at: https://gist.github.com/analyticsPierce/7434085.
referrals.each_slice(3) do |batch|
begin
prepared_batch = batch.map do |referral|
{
:EMAIL => {:email => referral['email']},
:EMAIL_TYPE => 'html',
:merge_vars => {
:MMERGE6 => referral['field_1'],
:MMERGE7 => referral['field_2'],
:MMERGE8 => referral['field_3'],
:MMERGE9 => referral['field_4'],
:MMERGE11 => referral['field_5'],
:MMERGE12 => referral['field_6'],
:MMERGE13 => referral['field_7'],
:MMERGE14 => referral['field_8'],
:MMERGE15 => referral['field_9'],
:FNAME => referral['first_name']
}
}
end
#log.info("prepared_batch : #{prepared_batch}")
result = #gibbon.lists.batch_subscribe(
:id => #mc_list_id,
:batch => prepared_batch,
:double_optin => false,
:update_existing => true
)
#log.info("#{result}")
rescue Exception => e
#log.warn("Unable to load batch into mailchimp because #{e.message}")
end
end

WickedPdf stopped working on my local system

I'm getting this error while generating pdf using wkhtmltopdf
undefined method `pdf_from_string' for #<WickedPdf:0x7f4b82a369c8>
my wicked_pdf.rb
WickedPdf.config = {
:wkhtmltopdf => '/usr/local/bin/wkhtmltopdf',
:layout => "pdf.html",
:margin => { :top=> 40,
:bottom => 20,
:left=> 30,
:right => 30},
:header => {:html => { :template=> 'layouts/pdf_header.html'}},
:footer => {:html => { :template=> 'layouts/pdf_footer.html'}}
# :exe_path => '/usr/bin/wkhtmltopdf'}
on command line
wkhtmltopdf google.com google.pdf
is working fine.
"pdf_from_string" means, that it makes pdf from STRING.
So to make this method work it should recieve string.
<WickedPdf:0x7f4b82a369c8> - it is an object.
It should look like this:
pdf_from_string("<p>some html code</p>")
You will get this message when calling pdf_from_string on the class itself, instead of an instance.
WickedPdf.pdf_from_string('<p>some html code</p>')
Will not work, however:
WickedPdf.new.pdf_from_string('<p>some html code</>')
will, because new returns an instance, which you could then call pdf_from_string on.
This is the same as this:
pdf_generator = WickedPdf.new
pdf = pdf_generator.pdf_from_string('<p>some html code</p>')

RCov doesn't work

I am currently developing a Ruby gem and want to create metrics.
I am using 'metric_fu', but RCov seems to leave my specs.
Here is my metric_fu configuration:
MetricFu::Configuration.run do |config|
config.metrics = [:churn, :saikuro, :flog, :flay, :reek, :roodi, :rcov]
config.graphs = [:flog, :flay, :reek, :roodi, :rcov]
config.flay = { :dirs_to_flay => ['lib'] }
config.flog = { :dirs_to_flog => ['lib'] }
config.reek = { :dirs_to_reek => ['lib'] }
config.roodi = { :dirs_to_roodi => ['lib'] }
config.saikuro = { :output_directory => 'scratch_directory/saikuro',
:input_directory => ['lib'],
:cyclo => "",
:filter_cyclo => "0",
:warn_cyclo => "5",
:error_cyclo => "7",
:formater => "text"} #this needs to be set to "text"
config.churn = { :start_date => "1 year ago", :minimum_churn_count => 10}
config.rcov = { :test_files => ["spec/**/*_spec.rb"],
:rcov_opts => ["--sort coverage",
"--no-html",
"--text-coverage",
"--no-color",
"--profile",
"--spec-only",
"--exclude /gems/,/Library/,spec"]}
end
Do you have some tips?
Best regards
Well this is going to be hard to diagnose without a stack trace but I would suggest changing your config to this:
MetricFu::Configuration.run do |config|
config.metrics = [:rcov]
config.graphs = [:rcov]
config.rcov = { :test_files => ["spec/**/*_spec.rb"],
:rcov_opts => ["--sort coverage",
"--no-html",
"--text-coverage",
"--no-color",
"--profile",
"--spec-only",
"--exclude /gems/,/Library/,spec"]}
end
So you can isolate the problem. Then run 'rake metrics:all --trace' and if you can't figure it out from there, post the results either here or the metric_fu google group: http://groups.google.com/group/metric_fu
You can also try running rcov straight from the command line (which is essentially what metric_fu does).
Hope that helps.

Resources