using mongdb, ruby and sinatra how can use math and aggregate functions? - ruby

I have a mongodb collection. Using the ruby driver, the following works:
search = 'LONDON'
result = posts.find(:district => search).to_a.to_json
and produces the following in the firebug console:
[{"_id":"{AB46B6E4-46F7-44F6-8D88-0002C05947BB}","price":"450000","date_sold":"2013-10-23 00:00","post_code":"NW6 2DT","house_type":"F","condition":"N","freehold":"L","house_number":"72","flat_number":"FLAT 3","street":"LOVERIDGE ROAD","town":null,"district":"LONDON","region":"CAMDEN","county":"GREATER LONDON"}]
There are 30 records in the collection, when I change to an aggregate function as follows:
result = posts.find( { :price => { $gt => 100000 } } ).to_a.to_json
I get an empty [ ] in the console. Is this because the data type, in the collection, is not set to integer? If so, how can I change it programatically (i.e. not in the shell)?
Or is the query wrong? I am using the mongodb ruby driver.
All help gratefully received, thank you.

You are so close - your example needs quotes around $gt,
and the value of the price field needs to be integer and not string so that $gt will work as you desire.
Here's a test that verifies this. Hope that this helps.
test.rb
require 'mongo'
require 'json'
require 'test/unit'
class MyTest < Test::Unit::TestCase
def setup
#posts = Mongo::MongoClient.new['test']['posts']
#docs = JSON.parse <<-EOT
[{"_id":"{AB46B6E4-46F7-44F6-8D88-0002C05947BB}","price":"450000","date_sold":"2013-10-23 00:00","post_code":"NW6 2DT","house_type":"F","condition":"N","freehold":"L","house_number":"72","flat_number":"FLAT 3","street":"LOVERIDGE ROAD","town":null,"district":"LONDON","region":"CAMDEN","county":"GREATER LONDON"}]
EOT
#posts.remove
#posts.insert(#docs)
end
test "find examples" do
result = #posts.find( { :price => { '$gt' => 100000 } } ).to_a
assert(result.count == 0)
puts "result from post: #{result.to_json}"
#docs.each{|doc| doc.delete("_id"); doc["price"] = doc["price"].to_i}
#posts.insert(#docs)
result = #posts.find( { :price => { '$gt' => 100000 } } ).to_a
assert(result.count > 0)
puts "result after fixes: #{result.to_json}"
end
end
ruby test.rb
Loaded suite test
Started
result from post: []
result after fixes: [{"_id":{"$oid": "5355e4c3a3f57661f3000001"},"price":450000,"date_sold":"2013-10-23 00:00","post_code":"NW6 2DT","house_type":"F","condition":"N","freehold":"L","house_number":"72","flat_number":"FLAT 3","street":"LOVERIDGE ROAD","town":null,"district":"LONDON","region":"CAMDEN","county":"GREATER LONDON"}]
.
Finished in 0.00482 seconds.
1 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
100% passed
207.47 tests/s, 414.94 assertions/s

Related

Trouble building Jbuilder nested array

I am trying to write e ruby script that query an elasticsearch database and build a report on it. I am using jbuilder to build the query string like this:
require 'elasticsearch'
require 'date'
require 'jbuilder'
client = Elasticsearch::Client.new log: true, host: 'x.x.x.x', request_timeout: 10
filter_conditions = {}
filter_conditions['must'] = []
filter_conditions['should'] = []
filter_conditions['must'] << Jbuilder.encode do |json|
json.term do
json._type 'httpry-log'
end
end
filter_conditions['must'] << Jbuilder.encode do |json|
json.range do
json.set! '#timestamp' do
_now = DateTime.now
json.gte (_now - 1.00/24).strftime('%Q').to_i
json.lte _now.strftime('%Q').to_i
json.format 'epoch_millis'
end
end
end
query = Jbuilder.encode do |json|
json.size 10
json.query do
json.bool do
json.must do
json.array!(filter_conditions['must'])
end
end
end
end
puts query
But here is the result I get for the query:
{"size":10,"query":{"bool":{"must":["{\"term\":{\"_type\":\"httpry-log\"}}","{\"range\":{\"#timestamp\":{\"gte\":1477919154057,\"lte\":1477922754057,\"format\":\"epoch_millis\"}}}"]}}}
How to get the unscaped version of the inner array inside the main json output?
Thanks in advance,
Assuming that the query returns the same object and key each time:
a = your_query_hash
# The following map! will parse your string array into an array in ruby
a[:query][:bool][:must].map! { |arr| JSON.parse(arr) }
If the query does not return the same object and key each time, I'd suggest writing a recursive method that parses each value.

Code not actually asserting in RSpec?

I'm new to Ruby and in various open source software I've noticed a number of "statements" in some RSpec descriptions that appear not to accomplish what they intended, like they wanted to make an assertion, but didn't. Are these coding errors or is there some RSpec or Ruby magic I'm missing? (Likelihood of weirdly overloaded operators?)
The examples, with #??? added to the suspect lines:
(rubinius/spec/ruby/core/array/permutation_spec.rb)
it "returns no permutations when the given length has no permutations" do
#numbers.permutation(9).entries.size == 0 #???
#numbers.permutation(9) { |n| #yielded << n }
#yielded.should == []
end
(discourse/spec/models/topic_link_spec.rb)
it 'works' do
# ensure other_topic has a post
post
url = "http://#{test_uri.host}/t/#{other_topic.slug}/#{other_topic.id}"
topic.posts.create(user: user, raw: 'initial post')
linked_post = topic.posts.create(user: user, raw: "Link to another topic: #{url}")
TopicLink.extract_from(linked_post)
link = topic.topic_links.first
expect(link).to be_present
expect(link).to be_internal
expect(link.url).to eq(url)
expect(link.domain).to eq(test_uri.host)
link.link_topic_id == other_topic.id #???
expect(link).not_to be_reflection
...
(chef/spec/unit/chef_fs/parallelizer.rb)
context "With :ordered => false (unordered output)" do
it "An empty input produces an empty output" do
parallelize([], :ordered => false) do
sleep 10
end.to_a == [] #???
expect(elapsed_time).to be < 0.1
end
(bosh/spec/external/aws_bootstrap_spec.rb)
it "configures ELBs" do
load_balancer = elb.load_balancers.detect { |lb| lb.name == "cfrouter" }
expect(load_balancer).not_to be_nil
expect(load_balancer.subnets.sort {|s1, s2| s1.id <=> s2.id }).to eq([cf_elb1_subnet, cf_elb2_subnet].sort {|s1, s2| s1.id <=> s2.id })
expect(load_balancer.security_groups.map(&:name)).to eq(["web"])
config = Bosh::AwsCliPlugin::AwsConfig.new(aws_configuration_template)
hosted_zone = route53.hosted_zones.detect { |zone| zone.name == "#{config.vpc_generated_domain}." }
record_set = hosted_zone.resource_record_sets["\\052.#{config.vpc_generated_domain}.", 'CNAME'] # E.g. "*.midway.cf-app.com."
expect(record_set).not_to be_nil
record_set.resource_records.first[:value] == load_balancer.dns_name #???
expect(record_set.ttl).to eq(60)
end
I don't think there is any special behavior. I think you've found errors in the test code.
This doesn't work because there's no assertion, only a comparison:
#numbers.permutation(9).entries.size == 0
It would need to be written as:
#numbers.permutation(9).entries.size.should == 0
Or using the newer RSpec syntax:
expect(#numbers.permutation(9).entries.size).to eq(0)

How to get more records out of the Twitter gem?

I'm banging my head trying to understand how the Twitter gem's pagination works.
I've tried max_id and cursor and they both strangely don't work.
Basically the maximum I can get out of search results is 100, and I would like to get 500.
Current code:
max_page = 5
max_id = -1
#data = []
for i in (1..max_page)
t = twt_client.search("hello world", :count => 100, :result_type => :recent, :max_id => max_id)
t.each do | tweet |
#data << tweet
end
max_id = t.next_results[:max_id]
end
This actually tells me that next_results is a private method, anyone has a working solution?
Without knowing which gem you're referencing (please specify a URL), I'd say intiuitively that cursor and max_id wouldn't get you what you want. However count would. Since you say you're only retrieving 100 results and count is set to 100, that would make sense to me.
t = twt_client.search("hello world", :count => 500, :result_type => :recent, :max_id => max_id)
I'm assuming you're talking about the Twitter client referenced here. My first question is: What's twt_client and for that matter, what does its search method return? It's also possible that you've unwittingly updated the gem and there's been a code base change that makes your current script out of date.
Take a look at your installed gem version and another look at the README here.
Twitter::SearchResults#next_results is private, because they try to provide uniform interface for enumeration.
Look, there's included Twitter::Enumerable in search_results.rb
module Twitter
class SearchResults
include Twitter::Enumerable
...
private
def last?
!next_page?
end
...
def fetch_next_page
response = #client.send(#request_method, #path, next_page).body
self.attrs = response
end
...
end
end
And if you look at enumerable.rb, you'll see that method's Twitter::SearchResults#last? and Twitter::SearchResults#fetch_next_page are used by Twitter::SearchResults#each method
module Twitter
module Enumerable
include ::Enumerable
# #return [Enumerator]
def each(start = 0)
return to_enum(:each, start) unless block_given?
Array(#collection[start..-1]).each do |element|
yield(element)
end
unless last?
start = [#collection.size, start].max
fetch_next_page
each(start, &Proc.new)
end
self
end
...
end
end
And Twitter::SearchResults#each will iterate over pages until there's #attrs[:search_metadata][:next_results] in Twitter's responses. So you need to break iteration after you'll reach 500th element.
I think you just need to use each
#data = []
tweet_number = 1
search_results = twt_client.search("hello world", count: 100, result_type: :recent)
search_results.each do |tweet|
#data << tweet
break if tweet_number == 500
end
This post is a result of looking into gem's sources and twitter's api. I could make a mistake somewhere, since I haven't checked my thoughts in console.
Try this (I basically only updated the calculation of the max_id in the loop):
max_page = 5
max_id = -1
#data = []
for i in (1..max_page)
t = twt_client.search("hello world", :count => 100, :result_type => :recent, :max_id => max_id)
t.each do | tweet |
#data << tweet
end
max_id = t.to_a.map(&:id).max + 1 # or may be max_id = t.map(&:id).max + 1
end

Mongo / Ruby driver output specific number of documents at a time?

Ruby Mongo Driver question:
How do I output 5_000 document batches from the collection at a time until I read the last document in the collection without dumping the entire database into memory first?
This is really bad method for me:
mongo = MongoClient.new('localhost', 27017)['sampledb']['samplecoll']
#whois.find.to_a....
Mongo::Collection#find returns a Mongo::Cursor that is Enumerable. For batch processing Enumerable#each_slice is your friend and well worth adding to your toolkit.
Hope that you like this.
find_each_slice_test.rb
require 'mongo'
require 'test/unit'
class FindEachSliceTest < Test::Unit::TestCase
def setup
#samplecoll = Mongo::MongoClient.new('localhost', 27017)['sampledb']['samplecoll']
#samplecoll.remove
end
def test_find_each_slice
12345.times{|i| #samplecoll.insert( { i: i } ) }
slice__max_size = 5000
#samplecoll.find.each_slice(slice__max_size) do |slice|
puts "slice.size: #{slice.size}"
assert(slice__max_size >= slice.size)
end
end
end
ruby find_each_slice_test.rb
Run options:
# Running tests:
slice.size: 5000
slice.size: 5000
slice.size: 2345
.
Finished tests in 6.979301s, 0.1433 tests/s, 0.4298 assertions/s.
1 tests, 3 assertions, 0 failures, 0 errors, 0 skips

Ruby Test:Unit, how to know fail/pass status for each test case in a test suite?

This question sounds stupid, but I never found an answer online to do this.
Assume you have a test suite like this page:
http://en.wikibooks.org/wiki/Ruby_Programming/Unit_testing
or code:
require "simpleNumber"
require "test/unit"
class TestSimpleNumber < Test::Unit::TestCase
def test_simple
assert_equal(4, SimpleNumber.new(2).add(2) )
assert_equal(4, SimpleNumber.new(2).multiply(2) )
end
def test_typecheck
assert_raise( RuntimeError ) { SimpleNumber.new('a') }
end
def test_failure
assert_equal(3, SimpleNumber.new(2).add(2), "Adding doesn't work" )
end
end
Running the code:
>> ruby tc_simpleNumber2.rb
Loaded suite tc_simpleNumber2
Started
F..
Finished in 0.038617 seconds.
1) Failure:
test_failure(TestSimpleNumber) [tc_simpleNumber2.rb:16]:
Adding doesn't work.
<3> expected but was
<4>.
3 tests, 4 assertions, 1 failures, 0 errors
Now, how to use a variable (what kind?) to save the testing results?
e.g., an array like this:
[{:name => 'test_simple', :status => :pass},
{:name => 'test_typecheck', :status => :pass},
{:name => 'test_failure', :status => :fail},]
I am new to testing, but desperate to know the answer...
you need to execute your test script file, that's it, the result will display pass or fails.
Suppose you save file test_unit_to_rspec.rb, after that execute below command
ruby test_unit_to_rspec.rb
Solved the problem with setting a high verbose level, in a test runner call.
http://ruby-doc.org/stdlib-1.8.7/libdoc/test/unit/rdoc/Test/Unit/UI/Console/TestRunner.html
require 'test/unit'
require 'test/unit/ui/console/testrunner'
class MySuperSuite < Test::Unit::TestSuite
def self.suite
suites = self.new("My Super Test Suite")
suites << TestSimpleNumber1
suites << TestSimpleNumber2
return suites
end
end
#run the suite
# Pass an io object
#new(suite, output_level=NORMAL, io=STDOUT)
runner = Test::Unit::UI::Console::TestRunner.new(MySuperSuite, 3, io)
results will be saved in the io stream in a nice format fo each test case.
What about using '-v' (verbose):
ruby test_unit_to_rspec.rb -v
This should show you a lot more information
You can check out another of Nat's posts for a way to capture the results. The short answer to your question is that there is no variable for capturing the results. All you get is:
Loaded suite My Special Tests
Started
..
Finished in 1.000935 seconds.
2 tests, 2 assertions, 0 failures, 0 errors
Which is not very helpful if you want to report to someone else what happened. Nat's other post shows how to wrap the Test::Unit in rspec to get a better result and more flexibility.
class Test::Unit::TestCase
def setup
#id = self.class.to_s()
end
def teardown
#test_result = "pass"
if(#_result.failure_count > 0 || #_result.error_count > 0)
#test_result = "fail"
# making sure no errors/failures exist before the next test case runs.
i = 0
while(i < #_result.failures.length) do
#_result.failures.delete_at(i)
i = i + 1
end
while(i < #_result.errors.length) do
#_result.errors.delete_at(i)
i = i + 1
end
#test_result = "fail"
end # if block ended
puts"#{#id}: #{#test_result}"
end # teardown definition ended
end # class Test::Unit::TestCase ended
Example Output :
test1: Pass
test2: fail
so on....

Resources