mongoid query caching - ruby

Rails' ActiveRecord has a feature called Query Caching (ActiveRecord::QueryCache) which saves the result of SQL query for the life-span of a request. While I'm not very familiar with the internals of the implementation, I think that it saves the query results somewhere in the Rack env, which is discarded in the end of the request.
The Mongoid, unfortunately, doesn't currently provide such feature, and this is exacerbated by the fact, that some queries occur implicitly (references).
I'm considering to implement this feature, and I'm curious, where and how Mongoid (or, perhaps, mongo driver?) should be hooked in order to implement this.

Mongoid has caching, described under http://mongoid.org/en/mongoid/docs/extras.html
Also MongoDB itself has caching ability: http://www.mongodb.org/display/DOCS/Caching
The mongoid caching extra knows 2 different cases: Caching of all queries of a model or caching of a query.
Mongoid caching seems to work slightly different: it looks like mongoid delegates caching to mongodb. (In the sources of mongoid I only can find option settings for caching but no cache module.)
Finally would say, there is no real difference in the caching in general -- in memory is in fact in memory! No matter if it's in the app or in the database.
I don't prefer to implement an extra caching algorithm, because this seems to be redundant and a RAM killer.
BTW: If your really want to cache results in-app you could try Rails.cache or another cache gem as a workaround.

The other answer is obviously wrong. Not only mongoid or mongo driver doesn't cache the query, even if mongo would - it still might be on other machine across the network.
My solution was to wrap the receive_message in Mongo::Connection.
Pros: one definite place
Cons: deserialization still takes place
require 'mongo'
module Mongo
class Connection
module QueryCache
extend ActiveSupport::Concern
module InstanceMethods
# Enable the selector cache within the block.
def cache
#query_cache ||= {}
old, #query_cache_enabled = #query_cache_enabled, true
yield
ensure
clear_query_cache
#query_cache_enabled = old
end
# Disable the selector cache within the block.
def uncached
old, #query_cache_enabled = #query_cache_enabled, false
yield
ensure
#query_cache_enabled = old
end
def clear_query_cache
#query_cache.clear
end
def cache_receive_message(operation, message)
#query_cache[operation] ||= {}
key = message.to_s.hash
log = "[MONGO] CACHE %s"
if entry = #query_cache[operation][key]
Mongoid.logger.debug log % 'HIT'
entry
else
Mongoid.logger.debug log % 'MISS'
#query_cache[operation][key] = yield
end
end
def receive_message_with_cache(operation, message, log_message=nil, socket=nil, command=false)
if query_cache_enabled
cache_receive_message(operation, message) do
receive_message_without_cache(operation, message, log_message, socket, command)
end
else
receive_message_without_cache(operation, message, log_message, socket, command)
end
end
end # module InstanceMethods
included do
alias_method_chain :receive_message, :cache
attr_reader :query_cache, :query_cache_enabled
end
end # module QueryCache
end # class Connection
end
Mongo::Connection.send(:include, Mongo::Connection::QueryCache)

OK, Mongoid 4 supports QueryCache middleware.
Just add middleware in application.rb
config.middleware.use "Mongoid::QueryCache::Middleware"
And then profit:
MOPED: 127.0.0.1:27017 QUERY database=XXX collection=page_variants selector={"$query"=>{"_id"=>BSON::ObjectId('5564dabb6d61631e21d70000')}, "$orderby"=>{:_id=>1}} flags=[] limit=-1 skip=0 batch_size=nil fields=nil runtime: 0.4397ms
MOPED: 127.0.0.1:27017 QUERY database=XXX collection=page_variants selector={"$query"=>{"_id"=>BSON::ObjectId('5564dacf6d61631e21dc0000')}, "$orderby"=>{:_id=>1}} flags=[] limit=-1 skip=0 batch_size=nil fields=nil runtime: 0.4590ms
QUERY CACHE database=XXX collection=page_variants selector={"$query"=>{"_id"=>BSON::ObjectId('5564c9596d61631e21d30000')}, "$orderby"=>{:_id=>1}}
QUERY CACHE database=XXX collection=page_variants selector={"$query"=>{"_id"=>BSON::ObjectId('5564dabb6d61631e21d70000')}, "$orderby"=>{:_id=>1}}
Source:
Mongoid changelog
https://github.com/mongoid/mongoid/blob/master/CHANGELOG.md#new-features-2
3410 Mongoid now has a query cache that can be used as a middleware in Rack applications. (Arthur Neves)
For Rails:
config.middleware.use(Mongoid::QueryCache::Middleware)

Mongoid 4.0+ now has a QueryCaching module: http://www.rubydoc.info/github/mongoid/mongoid/Mongoid/QueryCache
You can use it on finds by wrapping your lookups like so:
QueryCache.cache { MyCollection.find("xyz") }

Related

Rails with mutex on class variable, rake task and cron

Sorry for such a big question. I do not have much experience with Rails threads and mutex.
I have a class as follow which is used by different controllers to get the license for each customers.
Customers and their licenses gets added and removed every hour. An api is available to get all customers and their licenses.
I plan to create a rake task to call update_set_customers_licenses, run hourly via a cronjob.
I have following questions:
1) Even with a mutex, currently there is a potential for problem, there is a chance that my rake task can occur while updating. Any idea on how to solve this?
2) My design below writes the json out to a file, this is done is for safety as the api is not that reliable. As can be seen, it is not reading the file back, so in essence the file write is useless. I tried to implement a file read but together with mutex and rake task, it gets really confusing. Any pointers will help here.
class Customer
##customers_to_licenses_hash = nil
##last_updated_at = nil
##mutex = Mutex.new
CUSTOMERS_LICENSES_FILE = "#{Rails.root}/tmp/customers_licenses"
def self.cached_license_with_customer(customer)
Rails.cache.fetch('customer') {self.license_with_customer(customer)}
end
def self.license_with_customer(customer)
##mutex.synchronize do
license = ##customers_to_licenses_hash[customer]
if license
return license
elsif(##customers_to_licenses_hash.nil? || Time.now.utc - ##last_updated_at > 1.hours)
updated = self.update_set_customers_licenses
return ##customers_to_licenses_hash[customer] if updated
else
return nil
end
end
end
def self.update_set_customers_licenses
updated = nil
file_write = File.open(CUSTOMERS_LICENSES_FILE, 'w')
results = self.get_active_customers_licenses
if results
##customers_to_licenses_hash = results
file_write.print(results.to_json)
##last_updated_at = Time.now.utc
updated = true
end
file_write.close
updated
end
def self.get_active_customers_licenses
#http get thru api
#return hash of records
end
end
I'm pretty it's the case that every time rails loads, the environment is "fresh" and has no concept of "state" in between instances. That is to say, a mutex in one ruby instance (the one request to rails) has no effect on a second ruby instance (another request to rails or in this case, a rake task).
If you follow the data upstream, you'll find that the common root of every instance that can be used to synchronize them is the database. You could use transactional blocks or maybe a manual flag you set and unset in the database.

Ruby Mongo: How to disable logging on certain collections?

I am using Mongoid for interacting with MongoDB. In development I usually like to see the logs of what Mongo is doing. However, there is one instance where there is an excessive amount of redundant logging that I simply don't want to see. How can I disable logging in this specific case?
There doesn't appear to be any clean way to control logging in Mongoid on a per collection basis.
However, for your purposes, if you are able to identify the individual calls,
you can turn off logging temporarily by raising the level.
Hope that this is good enough for your purposes.
require 'test_helper'
def dont_log(temp_level = Logger::Severity::UNKNOWN)
logger = Rails.logger
old_level, logger.level = logger.level, temp_level
yield
logger.level = old_level
end
class LogDoTest < ActiveSupport::TestCase
def setup
LogDo.delete_all
LogDont.delete_all
end
test "log do dont" do
LogDo.create(text: 'Log me')
dont_log{ LogDont.create(text: 'Dont log me') }
end
end

Ruby and Timeout.timeout performance issue

I'm not sure how to solve this big performance issue of my application. I'm using open-uri to request the most popular videos from youtube and when I ran perftools https://github.com/tmm1/perftools.rb
It shows that the biggest performance issue is Timeout.timeout. Can anyone suggest me how to solve the problem?
I'm using ruby 1.8.7.
Edit:
This is the output from my profiler
https://docs.google.com/viewer?a=v&pid=explorer&chrome=true&srcid=0B4bANr--YcONZDRlMmFhZjQtYzIyOS00YjZjLWFlMGUtMTQyNzU5ZmYzZTU4&hl=en_US
Timeout is wrapping the function that is actually doing the work to ensure that if the server fails to respond within a certain time, the code will raise an error and stop execution.
I suspect that what you are seeing is that the server is taking some time to respond. You should look at caching the response in some way.
For instance, using memcached (pseudocode)
require 'dalli'
require 'open-uri'
DALLI = Dalli.client.new
class PopularVideos
def self.get
result = []
unless result = DALLI.get("videos_#{Date.today.to_s}")
doc = open("http://youtube/url")
result = parse_videos(doc) # parse the doc somehow
DALLI.set("videos_#{Date.today.to_s}", result)
end
result
end
end
PopularVideos.get # calls your expensive parsing script once
PopularVideos.get # gets the result from memcached for the rest of the day

Best way to cache a response in Sinatra?

I'm building a simple app on the side using an API I made with Sinatra that returns some JSON. It's quite a bit of JSON, my app's API relies on a few hundred requests to other APIs.
I can probably cache the results for 5 days or so, no problem with the data at all. I'm just not 100% sure how to implement the caching. How would I go about doing that with Sinatra?
Personally, I prefer to use redis for this type of things over memcached. I have an app that I use redis in pretty extensively, using it in a similar way to what you described. If I make a call that is not cached, page load time is upwards of 5 seconds, with redis, the load time drops to around 0.3 seconds. You can set an expires time as well, which can be changed quite easily. I would do something like this to retrieve the data from the cache.
require 'redis'
get '/my_data/:id' do
redis = Redis.new
if redis[params[:id]]
send_file redis[params[:id]], :type => 'application/json'
end
end
Then when you wanted to save the data to the cache, perhaps something like this:
require 'redis'
redis = Redis.new
<make API calls here and build your JSON>
redis[id] = json
redis.expire(id, 3600*24*5)
get '/my_data/:id' do
# security check for file-based caching
raise "invalid id" if params[:id] =~ /[^a-z0-9]/i
cache_file = File.join("cache",params[:id])
if !File.exist?(cache_file) || (File.mtime(cache_file) < (Time.now - 3600*24*5))
data = do_my_few_hundred_internal_requests(params[:id])
File.open(cache_file,"w"){ |f| f << data }
end
send_file cache_file, :type => 'application/json'
end
Don't forget to mkdir cache.
alternatively you could use memcache-client, but it will require you to install memcached system-wide.

Case insensitive like (ilike) in Datamapper with Postgresql

We are using Datamapper in a Sinatra application and would like to use case insensitive like that works on both Sqlite (locally in development) and Postgresql (on Heroku in production).
We have statements like
TreeItem.all(:name.like =>"%#{term}%",:unique => true,:limit => 20)
If termis "BERL" we get the suggestion "BERLIN" from both the Sqlite and Postgresql backends. However if termis "Berl" we only get that result from Sqlite and not Postgresql.
I guess this has to do with the fact that both dm-postgres-adapter and dm-sqlite-adapter outputting a LIKE in the resulting SQL query. Since Postgresql has a case sensitive LIKE we get this (for us unwanted) behavior.
Is there a way to get case insensitive like in Datamapper without resorting to use a raw SQL query to the adapter or patching the adapter to use ILIKEinstead of LIKE?
I could of course use something in between, such as:
TreeItem.all(:conditions => ["name LIKE ?","%#{term}%"],:unique => true,:limit => 20)
but then we would be tied to the use of Postgresql within our own code and not just as a configuration for the adapter.
By writing my own data object adapter that overrides the like_operator method I managed to get Postgres' case insensitive ILIKE.
require 'do_postgres'
require 'dm-do-adapter'
module DataMapper
module Adapters
class PostgresAdapter < DataObjectsAdapter
module SQL #:nodoc:
private
# #api private
def supports_returning?
true
end
def like_operator(operand)
'ILIKE'
end
end
include SQL
end
const_added(:PostgresAdapter)
end
end
Eventually I however decided to port the application in question to use a document database.
For other people who happen to use datamapper wanting support for ilike as well as 'similar to' in PostgreSQL: https://gist.github.com/Speljohan/5124955
Just drop that in your project, and then to use it, see these examples:
Model.all(:column.ilike => '%foo%')
Model.all(:column.similar => '(%foo%)|(%bar%)')

Resources