I want to store objects in jruby for a short time. The objects use procs so I seem to have trouble storing it into db. If anyone has good ideas for how to persist jruby objects for 1-5 mins it would be great.
These objects are quite large, specifically, celerity browser objects.
For now, I have created a model in jruby like so:
class Persist
##persistHash ||= Hash.new
def self.storeItem(id, item)
##persistHash[id.to_s] = item
end
def self.getItem(id)
return ##persistHash[id.to_s]
end
end
I have warbled the app and deployed it to glassfish v2.
I run the program and it works fine for a while. But after a day, if I call 'get' right after 'store' (10-20 secs) I am returned nil.
I can't find any errors in logs.
EDIT: I have also found that the item is indeed inserted into the hashtable (the hashtable did not run out of memory during insert):
Before 24 hrs:
Persist.storeItem() followed by Persist.getItem() works fine.
A http call for store. Then another http call for get returns the object.
After 24 hrs:
Persist.storeItem() followed by Persist.getItem() works fine.
A http call for store. Then another http call for get returns nil.
I don't see the object being deleted at any point.
I would examine the JVM using other tools. It could very well be that you've exhausted memory but the log message / exception never gets created to report the memory exhaustion.
May I suggest hooking up JMX monitoring of the various heap regions and create a means by which the hash can purged of old objects.
Wish I had more for you. Good luck!
Related
I'm building a very simple app using Sinatra. I am not required to use persistent storage so I'm not using a database; however, I want to keep an object that contains a record of all my transactions. The object should not reinitialize when there is a new HTTP request.
I have tried putting a #transactions variable into an initialize method and I've tried set :transactions, Transactions.new, both in my controller (neither of which worked). I just tried
configure do
##transactions = Transactions.new
end
and it's still saying the object is nil (the Transactions initialize method doesn't use params and initializes all instance variables, so nothing should be nil).
Are there other ways to accomplish this?
What you want is persistent storage. I know that you're trying to avoid it but you should perhaps just bite the bullet.
You could invent one from scratch like for example using a JSON file on the file system but thats not really going to save you any time. Better yet is to use a memory based storage like memcached, or an actual database like SQLite.
Using a class variable is a dead end since they are not thread safe and will not actually persist after the program ends.
Context
I'm making a game in ruby. It has a model named Character with attributes like energy and money. A Character can have a behavior, e.g. sleeping. Finally the Character has a tick method that calls its behavior method sleep!. Simplified it looks like this:
class Character
attr_accessor :energy
def initialize(behavior)
#current_behavior = behavior
end
def tick
self.send "#{#current_behavior}!"
end
private
def sleep!
self.energy -= 1 if energy > 0
end
end
As in a lot of games the tick method of every Character needs to be invoked every n minutes. I want to use EventMachine for this. In a periodic timer it calls Character.tick_all that should invoke the tick method of every Character instance.
Question
What kind of persistence engine can I use for this? On server startup I want to load all Character instances in memory so they can be ticked. For now its ok if every instance gets persisted after its state changes because of a tick.
I've tried it with Rails and ActiveRecord. But it requires at least one write and read action for every tick which seems a bit of an overkill.
Edit
I've looked into SuperModel. It seems to do exactly what I want, but it's last commit was about a year ago...
For simple lookups and storage of data in memory, there are 2 choices as far as I know:
Memcached: This is one of the most simple key-value stores around, and allows to easily SET and GET values that are associated with keys. However, as soon as you kill the process, all memories are flushed/destroyed, and it is a bit of a problem with storing data from users
Redis: Redis provides all the simple key-value lookup functionality of Memcached and much more. Apart from the fact, that is Redis is great of working with hashes, Redis provides set functionality (= not storing duplicates), and sorted set functionality (e.g. for scoring and ranking items). Also, Redis keeps track of data on the file-system and this is great when you need to restart a process once in a while.
In Ruby practice is to send id instead of object in workers. Isn't that kind of CPU consuming process because we have to retrieve Object again from database.
Several reasons:
Saves space on the queue, also transfer time (app => queue, queue => workers).
Often it is easier to fetch fresh object from the database (as opposed to retrieving cached copy from the queue)
Argument to Resque.enqueue must be JSON-serializable. Complex objects not always can be serialized.
If you think about it the reasons are pretty obvious:
your object may change between the time te action is queued and handled and in general you don't want an outdated object.
an id a a lot lighter to transport than a whole object which you will need to serialize it in json/yaml or anything else.
if you need the associations the problem just got even worse :)
But in the end it depends on your application, if you only need some informations you can just send them to your worker directly without even using the full model.
My C# 3.5 application uses SQL Server 2008 R2, NHibernate and CastleProject ActiveRecord. The application imports emails to database along with their attachments. Saving of emails and attachments is performed by 50 emails in new session and transaction scope to make sure they are not stored in memory (there can be 100K of emails in some mailbox).
Initially emails are saved very quickly. However, closer to 20K emails performance degrades dramatically. Using dotTrace I got the following picture:
Obviously, when I save an attachment, NHibernate tries to see if it really should save it and probably compares with another attachments in the session. To do so, it compares them byte by byte what takes almost 500 seconds (for the snapshot on the picture) and 600M enumerator operations.
All this looks crazy, especially when I know for sure that SaveAndFlush indeed should save the attachment without any checks: I know for sure that it is new and should be saved.
However, I cannot figure out, how to instruct NHibernate to avoid this check (IsUpdateNecessary). Please advise.
P.S. I am not sure but it might appear that degradation of performance closer to 20K is not connected with having some older mails in memory: I noticed that in mailbox I am working with, larger emails are stored later than smaller so the problem may be only in attachments comparison.
Update:
Looks like I need something like StatelessSessionScope, but there is no documentation on it even at CastleProject site! If I do something like
using (TransactionScope txScope = new TransactionScope())
using (StatelessSessionScope scope = new StatelessSessionScope())
{
mail.Save();
}
it fails with exception that Save is not supported by stateless session. I am supposed to insert objects into session, but I do not have any Session (only SessionScope, which adds up to SessionScope only single OpenSession method which accepts strange paramenters).
May be I missed it in that long text, but are you using stateless session for importing data? Using that prevents a lot of checks and also bypasses first level cache, thus using minimal resources.
Looks like I've found an easy solution: for my class Attachment, causing biggest performance penalty, I overridden the following method:
protected override int[] FindDirty(
object id,
System.Collections.IDictionary previousState,
System.Collections.IDictionary currentState, NHibernate.Type.IType[] types)
{
return new int[0];
}
Thus, dirty check always consider it dirty and does not do that crazy per-byte comparison.
I have a Sinatra app that basically takes some input values and then finds data matching those values from external services like Flickr, Twitter, etc.
For example:
input:"Chattanooga Choo Choo"
Would go out and find images at Flickr on the Chattanooga Choo Choo and tweets from Twitter, etc.
Right now I have something like:
#images = Flickr::...find...images..
#tweets = Twitter::...find...tweets...
#results << #images
#results << #tweets
So my question is, is there an efficient way in Ruby to run those requests concurrently? Instead of waiting for the images to finish before the tweets finish.
Threads would work, but it's a crude tool. You could try something like this:
flickr_thread = Thread.start do
#flickr_result = ... # make the Flickr request
end
twitter_thread = Thread.start do
#twitter_result = ... # make the Twitter request
end
# this makes the main thread wait for the other two threads
# before continuing with its execution
flickr_thread.join
twitter_thread.join
# now both #flickr_result and #twitter_result have
# their values (unless an error occurred)
You'd have to tinker a bit with the code though, and add proper error detection. I can't remember right now if instance variables work when declared inside the thread block, local variables wouldn't unless they were explicitly declared outside.
I wouldn't call this an elegant solution, but I think it works, and it's not too complex. In this case there is luckily no need for locking or synchronizations apart from the joins, so the code reads quite well.
Perhaps a tool like EventMachine (in particular the em-http-request subproject) might help you, if you do a lot of things like this. It could probably make it easier to code at a higher level. Threads are hard to get right.
You might consider making a client side change to use asynchronous Ajax requests to get each type (image, twitter) independently. The problem with server threads (one of them anyway) is that if one service hangs, the entire request hangs waiting for that thread to finish. With Ajax, you can load an images section, a twitter section, etc, and if one hangs the others will still show their results; eventually you can timeout the requests and show a fail whale or something in that section only.
Yes why not threads?
As i understood. As soon as the user submit a form, you want to process all request in parallel right? You can have one multithread controller (Ruby threads support works really well.) where you receive one request, then you execute in parallel the external queries services and then you answer back in one response or in the client side you send one ajax post for each service and process it (maybe each external service has your own controller/actions?)
http://github.com/pauldix/typhoeus
parallel/concurrent http requests
Consider using YQL for this. It supports subqueries, so that you can pull everything you need with a single (client-side, even) call that just spits out JSON of what you need to render. There are tons of tutorials out there already.