PostgreSQL queries are gobbling up VM - ruby

[Prescript: I know that nothing here is specific to Delayed::Job. But it helps establish the context.]
update
I believe the SQL queries are not being garbage collected. My application generates many large SQL insert/update operations (160K bytes each, about 1 per second) and sends them to PostgreSQL via:
ActiveRecord::Base.connection.execute(my_large_query)
When I perform these db operations, my application slowly grows without bound. When I stub out the db operations (but perform all the other functions in my app) the bloating stops.
So: any ideas on why this is happening, how I can pinpoint it, or how I can make it stop?
original question
I have delayed tasks that slurp data from the web and create records in a PostgreSQL database. They seem to be working okay, but they start at vmemsize=100M and within ten minutes bulk up to vmemsize=500M and just keeps growing. My MacBook Pro with 8G of RAM starts thrashing when the VM runs out.
How can I find where the memory is going?
Before you refer me to other SO posts on the topic:
I've added the following to my #after(job) method:
def after(job)
clss = [Object, String, Array, Hash, ActiveRecord::Base, ActiveRecord::Relation]
clss.each {|cls| object_report(cls, " pre-gc")}
ObjectSpace.each_object(ActiveRecord::Relation).each(&:reset)
GC.start
clss.each {|cls| object_report(cls, "post-gc")}
end
def object_report(cls, msg)
log(sprintf("%s: %9d %s", msg, ObjectSpace.each_object(cls).count, cls))
end
It reports usage on the fundamental classes, explicitly resets ActiveRecord::Relation objects (suggested by this SO post), explicitly does a GC (as suggested by this SO post), and reports on how many Objects / Strings / Arrays / Hashes, etc there are (as suggested by this SO post). For what it's worth, none of those classes are growing significantly. (Are there other classes I should be looking at? But wouldn't that be reflected in the number of Objects anyway?)
I can't use memprof because I'm running Ruby 1.9.
And there are other tools that I'd consider if I were running on Linux, but I'm on OS X.

update
I'm afraid this was all a red herring: left running long enough, each ruby job grows to a vmsize of about 1.2GB (yeah, that big, but not huge by today's standards), then shrinks back down to 850MB and bobbles between those two values thereafter without continuing to grow bigger.
My real problem was that I was trying to run more than four such processes on my machine with 8GB RAM, which filled up all available RAM and then went into swapping hypoxia. Running only four processes almost fills up available memory, so the system doesn't start swapping.
update 2
Nope, still a problem -- I didn't let the jobs run long enough: the jobs grow continually (albeit slowly). Even running just two external jobs eventually consumes all VM and my machine starts thrashing.
I tried running the in production mode (thinking that dev mode may cache things that don't get freed), but it didn't make any appreciable difference.

Related

Performance decays exponentially when inserting bulk data into grails app

We need to seed an application with 3 million entities before running performance tests.
The 3 million entities should be loaded through the application to simulate 3 years of real data.
We are inserting 1-5000 entities at a time. In the beginning response times are very good. But after a while they decay exponentially.
We use at groovy script to hit a URL to start each round of insertions.
Restarting the application resets the response time - i.e. fixes the problem temporally.
Reruns of the script, without restarting the app, have no effect.
We use the following to enhance performance
1) Cleanup GORM after each 100 insertions:
def session = sessionFactory.currentSession
session.flush()
session.clear()
DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()
(old Ted Naleid trick: http://naleid.com/blog/2009/10/01/batch-import-performance-with-grails-and-mysql)
2) We use GPars for parallel insertions:
GParsPool.withPool {
(0..<1000).eachParallel {
def entity = new Entity(...)
insertionService.insert(entity)
}
}
Notes
When looking at the log output, I've noticed that the processing time for each entity are the same, but the system seems to pause longer and longer between each iteration.
The exact number of entities inserted are not important, just around 3 mill, so if some fail we can ignore it.
Tuning the number of entities at a time have little or no effect.
Help
I'm really hoping somebody have a good idea on how to fix the problem.
Environment
Grails: 2.4.2 (GRAILS_OPTS=-Xmx2G -Xms512m -XX:MaxPermSize=512m)
Java: 1.7.0_55
MBP: OS X 10.9.5 (2,6 GHz Intel Core i7, 16 GB 1600 MHz DDR3)
The pausing would make me think it's the JVM doing garbage collection. Have you used a profiler such as VisualVM to see what time is being spent doing garbage collection? Typically this will be the best approach to understanding what is happening with your application within the JVM.
Also, it's far better to load the data directly into the database rather than using your application if you are trying to "seed" the application. Performance wise of course.
(Added as answer per comment)

how to index tons of data at once with Rails, (re)tire, json without eating (all) memory?

In a Rails 3.2.x app, using (Re)tire to access an ES cluster a rake task is going through approx 1M rows to create a new index. (Ruby 1.9.3).
The task is using .to_json with specific attributes and methods listed to limit the resulting hash for each element.
Yet as the task run the memory is eaten away, ending with the process being killed usually by the system.
The task is already using find_by_batch. Smaller batches sizes (using find_each) don't help.
checking without index
Removing the index.import call does improve things (obviously). The task goes through the whole collection very fast without a problem. Pointing to either ES, tire or the JSON conversion (and the relations it might call upon).
reducing the scope of the task
Adding back index.import and passing a very limited hash (with string keys) for each item does make things slower but not too much and does not eat memory away. So json might no be the culprit here.
adding attributes and methods back
The culprit seems to be one of the method used to grab one of the additional attributes. It's based on a relation of the model and another ... Ending up with a lot of models being involved and sifted through.
As pointed out by Index the results of a method in ElasticSearch (Tire + ActiveRecord) adding includes does help a bit but the task does end up heavy too.
going around
I also tried to go around part of the problem and replace the calls to Tire with the use of ES bulk API.
Generating json files and sending them with a Ruby http lib can work. Yet, the same problem arise : memory since the same requests to the DB are made.
What's left ?
What I don't get is why even with the find_by_batch Ruby keeps eating away memory. I would expect that after each batch of data, memory related that batch would be freed.
Next to try : GC.start calls, Active Record caching de activation around the tasks.
Yet, except if a solution limiting the memory use drastically (300 or 500Mo instead of 800+) the background issue is : indexing a lot of instances of a Model including data related to some other models.
am I missing something for the import and includes that would solve the issue ?
would splitting that task into smaller background jobs (resque, sidekiq) help ? I would suppose so as each batch would be isolated from the others and once treated, really free up the memory (?) (orchestrating those tasks would be another trouble)
is there good practices related to indexing big quantities of data into ES ?
I've been using Rails + Elasticsearch for a while and did this kind of dance a few times.
A few things comes to mind, in no particular order.
Did you try to use the recent elasticsearch gem (instead of tire) ? I've updated my apps to use and like having more control on what is done.
I would also try to force a GC sweep after each ActiveRecord loop. You could also be extra careful with memory allocation by explicitly resetting all local variables each time.
You could use the fork & exec trick to fork a brand new process at each loop, it would be the most effective GC you can get. It's a little overhead when you write it the first time, but the pay-off is great. Take good care of limiting the amount of memory used in the outer part of the task. Using a process-based background task would partly achieve the same goal, but you might still get memory bloat.
Can you limit the use of ActiveRecord? If you need some basic associations you could use a lower-level/simpler tool like Sequel (or else) to use Ruby hashes/arrays instead of full fledged AR models.

How to deactivate safe mode in the mongo shell?

Short question is on the title: I work with my mongo Shell wich is in safe mode by default, and I want to gain better performance by deactivating this behaviour.
Long Question for those willing to know the context:
I am working on a huge set of data like
{
_id:ObjectId("azertyuiopqsdfghjkl"),
stringdate:"2008-03-08 06:36:00"
}
and some other fields and there are about 250M documents like that (whole database with the indexes weights 36Go). I want to convert the date in a real ISODATE field. I searched a bit how I could make an update query like
db.data.update({},{$set:{date:new Date("$stringdate")}},{multi:true})
but did not find how to make this work and resolved myself to make a script that take the documents one after the other and make an update to set a new field which takes the new Date(stringdate) as its value. The query use the _id so the default index is used.
Problem is that it takes a very long time. I already figured out that if only I had inserted empty dates object when I created the database I would now get better performances since there is the problem of data relocation when a new field is added. I also set an index on a relevant field to process the database chunk by chunk. Finally I ran several concurrent mongo clients on both the server and my workstation to ensure that the limitant factor is the database lock availability and not any other factor like cpu or network costs.
I monitored the whole thing with mongotop, mongostats and the web monitoring interfaces which confirmed that write lock is taken 70% of the time. I am a bit disappointed mongodb does not have a more precise granularity on its write lock, why not allowing concurrent write operations on the same collection as long as there is no risk of interference? Now that I think about it I should have sharded the collection on a dozen shards even while staying on the same server, because there would have been individual locks on each shard.
But since I can't do a thing right now to the current database structure, I searched how to improve performance to at least spend 90% of my time writing in mongo (from 70% currently), and I figured out that since I ran my script in the default mongo shell, every time I make an update, there is also a getLastError() which is called afterwards and I don't want it because there is a 99.99% chance of success and even in case of failure I can still make an aggregation request after the end of the big process to retrieve the single exceptions.
I don't think I would gain so much performance by deactivating the getLastError calls, but I think itis worth trying.
I took a look at the documentation and found confirmation of the default behavior, but not the procedure for changing it. Any suggestion?
I work with my mongo Shell wich is in safe mode by default, and I want to gain better performance by deactivating this behaviour.
You can use db.getLastError({w:0}) ( http://docs.mongodb.org/manual/reference/method/db.getLastError/ ) to do what you want but it won't help.
This is because for one:
make a script that take the documents one after the other and make an update to set a new field which takes the new Date(stringdate) as its value.
When using the shell in a non-interactive mode like within a loop it doesn't actually call getLastError(). As such downing your write concern to 0 will do nothing.
I already figured out that if only I had inserted empty dates object when I created the database I would now get better performances since there is the problem of data relocation when a new field is added.
I did tell people when they asked about this stuff to add those fields incase of movement but instead they listened to the guy who said "leave them out! They use space!".
I shouldn't feel smug but I do. That's an unfortunately side effect of being right when you were told you were wrong.
mongostats and the web monitoring interfaces which confirmed that write lock is taken 70% of the time
That's because of all the movement in your documents, kinda hard to fix that.
I am a bit disappointed mongodb does not have a more precise granularity on its write lock
The write lock doesn't actually denote the concurrency of MongoDB, this is another common misconception that stems from the transactional SQL technologies.
Write locks in MongoDB are mutexs for one.
Not only that but there are numerous rules which dictate that operations will subside to queued operations under certain circumstances, one being how many operations waiting, another being whether the data is in RAM or not, and more.
Unfortunately I believe you have got yourself stuck in between a rock and hard place and there is no easy way out. This does happen.

Neo4j 1.9.4 (REST Server,CYPHER) performance issue

I have Neo4j 1.9.4 installed on 24 core 24Gb ram (centos) machine and for most queries CPU usage spikes goes to 200% with only few concurrent requests.
Domain:
some sort of social application where few types of nodes(profiles) with 3-30 text/array properties and 36 relationship types with at least 3 properties. Most of nodes currently has ~300-500 relationships.
Current data set footprint(from console):
LogicalLogSize=4294907 (32MB)
ArrayStoreSize=1675520 (12MB)
NodeStoreSize=1342170 (10MB)
PropertyStoreSize=1739548 (13MB)
RelationshipStoreSize=6395202 (48MB)
StringStoreSize=1478400 (11MB)
which is IMHO really small.
most queries looks like this one(with more or less WITH .. MATCH .. statements and few queries with variable length relations but the often fast):
START
targetUser=node({id}),
currentUser=node({current})
MATCH
targetUser-[contact:InContactsRelation]->n,
n-[:InLocationRelation]->l,
n-[:InCategoryRelation]->c
WITH
currentUser, targetUser,n, l,c, contact.fav is not null as inFavorites
MATCH
n<-[followers?:InContactsRelation]-()
WITH
currentUser, targetUser,n, l,c,inFavorites, COUNT(followers) as numFollowers
RETURN
id(n) as id,
n.name? as name,
n.title? as title,
n._class as _class,
n.avatar? as avatar,
n.avatar_type? as avatar_type,
l.name as location__name,
c.name as category__name,
true as isInContacts,
inFavorites as isInFavorites,
numFollowers
it runs in ~1s-3s(for first run) and ~1s-70ms (for consecutive and it depends on query) and there is about 5-10 queries runs for each impression. Another interesting behavior is when i try run query from console(neo4j) on my local machine many consecutive times(just press ctrl+enter for few seconds) it has almost constant execution time but when i do it on server it goes slower exponentially and i guess it somehow related with my problem.
Problem:
So my problem is that neo4j is very CPU greedy(for 24 core machine its may be not an issue but its obviously overkill for small project). First time i used AWS EC2 m1.large instance but over all performance was bad, during testing, CPU always was over 100%.
Some relevant parts of configuration:
neostore.nodestore.db.mapped_memory=1280M
wrapper.java.maxmemory=8192
note: I already tried configuration where all memory related parameters where HIGH and it didn't worked(no change at all).
Question:
Where to digg? configuration? scheme? queries? what i'm doing wrong?
if need more info(logs, configs) just ask ;)
The reason for subsequent invocations of the same query being much faster can be easily explained by the usage of caches. A common strategy is to run a cache warmup query upon startup, e.g.
start n=node(*) match n--m return count(n)
200% CPU usage on a 24 core means the machine is pretty lazy as only 2 cores are busy. When a query is in progress it's normal that CPU goes to 100% while running.
The Cypher statement above uses an optional match (in the 2nd match clause). These optional matches are known as being potentially slow. Check out if runtime changes if you make this a non-optional match.
When returning a larger result set consider that transferring the response is driven by network speed. Consider using streaming in the case, see http://docs.neo4j.org/chunked/milestone/rest-api-streaming.html.
You also should set wrapper.java.minmemory to the same value as wrapper.java.maxmemory.
Another approach for your rather small graph is to switch off MMIO caching and use cache_type=strong to keep the full dataset in the object cache. In this case you might need to increas wrapper.java.minmemory and wrapper.java.maxmemory.

PL/SQL - check for memory leaks?

I have some PL/SQL code that I think might have a memory leak. Everytime I run it it seems to run slower and slower than the time before, even though now I am decreasing the input size. The code that I'm suspicious of is populating an array from a cursor using bulk-collect, something like this
open c_myCursor(in_key);
fetch c_myCursor bulk collect into io_Array; /*io_array is a parameter, declared as in out nocopy */
close c_myCursor;
I'm not sure how to check to see what's causing this slowdown. I know there are some tables in Oracle that track this kind of memory usage, but I'm not sure if it's possible to look at those tables and find my way back to something useful about what my code is doing.
Also, I tried logging out the session and logging back in after about 10-15 minutes, still very slow.
Oracle version is 10.2
So it turns out there was other database activity. The DBA decided to run some large insert and update jobs at about the same time I started changing and testing code. I suspected my code was the root cause because I hadn't been told about the other jobs running (and I only heard about this other job after it completely froze everything and all the other devs got annoyed). That was probably why my code kept getting slower and slower.
Is there a way to find this out programmatically, such as querying for a session inserting/updating lots of data, just in case the DBA forgets to tell me the next time he does this?
v$sessmetric is a quick way to see what resources each session is using - cpu, physical_reads, logical_reads, pga_memory, etc.
"I tried logging out the session and logging back in after about 10-15 minutes, still very slow."
Assuming you are using a conventional dedicated connection on a *nix platform, this would pretty much rule out any memory leak. When you make a new connection to a database, oracle will fork off a new process for it and all the PGA memory will belong to that process and it will get released (by the OS) when the session is disconnected and the process terminated.
If you are using shared server connections then the session uses memory belonging to both the process but also the shared memory. This would probably be more vulnerable to any memory leak problem.
Windows doesn't work quite the same way, as it doesn't fork a separate process for each session, but rather has a separate thread under a single Oracle process. Again, I'd suspect this would be more vulnerable to a memory leak.
I'd generally look for other issues first, and probably start at the query underlying c_myCursor. Maybe it has to read through more old data to get to the fresh data ?
http://www.dba-oracle.com/t_plsql_dbms_profiler.htm describes DBMS_PROFILER. I suppose that the slowest parts of your code can be connected to memory leak. Anyway if you go back to the original problem, that it goes slower and slower, then the first thing to do is to see what is slow, and then to suppose memory leak.
It sounds like you do no commit between executions, and the redo log is larger and larger. Probably this is the cause that DB needs to provide read consistency.
You can also check the enterprise management console. Which version do you use? Never use XE for development, since as far as I know professional version can be used for development purposes. The enterprise management console even give you suggestions. Maybe it can tell you something clever about your PLSQL problem.
If your query returns very much data your collection can grow enormously large, say 10 000 000 records - that can be the point of the suspicious memory usage.
You can check this on by logging the size of the collection you bulk collect into. If it's larger that 10 000 (just a rough estimate, this depends on data of course) you may consider to split and work with parts of data, smth like this:
declare
cursor cCur is select smth from your_table;
--
type TCur is table of cCur%rowtype index by pls_integer;
--
fTbl TCur;
begin
open cCur;
loop
fTbl.delete;
fetch cCur bulk collect into fTbl limit 10000;
exit when cCur%notfound;
for i in 1 .. fTbl.count loop
--do your wok here
end loop;
end loop;
close cCur;
end;
Since you said that table is declared as in out nocopy I understand that you can't directly rewrite logic like this but just consider the methodology, maybe this can help you.

Resources