We need to seed an application with 3 million entities before running performance tests.
The 3 million entities should be loaded through the application to simulate 3 years of real data.
We are inserting 1-5000 entities at a time. In the beginning response times are very good. But after a while they decay exponentially.
We use at groovy script to hit a URL to start each round of insertions.
Restarting the application resets the response time - i.e. fixes the problem temporally.
Reruns of the script, without restarting the app, have no effect.
We use the following to enhance performance
1) Cleanup GORM after each 100 insertions:
def session = sessionFactory.currentSession
session.flush()
session.clear()
DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()
(old Ted Naleid trick: http://naleid.com/blog/2009/10/01/batch-import-performance-with-grails-and-mysql)
2) We use GPars for parallel insertions:
GParsPool.withPool {
(0..<1000).eachParallel {
def entity = new Entity(...)
insertionService.insert(entity)
}
}
Notes
When looking at the log output, I've noticed that the processing time for each entity are the same, but the system seems to pause longer and longer between each iteration.
The exact number of entities inserted are not important, just around 3 mill, so if some fail we can ignore it.
Tuning the number of entities at a time have little or no effect.
Help
I'm really hoping somebody have a good idea on how to fix the problem.
Environment
Grails: 2.4.2 (GRAILS_OPTS=-Xmx2G -Xms512m -XX:MaxPermSize=512m)
Java: 1.7.0_55
MBP: OS X 10.9.5 (2,6 GHz Intel Core i7, 16 GB 1600 MHz DDR3)
The pausing would make me think it's the JVM doing garbage collection. Have you used a profiler such as VisualVM to see what time is being spent doing garbage collection? Typically this will be the best approach to understanding what is happening with your application within the JVM.
Also, it's far better to load the data directly into the database rather than using your application if you are trying to "seed" the application. Performance wise of course.
(Added as answer per comment)
Related
I just started learning microstream. After going through the examples published to microstream github repository, I wanted to test its performance with an application that deals with more data.
Application source code is available here.
Instructions to run the application and the problems I faced are available here
To summarize, below are my observations
While loading a file with 2.8+ million records, processing takes 5 minutes
While calculating statistics based on loaded data, application fails with an OutOfMemoryError
Why is microstream trying to load all data (4 GB) into memory? Am I doing something wrong?
MicroStream is not like a traditional database and starts from the concept that all data are in memory. And an Object graph can be stored to disk (or other media) when you store this through the StorageManager.
In your case, all data are in 1 list and thus when accessing this list it reads all records from the disk. The Lazy reference isn't useful how you have used it since it just handles the access to the one list with all data.
Some optimizations that you can introduce.
Split the data based on vendorId, or day using a Map<String, Lazy<List>>
When a Map value is 'processed' removed it from the memory again by clearing the lazy reference. https://docs.microstream.one/manual/5.0/storage/loading-data/lazy-loading/clearing-lazy-references.html
Increase the number of Channels to optimize the reading and writing the data. see https://docs.microstream.one/manual/5.0/storage/configuration/using-channels.html
Don't store the object graph every 10000 lines but just at the end of the loading.
Hope this helps you solve the issues you have at the moment
There are many performance discussions around AutoMapper out there.
I have the issue, that the initialization just takes too much time, even though I don´t have a very big model compared to other applications. It is the following code block that matters:
var mapperConfiguration = new MapperConfiguration(cfg =>
{
cfg.CreateMap<..., ...>();
cfg.CreateMap<..., ...>();
// ... I have (just) 100 calls to cfg.CreateMap here...
});
Until yesterday I used version 5.0.2 and then I updated to the current stable version 6.0.2. This alone brought me some 25% faster initialization. But that is not enough. On a server of one customer this section takes about 8 seconds. We work with several worker processes, so every time such a process starts, it takes 8 seconds more again. That is not acceptable.
I tested also with the use of AutoMapper-Profile-classes. It didn´t make any difference.
Is there any way to postpone some part of the initialization nearer to the moment where mappings are actually needed for the first time?
Thanks everybody for your ideas!
:-)
I have Neo4j 1.9.4 installed on 24 core 24Gb ram (centos) machine and for most queries CPU usage spikes goes to 200% with only few concurrent requests.
Domain:
some sort of social application where few types of nodes(profiles) with 3-30 text/array properties and 36 relationship types with at least 3 properties. Most of nodes currently has ~300-500 relationships.
Current data set footprint(from console):
LogicalLogSize=4294907 (32MB)
ArrayStoreSize=1675520 (12MB)
NodeStoreSize=1342170 (10MB)
PropertyStoreSize=1739548 (13MB)
RelationshipStoreSize=6395202 (48MB)
StringStoreSize=1478400 (11MB)
which is IMHO really small.
most queries looks like this one(with more or less WITH .. MATCH .. statements and few queries with variable length relations but the often fast):
START
targetUser=node({id}),
currentUser=node({current})
MATCH
targetUser-[contact:InContactsRelation]->n,
n-[:InLocationRelation]->l,
n-[:InCategoryRelation]->c
WITH
currentUser, targetUser,n, l,c, contact.fav is not null as inFavorites
MATCH
n<-[followers?:InContactsRelation]-()
WITH
currentUser, targetUser,n, l,c,inFavorites, COUNT(followers) as numFollowers
RETURN
id(n) as id,
n.name? as name,
n.title? as title,
n._class as _class,
n.avatar? as avatar,
n.avatar_type? as avatar_type,
l.name as location__name,
c.name as category__name,
true as isInContacts,
inFavorites as isInFavorites,
numFollowers
it runs in ~1s-3s(for first run) and ~1s-70ms (for consecutive and it depends on query) and there is about 5-10 queries runs for each impression. Another interesting behavior is when i try run query from console(neo4j) on my local machine many consecutive times(just press ctrl+enter for few seconds) it has almost constant execution time but when i do it on server it goes slower exponentially and i guess it somehow related with my problem.
Problem:
So my problem is that neo4j is very CPU greedy(for 24 core machine its may be not an issue but its obviously overkill for small project). First time i used AWS EC2 m1.large instance but over all performance was bad, during testing, CPU always was over 100%.
Some relevant parts of configuration:
neostore.nodestore.db.mapped_memory=1280M
wrapper.java.maxmemory=8192
note: I already tried configuration where all memory related parameters where HIGH and it didn't worked(no change at all).
Question:
Where to digg? configuration? scheme? queries? what i'm doing wrong?
if need more info(logs, configs) just ask ;)
The reason for subsequent invocations of the same query being much faster can be easily explained by the usage of caches. A common strategy is to run a cache warmup query upon startup, e.g.
start n=node(*) match n--m return count(n)
200% CPU usage on a 24 core means the machine is pretty lazy as only 2 cores are busy. When a query is in progress it's normal that CPU goes to 100% while running.
The Cypher statement above uses an optional match (in the 2nd match clause). These optional matches are known as being potentially slow. Check out if runtime changes if you make this a non-optional match.
When returning a larger result set consider that transferring the response is driven by network speed. Consider using streaming in the case, see http://docs.neo4j.org/chunked/milestone/rest-api-streaming.html.
You also should set wrapper.java.minmemory to the same value as wrapper.java.maxmemory.
Another approach for your rather small graph is to switch off MMIO caching and use cache_type=strong to keep the full dataset in the object cache. In this case you might need to increas wrapper.java.minmemory and wrapper.java.maxmemory.
I'm using EF Code first, with one model that has over than 200 Entities(winforms), when i ran my program for first time, it took long time to run first query,then I used pre-generated views for improving performance, startup time reduced to about 12-13 seconds(before pregenerated views, startup time was about 30 seconds), which options i have, to reduce the time of my first query?
You don't have many options. First of all try to use the latest EF version - that means EF6 alpha 2 because there were some improvements but it may not be enough. IMHO add splash screen to your app and make the "first query" during application startup. WinForms application simply can have longer startup time if they perform some complex logic. Commonly whole application is initialized during startup so that it run smoothly once it is started.
[Prescript: I know that nothing here is specific to Delayed::Job. But it helps establish the context.]
update
I believe the SQL queries are not being garbage collected. My application generates many large SQL insert/update operations (160K bytes each, about 1 per second) and sends them to PostgreSQL via:
ActiveRecord::Base.connection.execute(my_large_query)
When I perform these db operations, my application slowly grows without bound. When I stub out the db operations (but perform all the other functions in my app) the bloating stops.
So: any ideas on why this is happening, how I can pinpoint it, or how I can make it stop?
original question
I have delayed tasks that slurp data from the web and create records in a PostgreSQL database. They seem to be working okay, but they start at vmemsize=100M and within ten minutes bulk up to vmemsize=500M and just keeps growing. My MacBook Pro with 8G of RAM starts thrashing when the VM runs out.
How can I find where the memory is going?
Before you refer me to other SO posts on the topic:
I've added the following to my #after(job) method:
def after(job)
clss = [Object, String, Array, Hash, ActiveRecord::Base, ActiveRecord::Relation]
clss.each {|cls| object_report(cls, " pre-gc")}
ObjectSpace.each_object(ActiveRecord::Relation).each(&:reset)
GC.start
clss.each {|cls| object_report(cls, "post-gc")}
end
def object_report(cls, msg)
log(sprintf("%s: %9d %s", msg, ObjectSpace.each_object(cls).count, cls))
end
It reports usage on the fundamental classes, explicitly resets ActiveRecord::Relation objects (suggested by this SO post), explicitly does a GC (as suggested by this SO post), and reports on how many Objects / Strings / Arrays / Hashes, etc there are (as suggested by this SO post). For what it's worth, none of those classes are growing significantly. (Are there other classes I should be looking at? But wouldn't that be reflected in the number of Objects anyway?)
I can't use memprof because I'm running Ruby 1.9.
And there are other tools that I'd consider if I were running on Linux, but I'm on OS X.
update
I'm afraid this was all a red herring: left running long enough, each ruby job grows to a vmsize of about 1.2GB (yeah, that big, but not huge by today's standards), then shrinks back down to 850MB and bobbles between those two values thereafter without continuing to grow bigger.
My real problem was that I was trying to run more than four such processes on my machine with 8GB RAM, which filled up all available RAM and then went into swapping hypoxia. Running only four processes almost fills up available memory, so the system doesn't start swapping.
update 2
Nope, still a problem -- I didn't let the jobs run long enough: the jobs grow continually (albeit slowly). Even running just two external jobs eventually consumes all VM and my machine starts thrashing.
I tried running the in production mode (thinking that dev mode may cache things that don't get freed), but it didn't make any appreciable difference.