I was evaluating the speed of H2 earlier today, and I noticed a significant slowdown when making many subsequent queries. I did a quick CPU profile with JMX and I noticed that the vast majority of CPU time was spent in the FileLock.sleep() method. I debugged the code while several hundred INSERT statements were made, and these calls almost entirely stem from this line in the FileLock.lockFile() method:
save();
sleep(SLEEP_GAP);
FileLock.SLEEP_GAP is a static final int set to 25, so it can't be tuned at all (please don't suggest that I use reflection, if you think that will work I'd encourage you to read this answer). This method is invoked and causes the main thread to sleep for 25ms on every single INSERT statement made. If you have tens of thousands of them to execute, it really adds up to a lot of wasted time. Why is this value set this way? Is there any way around having to use this class?
Source code, if you don't feel like getting it out of SVN.
FileLock.sleep() is used while opening the database file, to ensure no other process can open the same database file at the same time (similar to file locking). Such a mechanism is used by most database engines. If you see this in the profiling, then that means the database is opened and closed many times in a row. Opening and closing a database is quite slow and should be avoided.
If possible, databases should be kept open, by keeping the connection open or by using a connection pool.
If that's not an option, then append ;DB_CLOSE_DELAY=1 to the database URL. This will keep the database file open for one second after closing the last connection.
Related
Our application at startup checks for the presence of certain tables, sequences and a few other things. We had programmed that straight forward like so:
...
MetaData meta = connection.getMetaData();
...
ResultSet tables = meta.getTables(...);
... <checking for the presences of specific tables>
ResultSet sequences = meta.getSequences(...);
... <checking for the presences of specific sequences>
etc.
...
While so far the initial connection.getMetaData()-call always had a sub-second duration, after moving to a bigger, more powerful and shared Oracle DB Server this call now reproducibly takes more than 5 minutes(!). This time goes directly to the startup time of our application which has more than quadrupled by this and that is of course a big no-go!
Any idea why this JDBC call takes so long on one system but not on another? And are there any options or settings that could speed this up? Both databases report as "Oracle Database 11g Release 11.2.0.4.0 - 64bit Production". Both servers are in our intranet, so network-wise they should be similarly reachable. The new one CPU- and RAM-wise much more powerful and is configured in a fail-over config (i.e. the connection URL contains 2 servers in case one is down or not reachable). The old one was a simple one-machine setup.
Anything else that could be relevant to this or explain why that call now takes that much longer?
Addendum:
We tried to debug into the method (but didn't get very far). But the culprit seems to be in DatabaseMetadata.initSequences(), i.e. it seems that the fetching of the sequences is the part that takes so long on this server while it took split-seconds on the other. Any wisdom what could be causing this?
We found the culprit of our slow Metadata query!
The reason is that during initialization we set the comparison mode to LINGUISTIC, i.e. we execute a:
alter session set nls_comp=LINGUISTIC;
With that setting active the retrieval of sequences (as part of the getMetaData()) takes ~5 minutes! If we leave it at the default (which is alter session set nls_comp=BINARY) then the fetching of the metadata takes ~1 second only!
Apparently that comparison mode leads to a full table scan which causes this crazy query duration. However, we need this comparison mode since otherwise many of our queries don't yield matches (in our case names, company names, adresses, etc.) that contain accented characters.
We "fixed" this issue by switching the comparison mode at a later point in the application after we have completed the startup checks that verify the presence of certain tables and sequences, etc.
If someone knows an approach, how to speed up the MetaData-creation even in the presence of a non-default comparison mode (e.g. can one create a special index or similar) - please let me know!
Obviously there must be some additional setting involved here, because - as mentioned - on our previous DB-server the MetaData-fetch had taken ~1 second only even with that mode set to LINGUISTIC.
Short question is on the title: I work with my mongo Shell wich is in safe mode by default, and I want to gain better performance by deactivating this behaviour.
Long Question for those willing to know the context:
I am working on a huge set of data like
{
_id:ObjectId("azertyuiopqsdfghjkl"),
stringdate:"2008-03-08 06:36:00"
}
and some other fields and there are about 250M documents like that (whole database with the indexes weights 36Go). I want to convert the date in a real ISODATE field. I searched a bit how I could make an update query like
db.data.update({},{$set:{date:new Date("$stringdate")}},{multi:true})
but did not find how to make this work and resolved myself to make a script that take the documents one after the other and make an update to set a new field which takes the new Date(stringdate) as its value. The query use the _id so the default index is used.
Problem is that it takes a very long time. I already figured out that if only I had inserted empty dates object when I created the database I would now get better performances since there is the problem of data relocation when a new field is added. I also set an index on a relevant field to process the database chunk by chunk. Finally I ran several concurrent mongo clients on both the server and my workstation to ensure that the limitant factor is the database lock availability and not any other factor like cpu or network costs.
I monitored the whole thing with mongotop, mongostats and the web monitoring interfaces which confirmed that write lock is taken 70% of the time. I am a bit disappointed mongodb does not have a more precise granularity on its write lock, why not allowing concurrent write operations on the same collection as long as there is no risk of interference? Now that I think about it I should have sharded the collection on a dozen shards even while staying on the same server, because there would have been individual locks on each shard.
But since I can't do a thing right now to the current database structure, I searched how to improve performance to at least spend 90% of my time writing in mongo (from 70% currently), and I figured out that since I ran my script in the default mongo shell, every time I make an update, there is also a getLastError() which is called afterwards and I don't want it because there is a 99.99% chance of success and even in case of failure I can still make an aggregation request after the end of the big process to retrieve the single exceptions.
I don't think I would gain so much performance by deactivating the getLastError calls, but I think itis worth trying.
I took a look at the documentation and found confirmation of the default behavior, but not the procedure for changing it. Any suggestion?
I work with my mongo Shell wich is in safe mode by default, and I want to gain better performance by deactivating this behaviour.
You can use db.getLastError({w:0}) ( http://docs.mongodb.org/manual/reference/method/db.getLastError/ ) to do what you want but it won't help.
This is because for one:
make a script that take the documents one after the other and make an update to set a new field which takes the new Date(stringdate) as its value.
When using the shell in a non-interactive mode like within a loop it doesn't actually call getLastError(). As such downing your write concern to 0 will do nothing.
I already figured out that if only I had inserted empty dates object when I created the database I would now get better performances since there is the problem of data relocation when a new field is added.
I did tell people when they asked about this stuff to add those fields incase of movement but instead they listened to the guy who said "leave them out! They use space!".
I shouldn't feel smug but I do. That's an unfortunately side effect of being right when you were told you were wrong.
mongostats and the web monitoring interfaces which confirmed that write lock is taken 70% of the time
That's because of all the movement in your documents, kinda hard to fix that.
I am a bit disappointed mongodb does not have a more precise granularity on its write lock
The write lock doesn't actually denote the concurrency of MongoDB, this is another common misconception that stems from the transactional SQL technologies.
Write locks in MongoDB are mutexs for one.
Not only that but there are numerous rules which dictate that operations will subside to queued operations under certain circumstances, one being how many operations waiting, another being whether the data is in RAM or not, and more.
Unfortunately I believe you have got yourself stuck in between a rock and hard place and there is no easy way out. This does happen.
I have some PL/SQL code that I think might have a memory leak. Everytime I run it it seems to run slower and slower than the time before, even though now I am decreasing the input size. The code that I'm suspicious of is populating an array from a cursor using bulk-collect, something like this
open c_myCursor(in_key);
fetch c_myCursor bulk collect into io_Array; /*io_array is a parameter, declared as in out nocopy */
close c_myCursor;
I'm not sure how to check to see what's causing this slowdown. I know there are some tables in Oracle that track this kind of memory usage, but I'm not sure if it's possible to look at those tables and find my way back to something useful about what my code is doing.
Also, I tried logging out the session and logging back in after about 10-15 minutes, still very slow.
Oracle version is 10.2
So it turns out there was other database activity. The DBA decided to run some large insert and update jobs at about the same time I started changing and testing code. I suspected my code was the root cause because I hadn't been told about the other jobs running (and I only heard about this other job after it completely froze everything and all the other devs got annoyed). That was probably why my code kept getting slower and slower.
Is there a way to find this out programmatically, such as querying for a session inserting/updating lots of data, just in case the DBA forgets to tell me the next time he does this?
v$sessmetric is a quick way to see what resources each session is using - cpu, physical_reads, logical_reads, pga_memory, etc.
"I tried logging out the session and logging back in after about 10-15 minutes, still very slow."
Assuming you are using a conventional dedicated connection on a *nix platform, this would pretty much rule out any memory leak. When you make a new connection to a database, oracle will fork off a new process for it and all the PGA memory will belong to that process and it will get released (by the OS) when the session is disconnected and the process terminated.
If you are using shared server connections then the session uses memory belonging to both the process but also the shared memory. This would probably be more vulnerable to any memory leak problem.
Windows doesn't work quite the same way, as it doesn't fork a separate process for each session, but rather has a separate thread under a single Oracle process. Again, I'd suspect this would be more vulnerable to a memory leak.
I'd generally look for other issues first, and probably start at the query underlying c_myCursor. Maybe it has to read through more old data to get to the fresh data ?
http://www.dba-oracle.com/t_plsql_dbms_profiler.htm describes DBMS_PROFILER. I suppose that the slowest parts of your code can be connected to memory leak. Anyway if you go back to the original problem, that it goes slower and slower, then the first thing to do is to see what is slow, and then to suppose memory leak.
It sounds like you do no commit between executions, and the redo log is larger and larger. Probably this is the cause that DB needs to provide read consistency.
You can also check the enterprise management console. Which version do you use? Never use XE for development, since as far as I know professional version can be used for development purposes. The enterprise management console even give you suggestions. Maybe it can tell you something clever about your PLSQL problem.
If your query returns very much data your collection can grow enormously large, say 10 000 000 records - that can be the point of the suspicious memory usage.
You can check this on by logging the size of the collection you bulk collect into. If it's larger that 10 000 (just a rough estimate, this depends on data of course) you may consider to split and work with parts of data, smth like this:
declare
cursor cCur is select smth from your_table;
--
type TCur is table of cCur%rowtype index by pls_integer;
--
fTbl TCur;
begin
open cCur;
loop
fTbl.delete;
fetch cCur bulk collect into fTbl limit 10000;
exit when cCur%notfound;
for i in 1 .. fTbl.count loop
--do your wok here
end loop;
end loop;
close cCur;
end;
Since you said that table is declared as in out nocopy I understand that you can't directly rewrite logic like this but just consider the methodology, maybe this can help you.
If i am reading one of my application settings from the web.config everytime when each of my ASP.NET page loads,Would it be a performance issue ?I m concerned about memory too.
It's not great, but in the context of serving up a page, it's just a drop in the bucket. It's not nearly as bad as reading it over and over in a loop, hundreds of times per page view. Lots of pages do things like look up previous visit info (user preferences, cookie tracking, etc..) which usually requires opening a database connection and running a query. So hitting the config file is small potatoes.
You also have to consider how often this really happens. A thousand times per hour? Don't waste your time. A thousand per minute? Stil probably not a problem (a datbase query would probably be a different story though). A thousand times per second, and then you've got reason to try to optomize this.
I don't think I'd worry about it. It is a very small file, and reading from it is very fast.
If it concerns you that much, read it into an Application variable, and reference that throughout the app instead.
I have a certain web application that makes upwards of ~100 updates to an Oracle database in succession. This can take anywhere from 3-5 minutes, which sometimes causes the webpage to time out. A re-design of the application is scheduled soon but someone told me that there is a way to configure a "loader file" which loads the schema into memory and runs the transactions there instead of on the hard drive, supposedly improving speed by several orders of magnitude. I have tried to research this "loader file" but all I can find is information about the SQL* bulk data loader. Does anyone know what he's talking about? Is this really possible and is it a feasible quick fix or should I just wait until the application is re-designed?
Oracle already does it's work in memory - disk I/O is managed behind the scenes. Frequently accessed data stays in memory in the buffer cache. Perhaps your informant was referring to "pinning" an object in memory, but that's really not effective in the modern releases of Oracle (since V8), particularly for table data. Let Oracle do it's job - it's actually very good at it (probably better than we are). Face it - 100K updates is going to take a while.