Mongodb file allocator takes more time

Mongodb file allocator takes more time - windows

When mongodb is creating a new file under data directory it takes more time to create :
Line 376: Thu Jan 15 18:01:49.407 [FileAllocator] allocating new datafile >\data\db\test.3, filling with zeroes...
Line 476: Thu Jan 15 18:03:55.650 [FileAllocator] done allocating datafile >\data\db\test.3, size: 512MB, took 126.242 secs
Because of that node give below error after that node is not able to connect with mongodb.
{ "error":"{ err: 'connection to [localhost:27017] timed out' }","level":"error","message":"uncaught exception: ","timestamp":"2015-01-15T20:45:03.702Z"}
My understanding is that this error is coming from MongoMQ lib. I am not sure how I can handle it. Any one can help on this issue.

Windows Answer
The most obvious issue which could apply here is if you are using Windows 7 or Windows Server 2008. An issue (SERVER-8480) with those operating systems, which can be fixed by applying this hotfix means that the data files being allocated by MongoDB must be filled with zeroes.
That is a lengthy process when compared to the normal method. Unfortunately, even with the hotfix installed, with versions 2.6 and 2.4 MongoDB assumes the problem is still there on Windows 7 or Server 2008 and zero fills anyway. Version 2.8+ fixes the problem by detecting the hotfix specifically and reverting to not zero filling.
To give you an idea of the difference, here is a sample log line from 2.8.0-rc5 (which detects the fix) on Windows 7:
2015-01-22T16:56:51.749+0000 I STORAGE [FileAllocator] done allocating datafile E:\data\db\280\test.2, size: 2047MB, took 0.016 secs
And here is a sample log line from the same machine doing the same allocation with version 2.6.5:
2015-01-22T16:47:33.762+0000 [FileAllocator] done allocating datafile E:\data\db\265\test.2, size: 2047MB, took 112.071 secs
That's 112.071 seconds versus 0.016 seconds. Windows 8/2012 or 2.8+ (once released, and with the hotfix installed of course) seem to be the way to go here if allocation is causing problems for you.
There are also several more general known issues with versions of MongoDB prior to 2.6.4 on Windows, most notably SERVER-13729 and SERVER-13681 which were addressed with 2.6.4.
The remaining issues are being tracked in SERVER-12401 and are dependent on this hotfix from Microsoft to improve flushing of memory mapped files by the OS. Unfortunately, that hotfix is only available for Windows 8 and 2012, it has not been made available for Windows 7 and 2008.
Hence, make sure you are using 2.6.4+ at a minimum and, if possible Windows 8 or 2012. It may also be helpful compare any performance seen on remote storage with a local disk to determine if that is a contributing factor.
Linux Answer
(preserved in case anyone else ends up here when using Linux)
This should only take a few milliseconds, if you are using a supported filesystem. I suspect you are using something else (ext3 perhaps?), or perhaps using a very old version of Linux.
The supported filesystems use fallocate() to allocate the data files, which is very quick. If this is not supported, then the files must be allocated by filling with zeroes, and that will take a very long time.
Even if that is the case, 126 seconds to zero fill a 512MB file is very slow, which indicates the disk is either slow/broken, or oversubscribed/saturated and struggling to keep up.
If you want to evaluate the allocation outside of MongoDB, I've written a small bash script to pre-allocate data files for MongoDB (for testing purposes), which uses the aforementioned fallocate() and completes in milliseconds for multiple gigabytes on my test system.

Related

Tortoise Is very Slow And uses Huge amount of memory

Since some days TortoiseSVN uses lots of memory when I want to commit also it takes 10 - 20 minutes before the changed files appear.
On normal use it doensn't use much memory only when commiting or comparing changed files.
As you can see the memory usage is not normal.
I have already reinstalled the newest version (1.8.10) but no difference.
Does anyone have any clue?
(the directory I am working in is 2 GB This includes the tempdata witch is excluded from svn and i am working on w7 x64)
Here is a Screenshot of the Icon Overlay settings i use

I had the same issue since I updated to (TortoiseSVN 1.8.10); excessive amounts of memory used and a each refresh of your view would increase this amount even further.
The new version 1.8.11 appears to have resolved the issue.

5.6 GB not enough for Cloudera?

I am running Cloudera Hadoop on my laptop and Oracle VirtualBox VM.
I have given 5.6 GB out of mine 8 and six from eight cores as well.
And still I am not able to keep it up and running.
Even without load services would not stay up and running and when I try a query at least Hive will be down within 20 minutes. And sometimes they go down like dominoes: one after another.
More memory seemed to help some: with 3GB and all services, Hue was blinking with red colors when the Hue itself managed to get up. And after rebooting it would takes 30 - 60 minutes before I manage to get the system up enough to even try running anything on it.
There has been two sensible notes (that I have managed to find):
- Warning of swapping.
- Crashing note when the system used 26 GB of virtual memory which was not enough.
My dataset is less than one megabyte, so it is hard to understand why the system would go up to dozens of gigabytes, but for whatever was reason for that has passed: now the system is running more steadily around the 5.6 GB that I have given to it after closing down a few services: see my answer to myself.
And still it is just more stable. Right after I got a warning of swapping and the Hive went down again. What could be reason for more-or-less all Hadoop services going down if the VM starts to swap?
I don't have enough reputation to post the picture to here, but when Hive went down again it was swapping 13 pages / second and utilizing 5.9 GB / 5.6 GB. So basically my system starts crashing more-or-less right after it start to swap. "428 pages were swapped to disk in the previous 15 minute(s)"
I have used default installation options as far as hard drive is concerned.
Only addition is a shared folder between Windows and VM. That works somewhat strangely locking files all the time, so I used it just like FTP and only for passing files from one system to another. Thus I can go days without using it, but systems still crash, so that is not the cause either.
Now that the system is mostly up, services crash still about twice a day: Service Monitor and Hive are quite even with their crashing frequency. After those come Activity Monitor and Event Server, which appear to crash always together. I believe Yarn crashes as well, but it gets up on its own. Last time Hive crashed first, and then it got followed by Service Monitor, Hive (second time), Activity Monitor and Event Server all.
As swap is disk, perhaps the problem is with disk:
# cat /etc/fstab
# swapoff -a
# badblocks -v /dev/VolGroup/lv_swap
Checking blocks 0 to 8388607
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found.
# badblocks -vw /dev/VolGroup/lv_swap
Checking for bad blocks in read-write mode
From block 0 to 8388607
Testing with pattern 0xaa: done
Reading and comparing: done
Testing with pattern 0x55: done
Reading and comparing: done
Testing with pattern 0xff: done
Reading and comparing: done
Testing with pattern 0x00: done
Reading and comparing: done
Pass completed, 0 bad blocks found.
So nothing wrong with swap disk and I have not noticed any disk error anywhere else either.
Note that you could check file system from Windows side also. But I expect that if you make Windows to fix your Linux file system, you have good chances of destroying your Linux with that, so I did my checks somewhat pessimistically, because AFAIK these commands are safe to execute.

About half of the services kept going down, so giving more specifics would be a long story.
I succeeded to get the system more stable by closing down flume, hbase, impala, ks_indexer, oozie, spark and sqoop. And by increasing more memory to some remaining services that complained they had not been given enough memory.
Also I fixed couple of thing on the Windows side, I am not sure which one of these helped:
- MsMpEng.exe kept my hard drive busy. I didn't have permissions to kill it, but I decreased its priority to lowest possible.
- CcmExec.exe got to loop on my DVD and kept reading it for forever. This I solved by taking the DVD out from the drive. Then later on I killed the process tree to keep it from bothering for a while.
I found these using Windows resource manager.

The VM requires 4GB: http://www.cloudera.com/content/cloudera-content/cloudera-docs/DemoVMs/Cloudera-QuickStart-VM/cloudera_quickstart_vm.html You should use that.
I am not clear whether you are using the QuickStart VM though. It's set up to run just the essential services and tuned to conserve memory rather than exploit lots of memory.
It sounds like you are running your own installation, on one virtual machine, on your Windows machine. You may be running an entire cluster's worth of services on one desktop machine. Each of these services has master, worker processes, monitoring processes, etc. You don't need most of them.
You also probably have left memory settings at default suitable for a server-class machine of 16+ GB RAM. Remember these services usually run across many machines, not all on one.
Finally, you're clearly swapping, and that makes things incredibly slow. Remember this is all through a VM too!
Bottom line, use the QuickStart VM if you really want a 1-machine cluster tuned correctly. If you want a real cluster or more services, you need more hardware.

Also consider: cloudera.com/live contains a full CDH 5.1 cluster + sample data, running on demand on AWS. Of course, the advantage of the VM is that you can BYOD, but if you're simply looking for a hands-on Hadoop experience, Live is a great option.

Is there a way to cap the file size of slony log shipping files?

I am working with a SuSE machine (cat /etc/issue: SUSE Linux Enterprise Server 11 SP1 (i586)) running Postgresql 8.1.3 and the Slony-I replication system (slon version 1.1.5). We have a working replication setup going between two databases on this server, which is generating log shipping files to be sent to the remote machines we are tasked to maintain. As of this morning, we ran into a problem with this.
For a while now, we've had strange memory problems on this machine - the oom-killer seems to be striking even when there is plenty of free memory left. That has set the stage for our current issue to occur - we ran a massive update on our system last night, while replication was turned off. Now, as things currently stand, we cannot replicate the changes out - slony is attempting to compile all the changes into a single massive log file, and after about half an hour or so of running, it trips over the oom-killer issue, which appears to restart the replication package. Since it is constantly trying to rebuild that same package, it never gets anywhere.
My first question is this: Is there a way to cap the size of Slony log shipping files, so that it writes out no more than 'X' bytes (or K, or Meg, etc.) and after going over that size, closes the current log shipping file and starts a new one? We've been able to hit about four megs in size before the oom-killer hits with fair regularity, so if I could cap it there, I could at least start generating the smaller files and hopefully eventually get through this.
My second question, I guess, is this: Does anyone have a better solution for this issue than the one I'm asking about? It's quite possible I'm getting tunnel vision looking at the problem, and all I really need is -a- solution, not necessarily -my- solution.

How do I improve Windows Subversion client update performance?

How do I improve Subversion client update performance? It appears to be disk bound on the client.
Details:
CollabNet Windows client version 1.6.2 (r37639)
Windows XP SP2
3 GB RAM with PF Usage around 1 GB and System Cache of 1.1 GB.
Disk has write caching enabled
Update takes 7-15 minutes (when very little to update).
Checkout has 36,083 directories/files (from svn list)
Repository has 58,750 revisions.
Checkout takes about 2.7 GB
Perf monitor shows % Disk Write time stays near 90% during update.
Max Disk Read Bytes/sec got up to 12.8M and write got up to 5.2M
CPU, paging file usage, and network usage are all low.
Watching the server performance seems to show that it isn't a bottleneck.
I'm especially interested in answers besides getting a faster disk (especially configuration changes).
Updates from some of the suggestions:
I need the whole thing so sparse directories won't work.
Another client (TortoiseSVN) takes 7 minutes also
TortoiseSVN icon overlays have be configured so they don't cause the problem.
Anti-virus is configured to to skip that directory is it isn't causing the problem.

I experience exatly the same thing. Recently replaced Perforce with svn, but if we cannot overcome the performance problems on Windows me must consider another tool.
Using svn 1.6.6, Win XP and Vista clients. RedHat server.
My observations matches yours:
Huge disk-write activity.
Antivirus not a bottleneck.
No matter witch svn-clients are used.
No server or network bottleneck.
Complementary info
More than 3 times faster operations on:
Linux (Ubuntu).
Linux (Ubuntu) running on VirtualBox at Win Vista host.
Win XP running on VMWare at RedHat host.

Do you need every bit of the repository on your working copy? If you truly only care about particular portions of the tree, look into Subversion's Sparse Directories (a.k.a. "Sparse Checkouts") feature. It allows you to manipulate your working copy so it only contains those directories of interest.
Just as an example, you might use this to prune documentation, installer-related files, etc. Depending on what you truly need on your local machine, embracing this approach could make a serious dent in your wait times.

Try svn client version 1.5.. It helped me on my Vista laptop. Versions 1.6. are extremely slow.

This is more likely to be your network and the amount of data moved as well as your client. Are you using Tortoise? I find it to be a bit slow myself when moving that much data!

Are you using TortoiseSVN? If so, the Icon Overlays do slow down operations. If you go to TortoiseSVN Settings/Icon Overlays there are several settings you can tweak to control the level to which you want to use the Overlays, including turning them off completely. See if that affects your performance.

Do you run a virus checker that uses on-access scanning? That can really make it crawl. If so, turn it off and see if that helps. Most scanners will have a way to exclude specific directories if that helps.

Nobody seems to be pointing out the one reason that I often consider a design flaw. Subversion creates a second "pristine" copy of the checkout for offline operations. If you're checking out 4G of files, it's actually writing 8G to disk.
Compare a checkout to an export. That will show you the massive difference when writing those second copies.
There's nothing you can do about that.

Upgrade to svn 1.7
From Discussion of Slow Performance of SVN Update:
The update process in svn 1.6 goes something like this:
search the entire working copy, to see what's there at the moment, and locking it so no one changes the answer during the next steps
tell that to the server
receive from the server whatever new stuff you need, applying the changes to the files as you go
recurse over the entire working copy again, unlocking it
If there are many directories and files, steps 1 and 4 can take up a
lot of time. This would be consistent with your observation of long
delays with no network traffic.
Working copy format was changed in svn 1.7. Now all meta information is stored in SQLite database in root folder of working copy and there is no need to perform steps 1 and 4 any more which consumed most of the time durring svn update.

How to obtain good concurrent read performance from disk

I'd like to ask a question then follow it up with my own answer, but also see what answers other people have.
We have two large files which we'd like to read from two separate threads concurrently. One thread will sequentially read fileA while the other thread will sequentially read fileB. There is no locking or communication between the threads, both are sequentially reading as fast as they can, and both are immediately discarding the data they read.
Our experience with this setup on Windows is very poor. The combined throughput of the two threads is in the order of 2-3 MiB/sec. The drive seems to be spending most of its time seeking backwards and forwards between the two files, presumably reading very little after each seek.
If we disable one of the threads and temporarily look at the performance of a single thread then we get much better bandwidth (~45 MiB/sec for this machine). So clearly the bad two-thread performance is an artefact of the OS disk scheduler.
Is there anything we can do to improve the concurrent thread read performance? Perhaps by using different APIs or by tweaking the OS disk scheduler parameters in some way.
Some details:
The files are in the order of 2 GiB each on a machine with 2GiB of RAM. For the purpose of this question we consider them not to be cached and perfectly defragmented. We have used defrag tools and rebooted to ensure this is the case.
We are using no special APIs to read these files. The behaviour is repeatable across various bog-standard APIs such as Win32's CreateFile, C's fopen, C++'s std::ifstream, Java's FileInputStream, etc.
Each thread spins in a loop making calls to the read function. We have varied the number of bytes requested from the API each iteration from values between 1KiB up to 128MiB. Varying this has had no effect, so clearly the amount the OS is physically reading after each disk seek is not dictated by this number. This is exactly what should be expected.
The dramatic difference between one-thread and two-thread performance is repeatable across Windows 2000, Windows XP (32-bit and 64-bit), Windows Server 2003, and also with and without hardware RAID5.

The problem seems to be in Windows I/O scheduling policy. According to what I found here there are many ways for an O.S. to schedule disk requests. While Linux and others can choose between different policies, before Vista Windows was locked in a single policy: a FIFO queue, where all requests where splitted in 64 KB blocks. I believe that this policy is the cause for the problem you are experiencing: the scheduler will mix requests from the two threads, causing continuous seek between different areas of the disk.
Now, the good news is that according to here and here, Vista introduced a smarter disk scheduler, where you can set the priority of your requests and also allocate a minimum badwidth for your process.
The bad news is that I found no way to change disk policy or buffers size in previous versions of Windows. Also, even if raising disk I/O priority of your process will boost the performance against the other processes, you still have the problems of your threads competing against each other.
What I can suggest is to modify your software by introducing a self-made disk access policy.
For example, you could use a policy like this in your thread B (similar for Thread A):
if THREAD A is reading from disk then wait for THREAD A to stop reading or wait for X ms
Read for X ms (or Y MB)
Stop reading and check status of thread A again
You could use semaphores for status checking or you could use perfmon counters to get the status of the actual disk queue.
The values of X and/or Y could also be auto-tuned by checking the actual trasfer rates and slowly modify them, thus maximizing the throughtput when the application runs on different machines and/or O.S. You could find that cache, memory or RAID levels affect them in a way or the other, but with auto-tuning you will always get the best performance in every scenario.

I'd like to add some further notes in my response. All other non-Microsoft operating systems we have tested do not suffer from this problem. Linux, FreeBSD, and Mac OS X (this final one on different hardware) all degrade much more gracefully in terms of aggregate bandwidth when moving from one thread to two. Linux for example degraded from ~45 MiB/sec to ~42 MiB/sec. These other operating systems must be reading larger chunks of the file between each seek, and therefor not spending nearly all their time waiting on the disk to seek.
Our solution for Windows is to pass the FILE_FLAG_NO_BUFFERING flag to CreateFile and use large (~16MiB) reads in each call to ReadFile. This is suboptimal for several reasons:
Files don't get cached when read like this, so there are none of the advantages that caching normally gives.
The constraints when working with this flag are much more complicated than normal reading (alignment of read buffers to page boundaries, etc).
(As a final remark. Does this explain why swapping under Windows is so hellish? Ie, Windows is incapable of doing IO to multiple files concurrently with any efficiency, so while swapping all other IO operations are forced to be disproportionately slow.)
Edit to add some further details for Will Dean:
Of course across these different hardware configurations the raw figures did change (sometimes substantially). The problem however is the consistent degradation in performance that only Windows suffers when moving from one thread to two. Here is a summary of the machines tested:
Several Dell workstations (Intel Xeon) of various ages running Windows 2000, Windows XP (32-bit), and Windows XP (64-bit) with single drive.
A Dell 1U server (Intel Xeon) running Windows Server 2003 (64-bit) with RAID 1+0.
An HP workstation (AMD Opteron) with Windows XP (64-bit), and Windows Server 2003, and hardware RAID 5.
My home unbranded PC (AMD Athlon64) running Windows XP (32-bit), FreeBSD (64-bit), and Linux (64-bit) with single drive.
My home MacBook (Intel Core1) running Mac OS X, single SATA drive.
My home Koolu PC running Linux. Vastly underpowered compared to the other systems but I demonstrated that even this machine can outperform a Windows server with RAID5 when doing multi-threaded disk reads.
CPU usage on all of these systems was very low during the tests and anti-virus was disabled.
I forgot to mention before but we also tried the normal Win32 CreateFile API with the FILE_FLAG_SEQUENTIAL_SCAN flag set. This flag didn't fix the problem.

It does seem a little strange that you see no difference across quite a wide range of windows versions and nothing between a single drive and hardware raid-5.
It's only 'gut feel', but that does make me doubtful that this is really a simple seeking problem. Other than the OS X and the Raid5, was all this tried on the same machine - have you tried another machine? Is your CPU usage basically zero during this test?
What's the shortest app you can write which demonstrates this problem? - I would be interested to try it here.

I would create some kind of in memory thread safe lock. Each thread could wait on the lock until it was free. When the lock becomes free, take the lock and read the file for a defined length of time or a defined amount of data, then release the lock for any other waiting threads.

Do you use IOCompletionPorts under Windows? Windows via C++ has an in-depth chapter on this subject and as luck would have it, it is also available on MSDN.

Paul - saw the update. Very interesting.
It would be interesting to try it on Vista or Win2008, as people seem to be reporting some considerable I/O improvements on these in some circumstances.
My only suggestion about a different API would be to try memory mapping the files - have you tried that? Unfortunately at 2GB per file, you're not going to be able to map multiple whole files on a 32-bit machine, which means this isn't quite as trivial as it might be.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio