Do ramdisks really improve vs2010 performance? - visual-studio-2010

Do ramdisks really improve vs2010 performance (general and build times)?
If so, what are all the steps I have to do to get the maximum benefit of it?
Can it also help resharper?
Thanks,
André Carlucci

in my experience RamDisk is slower than SSD for build. it can be even slower than HDD...
RAMdisk slower than disk?
so do not bother with RamDisk and buy Intel or Crucial SSD, but not OCZ.
EDIT:
After many tries I figured it out. When ramdisk is formatted as FAT32, then even though benchmarks shows high values, real world use is actually slower than NTFS formatted SSD. But NTFS formatted ramdisk is faster in real life than SSD. But I would not bother with Ramdisk anyway, SSD is everything you need.

So here is the issue that you will run into. Yes it can improve performance with some serious buts:
You stand the chance of losing all your work between syncs.
There is a noticeable lag when the ramdisk sync's to disc.
This will require you to setup proper sync times for how you work.
I'd recommend getting a SATA III Solid-State Drive and back it up weekly.

I have tried ram disk 1 or 2 years ago. I remember that the build time was about 30% faster.
That was too few to top the disadvantages as mentioned by Andrew Finnell.

Related

hardware environment for compilation performance

This is a rather general question ..
What hardware setup is best for large C/C++ compile jobs, like a Linux kernel or Applications ?
I remember reading a post by Joel Spolsky on experiments with solid state disks and stuff like that.
Do I have to have rather more CPU power or more RAM or a fast harddisk IO solution like solid state ? Would it for example be convenient to have a 'normal' harddisk for the standard system and then use a solid state for the compilation ? Or can I just buy lots of RAM ? And how important is the CPU, or is it just sitting around most of the compile time ?
Probably it's a stupid question, but I dont have a lot of experience on that field, thanks for answers.
Here's some info on the SSD issue
Linus Torvalds on the topic
How to get the most out of it (Standard settings are sort of slowing it down)
Interesting article on Coding Horror about it and with a tip for a payable Chipset
I think you need enough of everything. CPU is very important, and compiles can easily be parallelised (with make -j), so you want as many CPU cores as possible. Then, RAM is probably just as important, since it provides more 'working space' for the compiler and allows to your IO to be buffered. Finally, of course drive speed is probably the least important of the three - kernel code is big, but not that big.
Definitely not a stupid question, getting the build-test environment tuned correctly will make a lot of headaches go away.
Hard disk performance would probably top the list. I'd stay well away from the solid state drive as they're only rated for a large-but-limited number of write operations and make-clean-build cycles will hammer it.
More importantly, can you take advantage of a parallel or shared build environment - from memory ClearCase and Perforce had mechanisms to handle shared builds. Unless you have a parallelising build system, having multiple CPUs will be fairly pointless
Last but not least, I would doubt that the build time would be the limiting factor - more likely you should focus on the needs of your test system. Before you look at the actual metal though, try to design a build-test system that's appropriate to how you'll actually be working - how often are your builds, how many people are involved, how big is your test system ....

Are solid-state drives good enough to stop worrying about disk IO bottlenecks?

I've got a proof-of-concept program which is doing some interprocess communication simply by writing and reading from the HD. Yes, I know this is really slow; but it was the easiest way to get things up and running. I had always planned on coming back and swapping out that part of the code with a mechanism that does all the IPC(interprocess communication) in RAM.
With the arrival of solid-state disks, do you think that bottleneck is likely to become negligible?
Notes: It's server software written in C# calling some bare metal number-crunching libraries written in FORTRAN.
The short answer is probably no. A famous researcher named Jim Gray gave a talk about storage and performance which included this great analogy. Assuming your brain as the processor, accessing a register takes 1 clock tick (numbers on left) which roughly equivalent to that information being in your brain. Accessing memory takes 100 clock ticks, so roughly equivalent to getting data somewhere in the city you live in. Accessing a standard disk takes roughly 10^6 ticks, which is the equivalent to the data being on pluto. Where does solid state fit it? Current SSD technology is somewhere between 10^4-10^5 depending on who you ask. While they can be an order of magnitude faster, there is still a tremendous gap between reading from memory and reading from disk. This is why the answer to your question is likely no, since as fast as SSDs become they will still be significantly slower than disk (at least in the foreseeable future).
I think that you will find the bottlenecks are just moved. As we expect higher throughput then we write programs with higher demands.
This pushes bottlenecks to buses, caches and parts other than the read/write mechanism (which is last in the chain anyway).
With a process not bound by disk I/O, then I think you might find it becomes bound by the scheduler which limits the amount of read/write instructions (as with all process instructions).
To take full advantage of limitless I/O speed you would require real-time response and very aggressive management of caches and so on.
When disks get faster then so does RAM and processors and the demand on devices. The bottleneck is the same, the workload just gets bigger.
I don't believe that it will change the way I/O bound applications are written the tiniest bit. Having faster processors did not make people pick bubblesort as a sorting algorithm either.
The external memory hierarchies are an inherent problem of computing.
Joel on Software has an article about his experience upgrading to solid state. Not exactly the same issue you have, but my takeaway was:
Solid state drives can significantly speed up I/O bound operations, but many things (like compiling) are still cpu-bound.
I have a solid-state drive, and no, this won't eliminate I/O as a bottleneck. The SSD is nice, bit it's not that nice.
It's actually not hard to master your system's IPC primitives or to build something on top of TCP. But if you want to stick with your disk stuff and make it faster, ramdisk or tmpfs might do the trick.
No. Current SSDs are designed as disk replacements. Every layer, from SATA controller to filesystem driver treats them as storage.
This is not a problem of the underlying technology, NAND flash. When NAND flash is directly mapped into memory, and uses a rotating log storage system instead of a file system based on named files it can be quite fast. The fundamental problem is that NAND Flash only performans well in block updates. File metadata updates cause expensive read-modify-write operations. Also, NAND blocks are much bigger than typical disk blocks, which doesn't help performance either.
For these reasons, the future of SSDs will be better cached SSDs. DRAM will hide the overhead of poor mapping and a small supercap backup will allow the SSD to commit writes faster.
Solid state drives do make one important improvement to IO throughput, and that is the fact that on solid state disks, block locality is less of an issue from rotating media. This means that high performance IO bound applications can shift their focus from structures that arrange data accessed in order to structures that optimize IO in other ways, such as by keeping data in a single block by means of compression. That said, Even solid state drives benefit from linear access patterns because they can prefetch subsequent blocks into a read cache before the application requests it.
A noticeable regression on solid state disks is that writes take longer than reads, although both are still generally faster than rotating drives, and the difference is narrowing with newer, high end solid state disks.
No, sadly not. They do make it more interesting though: SSD drives have very fast reads and no sync time, but their writes are almost as slow as normal hard drives. This means that you will want to read most of the time. However when you do write to the drive you should write as much as possible in the same spot since SSD drives can only write entire blocks at a time.
How about using a ram drive instead of the disk? You would not have to rewrite anything. Just point it to a different file system. Windows and Linux both have them. Make sure you have lots of memory on the machine and create a virtual disk with enough space for your processing. I did this for a system that listened to multiple protocols on a network tap. I never new what packet I was going to get and there was too much data to keep it in memory. I would write it to the RAM drive and when something was completed, I would move it and let another process get it off the RAM drive and onto a physical disk. I was able to keep up with really busy server class network cards in this way. Good luck!
Something to keep in mind here:
If the communication involves frequent messages and is on the same system you'll get very good performance because Windows won't actually write the data out in the first place.
I've had to resort to it once and discovered this--the drive light did NOT come on at all so long as the data kept getting written.
but it was the easiest way to get things up and running.
I usually find that it's much cheaper to think good once with your own head, than to make the cpu think millions of times in vain.

Is it reasonable for modern applications to consume large amounts of memory?

Applications like Microsoft Outlook and the Eclipse IDE consume RAM, as much as 200MB. Is it OK for a modern application to consume that much memory, given that few years back we had only 256MB of RAM? Also, why this is happening? Are we taking the resources for granted?
Is it acceptable when most people have 1 or 2 gigabytes of RAM on their PCS?
Think of this - although your 200mb is small and nothing to worry about given a 2Gb limit, everyone else also has apps that take masses of RAM. Add them together and you find that the 2Gb I have very quickly gets all used up. End result - your app appears slow, resource hungry and takes a long time to startup.
I think people will start to rebel against resource-hungry applications unless they get 'value for ram'. you can see this starting to happen on servers, as virtualised systems gain popularity - people are complaining about resource requirements and corresponding server costs.
As a real-world example, I used to code with VC6 on my old 512Mb 1.7GHz machine, and things were fine - I could open 4 or 5 copies along with Outlook, Word and a web browser and my machine was responsive.
Today I have a dual-processor 2.8Ghz server box with 3Gb RAM, but I cannot realistically run more than 2 copies of Visual Studio 2008, they both take ages to start up (as all that RAM still has to be copied in and set up, along with all the other startup costs we now have), and even Word take ages to load a document.
So if you can reduce memory usage you should. Don't think that you can just use whatever bloated framework/library/practice you want with impunity.
http://en.wikipedia.org/wiki/Moore%27s_law
also:
http://en.wikipedia.org/wiki/Wirth%27s_law
There's a couple of things you need to think about.
1/ Do you have 256M now? I wouldn't think so - my smallest memory machine is 2G so a 200M application is not much of a problem.
2a/ That 200M you talk about might not be "real" memory. It may just be address space in which case it might not all be in physical memory at once. Some bits may only be pulled in to physical memory when you choose to do esoteric things.
2b/ It may also be shared between other processes (such as a DLL). This means it could be only held in physical memory as one copy but be present in the address space of many processes. That way, the usage is amortized over those many processes. Both 2a and 2b depend on where your figure of 200M actually came from (which I don't know and, running Linux, I'm unlikel to find out without you telling me :-).
3/ Even if it is physical memory, modern operating systems aren't like the old DOS or Windows 3.1 - they have virtual memory where bits of applications can be paged out (data) or thrown away completely (code, since it can always reload from the executable). Virtual memory gives you the ability to use far more memory than your actual physical memory.
Many modern apps will take advantage of the existance of more memory to cache more. Some like firefox and SQL server have explicit settings for how much memory they will use. In my opinion, it's foolish to not use available memory - what's the point of having 2GB of RAM if your apps all sit around at 10MB leaving 90% of your physical memory unused. Of course, if your app does use caching like this, it better be good at releasing that memory if page file thrashing starts, or allow the user to limit the cache size manually.
You can see the advantage of this by running a decent-sized query against SQL server. The first time you run the query, it may take 10 seconds. But when you run that exact query again, it takes less than a second - why? The query plan was only compiled the first time and cached for use later. The database pages that needed to be read were only loaded from disk the first time - the second time, they were still cached in RAM. If done right, the more memory you use for caching (until you run into paging) the faster you can re-access data. You'll see the same thing in large documents (e.g. in Word and Acrobat) - when you scroll to new areas of a document, things are slow, but once it's been rendered and cached, things speed up. If you don't have enough memory, that cache starts to get overwritten and going to the old parts of the document gets slow again.
If you can make good use of the RAM, it is your responsability to use it.
Yes, it is perfectly normal. Also something big was changed since 256MB were normal... and do not forget that before that 640Kb were supposed to be enough for everybody!
Now most software solutions are build with a garbage collector: C#, Java, Ruby, Python... everybody love them because certainly development can be faster, however there is one glitch.
The same program can be memory leak free with either manual or automatic memory deallocation. However in the second case it is likely for the memory consumption to grow. Why? In the first case memory is deallocated and kept clean immediately after something becomes useless (garbage). However it takes time and computing power to detect that automatically, hence most collectors (except for reference counting) wait for garbage to accumulate in order to make worth the cost of the exploration. The more you wait the more garbage you can sweep with the cost of one blow, but more memory is needed to accumulate that garbage. If you try to force the collector constantly, your program would spend more time exploring memory than working on your problems.
You can be completely sure than as long as programmers get more resources, they will sacrifice them using heavier tools in exchange for more freedom, abstraction and faster development.
A few years ago 256 MB was the norm for a PC, then Outlook consumed about 30 - 35 MB or so of memory, that's around 10% of the available memory, Now PC's have 2 GB or more as a norm, and outlook consumes 200 MB of memory, that's about 10% also.
The 1st conclusion: as more memory is available applications use more of it.
The 2nd conclusion: no matter what time frame you pick there are applications that are true memory hogs (like Outlook) and applications that are very efficient memory wise.
The 3rd conclusion: memory consumption of a app can't go down with time, else 640K would have been enough even today.
It completely depends on the application.

Would NTFS allocation blocks of 16KB or 32KB make compile time faster in comparison to the default 4KB?

Would NTFS allocation blocks of 16KB or 32KB make compile time faster in comparison to the default 4KB?
I can't imagine that would make much of a difference - disk block size is pretty far removed from compile speed. With the amount of caching a modern OS does, it seems unlikely to be significant.
The real answer, of course, can be found by measuring it. Getting similar conditions between different machines with different disk block sizes might be tricky, though.
My guess would be that disk fragmentation would be the biggest factor in determining compile speeds (that is, for a code base of decent size).
Dashogun is correct, at least in my experience. Larger projects / solutions create a lot of small, temporary files on the way to producing the final binary(ies). I find that if I defragment my disk once a week or so (even if the defragmenter does not recommend it) I do not see the performance degradation that I experience if I fail to do that.
As a corroborating factor, there are a couple of guys I work with that have the same experiences.

How does Google Desktop Search manage to stay light and fast?

I always wondered what different methods Google Desktop Search is using so that it uses least CPU and memory while indexing a computer containing more 100,000 files on an average.
In just few hours it has indexed the whole system and I did not see it eating up my CPU, memory etc.
If any of you have done some research, please do share.
The trick is simple: It starts to work then very soon stops and just sits there in in memory, doing nothing. Of course it's then totally useless but at least, it keeps light and fast. Sorry, couldn't resist :-) I Switched to Windows Search 4.0 and I'm much happier about it.
It doesn't...
I installed it on one computer, and quickly removed it because it was intrusive (although this can be probably configured) and hungry (particularly on a low end PC).
It is installed on a laptop near me right now, and if I compare it to a couple of small utilities I run permanently (SlickRun, CLCL, my AutoHotkey script...) it uses more than 10 times their CPU and 5 to 20 times their memory. Times two, since, for some reason, I have one instance running another, plus the ToolbarNotifier (less hungry).
Even Trend Micro anti-virus uses less memory and CPU.
Perhaps I will try it again when I will get a more modern PC with lot of memory, but right now I am happy enough with some grep utilities, even if they are slower.
Take a look at disk usage. If you build many keys/indexes you will use lots of disk space and the searches will be fast.
For example;
30 gig drive 75% used. 3.6 gig used for 2 instances of Google Desktop. (roaming profiles suck)
Once it has done the initial index, and written it to disc, it doesn't need to anything.
Searching using the index will require very little resources, the only thing that will is indexing new or modified files..

Resources