Journaling Filesystem Performance on SDcard - performance

Currently, I have an 4GB sdcard on which I have an journaling FS partition (EXT3 and EXT4). I am testing the journaling recovery aspect of these filesystems to fix any corruption on an sd.
I have an SDCARD on a piece of hardware that simply boots linux then runs a copy.sh I wrote.
I run a script that powers the machine for 150 seconds then hard shutdowns the machine for 30. This process is repeated for an extended period of time. I am running a script that copies a directory recursively back and forth on the journaling FS, deleting the directory from which it read from after it finishes. I keep track of how many times the directory was copied per boot.
I noticed something interesting in my results. At first, the directory may be copied successfully 20 times back and forth, but after hours of running, it only copies once or twice.
I was wondering why that was?
This trend is consistent with both EXT3 and EXT4. I've researched online for answers, but haven't found an answer for why the number of writes would decrease over time.

Does this explanation of how sdcards work help? http://www.anandtech.com/show/2738/8 Read that page and the couple following. This explains how deletes and overwrites are handled within the sd memory chips themselves, and implications for systems that dont implement the TRIM command.

Related

Trying to understand why freading a file over a network is so much slower over Samba than NFS

We have a program building a 3d Model from three files hosted on a Linux file server. Basically x.bin, y.bin and z.bin. It builds the models one z level at a time, and is read each file for every "slice".
On Linux machines running this program, the first slice takes around 45 seconds, and then ~2 seconds for every "slice" after that.
On Windows, the exact same program performing the exact same operation running the exact same script and code takes 5 minutes for the first slice, and around a minute and a half each slice after that.
Reading file over network slow due to extra reads
This thread seemed to have a guy with a similar problem, but the truth is that I'm still unclear on how NFS can be faster, as well as how I can suggest a change to the actual developers as to how to improve performance. The code is OS independent, I believe it's just using C's fread, fseek, etc to read the file information over the network.
How does NFS transfer/read data that it can be 60x faster than samba?
How can I get that performance on samba?
I'm not 100% sure as I don't know much about samba, but my guess is that nfs support fseek and thus can just position over the next splice and return that data. While samba probably doesn't and have to return the full file from the server and discard the "unused" content.
By the way, it's not the exact same program you're running, you probably recompile them right? So it's been transcode to a lot of different system call with each platforms having differents pros and cons...

Transferring 1-2 megabytes of data through regular files in Windows - is it slower than through RAM?

I'm passing 1-2 MB of data from one process to another, using a plain old file. Is it significantly slower than going through RAM entirely?
Before answering yes, please keep in mind that in modern Linux at least, when writing a file it is actually written to RAM, and then a daemon syncs the data to disk from time to time. So in that way, if process A writes a 1-2 MB into a file, then process B reads them within 1-2 seconds, process B would simply read the cached memory. It gets even better than that, because in Linux, there is a grace period of a few seconds before a new file is written to the hard disk, so if the file is deleted, it's not written at all to the hard disk. This makes passing data through files as fast as passing them through RAM.
Now that is Linux, is it so in Windows?
Edit: Just to lay out some assumptions:
The OS is reasonably new - Windows XP or newer for desktops, Windows Server 2003 or newer for servers.
The file is significantly smaller than available RAM - let's say less than 1% of available RAM.
The file is read and deleted a few seconds after it has been written.
When you read or write to a file Windows will often keep some or all of the file resident in memory (in the Standby List). So that if it is needed again, it is just a soft-page fault to map it into the processes' memory space.
The algorithm for what pages of a file will be kept around (and for how long) isn't publicly documented. So the short answer is that if you are lucky some or all of it may still be in memory. You can use the SysInternals tool VMmap to see what of your file is still in memory during testing.
If you want to increase your chances of the data remaining resident, then you should use Memory Mapped Files to pass the data between the two processes.
Good reading on Windows memory management:
Mysteries of Windows Memory Management Revealed
You can use FILE_ATTRIBUTE_TEMPORARY to hint that this data is never needed on disk:
A file that is being used for temporary storage. File systems avoid writing data back to mass storage if sufficient cache memory is available, because typically, an application deletes a temporary file after the handle is closed. In that scenario, the system can entirely avoid writing the data. Otherwise, the data is written after the handle is closed.
(i.e. you need use that flag with CreateFile, and DeleteFile immediately after closing that handle).
But even if the file remains cached, you still have to copy it twice: from your process A to the cache (the WriteFile call), and from cache to the proces B (ReadFile call).
Using memory mapped files (MMF, as josh poley already suggested) has the primary advantage of avoiding one copy: the same physical memory pages are mapped into both processes.
A MMF can be backed by virtual memory, which means basically that it always stays in memory unless swapping becomes necessary.
The major downside is that you can't easily grow the memory mapping to changing demands, you are stuck with the initial size.
Whether that matters for an 1-2 MB data transfer depends mostly on how you acquire and what you do with the data, in many scenarios the additional copy doesn't really matter.

Why slow simulations when results saved into one directory?

Would love some help figuring out why a script is running much slower than it used to.
The script starts sequential Matlab simulations and saves each simulation's output to a file in a directory on computer #1. The script is running on computers #2, 3, and 4 which have the C: drive of computer #1 mounted as drive K:, and the computers read and write K: drive files during the simulations. Prior to starting each simulation, the script saves a 'placeholder' version of the simulation's output file which later gets overwritten with that simulation's results once the simulation is complete. The output filename is unique to that simulation. The script checks for the output file before starting a simulation; if the file is found, it goes to the next simulation. The intent is to divide up many simulations among the different computers. The directory on computer #1 has many files in it (~4000, 6GB) and computer #1 is an old windows XP machine. Computers #2-4 are also windows machines and are 2+ years old.
This scheme used to work fine, saving ~3 files per minute. Now it is taking ~15 minutes per file. What might be the leading cause for the slowdown? Could it be the number of files in the directory or the number of computers accessing computer #1? If that is unlikely, I would like to know so I can redirect my troubleshooting.
The number of items in a single directory absolutely leads to decreased performance. I've read that it depends on OS, filesystem, phase of the moon, local/remote drives ... maybe phase of the moon.
My personal rule of thumb is that at about 5,000 items per directory performance starts to degrade, and at about 10,000 performance has degraded enough that whatever you are doing will not work correctly anymore.
It turns out the problem was an old network switch that the various computers were plugged into. When we tried a newer switch, the script ran like lightning.
However everyone's suggestions (subdirectories to reduce # of files; defragging computer #1 which turned out to be badly fragmented) were very helpful, and it was great to have some other eyes on the problem, so thanks.

Read files by device/inode order?

I'm interested in an efficient way to read a large number of files on the disk. I want to know if I sort files by device and then by inode I'll got some speed improvement against natural file reading.
There are vast speed improvements to be had from reading files in physical order from rotating storage. Operating system I/O scheduling mechanisms only do any real work if there are several processes or threads contending for I/O, because they have no information about what files you plan to read in the future. Hence, other than simple read-ahead, they usually don't help you at all.
Furthermore, Linux worsens your access patterns during directory scans by returning directory entries to user space in hash table order rather than physical order. Luckily, Linux also provides system calls to determine the physical location of a file, and whether or not a file is stored on a rotational device, so you can recover some of the losses. See for example this patch I submitted to dpkg a few years ago:
http://lists.debian.org/debian-dpkg/2009/11/msg00002.html
This patch does not incorporate a test for rotational devices, because this feature was not added to Linux until 2012:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ef00f59c95fe6e002e7c6e3663cdea65e253f4cc
I also used to run a patched version of mutt that would scan Maildirs in physical order, usually giving a 5x-10x speed improvement.
Note that inodes are small, heavily prefetched and cached, so opening files to get their physical location before reading is well worth the cost. It's true that common tools like tar, rsync, cp and PostgreSQL do not use these techniques, and the simple truth is that this makes them unnecessarily slow.
Back in the 1970s I proposed to our computer center that reading/writing from/to disk would be faster overall if they organized the queue of disk reads and/or writes in such a way as to minimize the seek time and I was told by the computer center that their experiments and information from IBM that many studies had been made of several techniques and that the overall throughput of JOBS (not just a single job) was most optimal if disk reads/writes were done in first come first serve order. This was an IBM batch system.
In general, optimisation techniques for file access are too tied to the architecture of your storage subsystem for them to be something as simple as a sorting algorithm.
1) You can effectively multiply the read data rate if your files are spread into multiple physical drives (not just partitions) and you read two or more files in parallel from different drives. This one is probably the only method that is easy to implement.
2) Sorting the files by name or inode number does not really change anything in the general case. What you'd want is to sort the files by the physical location of their blocks on the disk, so that they can be read with minimal seeking. There are quite a few obstacles however:
Most filesystems do not provide such information to userspace applications, unless it's for debugging reasons.
The blocks themselves of each file can be spread all over the disk, especially on a mostly full filesystem. There is no way to read multiple files sequentially without seeking back and forth.
You are assuming that your process is the only one accessing the storage subsystem. Once there is at least someone else doing the same, every optimisation you come up with goes out of the window.
You are trying to be smarter than the operating system and its own caching and I/O scheduling mechanisms. It's very likely that by trying to second-guess the kernel, i.e. the only one that really knows your system and your usage patterns, you will make things worse.
Don't you think e.g. PostreSQL pr Oracle would have used a similar technique if they could? When the DB is installed on a proper filesystem they let the kernel do its thing and don't try to second-guess its decisions. Only when the DB is on a raw device do the specialised optimisation algorithms that take physical blocks into account come into play.
You should also take the specific properties of your storage devices into account. Modern SSDs, for example, make traditional seek-time optimisations obsolete.

How to read individual sectors/clusters using DeviceIoControl() in Windows?

I dropped my laptop while Windows was preparing to hibernate and as a result, I got a head crash on the hard drive. (Teaches me to get a hard drive and/or laptop with a freefall sensor next time around.) Anyway, running SpinRite to try to recover the data has resulted in all the spare sectors on the disk to all be all used up for all the recoverable sectors so far. SpinRite is still going right now, but since there won't be anymore spare sectors to be used, I think it'll be a fruitless exercise except to tell me where all the bad sectors are.
Anyway, I'm planning on writing an application to try to salvage data from the hard drive. From my past forays into defragging, I know that I can use FSCTL_GET_RETRIEVAL_POINTERS to figure out the logical cluster numbers for any given file.
How do I go about trying to read the sectors for that actual cluster? My digging through MSDN's listing for Disk, File, and Volume device control codes hasn't had anything jump out at me as the way I get to the actual cluster data.
Should I not even bother trying to read at that low level? Should I instead be doing SetFilePointer() and ReadFile() calls to get to the appropriate cluster sized offsets into the file and read cluster sized chunks?
If the file I'm trying to read has a bad sector, will NTFS mark the entire file as bad and prevent me from accessing the file in the future? If so how do I tell NTFS not to mark the file as bad or dead? (Remember that the HD is now out of spare sectors to be remapped.)
Should I dust off my *nix knowledge and figure out how to read from /dev/ ?
Update: I found the answer to my own question. :-) The solution is doing SetFilePointer() and ReadFile() on the volume handle rather than on the file handle.
I found the answer to my own question. :-) The solution is doing SetFilePointer() and ReadFile() on the volume handle rather than on the file handle.

Resources