Read, write, and... 'O'?

Read, write, and... 'O'? - performance

I have a disk activity monitoring tool (specifically Process Hacker) that is showing my program doing 0 kb/s of R(eading), 0kb/s of W(riting), and 20kb of 'O'. What's 'O'? Overwriting? Oopdating? Odorizing?

Related

Double the RAM consumed than the file size on disk

I am opening a huge txt file (~ 350 MB) in Notepad++
When I am monitoring the Private Bytes consumed for Notepad++ before and after opening the file, I see approx. 730 MB consumed where as the file that is opened is just around 350 MBs.
I am not trying to point problem with Notepad++ as I see same memory consumption when I write data of this file to my MFC CEditCtrl too. I need to know as to why this behavior.
PS: I monitor Private Bytes using Process Explorer software.

file system recognition on Windows and BIOS parameter block position

I am writing a boot sector for the FAT12 file system for a floppy disk.
Code #1:
start: jmp main
TIMES 3 - ($-$$) DB 0
OEMname: DB '12345678'
;rest of the BPB information below
Code #2:
TIMES 3 - ($-$$) DB 0
OEMname: DB '12345678'
;rest of the BPB information below
In both these cases, OEMname starts from byte 3 (as informed by the assembler listing).
When the boot sector is written to the disk, Windows recognizes the partition when code #1 is used but fails to recognize when code #2 is used, i.e. it complains that the drive is not formatted.
Why does Windows succeed to identify the file system in one case but not the other?

What is the Faults column in 'top'?

I'm trying to download Xcode (onto version El Capitan) and it seems to be stuck. When I run 'top', I see a process called 'storedownloadd' and the "STATE" column is alternating between sleeping, stuck,and running. The 'FAULTS' has a quickly increasing number with a plus sign after it. The 'FAULTS' column is now over 400,000 and increasing. other than 'top', I see no sign of activity of the download. Does this indicate that something is amiss? Here's a screen shot:
Processes: 203 total, 2 running, 10 stuck, 191 sleeping, 795 threads 11:48:14
Load Avg: 4.72, 3.24, 1.69 CPU usage: 56.54% user, 6.41% sys, 37.3% idle SharedLibs: 139M resident, 19M data, 20M linkedit. MemRegions: 18620 total, 880M resident, 92M private, 255M shared. PhysMem: 7812M used (922M wired), 376M unused.
VM: 564G vsize, 528M framework vsize, 0(0) swapins, 512(0) swapouts. Networks: packets: 122536/172M in, 27316/2246K out. Disks: 78844/6532M read, 240500/6746M written.
PID COMMAND %CPU TIME #TH #WQ #PORT MEM PURG CMPRS PGRP PPID STATE BOOSTS %CPU_ME %CPU_OTHRS UID FAULTS COW MSGSENT MSGRECV SYSBSD SYSMACH
354 storedownloadd 0.3 00:47.58 16 5 200 255M 0B 0B 354 1 sleeping *3[1] 155.53838 0.00000 501 412506+ 54329 359852+ 6620+ 2400843+ 1186426+
57 UserEventAgent 0.0 00:00.35 22 17 378 4524K+ 0B 0B 57 1 sleeping *0[1] 0.23093 0.00000 0 7359+ 235 15403+ 7655+ 24224+ 17770
384 Terminal 3.3 00:12.02 10 4 213 34M+ 12K 0B 384 1 sleeping *0[42] 0.11292 0.04335 501 73189+ 482 31076+ 9091+ 1138809+ 72076+

When top reports back FAULTS it's referring to "page faults", which are more specifically:
The number of major page faults that have occurred for a task. A page
fault occurs when a process attempts to read from or write to a
virtual page that is not currently present in its address space. A
major page fault is when disk access is involved in making that page
available.
If an application tries to access an address on a memory page that is not currently in physical RAM, a page fault occurs. When that happens, the virtual memory system invokes a special page-fault handler to respond to the fault immediately. The page-fault handler stops the code from executing, locates a free page of physical memory, loads the page containing the data needed from disk, updates the page table, and finally returns control to the program — which can then access the memory address normally. This process is known as paging.
Minor page faults can be common depending on the code that is attempting to execute and the current memory availability on the system, however, there are also different levels to be aware of (minor, major, invalid), which are described in more detail at the links below.
↳ Apple : About The Virtual Memory System
↳ Wikipedia : Page Fault
↳ Stackoverflow.com : page-fault

SCSI Write Buffer command "Download microcode with offset and save" vs "Download microcode with save" mode

I want to use Write Buffer SCSI command to upload a firmware of a tape drive (LTO-6).
As described in IBM LTO SCSI Reference section "5.2.41.6: MODE[07h] – Download microcode with offsets, save, and activate", microcode is transferred to the device using one or more WRITE BUFFER commands, saved to nonvolatile storage (Page 180).
According to the CDB (Page 132), the Buffer Offset can be expressed in 3 bytes so does the Parameter List Length.
As I understand you may want to use more than one Write Buffer command in case the firmware size can't be expressed in 3 bytes (more than about 16M), and if so you can use the offset for that. But if the offset itself can't be expressed in more than 3 bytes, that means one can't write at offset 17M for example (therefor can't use this command more than twice in a row).
Does anybody know if this is the real use of "offset and save" mode?

You can use the mode 07h (Section 5.2.17.4) where the write buffer uses a shifted offset and thus you can express offsets larger than 16MB.

Looks like one can't upload more than 32MB in the firmware buffer, and what was meant by 2 or more Write Buffer commands is to issue them with smaller value than the maximum(16MB) if you have a DMA(Direct Memory Access) limitation.
One can use the interpretation mentioned by Baruch Even with Read Buffer command with mode 07h (It's not supported by all Buffer IDs, one can check by issuing Read Buffer with mode 07h and it will return illegal request if it's not supported).
On the other hand, Write Buffer commands sections shows no such interpretation to any of the modes.

Win32: Write to file without buffering?

I need to create a new file handle so that any write operations to that handle get written to disk immediately.
Extra info: The handle will be the inherited STDOUT of a child process, so I need any output from that process to immediately be written to disk.
Studying the CreateFile documentation, the FILE_FLAG_WRITE_THROUGH flag looked like exactly what I need:
Write operations will not go through
any intermediate cache, they will go
directly to disk.
I wrote a very basic test program and, well, it's not working.
I used the flag on CreateFile then used WriteFile(myHandle,...) in a long loop, writing about 100MB of data in about 15 seconds. (I added some Sleep()'s).
I then set up a professional monitoring environment consisting of continuously hitting 'F5' in explorer. The results: the file stays at 0kB then jumps to 100MB about the time the test program ends.
Next thing I tried was to manually flush the file after each write, with FlushFileBuffers(myHandle). This makes the observed file size grow nice and steady, as expected.
My question is, then, shouldn't the FILE_FLAG_WRITE_THROUGH have done this without manually flushing the file? Am I missing something?
In the 'real world' program, I can't flush the file, 'cause I don't have any control over the child process that's using it.
There's also the FILE_FLAG_NO_BUFFERING flag, that I can't be used for the same reason - no control over the process that's using the handle, so I can't manually align the writes as required by this flag.
EDIT:
I have made a separate project specifically for watching how the size of the file changes. It uses the .NET FileSystemWatcher class. I also write less data - around 100kB in total.
Here's the output. Check out the seconds in the timestamps.
The 'builtin no-buffers' version:
25.11.2008 7:03:22 PM: 10230 bytes added.
25.11.2008 7:03:31 PM: 10240 bytes added.
25.11.2008 7:03:31 PM: 10240 bytes added.
25.11.2008 7:03:31 PM: 10240 bytes added.
25.11.2008 7:03:31 PM: 10200 bytes added.
25.11.2008 7:03:42 PM: 10240 bytes added.
25.11.2008 7:03:42 PM: 10240 bytes added.
25.11.2008 7:03:42 PM: 10240 bytes added.
25.11.2008 7:03:42 PM: 10240 bytes added.
25.11.2008 7:03:42 PM: 10190 bytes added.
... and the 'forced (manual) flush' version (FlushFileBuffers() is called every ~2.5 seconds):
25.11.2008 7:06:10 PM: 10230 bytes added.
25.11.2008 7:06:12 PM: 10230 bytes added.
25.11.2008 7:06:15 PM: 10230 bytes added.
25.11.2008 7:06:17 PM: 10230 bytes added.
25.11.2008 7:06:19 PM: 10230 bytes added.
25.11.2008 7:06:21 PM: 10230 bytes added.
25.11.2008 7:06:23 PM: 10230 bytes added.
25.11.2008 7:06:25 PM: 10230 bytes added.
25.11.2008 7:06:27 PM: 10230 bytes added.
25.11.2008 7:06:29 PM: 10230 bytes added.

I've been bitten by this, too, in the context of crash logging.
FILE_FLAG_WRITE_THROUGH only guarantees that the data you're sending gets sent to the filesystem before WriteFile returns; it doesn't guarantee that it's actually sent to the physical device. So, for example, if you execute a ReadFile after a WriteFile on a handle with this flag, you're guaranteed that the read will return the bytes you wrote, whether it got the data from the filesystem cache or from the underlying device.
If you want to guarantee that the data has been written to the device, then you need FILE_FLAG_NO_BUFFERING, with all the attendant extra work. Those writes have to be aligned, for example, because the buffer is going all the way down to the device driver before returning.
The Knowledge Base has a terse but informative article on the difference.
In your case, if the parent process is going to outlive the child, then you can:
Use the CreatePipe API to create an inheritable, anonymous pipe.
Use CreateFile to create a file with FILE_FLAG_NO_BUFFERING set.
Provide the writable handle of the pipe to the child as its STDOUT.
In the parent process, read from the readable handle of the pipe into aligned buffers, and write them to the file.

This is an old question but I thought I might add a bit to it. Actually everyone here I believe is wrong. When you write to a stream with write-through and unbuffered-io it does write to the disk but it does NOT update the metadata associated with the File System (eg what explorer shows you).
You can find a good reference on this kind of stuff here http://winntfs.com/2012/11/29/windows-write-caching-part-2-an-overview-for-application-developers/
Cheers,
Greg

Perhaps you could be satisfied enough with FlushFileBuffers:
Flushes the buffers of a specified file and causes all buffered data to be written to a file.
Typically the WriteFile and WriteFileEx functions write data to an internal buffer that the operating system writes to a disk or communication pipe on a regular basis. The FlushFileBuffers function writes all the buffered information for a specified file to the device or pipe.
They do warn that calling flush, to flush the buffers a lot, is inefficient - and it's better to just disable caching (i.e. Tim's answer):
Due to disk caching interactions within the system, the FlushFileBuffers function can be inefficient when used after every write to a disk drive device when many writes are being performed separately. If an application is performing multiple writes to disk and also needs to ensure critical data is written to persistent media, the application should use unbuffered I/O instead of frequently calling FlushFileBuffers. To open a file for unbuffered I/O, call the CreateFile function with the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags. This prevents the file contents from being cached and flushes the metadata to disk with each write. For more information, see CreateFile.
If it's not a high-performance situation, and you won't be flushing too frequently, then FlushFileBuffers might be sufficient (and easier).

The size you're looking at in Explorer may not be entirely in-sync with what the file system knows about the file, so this isn't the best way to measure it. It just so happens that FlushFileBuffers will cause the file system to update the information that Explorer is looking at; closing it and reopening may end up doing the same thing as well.
Aside from the disk caching issues mentioned by others, write through is doing what you were hoping it is doing. It's just that doing a 'dir' in the directory may not show up-to-date information.
Answers suggesting that write-through only writes it "to the file system" are not quite right. It does write it into the file system cache, but it also sends the data down to the disk. Write-through might mean that a subsequent read is satisfied from the cache, but it doesn't mean that we skipped a step and aren't writing it to the disk. Read the article's summary very carefully. This is a confusing bit for just about everyone.

Perhaps you wanna consider memory mapping that file. As soon as you write to the memory mapped region, the file gets updated.
Win API File Mapping

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Read, write, and... 'O'? - performance

I have a disk activity monitoring tool (specifically Process Hacker) that is showing my program doing 0 kb/s of R(eading), 0kb/s of W(riting), and 20kb of 'O'. What's 'O'? Overwriting? Oopdating? Odorizing?

Related

Double the RAM consumed than the file size on disk

file system recognition on Windows and BIOS parameter block position

What is the Faults column in 'top'?

SCSI Write Buffer command "Download microcode with offset and save" vs "Download microcode with save" mode

Win32: Write to file without buffering?

Categories

Resources