How to identify bad sectors when reading a disk - winapi

I want to check for bad sectors while reading a disk. For one time I am reading 64 sectors at once. Between these 64 sectors some bad sectors are available. I tried it by checking
ReadFile(HANDLE,LPVOID,NoOfBytestoRead,NoOfBytesRead,NULL);
function. I checked that number of bytes returned(NoOfBytesRead) that has been read and number of bytes(NoOfBytestiRead) that is requested to read. Both values are coming equal. But I think if there are unreadable sectors than NoOfBytesRead should be less than NoOfBytestoRead.
Can anybody help me out, how can I detect the bad sectors while reading the 64 sectors once.

Related

What may cause a limit on SG_IO ioctl maximum sector count of a transfer?

I need to pass a direct ATA request to a hard drive (0x25, READ DMA EXT), to disobey max sector count (long story), and to bypass all possible OS caches, buffers, reorderings et al.
HDIO_DRIVE_TASKFILE IOCTL is no longer available due to libata.
I accomplished the goal with a SG_IO IOCTL with ATA pass-through (SG_ATA_16). Works perfectly except one problem: I can read a maximum of 8192 sectors in one command. I need to read a full of 32767 sectors.
max_hw_sectors_kb is 32767, so the drive supports this much transfer
max_sectors_kb was low, yet I brought it up to 32767 sectors, to no avail
scheduler is set to noop, no change.
Tried gather buffer (iovec_count>0, properly set iovecs to consecutive buffer slices), no change.
Environment: Ubuntu 16.04/16.10/17.04 with standard kernels, SATA drive connected to standard AHCI interface on Intel chipset.
No matter what I do, starting with 8193 sectors, IOCTL bails out with "Invalid argument" error.
Where to look? What else can cause a 4MB data transfer cap?

Files take up more space on the disk

When viewing details of a file using Finder, different values are shown for how much space the file occupies. For example, a file takes up 28.8KB of RAM but, 33KB of the disk. Anyone know the explanation?
Disk space is allocated in blocks. Meaning, in multiples of a "block size".
For example, on my system a 1 byte file is 4096 bytes on disk.
That's 1 byte of content & 4095 bytes of unused space.

Calculating Cache Memory Hit and Miss, and Calculating Rows in Cache

I am studying an old exam for an upcoming exam, and the final questions consist of what the title describes. Now, I am familiar with assembly language instructions and I somewhat know what the code means. But, what the exam question actually wants me to do is confusing. I would really appreciate if someone could explain this question.
The question:
I am given a cache-memory which has room for 512 bytes and every row is 8 bytes long. The memory is direct-mapped and an "address" is 32 bits long. Also, the cache-memory is empty from the start.
After that, I get some instructions and am supposed to explain if it becomes a cache-hit or cache-miss. It should also be assumed that the instructions are all sequential and all data that is added/modified in an instruction still exists for the next instruction.
The instructions I get are
movia r8, 0xBEDA12C4
ldw r10, 0( r8 )
ldw r11, 8( r8 )
stw r10, 16( r8 )
ldw r10, 24(r8)
ldw r18, 32(r8)
Now I would really appreciate if someone could explain the details to me:
The cache-memory has room for a total of 512 bytes. What is this? Is it the total memory the cache is able to store? Also, I heard from somewhere that this is how you calculate rows in cache. For example, 512 bytes of memory and every row is 16 bytes. 512/16 = 32 rows in cache. For this example 512/8 = 64 rows. Which one is it? What does this mean!?
It also states that every row is 16 bytes long. I've seen the example with TAG, ROW, BYTE where they try to illustrate the cache. But how do I understand the 16 bytes per row? At least it doesn't seem to take part of the length on TAG, ROW, BYTE. What is this for?
Direct-mapped cache. I understand this somewhat. It's just a big row of slots of order which are empty or not, yeah? I found some information on this here.
http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Memory/direct.html
*Updated link: https://web.archive.org/web/20150213025748/http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Memory/direct.html
Now to the main part. How do I calculate for each instruction if it will be a cache miss or hit? My guess is that the first instruction ought to be a miss, since the question said that the cache memory is empty from the start. The second instruction also must be a cache miss but from this point on I am not sure how to calculate if the instruction generates a cache hit or miss. To be honest, I am not even sure what a hit would be.
I would really appreciate if someone could show me how to calculate each step and how I know whether an instruction creates a cache hit or miss. The instructions we get for calculating this are really confusing. Thank you so much!
Generally you have to look at it as at a separate memory space, with only 512 bytes, addressable, readable and writable as arrays 8 bytes each. If you need byte 2, the address will be 0, you read the whole array and select byte 3 from it. If you need byte 8, the address will be 1, and you select byte 0 from the array. Such small memory have one huge advantage - it is fast. It alone can store the contents of some larger memory space, only first 512 bytes. If you store something to address 1 of larger memory space, it will go to that smaller memory instead, the address will become 0 and offset 1, internally for that small amount of memory. If you access beyond that, for example, 1000, you will have to wait more. In this case it would be just memory mapped "registers" - it would be actually faster and better in some cases, than "cache" - unfortunately for some reason processor makers generally won't let you use the cache in that way (probably marketing and support reasons - to sell other products as a separate market share, with higher price).
If you add some more space to each array to store some other value, you can store a part of address there. Without hardware support you could store there virtually anything, that second part is called tag. Now if you have some address fffff000, you can read the second space (assuming that you have the commands to do so), from address 0 - for simplicity and speed you can obtain the address from the primary memory space by masking all the bits except bits 3..8 and 0-2 (which are used to obtain offset in 8 byte array), and check the tag part from that address. One bit in that tag may be used to indicate whether there is something stored there, the other bits may be used to store the part of address from main memory. If you want to save something cached there, you set the bit indicating that the array is not "empty", and assign the upper bits of the main address there, and copy the 8 bytes from the main memory. Next time, before reading something within that range in memory, you read the tag part of the smaller memory array first, then decide whether to read from slow main memory, or from that smaller but faster part (and it would be cache hit).
If you write something with an address of (+-)x512 bytes in main memory, you would have to read the already mentioned array of 8 bytes, copy it into main memory, whole 8 bytes, and write what you want into the very same cell, and then modify the address with a new value. But you would lose the previous copy of your data in the smaller memory area (but faster). If you need the previous value again (any of those 8 bytes), you would have to copy it again from main memory (cache miss).
The same goes for all other arrays of that "cache" memory. So we have a sequence of cache checking, writing, reading and copying the data to or from main memory.
That is called 1 way associativity, for 2 ways there would be one more array (same) of 512 bytes, which can store different addresses though (with the step of 512 from main memory), the tags of those 2 arrays may be checked simultaneously, and if some array has the copy of that memory range it can return it instead of reading it from main memory. Without tag checking (extra cycles for that), the "cache" is essentially a small amount of memory.

Page boundaries, implementing memory pool

I have decided to reinvent the wheel for a millionth time and write my own memory pool. My only question is about page size boundaries.
Let's say GetSystemInfo() call tells me that the page size is 4096 bytes. Now, I want to preallocate a memory area of 1MB (could be smaller, or larger), and divide this area into 128 byte blocks. HeapAlloc()/VirtualAlloc() will have an overhead between 8 and 16 bytes I guess. Might be some more, I've read posts talking about 60 bytes.
Question is, do I need to pay attention to not to have one of my 128 byte blocks across page boundaries?
Do I simply allocate 1MB in one chunk and divide it into my block size?
Or should I allocate many blocks of, say, 4000 bytes (to take into account HeapAlloc() overhead), and sub-divide this 4000 bytes into 128 byte blocks (4000 / 128 = 31 blocks, 128 bytes each) and not use the remaining bytes at all (4000 - 31x128 = 32 bytes in this example)?
Having a block cross a page boundary isn't a huge deal. It just means that if you try to access that block and it's completely swapped out, you'll get two page faults instead of one. The more important thing to worry about is the alignment of the block.
If you're using your small block to hold a structure that contains native types longer than 1 byte, you'll want to align it, otherwise you face potentially abysmal performance that will outweigh any performance gains you may have made by pooling.
The Windows pooling function ExAllocatePool describes its behaviour as follows:
If NumberOfBytes is PAGE_SIZE or greater, a page-aligned buffer is
allocated. Memory allocations of PAGE_SIZE or less do not cross page
boundaries. Memory allocations of less than PAGE_SIZE are not
necessarily page-aligned but are aligned to 8-byte boundaries in
32-bit systems and to 16-byte boundaries in 64-bit systems.
That's probably a reasonable model to follow.
I'm generally of the idea that larger is better when it comes to a pool. Within reason, of course, and depending on how you are going to use it. I don't see anything wrong with allocating 1 MB at a time (I've made pools that grow in 100 MB chunks). You want it to be worthwhile to have the pool in the first place. That is, have enough data in the same contiguous region of memory that you can take full advantage of cache locality.
I've found out that if I used _align_malloc(), I wouldn't need to worry wether spreading my sub-block to two pages would make any difference or not. An answer by Freddie to another thread (How to Allocate memory from a new virtual page in C?) also helped. Thanks Harry Johnston, I just wanted to use it as a memory pool object.

Is WriteFile atomic?

I'm designing a system that will write time series data to a file. The data is blocks of 8 bytes divided into two 4 bytes parts, time and payload.
According to MSDN the WriteFile function is atomic ( http://msdn.microsoft.com/en-us/library/aa365747(VS.85).aspx ), if the data written is less than a sector in size.
Since the file will only contain these blocks (there is no "structure" of the file so it's not possible to reconstruct a damaged file), added one after each other, it's vital that the whole block, or nothing is written to the file at all times.
So the question is, have I understood it correctly that a writefile less than a sector in size is alway written completely to disk or not written at all, no matter what happens during the actual call to writefile ?
WriteFile is atomic as long as the write does not cross a sector boundary in the file. So if the sector size is 512 bytes, writing 20 bytes starting at file offset 0 will be atomic, but the same data written at file offset 500 will not be atomic. In your case the writes should be atomic, since the sector size should be a multiple of 8.
This MSDN blog has more information on how to do an atomic multi-sector write without using transacted NTFS.

Resources