How does one map physical sectors of a disk to the files that contain them on an HFS+ volume on Mac OS X - macos

I've been recovering a hard disk using dd_rescue, which provides me a list of all of the device sectors it could not copy due to hardware errors.
I'd like to take that list, and write a tool to give me all of the files that contain these bad sectors so I can delete them. I'm not sure what APIs I can use to do this--essentially i want to get a list of all files on disk and for each file, a list of the ranges of sectors it occupies on disk.
My first guess is that I will iterate over each directory entry on the disk and check to see if the file overlaps one of these bad sectors. Maybe there's a better way.

If you want to map a file's data location to a physical block (sector), you can use the fcntl(2) call with the F_LOG2PHYS command. Not all file systems support this command, but HFS+ does. Just use lseek to pick the file offset and you can get back the diskoffset from F_LOG2PHYS (it's returned in a struct log2phys in the l2p_devoffset field). See fcntl.h for more details.

It's an old question, but since it still is among the top hits when searching the topic, here's to all who searched:
Since Mac OS X 10.6 fsck_hfs(8) can map physical sectors to files (see option -B). A note on usage: matching will only be successful if checking of the catalog was actually performed. So you might have to force checking with options -l or -f.
BTW, hfsdebug as a PPC binary relies on Rosetta and thus will not run on Lion or later.

There's no API exposed for grubbing through HFS+ filesystems, but the source for the filesystem is available from Apple as part of the XNU kernel. Also check out the hfsdebug tool which might help to understand the fs.

Related

read/write to a disk without a file system

I would like to know if anybody has any experience writing data directly to disk without a file system - in a similar way that data would be written to a magnetic tape. In particular I would like to know if/how data is written in blocks, and whether a certain blocksize needs to be specified (like it does when writing to tape), and if there is a disk equivalent of a tape file mark, which separates the archives written to a tape.
We are creating a digital archive for over 1 PB of data, and we want redundancy built in to the system in as many levels as possible (by storing multiple copies using different storage media, and storage formats). Our current system works with tapes, and we have a mechanism for storing the block offset of each archive on each tape so we can restore it.
We'd like to extend the system to work with disk volumes without having to change much of the logic. Another advantage of not having a file system is that the solution would be portable across Operating Systems.
Note that the ability to browse the files on disk is not important in this application, since we are considering this for an archival copy of data which is not accessed independently. Also note that we would already have an index of the files stored in the application database, which we also write to the end of the tape/disk when it is almost full.
EDIT 27/05/2020: It seems that accessing the disk device as a raw/character device is what I'm looking for.

Writing to /dev/loop USB image?

I've got an image that I write onto a bootable USB that I need to tweak. I've managed to mount the stick as /dev/loopX including allowing for the partition start offset, and I can read files from it. However writing back 'seems to work' (no errors reported) but after writing the resulting tweaked image to a USB drive, I can no longer read the tweaked files correctly.
The file that fails is large and also a compressed tarfile.
Is writing back in this manner simply a 'no-no' or is there some way to make this work?
If possible, I don't want to reformat the partition and rewrite from scratch because that will (I assume) change the UUID and then I need to go worry about the boot partition etc.
I believe I have the answer. When using losetup to create a writeable virtual device from the partition on your USB drive, you must specify the --sizelimit parameter as well as the offset parameter. If you don't then the resulting writes can go past the last defined sector on the partition (presumably requires your USB drive to have extra space). Linux reports no errors until later when you try to read. The key hints/evidence for this are that when reads (or (re)written data) fail, dmesg shows read errors attempting to read past the end of the drive. fsck tools such as dosfsck also indicate that the drive claims to be larger than it is.

How can I find the physical address of a file?

I'm using the GoAsm assembler on a Windows 7 - 64 bit OS and I'll be asking you a few (not so dumb) questions.
First question :
How can I find the physical address of a file ?
Let's suppose file "Text.txt" is at the root of my C:\ partition.
Is there a way to get the exact memory address where this file is ?
Second question :
Is it possible to call a routine which will just do like if I invoked a C function ?
(i.e. : Consider a C function "WriteToScreen", is it possible to have the same function, but in assembler format, that means without having the need to use high-level invokes to do that work ?
Third question :
Are there somewhere on the net some include files for GoAsm containing useful routines like (move, copy, edit, erase) commands ? I've first thought of ms-dos interrupts but I can't manage to get them to work without crashing the program. I guess it just not compatible with Windows OS even though the command prompt acts like ms-dos... ?
Fourth question :
I've heard from different sources and myself that NASM works pretty bad on Win7 x64, is it just true, or am I doing it the wrong way ?
1
An hard drive, from a logical point of view, can be seen as a sequence of "blocks" (the more common name is sectors). How these blocks are organized physically on the disks can be disregarded, but the driver must know someway how to get data of course, though you send to modern hd driver "high level" commands that, as far as you know, are not strongly related to where data physically are (you can say "read the block 123", but there's no extern evidence of where that block lives).
However this way you can "name" a block with a number, and say e.g. that block 0 is the MBR. Each block contains several bytes (512, 1024...). Not all used blocks contain actual data of a file, in fact there are metainformations of any sort, depending on the filesystem but even related to the "structure" of the hd (I mean, partitions).
A file located on an hd is not automatically loaded into memory, so it has no memory address. Once you read it, piece of it if not all are of course copied into the memory you give, which is not an intrinsic property of the file. (Filesystems retrieve the blocks belonging to the file and "show" them as we are used to see them, as a single "unit", the file)
Summarizing: files have no memory address. The physical address could be the set of blocks holding data (and metadata, like inodes ) of the file, or just the first block (but if a block of data is N, N+1 could not belong to the same file - the blocks need no to be one next to the other). To know them, you have to analyse the structure of the filesystem you use. I don't know if there's an API to retrieve them easily, but in the worst case you can analyse the source code of the filesystem... good luck!
2
C functions are translated into assembly. If you respect the C calling convention, you can write a "C function" directly in assembly. Try reading this and this for x86.
3
You can call windows API from asm. Forget MS-DOS, MS-DOS is dead, MS-DOS is not Windows, the cmd is a sort of "emulation"... indeed no, not an emulation but just a command line interface that resemble the one MS-DOS users was used to. But it is not exaclty the same, i.e. there are no MS-DOS system interrupt you can use. Iczelion's assembly tutorials, though old, could be an interesting resource. (If links expire, try with the wayback machine)
4
I do not own Win7 and never installed nasm on windows, so I can't say anything about.
For the first question just drag the file into the address bar in the browser

How does file recovery software work?

I wanted to make some simple file recovery software, where I want to try to recover files which happen to have been deleted by pressing Shift + Delete. I'm working in Windows, can anyone show me any links or documents which can help me to do so programatically? I know C, C++, .NET. Any pointers?
http://www.google.hu/search?q=file+recovery+theory&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a :)
Mainly file recoveries are looking for file headers and/or filenames in the disk as I know, then try to get the whole file by the header information.
This could be a good start: http://geeksaresexy.blogspot.com/2006/02/theory-behind-deleted-files-recovery.html
The principle of all recovery tools is that deleting a file only removes a pointer in a folder and (quick) formatting of a partition only rewrites the first sectors of the partition which contains the headers of the filesystem. An in depth analysis of the partition data (at sector level) can rebuild a big part of the filesystem data, cluster allocation tables, folders, and file cluster chains.
All course if you use a surface test tool while formatting the partition that will rewrite all sectors to make sure that they are correct, nothing will be recoverable - unless you use specialized hardware to look at remanent magnetism on the edges of the actual tracks
In windows when a file is deleted(permanent delete) it's not actually deleted from disk but the file name added with char( _ I guess) in front of it and windows ignores these when showing in explorer... and recovery tools will search these kind of file names in the disk. And your file recover integrity based on some data over written on location of deleted file. Don't know this pattern still used by windows.. but long time back I have read this some where

Get sector location of a file

Based on a file name, or a file handle, is there a Win-API method of determining what physical sector the file starts on?
You can get file cluster allocation by sending FSCTL_GET_RETRIEVAL_POINTERS using DeviceIoControl.
You'd have to read the allocation table directly.
I suspect that there is no such function.
Even if you know where the file starts, what good would it do? The rest of the file could be anywhere as soon as the file is larger than a single sector due to fragmentation.
You would probably need to develop deeper understanding of the file system involved and read the necessary information from the file allocation table or such mechanism.
No. Why? Because a file system is an abstraction of physical hardware. You don't need to know if you're on a RAM disk, hard drive, CD, or network drive, or if your data is compressed or encrypted -- Windows takes care of these little details for you.
You can always open the physical disk, but you'd need knowledge of the file system used.
What are you trying to accomplish with this?

Resources