Failure to modify File Records in Master File Table(MFT) of NTFS - windows

I am writing a program to remove a file and all attributes related(including the 0x30 $FILE_NAME, 0x80 $DATA, 0x90 $INDEX_ROOT, and 0xA0 $INDEX_ALLOCATION, and etc.) in a NTFS volume in Windows.
I could now find the position of the File Record to any file. I would overwrite the File Record for several times to prevent recovery, and then I put back the basic information that a File Record would have(that is the Standard Attribute Header of the first attribute "0x10 $STANDARD_INFORMATION").
I used WriteFile() to write the File Record, and the returned value indicates the function succeeded.
After that, open disk to see raw data by WinHex I can see the File Record actually IS modified.
But the problem is, after I deleted another two or three files, the previous file's File Record reappeared as if I had never done anything to it.
I think this could be some recovery mechanism of Windows file management. I wonder if there is any method to modify the File Record successfully without Windows recovering it.
P.S. I used DeleteFile() to take care of the B+ tree and other stuff before I modify the File Record manually.

Are you sure the MFT record got deleted ? Because if it was, then the file won't reappear.
Check your MFT record position calculations (from VCN to actual CN and sector number).
Also, there's a $MFTMirror, you should check if a duplicate copy of the MFT record (for the file in question) exists in $MFTMirror ...if yes, then you should be erasing even that record.
If you could share your code for MFT record locator (most probably that's were the problem is) for the file ... I could help you more.

Related

How to reliably overwrite a file in Windows

I want to overwrite the content of a file atomically. I need this to maintain integrity when overwriting a config file, so an update should either pass or fail and never leave the file half-written or corrupted.
I went through multiple iteration to solve this problem, here is my current solution.
Steps to overwrite file "foo.config":
Enter a global mutex (unique per file name)
Write the new content in "foo.config.tmp"
Call FlushFileBuffers on the file handle before closing the file to flush the OS file buffers
Call ReplaceFile which will internally
rename "foo.config" to "foo.config.bak"
rename "foo.config.tmp" to "foo.config"
Delete "foo.config.bak"
Release the global mutex
I thought this solution to be robust, but the dreaded issue occurred again in production after a power failure. The config file was found corrupted, filled with 'NULL' character, .tmp or .bak file did not exist.
My theory is that the original file content was zeroed out when deleting "foo.config.bak" but the filesystem metadata update caused by the ReplaceFile call was not flushed to disk. So after reboot, "foo.config" is pointing to the original file content that has been zeroed out, is that even possible since ReplaceFile is called before DeleteFile?
The file was stored on an SSD (SanDisk X110).
Do you see a flaw in my file overwrite procedure? Could it be an hardware failure in the SSD? Do you have an idea to guarantee the atomicity of the file overwrite even in case of power failure? Ideally I'd like to delete the tmp and bak file after the overwrite.
Thanks,
Use MoveFileEx with the MOVEFILE_WRITE_THROUGH flag when renaming the file. This should tell windows to write the file right away, not caching it.

NTFS Journal USN_REASON_HARD_LINK_CHANGE event

I've written a program that reads the NTFS index and journal similar to what is described here:
http://ejrh.wordpress.com/2012/07/06/using-the-ntfs-journal-for-backups/
And It works fairly well.
In addition to the normal journal events USN_REASON_CLOSE, USN_REASON_FILE_CREATE, USN_REASON_FILE_DELETE etc' I'm receiving an event with reason USN_REASON_HARD_LINK_CHANGE. I'd like to be able to update the directory index according to this event but I can't find any information about it. The only documentation is:
An NTFS file system hard link is added to or removed from the file or
directory. An NTFS file system hard link, similar to a POSIX hard
link, is one of several directory entries that see the same file or
directory.
What does this mean? where was the hard-link created? or was it removed? how do I get more information about what happened?
I know this is ancient, but I stumbled upon this while researching a related problem. Here's what I found: The hard-links are a complicating factor when reading the USN. You can get journal entries describing change to a single file reference number by way of changes made through any hard-link that's been created. Generally, and to the original question, hard-links are alternative directory entries through which a single file might be accessed. Thus, all the file's characteristics are shared for each link (except for the names and parent file reference numbers). Technically, you can't tell which entry is the original and which is a link.
A subtle difference does exist, and it manifests if you query the master file table (using DeviceIOControl and Fsctl_Enum_Usn_Data). The query will return only a single representative file regardless of how many links exist. You can query for the links using NtQueryInformationFile, querying for FILE_HARD_LINK_INFORMATION. I think of the entry returned by the MFT query as the main entry and the NtQueryInformationFile-returned items as links...however, the main entry can get deleted and one of the links will get promoted...so it's only a housekeeping thought and little else.
Note that a problem arises where one of the hard-links is moved or renamed. In this case, the journal entries for the rename or move reflect the filename and parent file reference number of the affected link. The problem arises if you ask for only the summary "on-close" records. In such a case, you won't ever see the USN_REASON_RENAME_OLD_NAME record...because that USN entry never gets an associated REASON_CLOSE associated with it. Without this tidbit, you won't be able to easily determine which link's name or location was changed. You have to read the usn with ReadOnlyOnClose set to 0 in the Read_Usn_Journal_Data_V0. This is a far chattier query, but without it, you can't accurately associate the change with one link or the other.
As always with the USN, I expect you'll need to go through a bit of trial and error to get it to work right. These observations/guesses may, I hope, be helpful:
When the last hard link to a file is deleted, the file is deleted; so if the last hard link has been removed you should see USN_REASON_FILE_DELETE instead of USN_REASON_HARD_LINK_CHANGE. I believe that each reference number refers to a file (or directory, but NTFS doesn't support multiple hard links to directories AFAIK) rather than to a hard link. So immediately after the event is recorded, at least, the file reference number should still be valid, and point to another name for the file.
If the file still exists, you can look it up by reference number and use FindFirstFileNameW and friends to find the current links. Comparing this to the event record in question plus any relevant later events should give you enough information, although if multiple hard links for the same file are deleted and/or created you might not be able to reconstruct the order in which this happened, and if you don't have enough information about the prior state of the file system you might not be able to identify the deleted hard links. I don't know whether that would matter to you or not.
If the file no longer exists, you should still be able to identify it by the USN record in which it was deleted. Again, taking all relevant events into consideration, and with enough information about the prior state, you should be able to reconstruct most of what happened, if not the order.
There is some hope that we can do better than this: the file name and/or ParentFileReference number in the event record might refer to the hard link that was created or deleted, rather than to an arbitrary link to the file. In this case you'll have all the relevant information about the sequence of events except for whether any particular event was a create or a delete, which you should be able to work out by looking at the current state of the file and working backwards through the records.
I assume you've already looked for nearby change records that might contain additional information? There isn't, for example, a USN_REASON_RENAME_NEW_NAME record generated when a hard link is created or a USN_REASON_RENAME_OLD_NAME when a hard link is removed? Or paired USN_REASON_HARD_LINK_CHANGE records, one for the file, one for the directory containing the affected hard link to the file? (Wishful thinking, I expect, but it wouldn't hurt to look!)
For testing purposes, you can create hard links with the mklink command.

Get file offset on disk/cluster number

I need to get any information about where the file is physically located on the NTFS disk. Absolute offset, cluster ID..anything.
I need to scan the disk twice, once to get allocated files and one more time I'll need to open partition directly in RAW mode and try to find the rest of data (from deleted files). I need a way to understand that the data I found is the same as the data I've already handled previously as file. As I'm scanning disk in raw mode, the offset of the data I found can be somehow converted to the offset of the file (having information about disk geometry). Is there any way to do this? Other solutions are accepted as well.
Now I'm playing with FSCTL_GET_NTFS_FILE_RECORD, but can't make it work at the moment and I'm not really sure it will help.
UPDATE
I found the following function
http://msdn.microsoft.com/en-us/library/windows/desktop/aa364952(v=vs.85).aspx
It returns structure that contains nFileIndexHigh and nFileIndexLow variables.
Documentation says
The identifier that is stored in the nFileIndexHigh and nFileIndexLow members is called the file ID. Support for file IDs is file system-specific. File IDs are not guaranteed to be unique over time, because file systems are free to reuse them. In some cases, the file ID for a file can change over time.
I don't really understand what is this. I can't connect it to the physical location of file. Is it possible later to extract this file ID from MFT?
UPDATE
Found this:
This identifier and the volume serial number uniquely identify a file. This number can change when the system is restarted or when the file is opened.
This doesn't satisfy my requirements, because I'm going to open the file and the fact that ID might change doesn't make me happy.
Any ideas?
Use the Defragmentation IOCTLs. For example, FSCTL_GET_RETRIEVAL_POINTERS will tell you the extents which contain file data.

Deleted file recovery program using C C++ [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I want to write a program that can recover deleted files from hard drive ( FAT32/NTFS partition Windows). I don't know where to start from. What should be the starting point of this? What should i read to pursue this? Help is required. Which system level structs should i study?
It's entirely a matter of the filesystem layout, how a "file" actually looks on disk, and what remains when a file is deleted. As such, pretty much all you need to understand is the filesystem spec (for each and every filesystem you want to support), and how to get direct block-level access to the HD data. It might be possible to reuse some code from existing filesystem drivers, but it will need to be modified to process structures that, from the point of view of the filesystem, are gone.
NTFS technical reference
NTFS.com
FAT32 spec
You should know first how file deletion is done in FAT32/NTFS, and how other undelete softwares work.
Undelete software understands the internals of the system used to store files on a disk (the file system) and uses this knowledge to locate the disk space that was occupied by a deleted file. Because another file may have used some or all of this disk space there is no guarantee that a deleted file can be recovered or if it is, that it won't have suffered some corruption. But because the space isn't re-used straight away there is a very good chance that you will recover the deleted file 100% intact. People who use deleted file recovery software are often amazed to find that it finds files that were deleted months or even years ago. The best undelete programs give you an indication of the chances of recovering a file intact and even provide file viewers so you can check the contents before recovery.
Here's a good read (but not so technical): http://www.tech-pro.net/how-to-recover-deleted-files.html
This is not as difficult as you think. You need to understand how files are stored in fat32 and NTFS. I recommend you use winhex an application used for digital forensics to check your address calculations are correct.
Ie NTFS uses master file records to store data of the file in clusters. Unlink deletes file in c but if you look at the source code all it does is removes entry from table and updates the records. Use an app like winhex to read information of the master file record. Here are some useful info.
Master boot record - sector 0
Hex 0x55AA is the end of MBR. Next will be mft
File name is mft header.
There is a flag to denote folder or file (not sure where).
The file located flag tells if file is marked deleted. You will need to change this flag if you to recover deleted file.
You need cluster size and number of clusters as well as the cluster number of where your data starts to calculate the start address if you want to access data from the master file table.
Not sure of FAT32 but just use same approach. There is a useful 21 YouTube video which explains how to use winhex to access deleted file data on NTFS. Not sure the video but just type in winhex digital forensics recover deleted file. Once you watch this video it will become much clearer.
good luck
Just watched the 21 min YouTube video on how to recover files deleted in NTFS using winhex. Don't forget resident flag which denotes if the file is resident or not. This gives you some idea of how the file is stored either in clusters or just in the mft data section if small. This may be required if you want to access the deleted data. This video is perfect to start with as it contains all the offset byte position to access most of the required information relative to beginning of the file record. It even shows you how to do the address calculation for the start of the cluster. You will need to access the table in binary format using a pointer and adding offsets to the pointer to access the required information. The only way to do it is go through the whole table and do a binary comparison of the filename byte for byte. Some fields are little eindian so make sure you got winhex to check your address calculations.

How does file recovery software work?

I wanted to make some simple file recovery software, where I want to try to recover files which happen to have been deleted by pressing Shift + Delete. I'm working in Windows, can anyone show me any links or documents which can help me to do so programatically? I know C, C++, .NET. Any pointers?
http://www.google.hu/search?q=file+recovery+theory&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a :)
Mainly file recoveries are looking for file headers and/or filenames in the disk as I know, then try to get the whole file by the header information.
This could be a good start: http://geeksaresexy.blogspot.com/2006/02/theory-behind-deleted-files-recovery.html
The principle of all recovery tools is that deleting a file only removes a pointer in a folder and (quick) formatting of a partition only rewrites the first sectors of the partition which contains the headers of the filesystem. An in depth analysis of the partition data (at sector level) can rebuild a big part of the filesystem data, cluster allocation tables, folders, and file cluster chains.
All course if you use a surface test tool while formatting the partition that will rewrite all sectors to make sure that they are correct, nothing will be recoverable - unless you use specialized hardware to look at remanent magnetism on the edges of the actual tracks
In windows when a file is deleted(permanent delete) it's not actually deleted from disk but the file name added with char( _ I guess) in front of it and windows ignores these when showing in explorer... and recovery tools will search these kind of file names in the disk. And your file recover integrity based on some data over written on location of deleted file. Don't know this pattern still used by windows.. but long time back I have read this some where

Resources