VHDX redundancy (REFS file system) checking integrity every time - windows

I have created 2 VHDX disks (as files). I created a REFS file system on them and used the reliability option. When I mount first VHDX file in the system and then the second, there is a data integrity check on both.
Question, is it possible to mount both disks at the same time? so that there is no data integrity check each time?
How do I use two VHDX files in the reliability option so that there is no data integrity check each time I connect the drives?
Answer to the question of how to avoid checking the integrity of VHDx

Related

How to safely extract an encrypted zip folder to a random folder, load all extracted files into memory safely delete the extracted files

I have an encrypted 7z file which has some binary data files.
I am on windows platform (Windows 10 - Customers are also on same platform).
I am using C++ 17.
I am using a third party library (C standard) that can load these binary files only from disk (not from memory streams) into memory.
Loading the bin files from disk to memory takes a few milliseconds.
I don't want to let the users of my software to be able to read the content of the binary files.
I can't use a online service to host these bin files because the customers should be able to use the software on a standalone computer without any network connectivity.
The way I am planning it now is as follows:
Choose a random folder path at runtime (in the windows temp folder)
Extract the encrypted 7z file to the above random path.
Immediately acquire a exclusive lock on the bin files using https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-lockfileex
Read the bin files
After the reading is done, overwrite the files with zeros
Delete the extracted files
Things that can go wrong in the above approach:
Customers having admin privileges can potentially perfectly time the bin file loading in software and kill the process before the files are overwritten.
Customers can potentially get a memory dump and read the data directly (not sure if this is so easy to pull off)
Potentially more ways..
Is there any better way to deal with the situation?
Should I live with the potential of IP loss.

Move/copy millions of images from Macos to external drive to ubuntu server

I have created a dataset of millions (>15M, so far) of images for a machine-learning project, taking up over 500GB of storage. I created them on my Macbook Pro but want to get them to our DGX1 (GPU cluster) somehow. I thought it would be faster to copy to a fast external SSD (2x nvme in raid0) and then plug that drive directly into local terminal and copy it to the network scratch disk. I'm not so sure anymore, as I've been cp-ing to the external drive for over 24 hrs now.
I tried using the finder gui to copy at first (bad idea!). For a smaller dataset (2M images), I used 7zip to create a few archives. I'm now using the terminal in MacOS to copy the files using cp.
I tried "cp /path/to/dataset /path/to/external-ssd"
Finder was definitely not the best approach as it took forever at the "preparing" to copy stage.
Using 7zip to archive the dataset increased the "file" transfer speed, but it took over 4 days(!) to extract the files, and that for a dataset an order of magnitude smaller.
Using the command line cp, started off quickly but seems to have slowed down. Activity monitor says I'm getting 6-8k IO's on the disk. It's been 24 hours and it isn't quite halfway done.
Is there a better way to do this?
rsync is the preferred tool for this kind of workload. It is used for both local and network copies.
Main benefits are (excerpt from manpage):
delta-transfer algorithm, which reduces the amount of data sent
if it is interrupted for any reason, then you can restart it easily with very little cost. It can even restart part way through a large file
options that control every aspect of its behavior and permit very flexible specification of the set of files to be copied.
Rsync is widely used for backups and mirroring and as an improved copy command for everyday use.
Regarding command usage and syntax, for local transfers is almost the same as cp:
rsync -az /path/to/dataset /path/to/external-ssd

Delete records from Windows registry after MoveFileEx added them

In my software I am using MoveFileEx to mark my files for deletion to be deleted on system reboot.
That is in the case when for some reasons I cannot delete the file with my software (for example it is locked by other software).
But I assume that the system could crash, or other unexpected situations could reboot the system, and I want be able to delete the file or to mark it for deletion with MoveFileEx.
Now I am planning to mark the file with MoveFileEx immediately after the software starts to use it.
I have three questions:
Is there a function (which I could use) to remove the record from windows regestry for a specific file?
The case is: I mark the file for deleting on system reboot, work with it and I delete it successfully - then I wont need this record in my registry and I don't want to have records for files that are deleted successfully. (maybe millions successfully deleted and 5-10 unsuccessfully).
If there isn't such function (mentioned in question '1') how big would be the problem to have millions unwanted records in my registry?
If the answer of question 2 is very negative, is there other solution for this?

Where is data on a non-persistant Live CD stored?

When I boot up Linux Mint from a Live CD, I am able to save files to the "File System". But where are these files being saved to? Can't be the disc, since it's a CDR. I don't think it's stored in the RAM, because it can only hold so much data and isn't really intended to be used as a "hard drive". The only other option is the hard drive... but it's certainly not saving to any partition on the hard drive I know about, since none of them are mounted. Then where are my files being saved to??
Believe it or not, it's a ramdisk :)
All live distros mount a temporary hard disk in RAM memory. The process is completely user-transparent and is all because of the magic of Linux kernel.
The OS, in fact, first allocates an area of your RAM memory into a virtual device, then mounts it as a regular hard drive in your file system.
Once you reboot, you lose all your data from that ramdrive.
Ramdrive is needed by almost all software running on Live CDs. In fact, almost all programs, in particular desktop managers, are designed in order to write files, even temporary, during their execution.
As an example, there are two ways to run KDE on a Live CD: either modify its code deeply in order to disallow you to change wallpaper etc. (the desktop settings are stored inside ~/.kde) or redeploy it onto a writable file system such as a ramdrive in order to avoid write fails on read-only file systems.
Obviously, you can mount your real HDD or any USB drive into your virtual file system and make all writes to them permanent, but by default no live distro mounts your drives into the root file system, instead they usually mount into specific subdirectories like /mnt, /media, /windows
Hope to have been of help.
It does indeed emulate a disk using RAM; from Wikipedia:
It is able to run without permanent
installation by placing the files that
typically would be stored on a hard
drive into RAM, typically in a RAM
disk, though this does cut down on the
RAM available to applications.
RAM. In Linux, and indeed most unix systems, any kind of device is seen as a file system.
For example, to get memory info on linux you use cat /proc/meminfo, where cat is used to read files. Then, there's all sorts of strange stuff like /dev/random (to read random crap) and /dev/null (to throw away crap). ;-)
To make it persistent - use a USB device - properly formatted and with a special name. See here:
https://help.ubuntu.com/community/LiveCD/Persistence

let's say I am writing my code and then my PC died, how necessary is it to do a complete scan if i don't want my later source code to be contaminated?

let's say I am writing a Ruby on Rails program and while editing a file, the machine blue screened. in this case, how necessary is it to re-scan the whole hard drive if I don't want my future files to be damaged?
Let's say if the OS is deleting a tmp file at the moment when my computer crashed, and still have some pointers to some sector on the hard drive. and if my newly created files happen to be in those sector, and next time the OS clean up files again, it may think that the "left-over" sector wasn't cleaned last time and clean it again, and damaging our source code. (esp with Ruby on Rails, where the source code could be generated by rails and not by us, and we may not know why our rails server doesn't work, if a file is affected). we can rely on SVN, but what if the file is affected before we check it in?
i think the official answer will be: "always scan the disk after a crash or power outage, for the data and even the space and indicate attempt to fix any bad sector", but the thing is, nowadays with the hard drive so big, it could take 2 hours to scan everything. And especially at work, we cannot wait for 2 hours if it is the middle of the day.
Does someone know if the modern OS, like XP, Vista, Mac OS, and Linux (when sometimes the power cord was loose and it didn't shut down properly and just shut down on 0% battery), with these modern OS, are our source code safe? Do they know how to structure to write to sector so that at most it will waste sector instead of overlapping sectors?
With a modern journaling file system (ext3/4, NTFS), the only problem would be that a file could be in a "half-written" state. Obviously scanning is not going to help this (that's what backups are for). The file system itself could not be corrupted. If you are using something like FAT, then yes, you should worry about this.
There's really only 1 issue here.
Is any file currently being written in some kind of "half written" state.
The primary cause of this would be if the application/editor is writing the file and the machine dies halfway through. In this case, the file be written is, well, half done. If it was over writing the original file, the original file is "gone", and the new one is "half done". If you don't have a back up file, then, well, you have a problem.
As far as a file having dangling pointers, or references to sectors not written, or somesuch thing. That problem depends on your file system.
The major, modern files ystems are journaled and "won't allow" this to happen. You may have a "half written", but that's because the application only got to write half of it, rather than the file system losing track of a sector pointer.
If you're playing file system games for performance, or whatever (such as using a UFS without logging), then you would want to run a fschk to clean up the file systems meta data.
But if you're using a modern operating system and file system (i.e. anything from the past 5 years), you won't have this problem.
Finally, if you do have version control running, then just do an "svn status", it will show you any "corrupted" files as they will have changed and it will detect that as well.
i see some information on
http://en.wikipedia.org/wiki/Journaling_file_system
Journalized file systems
File systems may provide journaling, which provides safe recovery in the event of a system crash. A journaled file system writes some information twice: first to the journal, which is a log of file system operations, then to its proper place in the ordinary file system. Journaling is handled by the file system driver, and keeps track of each operation taking place that changes the contents of the disk. In the event of a crash, the system can recover to a consistent state by replaying a portion of the journal. Many UNIX file systems provide journaling including ReiserFS, JFS, and Ext3.
In contrast, non-journaled file systems typically need to be examined in their entirety by a utility such as fsck or chkdsk for any inconsistencies after an unclean shutdown. Soft updates is an alternative to journaling that avoids the redundant writes by carefully ordering the update operations. Log-structured file systems and ZFS also differ from traditional journaled file systems in that they avoid inconsistencies by always writing new copies of the data, eschewing in-place updates.

Resources