Why dd output file is bigger then the original input

Why dd output file is bigger then the original input - clone

If I clone a disk to disk like this:
dd if=/dev/sda of=/dev/sdb
The second disk will contain the data of the first disk without any space issue, but when I create an image of /dev/sda, say it has capacity of 10GB, the result file will be in my case ~48GB. Now my question is why cloned disk to image has a bigger length, while cloned disk to disk having same capacity won't produce space issue?

Related

Get file size on disk from file size

I have a file on Windows machine with this size and i need to caclulate file size on disk from file size:
Size
3,06 MB (3.216.171 bytes)
Size on disk
3,07 MB (3.219.456 bytes)
I have 512 bytes/sector file system
How to calculate how many sectors I need to use, to store the file from file size?
I understand 3219456 / 512 = 6288, but how to calculate size on disk from file size?
Thare is a way to get size on disc from file size?
I miss something?

Your file length is 0x31132B.
The required storage (rounded up to the nearest cluster) is 0x312000. Your clusters are either 4kB (0x1000) or 8kB (0x2000).
This can be computed as:
clusterSize * ceil(fileSize * 1.0 / clusterSize)
(The 1.0 prevents integer division.) In integer math, it is:
clusterSize * (1 + (fileSize - 1) / clusterSize)
You get the cluster size from GetDiskFreeSpace, which you'll need to call anyway to figure out if your file will fit. See this existing answer:
Getting the cluster size of a hard drive (through code)
Of course, other things can affect the true storage used by storing a file... if a directory doesn't have enough space in its cluster for the new entry, if you are storing metadata with it that doesn't fit in the directory, if you have compression enabled. But for an "ordinary" file system, the above calculations will be correct.

Has curlftpfs a maximum size of mounted space and how can I skip it?

I mounted a ftpserver into my local OS:
curlftpfs user:pass#ftp.server.com /var/test/
I noticed using pydf that there is maximal size of this volume at about 7.5GB:
Filesystem Size Used Avail Use% Mounted on
curlftpfs#ftp://user:pass#ftp.server.com 7629G 0 7629G 0.0 [.........] /var/test
Then I tried to fill the disk space using dd with an 8GB file but this failed also at the given size:
dd if=/dev/zero of=upload_test bs=8000000000 count=1
dd: memory exhausted by input buffer of size 8000000000 bytes (7.5 GiB)
The FTP user has unlimited traffic and disk space at remote server.
So my question is: Why is there a limit at 7.5GB and how can I skip it?

Looking at the source code of curlftpfs 0.9.2, which is the last released version, this 7629G seems to be the hardcoded default.
In other words, the curlftpfs doesn't check the actual size of the remote filesystem and uses some predefined static value instead. Moreover the actual check can't be implemented because ftp protocol doesn't provide information about free space.
This means that failure of your file transfer on 7.5 GB is not caused by reported free space, as there is an order of magnitude difference between the two.
Details
Function ftpfs_statfs implementing statfs FUSE operation defines number of free blocks as follows:
buf->f_blocks = 999999999 * 2;
And the size of filesystem block as:
buf->f_bsize = ftpfs.blksize;
Which is defined elsewhere as:
ftpfs.blksize = 4096;
So putting this all together gives me 999999999 * 2 * 4096 / 2^30 GB ~= 7629.3945236206055 GB, which matches the number in your pydf output: 7629G.

It's an old question, however, for completeness sake:
DD's bs ("block size") option makes it buffer the specified amount of data in memory before writing out the chunk to the output. With a massive block size like your 8GB, it's entirely possible your system simply did not have the free memory (or even memory capacity!) to hold the entire buffer at once. Retrying with a smaller block size and appropriately higher count for the same output size should work as expected:
dd if=/dev/zero of=upload_test bs=8000000 count=1000

Files take up more space on the disk

When viewing details of a file using Finder, different values are shown for how much space the file occupies. For example, a file takes up 28.8KB of RAM but, 33KB of the disk. Anyone know the explanation?

Disk space is allocated in blocks. Meaning, in multiples of a "block size".
For example, on my system a 1 byte file is 4096 bytes on disk.
That's 1 byte of content & 4095 bytes of unused space.

How do I partition a drive to an exact size in OSX Terminal?

I've got a 3TB drive partitioned like so:
TimeMachine 800,000,000,000 Bytes
TELUS 2,199,975,890,944 Bytes
I bought an identical drive so that I could mirror the above in case of failure.
Using DiskUtility, partitioning makes the drives a different size than the above by several hundreds of thousands of bytes, so when I try to add them to the RAID set, it tells me the drive is too small.
I figured I could use terminal to specify the exact precise sizes I needed so that both partitions would be the right size and I could RAID hassle-free...
I used the following command:
sudo diskutil partitionDisk disk3 "jhfs+" TimeMachine 800000000000b "jhfs+" TELUS 2199975886848b
But the result is TimeMachine being 799,865,798,656 Bytes and TELUS being 2,200,110,092,288 Bytes. The names are identical to the originals and I'm also formatting them in Mac OS Extended (Journaled), like the originals. I can't understand why I'm not getting the same exact sizes when I'm being so specific with Terminal.
Edit for additional info: Playing around with the numbers, regardless of what I do I am always off by a minimum of 16,384 bytes. I can't seem to get the first partition, TimeMachine to land on 800000000000b on the nose.

So here's how I eventually got the exact sizes I needed:
Partitioned the Drive using Disk Utility, stating I wanted to split it 800 GB and 2.2 TB respectively. This yielded something like 800.2GB and 2.2TB (but the 2.2 TB was smaller than the 2,199,975,890,944 Bytes required, of course).
Using Disk Utility, I edited the size of the first partition to 800 GB (from 800.2GB), which brought it down to 800,000,000,000 bytes on the nose, as required.
I booted into GParted Live so that I could edit the second partition with more accuracy than Terminal and Disk Utility and move it around as necessary.
In GParted, I looked at the original drive for reference, noting how much space it had between partitions for the Apple_Boot partitions that Disk Utility adds when you add a partition to a RAID array (I think it was 128 MB in GParted).
I deleted the second partition and recreated it leaving 128 MB before and after the partition and used the original drive's second partition for size reference.
I rebooted into OS X.
Now I couldn't add the second partition to the RAID because I think it ended up being slightly larger than the 2,199,975,890,944 Bytes required (i.e., it didn't have enough space after it for that Apple_Boot partition), I got an error when attempting it in Disk Utility.
I reformatted the partition using Disk Utility just so that it could be a Mac OS Extended (journaled) rather than just HSF+, to be safe (matching the original).
I used Terminal's diskutil resizeVolume [drive's name] 2199975895040b command to get it to land on the required 2,199,975,890,944 Bytes (notice how I had to play around with the resize size, making it bigger than my target size to get it to land where I wanted).
Added both partitions to their respective RAID arrays using Disk Utility and rebuilt them successfully.
... Finally.

Can the USN Journal of the NTFS file system be bigger than it's declared size?

Hello fellow programmers.
I'm trying to dump the contents of the USN Journal of a NTFS partition using WinIoCtl functions. I have the *USN_JOURNAL_DATA* structure that tells me that it has a maximum size of 512 MB. I have compared that to what fsutil has to say about it and it's the same value.
Now I have to read each entry into a *USN_RECORD* structure. I do this in a for loop that starts at 0 and goes to the journal's maximum size in increments of 4096 (the cluster size).
I read each 4096 bytes in a buffer of the same size and read all the USN_RECORD structures from it.
Everything is going great, file names are correct, timestamps as well, reasons, everything, except I seem to be missing some recent records. I create a new file on the partition, I write something in it and then I delete the file. I run the app again and the record doesn't appear. I find that the record appears only if I keep reading beyond the journal's maximum size. How can that be?
At the moment I'm reading from the start of the Journal's data to the maximum size + the allocation delta (both are values stored in the *USN_JOURNAL_DATA* structure) which I don't believe it's correct and I'm having trouble finding thorough information related to this.
Can someone please explain this? Is there a buffer around the USN Journal that's similar to how the MFT works (meaning it's size halves when disk space is needed for other files)?
What am I doing wrong?

That's the expected behaviour, as documented:
MaximumSize
The target maximum size for the change journal, in bytes. The change journal can grow larger than this value, but it is then truncated at the next NTFS file system checkpoint to less than this value.
Instead of trying to predetermine the size, loop until you reach the end of the data.
If you are using the FSCTL_ENUM_USN_DATA control code, you have reached the end of the data when the error code from DeviceIoControl is ERROR_HANDLE_EOF.
If you are using the FSCTL_READ_USN_JOURNAL control code, you have reached the end of the data when the next USN returned by the driver (the DWORDLONG at the beginning of the output buffer) is the USN you requested (the value of StartUsn in the input buffer). You will need to set the input parameter BytesToWaitFor to zero, otherwise the driver will wait for the specified amount of new data to be added to the journal.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio