How do you calculate slack space? - byte

A file of size 12,500 bytes is to be stored on a hard disk drive where the sector size is 512 bytes, and a cluster consists of 8 sectors. How much slack space is there once the file has been saved?

A few things to know:
Slack space is the leftover sectors and bytes from a cluster allocation.
Hard drives understand sectors and are nearly universally 512 bytes as far as logical block addressing goes.
OS (operating systems) understand clusters, which are groups of sectors. This amount of sectors can vary between OS and file system.
The OS DOES NOT understand sectors. The hard drive DOES NOT understand clusters. But they are both terms needed for calculating slack space.
Example 1
Given:
Sector Size: 512 bytes (basically the ONLY sector size these days)
cluster size: 8 sectors
file size: 2560 bytes
In this scenario, when disk space is allocated to store the file, the smallest amount that the OS can read/write MUST be 8 sectors or 4096 bytes.
To find slack space:
first find your cluster size in bytes. Cluster == 8 sectors.
∴ allotment is 4096.
then find if the file size is larger or smaller than that allotment size.
2560 bytes < 4096.
∴ only one cluster is needed to save this file
subtract the file size from the cluster size and you have slack space.
4096 - 2560 == 1536 bytes (or 3 sectors) of slack space.
Example 2
Given:
Sector Size: 512 bytes
cluster size: 16 sectors
file size: 61440 bytes
In this scenario, when disk space is allocated to store the file, the smallest amount that the OS can read/write MUST be 16 sectors or 8192 bytes.
Let's work through the same process:
first find your cluster size in bytes. Cluster == 16 sectors.
∴ allotment is 8192.
then find if the file size is larger or smaller than that allotment size.
61440 bytes > 8192.
∴ several clusters are needed to save this file.
since this file is larger, divide it by the cluster size in bytes.
61440 / 8192 == 7.5 clusters required to save this file.
This isn't a nice round number, so we will have to round up. Recall that the OS cannot write LESS than a whole cluster and if we allot less whole clusters than necessary, we won't save the file.
∴ we require 8 clusters.
find the size in bytes of your allotment size.
8 clusters * 8192 == 65536.
subtract the file size from the cluster allotment and you have slack space.
65536 - 61440 == 4096 bytes (or 4 sectors) of slack space.
Try it.

Related

Virtual Memory - Calculating the Virtual Address Space

I am studying for an exam tomorrow and I came across this question:
You are given a memory system with 2MB of virtual memory, 8KB page size,
512 MB of physical memory, TLB contains 16 entries, 2-way set associative.
How many bits are needed to represent the virtual address space?
I was thinking it would be 20 bits, since 2^10 is 1024, so I simply multiply 2^10*2^10 and get 2^20. However, the answer ends up being 21 and I have no idea why.
The virtual address space required is 2MB.
As you have calculated, 20 bits can accomodate 1MB of VM space. You need 21 bits to accomodate 2MB.

NtQueryInformationFile returns incorrect allocation size

I use NtQueryInformationFile with FILE_STANDARD_INFORMATION struct to retrieve the allocation size of file. But for small files it returns incorrect1 result. For example text file with size 1 byte returns 8 bytes allocation size, instead 4096 bytes. Where is problem?
1 I'm assuming that this value is incorrect, because Explorer (on Windows XP Checked Build in my case) the size on disk reports higher figures (4096 bytes for a file with size 1).
file size in EndOfFile member. AllocationSize - this is how many disk space allocated for file -
Usually, this value is a multiple of the sector or cluster size of the
underlying physical device.

How can 8086 processors access harddrives larger than 1 MB?

How can 8086 processors (or real mode on later processors) access harddrives larger than 1 MB, when they can only access 1 MB (without expanded memory) of RAM?
Access is not linear (by byte) but by sector. Sector size may be for example 512 bytes. The computer reads sectors to memory as needed.

About cache memory size - Getting the number of blocks out of the size and number words per block

This is an example from the book Computer Organization and Architecture by Stallings
The cache can hold 64 Kbytes
Data are transfered between main memory and the cache in blocks of 4 bytes each. This means that the cache is organized as 16K = 2^14 lines of 4 bytes each *.
The main memory has 16M. That is 2^24 words. So 4M blocks of 4 bytes.
My confusion is in the second point. It is said that each block is of 4 bytes that is 4 words of 8 bits so one block is 32 bits = 2^5. Now I want to get the number of blocks in the cache. For that I divide the size of the cache with the size of one block, that is 2^16(64K)/2^5(4bytes) = 2^11 lines of 4 bytes each but the answer is 2^14. What am I doing wrong? Thanks!
Its 64K Bytes so it will be 2^16 Bytes.
You will have to make it to bits so it will be (2^16 * 2^3 bits) / 2^5 bits = 2^14

Namenode file quantity limit

Any one know how many bytes occupy per file in namenode of Hdfs?
I want to estimate how many files can store in single namenode of 32G memory.
Each file or directory or block occupies about 150 bytes in the namenode memory. [1] So a cluster with a namenode with 32G RAM can support a maximum of (assuming namenode is the bottleneck) about 38 million files. (Each file will also take up a block, so each file takes 300 bytes in effect. I am also assuming 3x replication. So each file takes up 900 bytes)
In practice however, the number will be much lesser because all of the 32G will not be available to the namenode for keeping the mapping. You can increase it by allocating more heap space to the namenode in that machine.
Replication also effects this to a lesser degree. Each additional replica adds about 16 bytes to the memory requirement. [2]
[1] https://blog.cloudera.com/small-files-big-foils-addressing-the-associated-metadata-and-application-challenges/
[2] http://search-hadoop.com/c/HDFS:/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java%7C%7CBlockInfo
Cloudera recommends 1 GB of NameNode heap space per million blocks. 1 GB for every million files is less conservative but should work too.
Also you don't need to multiply by a replication factor, an accepted answer is wrong.
Using the default block size of 128 MB, a file of 192 MB is split into two block files, one 128 MB file and one 64 MB file. On the NameNode, namespace objects are measured by the number of files and blocks. The same 192 MB file is represented by three namespace objects (1 file inode + 2 blocks) and consumes approximately 450 bytes of memory.
One data file of 128 MB is represented by two namespace objects on the NameNode (1 file inode + 1 block) and consumes approximately 300 bytes of memory. By contrast, 128 files of 1 MB each are represented by 256 namespace objects (128 file inodes + 128 blocks) and consume approximately 38,400 bytes.
Replication affects disk space but not memory consumption. Replication changes the amount of storage required for each block but not the number of blocks. If one block file on a DataNode, represented by one block on the NameNode, is replicated three times, the number of block files is tripled but not the number of blocks that represent them.
Examples:
1 x 1024 MB file
1 file inode
8 blocks (1024 MB / 128 MB)
Total = 9 objects * 150 bytes = 1,350 bytes of heap memory
8 x 128 MB files
8 file inodes
8 blocks
Total = 16 objects * 150 bytes = 2,400 bytes of heap memory
1,024 x 1 MB files
1,024 file inodes
1,024 blocks
Total = 2,048 objects * 150 bytes = 307,200 bytes of heap memory
Even more examples article in the origin article from cloudera.
(Each file metadata = 150bytes) + (block metadata for the file=150bytes)=300bytes
so 1million files each with 1 block will consume=300*1000000=300000000bytes
=300MB for replication factor of 1. with replication factor of 3 it requires 900MB.
So as thumb rule for every 1GB you can store 1million files.

Resources