Modern hard drives are shipping with newer features, some of them more confusing than others. One such feature that’s causing a lot of head-scratching and confusion amongst the ranks is the new, so-called “advanced format” hard disks that are now shipping. Generally, these are newer SSDs as well as traditional “spinning rust” hard drives larger than 4TiB in capacity. What’s 4k all about and why do we need it? To understand this, we’re going to have to take a trip back in time and find out exactly how disks work, how an operating system talks to a disk in order to read/write from/to the disk, and see why the old way was broken and needed to be replaced with something newer and better.
Let’s first start with some definitions: there are two different ways of accessing a location within a drive, one is the older CHS scheme and the other is the LBA scheme, currently used by all modern operating systems.
CHS stands for Cylinder, Head, Sector and is the most low-level method of determining where to read or write from the drive. You tell it to use cylinder x, head y, and sector z and read or write the contents of that location to or from an address in the memory (a buffer). It is derived from the actual, physical components of a (traditional, spinning rust) hard drive, where you have physical cylinders and read heads. The sector is the smallest addressable unit, and was traditionally fixed at 512 bytes.
LBA is logical byte addressing wherein the drive reads from and writes to a sector address by its offset, for example, read the 37th sector on the disk or write this to the 1434th sector on the disk (starting from zero).
The problem? Each of these values is limited in range. In fact, because of how severely limited CHS was, LBA had to be introduced. For CHS, the possible values for C (the cylinder) is 1023, while H (heads) can be 255 maximum, and S (sector) can only go up to 63, meaning you can have at most 1024 cylinders x 255 heads x 64 sectors x 512 bytes mapped in traditional CHS format, giving you a grand total of under 8 GiB! Using CHS, it’s simply not possible to access a disk larger than 8 GiB!
So LBA was introduced with a 32-bit limit giving you 232 x 512 bytes or 2 TiB limit on disk size – and this is the reason an MBR disk cannot exceed 2TiB because it uses CHS and LBA to specify partition sizes, and neither can support anything over 2TiB.
Newer, better options have been introduced like the GPT partitioning scheme which extends LBA to 64 bits, giving you a heck of a lot more than you’ll ever need at 264 x 512 bytes – but there’s a catch: a lot of legacy hardware and legacy operating systems and legacy BIOS implementations and legacy drivers don’t support UEFI or GPT, and a lot of people would like to have something that can be more-easily upgraded to go past the 2TiB limit without having to rewrite the entire stack from scratch. And, at long last, we reach the 4096 sector size.
See, throughout all the limitations discussed above, one thing has been a fixed assumption: the sector size. From day one, it has been 512 bytes and it’s stayed that way ever since. But recently, hard disk manufacturers realized there’s an opportunity to work some magic: take the traditional CHS or 32-bit LBA and simply replace the sector size with 4096 (4k) instead of 512 bytes. When an OS says “give me the 2nd sector on the disk” by requesting LBA 1 (because LBA 0 is the first), we aren’t going to give it bytes 512 – 1023 but rather bytes 4096 – 8191.
Suddenly, our 2TiB limit is upgraded to 232 x 4096 bytes, or 16 TiB, without having to ditch MBR, switch to UEFI or GPT, or anything! An additional advantage is that things are a lot faster because if you’re reading and writing 4096 bytes at a time, it’s 8x fewer operations to read or write, say, 4GiB of data, which helps with avoiding flooding the command buffer and improves random access times, cache locality, and more.
The only catch is that if the OS isn’t aware that this is a magic disk that uses 4096 sectors instead of 512 byte sectors, there’s going to be a mismatch. Each time the OS says “hey, you, disk, write me these 512 bytes to offset xxx” the disk will use up 4096 bytes to store these 512 bytes (the rest being zeros or junk data, assuming you don’t end up with a memory underflow) because they don’t communicate in bytes, they communicate in sectors.
So BIOSes now (sometimes) include an option to let you manually specify that a 512-byte sector size should be used instead of the native 4096 byte sector size that newer disks are using – with the caveat that you cannot use it to access more than 2TiB of the disk on an MBR system, just like it was in the “good old days.” But modern OSes that are 4k-aware can take advantage of all this to use this magic to read and write in 4096-byte chunks and voilà!
ty! most informative
How about a list of at least some of the operating systems which do support “Advanced Format” and “4k” drives (“4k” drives are those which can *not* be set to logically use only 512 bytes/sector; either the OS supports 4096 bytes/sector, or else you can not use a “4k” drive!)?
Not entirely accurate; advanced sector drives were primarily developed to increase drive capacity, albeit slightly, if just enough for a marketing edge. The 512 bytes in a sector is only part of the disks storage; there are also preamble and ID bytes that identify the sector, and this is overhead. Manufacturers discovered that by having the overhead only on every 8th sector, they could cram a few more bytes on a track, because on most OSs, hard disk storage is allocated in clusters, which are multiples of a power of 2. This is done a an efficiency tradeoff; fewer clusters means indexing is more efficient, but extra space at the end of each end cluster is wasted. Ever since NT4.0 (possibly even earlier), the default HD cluster size has been 4K, or 8 sectors (diskettes were typically 2 sectors). The problem with non-aware OSs is that Partition 1 (usually the only partition on most user’s system) starts on a cylinder boundary, which is NOT a 4K boundary. End result, reading a logical cluster takes TWO reads, before the boundary and after the boundary.
Traditional knowledge in SATA HDD age, now applicable in NVMe SSD as well.