eRacks Systems Tech Blog

Open Source Experts Since 1999

The history of computing is littered with examples of storage capacity overstepping software’s ability to effectively make use of it (for a list of examples — and an entertaining read — check out this old article from 2000: http://www.dewassoc.com/kbase/hard_drives/hard_drive_size_barriers.htm).  Today is certainly no exception.  With RAID arrays in excess of 32 TB (one terrabyte is ~1000 gigabytes), one must be thorough in their research, lest they discover the hard way that the software configuration they wish to use supports only a small fraction of the available disk space.  The information contained in this article was compiled in an attempt to aid others in making wise decisions when making use of high capacity storage.

There are two basic considerations one must take into account.  The first is partitioning.  The second is one’s choice of filesystem (which in turn is often determined at least in part by the operating system.)

Partitioning

One of the most fundamental logical units of storage, treated by most operating systems as a “disk” in its own right, is the partition.  For those of us who are unaware of what a partition is or why it’s important (those who know might want to skip ahead to the next paragraph), imagine the North American continent.  Though in reality it’s a single physical chunk of land, it’s separated into logical boundaries: Canada, the United States and Mexico.  Without these boundaries to demarcate those areas of land available to each country, it would be much more difficult to decide which resources belong to whom.  In the same vein, partitions exist to logically separate a hard disk into regions of storage designated for various purposes.

The problem here is that in the original BIOS-style scheme, one can only create partitions of up to 2 TB in size.  This is because the BIOS-style partition table, which makes use of a 32-bit address space, can only keep track of up to 2^32 blocks, each typically 512 bytes in size.  Multiplying these two quantities gives us the maximum 2 TB.  There are two ways to approach this problem.

The first is to accept the limitation and to create many small partitions.  The disadvantage, of course, is that you must spread your data out over a large area.  The Logical Volume Manager (LVM) on Linux can somewhat mitigate this problem by tying everything together into a single logical disk, but even so, we can certainly do better.

The second, and in my opinion the superior choice, is to stop using traditional BIOS-style partitions and to instead make use of a relatively new standard known as GPT (GUID Partition Table).  Unlike BIOS-style partition tables, a GPT uses 64-bit addresses, meaning that each partition has a maximum size of 2^64 blocks x 512-bytes per block = 8 ZB (that’s zettabytes; for comparison, 1 ZB = 1 billion TB).  That’s A LOT better than 2 TB!

The tradeoff, if you choose to go the GPT route, is that not all operating systems support it.  At the time of this writing, the following are known to NOT support GPT (or at least not without jumping through some hoops): FreeBSD (partial support for GPT partitions exists), OpenBSD, NetBSD (GPT filesystems are supported via dkwedges, but cannot be booted from directly), OpenSolaris (again, GPT is supported for separate data partitions, but cannot be booted from directly) Fedora Core, CentOS and RedHat Enterprise Linux and Windows XP or below (GPT only works in Windows XP x64, and only for separate data partitions).  This is not an exhaustive list.  By contrast, here is a list (also not exhaustive) of operating systems that do fully support GPT partitions out of the box: Debian, Ubuntu and Gentoo Linux, Windows Vista and Windows 7.

Your choice of operating system will therefore determine whether or not you can take advantage of what GPT has to offer.

Filesystem Considerations

Now that we’ve got the partitioning figured out, we’ll have to consider filesystem limitations.  The rest of this article assumes that you’ve either made use of a GPT partition or that you’re on Linux and have created one large LVM volume.

You might be tempted to think, “now that I have a large partition, I just have to format it and I’m done!”  Sometimes this is true, as is the case with any version of Windows that supports GPT partitions and *BSD (assuming you’ve jumped through the hoops necessary to create the partition in the first place.)  If you plan to use Linux, however, you’ll need to be a little more careful, as you have a few choices available to you, not all of them supporting large volumes.

The default Linux filesystem for a long time was ext3.  It does support large filesystems, but with a 4K block size, it will only address up to 16 TB of space.  You can create an ext3 filesystem with an 8K block size, for a maximum size of 32 TB, but only if you’re working with an architecture that supports 8K page sizes (and unless you’re using an Itanium or an Alpha processor, you’re probably out of luck.)

More recently, ext3 has been superceded by ext4.  Theoretically, ext4, with 4K blocks, supports up to 1 EB (exabyte, equal to 1 million TB).  However, for now, due to limitations in the tools used to create ext4 filesystems, you’re still limited to 16 TB.  Hopefully, this will be fixed in the not too distant future.

Fortunately, Linux does support filesystems that can span across large volumes.  These include (but are not necessarily limited to) XFS (up to 16 EB) and JFS2 (up to 32 TB).

Conclusion

Eventually, these issues will be smoothed over, just like all the others that have surfaced throughout the history of computers.  For now, however, one must take some time to plan how best to utilize large capacity volumes, as the software industry still has quite a bit of catching up to do.

eRacks Open Source Systems well understands the issues faced when dealing with so much storage, and will be more than happy to help you with your needs.  Check us out at http://www.eracks.com, and call for a quote today!

August 4th, 2010

Posted In: Uncategorized

Leave a Reply