Filesystem defragmentation: a thing of the past?

Looking at online forums, you will be convinced that filesystem defragmentation is a thing of the past. Just like everything else, however, whether this is true depends on your hardware and how you use it. This post is an attempt to look at the relevance of defragmentation in modern systems.

What does it mean to defragment a filesystem?

In the context of this post, defragmentation is about ensuring that a file’s data blocks are stored optimally on the storage medium. Data is stored optimally on a device if it can be accessed fast, meaning that storage optimality depends on: 1) how the data is accessed, and 2) the characteristics of the storage medium.

Defragmentation is based on the idea of storing file blocks in the same order they are accessed
Figure 1: Defragmentation is based on the idea of storing file blocks in the same order they are accessed

In the case of hard disks, data is accessed faster when stored in contiguous addresses. For every non-contiguous pair of addresses, the disk head needs cover the distance between them, adding precious milliseconds to the total access time. To optimize data accesses for hard disks, defragmentation tools make the assumption that data will be accessed sequentially in some context, e.g. a file or extent. As a result, the problem of defragmenting a file or extent is reduced to storing all its blocks in consecutive addresses. Figure 1 illustrates how defragmentation would work on a typical file “Slow.avi”. Since the Slow.avi video file is expected to be watched from beginning to end, storing its data blocks sequentially will achieve the fastest data transfer rate when loading it.

Sequentially defragmenting a file does not work, of course, if the file resides on a hard disk and is typically accessed in a non-sequential manner. But that’s not necessarily a problem that the defragmentation task should have to solve. If you’re accessing a file in a specific pattern, you should consider storing its blocks in that order. If you’re accessing a file in multiple different patterns equally often, then maybe you should consider the tradeoff of storage capacity for performance, and store multiple copies that are arranged properly. Of course, there’s a paper on that by Huang et al. [1].

In the case of solid state drives (SSDs), performing defragmentation is easier. There is no difference in latency if the addresses accessed are non-consecutive, as long as you’re reading. To speed up writes to the SSD, however, free space needs to be defragmented. The good news is that SSDs come with the smarts to pack used data blocks, built right into their firmware. The bad news: they have no idea which blocks are used; after all, that gets decided by the man upstairs: the filesystem.

But hey, the SSD guys thought of that and put together the TRIM command [2], which marks a block as unused. Just like so many other things that storage companies have insisted in handling at the device-level, however, it doesn’t work as you might want it to. Specifically, TRIM defers erasing the data until such a time when the block is garbage collected. This is no big deal, unless your drive operates in an adversarial environment where you don’t want people to see you just asked your drive to delete some data [3]. But don’t worry, it’s not like you’ll ever store data in a cloud or anything (/sarcasm). Of course, we wouldn’t have to TRIM if we could garbage collect SSD blocks at a layer outside the device… but don’t get me started on that.

Does filesystem fragmentation really impact performance?

Again, it depends. If you process data at a slower rate than the device’s data transfer rate, and you have enough memory to spare, your operating system can mask the effect of fragmentation by prefetching data in memory before it is requested by the application. This is a big If, however, as applications commonly behave more haphazardly than media players that sequentially read files from beginning to end. Furthermore, data is usually consumed much faster than the typical random data transfer rates that storage devices can provide.

Table 1: Impact of fragmentation on every-day tasks
Table 1: Impact of fragmentation on every-day tasks

Joe Kinsella from Condusiv Technologies put this to the test by fragmenting the free space on a filesystem, and using that fragmented space to store files accessed from different every-day applications [4]. Table 1 shows some of the reported results. Some of the numbers are not easy to explain, e.g. why task completion time decreases with fragmentation for the Microsoft Outlook and Anti-Spyware tests. Overall, however, task completion times seem to increase by 4-124% with 10% fragmentation, which is a degree of fragmentation I tend to observe on my file systems.

Why do files end up fragmented?

For defragmentation to make sense, files must end up fragmented. As we already covered, one reason for that is free space fragmentation. In 1986, the average person kept 500MB of data around, while in 2007 this number grew to 44.5GB [5]. At the time, hard drives of twice that size were common place [6]. As we use smaller fractions of the storage devices that come with our personal computers, file blocks do not have to be spaced apart, so why should we even worry about file fragmentation?

A determining factor for the degree of fragmentation that files will exhibit, is the characteristics of the underlying filesystem; specifically, the block allocation and write policies.

The block allocation algorithms of modern filesystems tend to allocate blocks of the same file close together on the storage medium [7] (Section 5.10.11). This works well when the filesystem follows a write policy that updates data in place, i.e. a block is updated by overwriting the original data block. Given a good original placement of the data, this is a good strategy for keeping fragmentation low. The downside is that overwriting data blocks may leave the filesystem in an inconsistent state in the event of a crash, which is solved by journaling updates, i.e. also appending the data to a log file, a technique that comes with performance implications due to the additional write operation (a very rich literature exists to mitigate this problem in a myriad creative ways, which I will not even attempt to cite). Filesystems such as ext2/3 in Linux, and NTFS in Windows update their data in place.

To avoid the performance (and code complexity) issues that plague logging, some filesystems treat all of storage as a log, always appending data instead of overwriting. This technique, originally introduced in log-structured filesystems and now termed write-anywhere or copy-on-write, has made its way to many modern filesystems, such as btrfs in Linux, and HFS in OS X. While write-anywhere filesystems are expected to achieve better write performance, they come with their own set of problems, one of which is file fragmentation. Files in a write-anywhere filesystem will normally be fragmented every time a write occurs in one of their offsets, as the new block will be placed at the end of the log. To reduce free-space fragmentation, traditional log-structure filesystems have employed garbage collection for data compaction, i.e. move allocated blocks closer together. To reduce file fragmentation, write-anywhere filesystems perform online defragmentation when it is most needed; this depends on different heuristics for different filesystems.

Do modern filesystems defragment?

You’d think they don’t, because we’re told “there is little benefit to defragmenting” nowadays [8]. That doesn’t mean it doesn’t happen, however. It just means that modern filesystems don’t depend on you to trigger a defragmentation operation.

More specifically, write-anywhere filesystems tend to perform online, heuristic-dependent defragmentation. Btrfs performs online defragmentation on files that have experienced small (≤64KB) random writes [9], and HFS usually tends to small (≤20MB) files with more than 8 fragments [10]. While this happens while you’re using the filesystem, odds are you won’t notice it, because you’re probably not hammering your disk all the time. And when you don’t, it’s tidying up the mess you made.

Figure 2: NTFS defragmentation operation in process in Windows 7
Figure 2: NTFS defragmentation operation in process in Windows 7

In the case of update-in-place filesystems, defragmentation is not as sophisticated. Defragmentation tools for these filesystems usually need to be triggered manually, and they scan the entire filesystem trying to do their best. In Linux, the ext4 filesystem is usually defragmented using e4defrag [11]. e4defrag defragments a file by attempting to allocate an empty file of the same size with as many contiguous blocks as possible. If the fragmentation of the empty file is better than the original, it proceeds to copy the data over and delete the original [12] [13]. Maybe not as fancy as the copy-on-write approach, but oh well. In Windows, NTFS is not much different. The Disk Defragmenter application is scheduled to run weekly, and go over your filesystem multiple times to optimize it. I have no idea what it’s actually doing, but I’ll tell you this: it took me 12 hours to defragment a 300GB filesystem that was only 25% full (i.e. 75GB of data) and 8% fragmented. As shown in Figure 2, the program performed multiple passes (15 in total, before it terminated) over the filesystem, and in each pass there was a defragmentation and a consolidation phase. Presumably, the former lowers file fragmentation, while the latter lowers free space fragmentation. Whatever it does, however, I’m sure it could be optimized.

Conclusion (TL;DR)

You might have heard that defragmentation is a thing of the past; that’s a myth. You might have read that you don’t need to manually defragment your filesystem; that’s probably true. Nowadays we don’t need defragmentation as badly as we did in the past. That’s because we are now building our filesystems to be smarter (read: more complex), and prevent really bad fragmentation. However, file fragmentation can still happen despite our best efforts. To combat that, some filesystems have built-in heuristics that defragment files when you mess them up. Other filesystems are not as selective, and will sneak and perform full scans behind your back to fix things up.

References:
[1] http://dl.acm.org/citation.cfm?id=1095836
[2] https://en.wikipedia.org/wiki/Trim_%28computing%29
[3] http://asalor.blogspot.com/2011/08/trim-dm-crypt-problems.html
[4] http://www.condusiv.com/disk-defrag/fragmentation-impact/
[5] http://www.businessinsider.com/average-hard-disk-use-2013-7
[6] http://www.tomshardware.com/reviews/2007-hdd-rundown,1522.html
[7] http://www.tldp.org/LDP/sag/html/filesystems.html
[8] https://support.apple.com/en-us/HT1375
[9] http://kernelnewbies.org/Linux_3.0#head-3e596e03408e1d32a7cc381d6f54e87feee22ee4
[10] http://osxbook.com/software/hfsdebug/fragmentation.html
[11] http://manpages.ubuntu.com/manpages/vivid/man8/e4defrag.8.html
[12] http://www.linux.org/threads/online-defragmentation.4121/
[13] http://git.kernel.org/cgit/fs/ext2/e2fsprogs.git/tree/misc/e4defrag.c

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s