Home / How-To & Tutorials / Automation / DU vs. DF – Which Ones Right? Which To Trust?

DU vs. DF – Which Ones Right? Which To Trust?

du and df are both basic Linux commands that come pre-installed on every flavor of Linux (to the best of my knowledge).

Snippet from the man page for du

NAME
du — display disk usage statistics

DESCRIPTION
The du utility displays the file system block usage for each file argument and for each directory in the file hierarchy rooted in each directory argument. If no file is speci-
fied, the block usage of the hierarchy rooted in the current directory is displayed.

And then for df

NAME
df — display free disk space

DESCRIPTION
The df utility displays statistics about the amount of free disk space on the specified filesystem or on the filesystem of which file is a part. Values are displayed in
512-byte per block counts. If neither a file or a filesystem operand is specified, statistics for all mounted filesystems are displayed (subject to the -t option below).

These commands are used pretty much every day, either by a SysAdmin troubleshooting an issue or setting up an application, or scripts that need to know the available or used disk space.

…But have you ever noticed that sometimes (… every time) they show results are different?

Typically (Not always, but usually), the size reported by df will be more than whats reported by du, but it’s very rare that they both report the same disk usage. Actually, personally… I have never seen them report the same size, and I’ve managed more Linux servers than I can even begin to count. Both of the commands use a different “ruler” per say when they determine the size of said folder/mount. So depending on what you’re trying to get exactly, they can both be correct.

NOTE: In the output of any of my examples of du, I use the -s flag to display the summary of said directory, as opposed to outputting a tree-like display of the contents of the directory, and then the size of each of the files/folders. Then the -h flag will display the space in human-readable format (for both du and df).

There are a few reasons behind this, so lets go over it in some detail….=

Reason # 1

Files in memory will be included in the output of df. du doesn’t account for the files in memory, just the files that are actually on disk.

Reason # 2

The command df will include the size of deleted files with open file descriptors. So hypothetically, if you were to have a large file (Lets say… a 4GB log file… because someone didnt enable logrotate), and you open that file, then someone steps on your toes and accidentally deletes that same 4GB log file (Or right then, it finally gets rotated), df will still include the size of the file as if it was still there, thus, reporting an improper folder/mount point size.

NOTE: You can use the lsof command to help find file descriptors to deleted files. The exact command is lsof +L1

Reason #3

The command df, for the most part, get most of its info from the file systems primary superblock, so it’s almost as if the results were cached, and you’re pulling it from the cache. As opposed to du, which gathers the information for the output at the time you execute the command. You can tell this by how long it takes for the commands to execute. I’ll execute both a du and df on the same machine, with the same mounts, and wrap it in a time command…

root@RPI02:~# time df -h /mnt/media/
Filesystem             Size  Used Avail Use% Mounted on
//192.168.1.140/Media  5.4T  2.4T  3.1T  44% /mnt/media

real	0m0.020s
user	0m0.010s
sys	0m0.000s
root@RPI02:~# time du -sh /mnt/media/
1.6T	/mnt/media/

real	0m6.861s
user	0m0.270s
sys	0m0.800s

Not only is the usage almost a full TB off in the disk usage, but there was a difference of about 6 seconds! You can test this yourself by copying or moving data, then using the watch command in two different terminals and watch the differences in the sizes. The results from df will update much faster, but the results wont change every time it executes.

Summary

Unless you’re doing something like writing a script to interact with the NFS directly or something similar like that, I wouldn’t really trust the output of df. I think of as somewhat of a guesstimation or a ballpark figure of the partition sizes. The only real upside of it is the fact it executes nearly immediately. The du command is much more reliable and accurate. So if you can spare the time it takes to execute the command, I would suggest using du any day.

Why is this information useful? Sometimes (many times in my personal experience), you will execute something like an install script or something that will use one or the other to check for free disk space before it continues. I can remember more than a few times I had to use lsof to check for open file descriptors because a script was erroring out. Or if you use Nagos to monitor your Linux systems, and you get

About J

Welcome to my little corner of the InterWebs! Most of what I post on LinuxDigest is about either automation, something I find interesting, or something I just learned myself. If you want me to post an article about something, just let me know! Im more than happy to help and teach others Linux.

4 comments

  1. Matt, you made a small typo in your post there. Instead of saying: Option + Command + I you typed Option + Command + 1.Just thhugot you would like to know

Leave a Reply

Your email address will not be published. Required fields are marked *

*