Linux – disk usage (du) human readable AND sorted by size

This is quick tip to fix a problem that has always bugged me – When showing disk usage in a human readable form (KB, MB, GB) for each subdirectory using “du -sh *”, how can you properly sort it into size order.

If you just want the solution here it it…

alias duf='du -sk * | sort -n | perl -ne '\''($s,$f)=split(m{\t});for (qw(K M G)) {if($s<1024) {printf("%.1f",$s);print "$_\t$f"; last};$s=$s/1024}'\'

Put it into ~/.bashrc to make it permanent.

But if you can spare a minute or two you might get some ideas about how to write those programmatic aliases, in this case using perl.

When using the linux “du” command I like to make the file size human readable, so 8709100 becomes 8.4G, This is achieved by doing this:

du -sh *

Now, the main problem with the K, M and G filesize suffixes is that you can’t sort them.

If you try to pipe that through sort by using

du -sh * | sort

you’ll get something like this

8.4G Desktop
2.6G Documents
12K keys
12M Pictures
536K scripts

or if we sort numerically

du -sh * | sort -n

you’ll get something like this

2.6G Documents
8.4G Desktop
12K keys
12M Pictures
536K scripts

Obviously both these commands are not working as we are intending, because the K(ilo) M(ega) G(iga) suffixes mess up “sort”, The solution is a one liner wrapped up into an alias ‘duf’ for ‘disk usage formatted’

alias duf='du -sk * | sort -n | perl -ne '\''($s,$f)=split(m{\t});for (qw(K M G)) {if($s<1024) {printf("%.1f",$s);print "$_\t$f"; last};$s=$s/1024}'\'

When expanded out, formatted and commented the code looks like this

du -sk * | sort -n |   //get usage in KBytes and sort
perl -ne '             //we use perl to reformat the filesize in K M & G
($s,$f)=split(m{\t});  //splits the size/filename pair
 for (qw(K M G)) {  //loops for each size
  if($s<1024) {        //if s<1024 weve found the correct suffix
   printf("%.1f",$s);  //display the size
   print "$_\t$f";     //display the filename
   last                //line completed
  };
 $s=$s/1024            //for each sizes suffix divide by 1024
}'

This produces the output we intended like this.

12.0KB	keys
536.0KB	scripts
11.7MB	Pictures
2.5GB	Documents
8.3GB	Desktop

Here some useful additions that are worth adding as an edit to my original post:

1) Purely as a shell script, without the perl overhead - source 'inataysia' reddit
du -sk * | sort -n | while read size fname; do for unit in k M G T P E Z Y; do if [ $size -lt 1024 ]; then echo -e "${size}${unit}\t${fname}"; break; fi; size=$((size/1024)); done; done

2) As a function, instead of an alias - which allows you to pass paramters to du - source 'fire'
function duf {
du -sk "$@" | sort -n | perl -ne '($s,$f)=split(/\t/,$_,2);for(qw(K M G T)){if($s<1024){$x=($s<10?"%.1f":"%3d");printf("$x$_\t%s",$s,$f);last};$s/=1024}'
}

Combining together would probably make the best solution so far.
function duf {
du -sk "$@" | sort -n | while read size fname; do for unit in k M G T P E Z Y; do if [ $size -lt 1024 ]; then echo -e "${size}${unit}\t${fname}"; break; fi; size=$((size/1024)); done; done
}

If this has been useful to you, and you would like to buy me a coffee, or help towards my monthly server costs please click here to make a donation via paypal.

41 comments to Linux – disk usage (du) human readable AND sorted by size

  • Casper

    Thanks for this cool script. One caveat is to watch out to use this in some top-level folder, it can take a very long time to finish. (Wish we had file systems that maintained directory size somehow.)

  • [...] This post was Twitted by metoikos – Real-url.org [...]

  • du -s * | sort -n | sed -Ee ‘s/^[0-9]+./”/’ -e ‘s/$/”/’ | xargs du -sh

    Perl-less implementation; a little extra effort for filenames with spaces. (Yours doesn’t have to worry about that, obviously.)

  • Michael Speer

    http://www.nabble.com/Human-readable-sort-td23223205.html

    Never discount simply fixing the underlying problem.

  • That’s always bugged me as well! I’ve made a few changes,
    though, so that it produces the same formatted output as
    du -sh. Also, as a function it can take arguments:


    function duf {
    du -sk "$@" | sort -n | perl -ne '($s,$f)=split(/\t/,$_,2);for(qw(K M G T)){if($s<1024){$x=($s<10?"%.1f":"%3d");printf("$x$_\t%s",$s,$f);last};$s/=1024}'
    }

  • @casper

    I don’t.

    I prefer not to pay an additional cost on every write, to speed up this far-less-frequent case.

  • Latest version of sort (part of coreutils) supports -h (correct sorting of M,k,G suffixes).

  • I use the following… you see it use ‘du’ two times, but this is not really slower, ’cause the operating system caches.

    # sorted du -hsc
    function duhs() {
    du -s $* | sort -n | cut -f 2- | while read a; do du -sh $a; done
    }

  • DVoita

    If you modify du -sk * to du -sk * .??* you can see hidden dot files as well.

  • chris

    Thanks to inataysia on reddit for a bash only version

    du -sk * | sort -n | while read size fname; do for unit in k M G T P E Z Y; do if [ $size -lt 1024 ]; then echo -e "${size}${unit}\t${fname}"; break; fi; size=$((size/1024)); done; done

  • Why not promote the ‘human-readability’ step to a standalone utility?

    Let’s call it ‘hu’ for ‘human units’. Hypothetically, it would convert any whitespace-delimited numbers found on stdin to human-readable units when echoing to stdout. (Optional arguments could limit this conversion to just certain fields or to alternate unit systems.) Then the solution would be:

    du -sb * | sort -n | hu

  • Michael Speer

    Sat Jan 20 06:00:09 1996 Jim Meyering (——@na-net.ornl.gov)

    —snip—

    * du.c (main): New options –human-readable (-h) and –megabytes (-m).
    (human_readable): New function.
    From Larry McVoy (——@sgi.com).

    Ever since this patch was included in fileutils, system administrators have been frustrated by finding that while they could `du -h` they could not then `sort -h` the output. -h is not posix but is now solidly a part of the gnu coreutils du and ls commands. Including a switch for sort that respects the switch for du was not my invention. It has been argued a number of times on the developers mailing list. Mine was simply the straw which broke the camels back. The additional switch is consistent with the other tools, and merely augments the purpose of sort without creating a differing utility to it.

    Something of the functionality of `hu` may have been the appropriate fix in ’96, but since the ’96 -h switch is long set, adding a corresponding switch to sort seems only too appropriate. To `promote’ -h out of du, df and ls into a separate utility would break scripts of users that depend on it.

  • Josh

    You can also set the BLOCK_SIZE environment variable to the value human-readable and all the GNU coreutils that report sizes will respect it.

  • my solution

    du -s * 2>/dev/null | sort -n | cut -f2 | xargs du -sh 2>/dev/null

  • I like ‘-h’ too; it doesn’t have to go away for ‘hu’ to also exist and be useful in other contexts, or when people need a sort to precision hidden by ‘-h’ rounding.

  • @Jason Sares: your xargs line chokes on spaces etc.
    Using while/do fixes that. Using newline as IFS fixes bug with trailing space on files/dirs. Combining:

    du -s * 2>/dev/null | sort -n | cut -f2 | while IFS=$’\n’ read F; do du “$F” -sh; done

  • Bruno

    i guess this is easier:

    du -s * | sort -nr | cut -f2 | xargs du -sh ( you can add a “| less” ) for long lists…

  • Unfortunately
    du -s * | sort -nr | cut -f2 | xargs du -sh
    doesn’t work, illustrated by this example

    2K
    2G
    3M

  • Dwayne Cole

    du -sk * | sort -nr | cut -f2 | xargs du -sh works captures the ‘K’, ‘G’, ‘M’ cases. (still chokes on spaces in directory names though)

  • Dan Schaefer

    I modified Dwayne’s code to be space-friendly. Let me know if something doesn’t work.
    du -sk * | sort -nr | cut -f2 | xargs -d “\n” du -sh $1

  • Some distributions set up cron jobs that warn you when disk usage exceeds a certain percentage. But when your usage gets flagged as high, it’s deciding what goes and what stays that takes time. Finding the right file or folder to get rid of can be a chore if you have a huge disk. But don’t panic. Among all that clutter, you’ve got some simple tools to bring order to chaos.

    The CLI way

    The df utility displays the disk space usage on all mounted filesystems. The -T option prints the filesystem type as well. By default, df measures the size in 1K blocks, which could be a little difficult for a desktop user to decipher. Use the -h option to get more understandable output:
    Memory

    cat /proc/meminfo = memory usage information
    free = how much memory is currently unused
    Disk space

    df = disk usage for all partitions
    du -h = disk usage for the current directory and all sub-directories

    Thai Green Curry

  • Andres Van Treek

    @Dan Schaefer

    In some systems, we have to replace double quotes by simple quotes
    like this

    du -sk * | sort -nr | cut -f2 | xargs -d ‘\n’ du -sh $1

  • Please be sure to replace the “fancy quotes” around the ‘\n’ when you copy and paste this command into terminal or it won’t execute properly. Also, the command lists the results in descending order using the -r flag on sort, a plus in my opinion.

    du -sk * | sort -nr | cut -f2 | xargs -d ‘\n’du -sh $1

  • Bob/Paul

    du -h * | sort -h

    from man sort:
    -h, –human-numeric-sort
    compare human readable numbers (e.g., 2K 1G)

    $ sort –version
    sort (GNU coreutils) 8.5
    Copyright (C) 2010 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later .
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.

    Upgrade your sort utility to one that supports the –human-numeric-sort option

  • Astral

    Guys c’mon, no way need to go this complex. Simplicity is key.

    du -k /home | sort -n

    Sorts directory numerical order, using KB (-k). Want order in MB use -m instead.

    Now isn’t that much easier?

  • Very easy – but not what we want.
    Since you did not read the title of this page (located at the top, as usual), I will repeat it here at the bottom:

    “Linux – disk usage (du) human readable AND sorted by size”

  • Ashish Jaiswal

    I have these sort of issue at my office

    So I normally go for this command.. Normal Scenario is that the var is one is getting filled all the time

    cd /var/log/
    # du -csh * | grep “M” |sort -rn

    This will list all the directory which is higher in space, If you are dealing with very high data, then you can grep it with “G” also.

    If you want to which specific file is one, then you can go for this command :

    # du -csh */* | grep “M” |sort -rn
    Cheers

  • Bob/Paul

    No. Sometimes -h is better than a static unit for everything. I’d probably use your alternative on Solaris or FreeBSD or somewhere else without a recent GNU sort.

  • [...] Article: http://www.earthinfo.org/linux-disk-usage-sorted-by-size-and-human-readable/ No Comments Posted by Michael Butler in technical, Ubuntu [...]

  • indeed the best solution is

    $ du -hsc * | sort -h

  • Manuel Parra

    I like

    ls -lSh

    And if you don’t like the extra info use awk to clean it up.

    When I feel fancy I’ll just use a for loop with ls and awk.

    ls -lhS | awk ‘{for(i=5; i<=NF; i++) printf("%s ",$i); print ""}'

  • du -hsc * | sort -n

    du -hsc * | sort -n

  • Chris Needham

    … do it this way… much easier to read

    #!/bin/bash
    du -sh * | sort -n | grep K
    du -sh * | sort -n | grep M
    du -sh * | sort -n | grep G

  • Chris Needham

    There’s a bug in mine… if you do mkdir M then run my code it’l mess up because the M dir will be in kilobytes and when it Greps for M it will grab it… anyway… it works 95% of the time and it’s easier to read and remember.

  • Chris Needham

    yeah.. just do this

    du -sh * | sort -h

    as said before .. the du -shc * | sort -h will also throw the total at the bottom

  • Is there a way to perhaps sort the results based on whether or not large files exist in the specific directory for example??

    My issue is. . .my /usr/ partition is full, and it’s showing me all the directories, but none of them are large enough to suggest filling up the partition to 39GB in size.

    I also tried using a find/sort combination I found elsewhere, but that didn’t pinpoint any specific files either.

    I mean there are files and directories that are 80M here, 100M there, but nothing to suggest a single file or series of files that could fill up the partition as quickly as I delete other files.

    I deleted a 39M file in one directory, and within seconds, the space was full. Any ideas?

  • [...] to Earth info (modified the example to suppress error messages and make it more readable) this nice [...]

  • dummyano

    What’s taking up all that space in my home?

    du -h –max-depth=1 . | sort -h -r

  • Choperro

    du -s * .[^.]* ..[^.]* …*

    is the list of files and dirs excluding “.” and “..”

    BUT there will be a stderr message

    —- using b=bytes and apparent size —-
    function duf2 { du -sb “$@” | sort -nr | while read size fname; do for unit in b k M G T P E Z Y; do if [ $size -lt 1024 ]; then echo -e “${size}${unit}\t${fname}”; break; fi; size=$((size/1024)); done; done; }

    For me, human readable should be a unit:
    B,K,M,G,.. in capital letters and with a pair of decimals. But, of course, labyrinthine linux uses upper and lower case letter, and no letter for Bytes, etc.

  • Choperro

    I’m sure this can be improved to be more efficient, but here it is with decimals:

    function duh3 {
    du -sb “$@” | sort -nr |
    while read intsize fname; do
    floatsize=$intsize;
    for unit in B K M G T P E Z Y; do
    if [ $intsize -lt 1024 ]; then
    echo -e ${floatsize}${unit}\t${fname}”;
    break;
    fi;
    floatsize=$(echo “scale=2;$floatsize/1024″| bc -l);
    intsize=$((intsize/1024));
    done;
    done
    }

Leave a Reply

  

  

  

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>