Determining Disk Space Usage on Linux/Unix

Sometimes your servers fill up or your local machine does. Let’s clean that up. Here are a few commands and methods to help you determine what’s taking up the most space on your system.

Finding Files

sudo du -h . | sort -rh | head -5

This command lists the five largest files/directories in the current directory (.), sorted from the largest to the smallest. It’s a powerful way to quickly identify where most of the disk space is being used.

If you’re still having trouble, you can use the find command to search for large files:

sudo find / -type f -size +100M -exec ls -lh {} \;

This command searches for files larger than 100MB on the entire system. You can adjust the size to your needs.

Then just delete the files you don’t need. I’ll leave how up to you.

There’s also ncdu (NCurses Disk Usage). Solid tool for finding out what’s taking up space on your system. It’s a bit more user-friendly than the du command and has a nice interface.

sudo ncdu /

To scan the entire system, run this and ncdu will show you a nice interface to navigate through the files and directories.

Auto Remove

Next I usually clean cache from package managers, logs, and old kernels. Here are a few commands to help with that:

For Debian/Ubuntu:

sudo apt-get clean
sudo apt-get autoremove

For Red Hat/CentOS:

sudo yum clean all
sudo package-cleanup --oldkernels --count=1

The first two commands clean up the cache from the package manager.

sudo journalctl --vacuum-time=7d
sudo journalctl --vacuum-size=100M
sudo journalctl --vacuum-size=1G

These three commands clean up the logs, keeping only the last 7 days, or the last 100MB or 1GB of logs.

Kernels

Clear up old kernels:

For Debian/Ubuntu:

sudo apt autoremove --purge

For Red Hat/CentOS:

sudo package-cleanup --oldkernels --count=1

Using the autoremove command on Debian/Ubuntu will remove old kernels and their associated files. On Red Hat/CentOS, the package-cleanup command will do the same.

Logs

Logs typically live in /var/log. You can use the du command from earlier to find the largest log files:

sudo du -h /var/log | sort -rh | head -5
find . -name "*.log" -type f -mtime +30

If you need something more time based, fire this command up to find all log files older than 30 days. You can adjust the time to your needs.

find . -name "*.log" -type f -mtime +30 -delete

If you’re sure you want to delete them, you can use the -delete flag to remove them.

Logrotate

You can also use logrotate to manage log files; a system utility that manages the automatic rotation and compression of log files. It’s a bit more advanced, but it’s a good tool to know about. I’ll show you the basics:

Configuration Files: logrotate configurations are typically located in /etc/logrotate.conf for global settings, and /etc/logrotate.d/ for application-specific settings.

Creating a Custom Logrotate Configuration:

Let’s say you want to rotate the logs for a custom application located at /var/log/myapp/. You would create a configuration file under /etc/logrotate.d/ named myapp with the following content:

/var/log/myapp/*.log {
    daily
    rotate 7
    compress
    delaycompress
    missingok
    notifempty
    create 640 root adm
}
  • daily: Rotate logs daily.
  • rotate 7: Keep 7 days of backlogs.
  • compress: Compress the rotated files.
  • delaycompress: Compress the log file the next time the log is rotated, not immediately.
  • missingok: If the log file is missing, go on to the next one without issuing an error message.
  • notifempty: Do not rotate the log if it is empty.
  • create: Create new log files with set permissions, owner, and group.

Testing logrotate Configuration: You can test your logrotate configuration to ensure it works as expected without actually rotating the logs:

logrotate --debug /etc/logrotate.d/myapp

This will output the logrotate actions that would be taken without actually performing them.

Compression

This is a bit more advanced and kinda last resort, but you can compress files to save space. The gzip command is a good way to do this:

gzip file

Then either offload that data to another system or delete the original file.

Optimization & Conclusion

There are a few other things you can do to optimize your database, unused old directories, data, files, caches, and more. But this is a good start. It’s usually the logs, old kernels, and package manager cache that take up the most space and are the problem children.

If all goes to plan some of this stuff will help you clean up your system and free up some disk space. Sidebar, Mac has brew and brew cleanup to help with this as well but that’s another post for another day.

ender