Random   •   Archives   •   RSS   •   About   •   Contact   •  

unpotato webwords: cleaning up btrfs docker volumes after build-all

So you ran make build-all on webwords and now your root filesystem is 100% full. BTRFS doesn't clean up Docker volumes the way you'd expect. Here's how to unpotato your system.

The companion webwords git repo lives here.

the problem

Docker's system prune doesn't fully clean up BTRFS subvolumes. Even after removing containers and images, the filesystem usage stays high because the underlying BTRFS subvolumes persist.

identify docker volumes

List all Docker volumes:

docker volume ls

Check which containers might be using volumes:

docker ps -a --filter volume=volume_name

delete docker volumes

Stop and remove any containers using the volumes:

docker stop $(docker ps -aq)
docker rm $(docker ps -aq)

Remove all unused volumes:

docker volume prune -f

manual btrfs cleanup

If filesystem usage is still high, manually clean up BTRFS subvolumes.

Check Docker's storage driver:

docker info --format '{{.Driver}}'

If it shows btrfs, locate the subvolumes:

sudo ls /var/lib/docker/btrfs/subvolumes/

List all BTRFS subvolumes:

sudo btrfs subvolume list /var/lib/docker

Delete persistent subvolumes:

sudo btrfs subvolume delete /var/lib/docker/btrfs/subvolumes/HASH_HERE

Wait for async deletion to complete:

sudo btrfs subvolume sync /var/lib/docker

the nuclear option (that actually worked)

After running unpotato, Docker's metadata became corrupted with references to missing BTRFS subvolumes. Simple restarts and partial cleanups failed with errors like:

failed to register layer: stat /var/lib/docker/btrfs/subvolumes/53a19da3801c895b697625094ebc8f95668a212a4cbc923787ac0c86452d9f60: no such file or directory

The root cause: Docker's image metadata and containerd's content store still referenced layers that no longer existed in BTRFS.

Partial cleanup attempts that failed:

# These didn't fix the corruption:
sudo systemctl restart docker
sudo rm -rf /var/lib/docker/image/btrfs/layerdb/*
sudo rm -rf /var/lib/containerd/io.containerd.content.v1.content/*

The only solution that worked:

Complete reset of both Docker and containerd data stores:

sudo systemctl stop docker
sudo systemctl stop containerd
sudo rm -rf /var/lib/docker/*
sudo rm -rf /var/lib/containerd/*
sudo systemctl start containerd
sudo systemctl start docker

This deletes all Docker data including images, containers, volumes, and networks. But it's the only way to recover from corrupted layer metadata after aggressive BTRFS cleanup.

the unpotato script

After going through the manual unpotato process, I created a script to automate the recovery for future incidents.

## what it does

The script focuses on three key operations that free space WITHOUT touching your valuable base images:

  1. Remove build cache - This is the big win, often 10-30GB of intermediate layers
  2. Remove dangling images - The <none>:<none> orphaned images from failed builds
  3. Remove stopped containers - Minimal space but good housekeeping

Critical: The script uses `docker image prune -f` NOT `docker image prune -a -f`

The -a flag would remove ALL unused images including your expensive base images. Without it, only true dangling images are removed while all tagged images (even if not currently used) are preserved.

why it works for potatoed systems

When your system is critically full, most commands that try to write files will fail or hang. The unpotato script is designed to work in this state:

  • No file writes until the actual cleanup operations
  • Avoids df during critical steps (it times out when the system is potatoed)
  • Uses systemctl stop docker before final check to unmount overlays
  • Includes BTRFS sync to commit async deletions
#!/bin/bash
# unpotato-docker-btrfs.sh - Emergency Docker cleanup for potatoed systems
# ONLY removes: build cache, dangling images, stopped containers
# PRESERVES: All tagged images and their layers

set -e

echo "======================================"
echo "  Docker Emergency Unpotato"
echo "======================================"
echo "This script will ONLY remove:"
echo "  - Build cache (layers not in images)"
echo "  - Dangling images (<none>:<none>)"
echo "  - Stopped containers"
echo ""
echo "This script will PRESERVE:"
echo "  - All tagged images"
echo "  - Base images"
echo "  - Image layers"
echo "======================================"
echo ""

# Check if running as root
if [[ $EUID -ne 0 ]]; then
   echo "[ERROR] This script must be run as root (use sudo)"
   exit 1
fi

# Check if Docker is installed
if ! command -v docker &> /dev/null; then
    echo "[ERROR] Docker is not installed"
    exit 1
fi

# Check filesystem type
FILESYSTEM=$(stat -f / -c %T 2>/dev/null || echo "unknown")
echo "[INFO] Detected filesystem: $FILESYSTEM"
echo ""

# Step 1: Stop all running containers (no disk writes needed)
echo "[STEP 1/5] Stopping all running containers..."
RUNNING=$(docker ps -q 2>/dev/null | wc -l)
if [[ $RUNNING -gt 0 ]]; then
    docker stop $(docker ps -q) 2>/dev/null || echo "[WARN] Some containers failed to stop"
    echo "[OK] Stopped $RUNNING containers"
else
    echo "[OK] No running containers"
fi
echo ""

# Step 2: Remove stopped containers (minimal disk writes)
echo "[STEP 2/5] Removing stopped containers..."
docker container prune -f 2>/dev/null || echo "[WARN] Container prune failed"
echo "[OK] Stopped containers removed"
echo ""

# Step 3: Remove ONLY dangling images (preserves all tagged images)
echo "[STEP 3/5] Removing dangling images only..."
echo "[INFO] This removes <none>:<none> images ONLY"
echo "[INFO] All tagged images will be preserved"
DANGLING=$(docker images -f "dangling=true" -q 2>/dev/null | wc -l)
docker image prune -f 2>/dev/null || echo "[WARN] Image prune failed"
echo "[OK] Removed $DANGLING dangling images"
echo ""

# Step 4: Remove build cache (THE BIG WIN - usually 10-30GB)
echo "[STEP 4/5] Removing build cache..."
echo "[INFO] This is usually the biggest space saver"
docker builder prune -a -f 2>/dev/null || echo "[WARN] Builder prune failed"
echo "[OK] Build cache removed"
echo ""

# Step 5: BTRFS sync (if applicable)
if [[ "$FILESYSTEM" == "btrfs" ]]; then
    echo "[STEP 5/5] Running BTRFS sync to commit deletions..."
    btrfs filesystem sync / 2>/dev/null || echo "[WARN] BTRFS sync failed"
    echo "[OK] BTRFS sync completed"
else
    echo "[STEP 5/5] Skipping BTRFS sync (not BTRFS filesystem)"
fi
echo ""

# Restart Docker to clear overlay mounts (helps df work)
echo "[FINAL] Restarting Docker daemon..."
systemctl stop docker 2>/dev/null
sleep 2
systemctl start docker 2>/dev/null
sleep 3
echo "[OK] Docker restarted"
echo ""

# Show results
echo "======================================"
echo "  Cleanup Complete!"
echo "======================================"
echo ""

# Try to show space (with timeout in case still potatoed)
if timeout 10 df -h / 2>/dev/null | grep -v Filesystem; then
    echo "[OK] Filesystem responding normally"
elif command -v btrfs &> /dev/null; then
    echo "[INFO] Using BTRFS method (df timed out):"
    btrfs filesystem usage / 2>/dev/null | grep -E "(Free|Used):"
fi
echo ""

echo "[SUCCESS] Unpotato complete!"
echo "[INFO] All your tagged images have been preserved"
echo ""
echo "To verify your images are still there:"
echo "  docker images | head -20"

usage

`bash # Download and run sudo bash unpotato-docker-btrfs.sh `

In my case, the script freed 23.27GB from build cache alone, bringing the system from 100% full to 79% used with 99GB free. All 90+ Docker images including expensive base images (Haskell, OCaml, R, Fortran, Julia, Scala, etc.) were preserved.

The best solution is to not get potatoed in the first place. Consider:

  • Running docker builder prune -f weekly as a cron job
  • Monitoring disk usage with alerts at 80% full
  • Using a separate filesystem or volume for /var/lib/docker
  • Setting up Docker with disk usage limits

But when you do get potatoed, this script will safely get you back to operational status without losing your work.

unpotato



Want comments on your site?

Remarkbox — is a free SaaS comment service which embeds into your pages to keep the conversation in the same place as your content. It works everywhere, even static HTML sites like this one!

Remarks: unpotato webwords: cleaning up btrfs docker volumes after build-all

© Russell Ballestrini.