Volevo conoscere lo stato di salute del disco esterno ma ho ucciso Docker
Since I am using a very old 1TB HDD (15 years old?) I am sometimes scared it could fail from one day to another. Today I wanted to see if I could find some symptoms using a software-based solution. DuckDucking a bit I found this article(archived) that talks about smartctl
.
I gave it a try and, after installing it using apt
, I ran: sudo smartctl -i /dev/sda
and it returned:
smartctl 7.2 2020-12-30 r5155 [aarch64-linux-5.15.84-v8+] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: SAMSUNG
Product: HD103UJ
User Capacity: 1.000.204.886.016 bytes [1,00 TB]
Logical block size: 512 bytes
Serial number: 152D20329000
Device type: disk
Local Time is: Sun Nov 5 14:36:25 2023 CET
SMART support is: Unavailable - device lacks SMART capability.
Sadly, my old HDD does not support SMART :(. I don’t know what actually happened but after this command the HDD became read-only and all my docker containers started to fail (my docker uses the HDD and not the SD for its data). I could use the services but any actions that had to require a write was failing (in one of the container logs I could read something like read-only file system
).
I first tried the old way: let’s restart the containers. But docker-compose down
did not succeed and the logs contained a tons of line like these:
ERROR: for self-hosted_mongodb_1 container da0f4865943a7c338660006e857dfada61b1fff08d1fb02585e10737f13a5270: driver "overlay2" failed to remove root filesystem: unlinkat /mnt/ext-drive/docker-data/overlay2/d893045b5f5181a43d51abe2989fbc627c89e8dedb7026606902086f6f399ff3: read-only file system
Removing network self-hosted_default
ERROR: error while removing network: network self-hosted_default id 114fecd466ee36774d0bef247ba8127dbab79c9ce25f091bae3a8d3480337546 has active endpoints
At this point, I was a little bit scared. I did not want to restart the Pi because I thought Docker would not be able to restart anymore and it could have blocked the Pi startup. I searched for something online and one of the Stackoverflow answers made me even more scary. My dmesg
output was not great (no colors here, but of these lines were red):
[23542.142493] EXT4-fs warning (device sda1): ext4_end_bio:348: I/O error 10 writing to inode 16129270 starting block 7139300)
[23542.142566] Buffer I/O error on device sda1, logical block 7139042
[23542.142588] Buffer I/O error on device sda1, logical block 7139043
[23544.614305] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=DRIVER_OK cmd_age=0s
[23544.614326] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x2a 2a 00 20 9d 4a b8 00 00 08 00
[23544.614332] blk_update_request: I/O error, dev sda, sector 547179192 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[23544.614343] EXT4-fs warning (device sda1): ext4_end_bio:348: I/O error 10 writing to inode 16129026 starting block 68397400)
[23544.614355] Buffer I/O error on device sda1, logical block 68397143
[23545.128173] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=DRIVER_OK cmd_age=0s
[23545.128194] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x2a 2a 00 3a 07 23 10 00 00 88 00
[23545.128200] blk_update_request: I/O error, dev sda, sector 973546256 op 0x1:(WRITE) flags 0x800 phys_seg 17 prio class 0
[23545.128263] Aborting journal on device sda1-8.
[23545.206305] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=DRIVER_OK cmd_age=0s
[23545.206331] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x2a 2a 00 3a 04 08 00 00 00 08 00
[23545.206339] blk_update_request: I/O error, dev sda, sector 973342720 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[23545.206351] Buffer I/O error on dev sda1, logical block 121667584, lost sync page write
[23545.206386] JBD2: Error -5 detected when updating journal superblock for sda1-8.
[23547.145991] EXT4-fs error (device sda1): ext4_journal_check_start:83: comm dockerd: Detected aborted journal
[23547.230310] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=DRIVER_OK cmd_age=0s
[23547.230338] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x2a 2a 00 00 00 08 00 00 00 08 00
[23547.230346] blk_update_request: I/O error, dev sda, sector 2048 op 0x1:(WRITE) flags 0x3800 phys_seg 1 prio class 0
[23547.230359] Buffer I/O error on dev sda1, logical block 0, lost sync page write
[23547.230400] EXT4-fs (sda1): I/O error while writing superblock
[23547.230408] EXT4-fs (sda1): Remounting filesystem read-only
[23584.896419] br-114fecd466ee: port 3(vethe8b1298) entered disabled state
[23584.896882] veth29119cf: renamed from eth0
[23585.006119] overlayfs: upper fs is r/o, try multi-lower layers mount
[23586.118680] br-114fecd466ee: port 11(veth362085a) entered disabled state
[23586.118897] vethff16379: renamed from eth0
[23586.224521] overlayfs: upper fs is r/o, try multi-lower layers mount
I tried a few times to restart the containers, without luck. I also tried to restart the docker service (sudo service docker restart
) with the same result.
At this point, I decided to try the shutdown and hope for the best. It worked :) When the Pi started, all the docker containers restarted and everything was back!