DJ - Zombie ZFS

New About Yours API Help
4.2 KB, Plain text
Dear Allan, Benedict, and JT (and TJ and Kris, if you are out there),
As a stalwart BSD user for roughly 16 years and avid BSD Now listener since Episode 1, I wish to thank you profusely for 5+ years of what is by far one of the most entertaining, informative, and useful multimedia productions on the Internet and beyond. Please keep up the good work!

Going forward, I hope to be better at contributing comments and questions to the show. Believe it or not, there have been many times when I had wanted to write in to ask something or add something but always get too busy. Also, someone else in your astute audience will often ask or comment the same thing what I would have written in, so then I get really lazy about writing in myself. C'est la vie en BSD...

Unfortunately, I am writing now is because of an infuriatingly slow software update in a separate embedded project, and a strange problem with ZFS on FreeBSD. No mission-critical data at risk here, but I am just curious about some unusual behavior from a zpool command or two. This was a fairly old pool, but it was working just fine until 7 weeks ago. Questions follow near the end.

Here are the commands:
> # zpool scrub zarchive
> # zpool status -v zarchive
Here is the output:
> pool: zarchive
> state: DEGRADED
> status: One or more devices has experienced an error resulting in data corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the entire pool from backup.
>   see: http://illumos.org/msg/ZFS-8000-8A
>  scan: scrub repaired 0 in 0h0m with 1 errors on Thu Aug 23 00:51:31 2018
> config:
>        NAME                     STATE     READ WRITE CKSUM
>        zarchive                 DEGRADED     2     0     0
>          mirror-0               DEGRADED     8     0     0
>            ada2.eli             ONLINE       0     0    12
>            6073285353319942453  REMOVED      0     0     0  was /dev/ada3.eli
>
> errors: Permanent errors have been detected in the following files:

And that's all from zpool status: no listing of files. The odd part is that zpool status is that it has not stopped running--it shows up in top, using 0% CPU. Control-t on the terminal shows that zio->io_cv has been running for 49 days but with no new activity.

> load: 0.12  cmd: zpool 2631 [zio->io_cv] 99.05r 0.00u 0.00s 0% 3580k
> load: 0.16  cmd: zpool 2631 [zio->io_cv] 4232632.02r 0.00u 0.00s 0% 3580k

Also, nothing I can do on the system will kill the process, probably short of shutting down or pulling the plug. No control-c, no kill -9, [insert tropes here for killing undead creatures], etc. I could try other things, but I would not want to do anything too disruptive before I know what's happening.

As you can tell from the status, the providers are GELI-encrypted, in a mirrored pair. These are two identical spinning-rust drives (both pass their SMART tests, albeit with some occasional errors). When I last had the pool mounted, I had written some data, then unexpectedly lost power a few hours later (and then suddenly discovered that my UPS battery was dead as a doornail!).

After that, the SATA controller started getting noisy (lots of log entries, nothing special), so I shut down and connected the spinning rust to some SATA ports on the main board, booted, and tried to run a scrub.

At the time of starting the scrub, I had just attached/decrypted ada2 and ada3, but I had not yet mounted the filesystem (this was an old-school legacy mount). My guess would be that ZFS cannot tell me which files, if any, were corrupted if I have not mounted anything yet, correct?

So, my main curiosity here is, why the zombie process? Is there a way to terminate it gracefully? Can I ignore it and mount the filesystem (even if read-only)?

*Aside: Admittedly, this is a somewhat dated setup on a somewhat dated system. If you need more info from me, please feel free to reach out, although I cannot guarantee a response before your next live show time. Please feel free to hang on to this question to air on your upcoming Halloween episode if you want something zombie-themed! Happy Halloween!

Thanks again for all your hard work in producing this wonderful show. You all have my eternal (undead?) respect and admiration!

Best,

DJ
Pasted 1 month ago — Expires in 331 days
URL: http://dpaste.com/0YV8WC6