FYI: 2022-10 thorntail failures from coreutils FICLONE

From: Noah Misch <noah(at)leadboat(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: FYI: 2022-10 thorntail failures from coreutils FICLONE
Date: 2023-01-07 23:29:24
Message-ID: 20230107232924.GD1826938@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

thorntail failed some recovery tests in 2022-10:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2022-11-02%2004%3A25%3A43
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2022-10-31%2013%3A32%3A42
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2022-10-29%2017%3A48%3A15
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2022-10-24%2013%3A48%3A16
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2022-10-24%2010%3A08%3A30
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2022-10-21%2000%3A58%3A14
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2022-10-16%2000%3A08%3A17
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2022-10-15%2020%3A48%3A18
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2022-10-14%2020%3A13%3A35
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2022-10-14%2006%3A58%3A15

thorntail has long seen fsync failures, due to a driver bug[1]. On
2022-09-28, its OS updated coreutils from 8.32-4.1, 9.1-1. That brought in
"cp" use of the FICLONE ioctl. FICLONE internally syncs its source file,
reporting EIO if that fails. A bug[2] in "cp" allowed it to silently make a
defective copy instead of reporting that EIO. Since the recovery suite
archive_command uses "cp", these test failures emerged. The kernel may
change[3] to make such userspace bugs harder to add.

For thorntail, my workaround was to replace "cp" with a wrapper doing 'exec
/usr/bin/cp --reflink=never "$@"'. I might eventually propose the ability to
disable FICLONE calls in PostgreSQL code. So far, those calls (in pg_upgrade)
have not caused thorntail failures.

[1] https://postgr.es/m/flat/20210508001418(dot)GA3076445(at)rfd(dot)leadboat(dot)com
[2] https://github.com/coreutils/coreutils/commit/f6c93f334ef5dbc5c68c299785565ec7b9ba5180
[3] https://lore.kernel.org/linux-xfs/20221108172436(dot)GA3613139(at)rfd(dot)leadboat(dot)com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-01-07 23:38:25 Re: drop postmaster symlink
Previous Message Karl O. Pinc 2023-01-07 22:59:42 Re: drop postmaster symlink