Fails to work on live images due to fsync() on pg_commit_ts before doing any write there

From: Raphael Hertzog <hertzog(at)debian(dot)org>
To: pgsql-bugs(at)postgresql(dot)org
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Fails to work on live images due to fsync() on pg_commit_ts before doing any write there
Date: 2017-11-07 13:54:54
Message-ID: 20171107135454.lbelbbvfgadljmuj@home.ouaza.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

[ 2nd try after subscription to pgsql-bugs ]

Hello,

PostgreSQL 10 no longer works on a (Kali) live system where the
root filesystem is an overlayfs with an underlying squashfs
filesystem (where postgresql and its initial file structure
is present) and a writable tmpfs overlay.

When you try to create a new database you get this failure:
createdb: database creation failed: ERROR: checkpoint request failed
HINT: Consult recent messages in the server log for details.

And in the server log you have this:
ERROR: could not fsync file "pg_commit_ts": Invalid argument

When you strace the postgresql checkpointer process you see
this:
# strace -f -p 31599
select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout)
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
open("pg_xact", O_RDONLY) = 3
fsync(3) = 0
close(3) = 0
open("pg_commit_ts", O_RDONLY) = 3
fsync(3) = -1 EINVAL (Invalid argument)
close(3) = 0
rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE KILL SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0
write(2, "2017-11-07 09:47:38.580 UTC [315"..., 98) = 98

The reason why the second fsync() fails is because the
pg_commit_ts directory has not had any change since its
creation in the initial image. It is thus stored in the
read-only squashfs filesystem and has not yet been copied
up in the writable tmpfs (which does support fsync). In
this case, overlayfs delegates the fsync() call to the read-only
squashfs filesystem which returns EINVAL as it does not support
such an operation.

This has been explained by the overlayfs upstream developer
(to which I reported this bug initially, thinking it was an
overlayfs regression):
https://marc.info/?l=linux-unionfs&m=151005246512873&w=2
https://marc.info/?l=linux-unionfs&m=151005699414227&w=2

My request is thus that PostgreSQL should fsync that directory only after
it has made changes to the directory or its content. PostgreSQL 9.6 was
working fine in the same setup and I would like PostgreSQL 10 to do the
same. :)

I'm ccing Teodor Sigaev <teodor(at)sigaev(dot)ru> because I believe that
the problematic fsync() has been added by him in this commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1b02be21f271db6bd3cd43abb23fa596fcb6bac3

Cheers,
--
Raphaël Hertzog ◈ Writer/Consultant ◈ Debian Developer

Discover the Debian Administrator's Handbook:
https://debian-handbook.info/get/

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alvaro Herrera 2017-11-07 14:13:38 Re: Fails to work on live images due to fsync() on pg_commit_ts before doing any write there
Previous Message sean.johnston 2017-11-07 13:49:48 BUG #14890: Error grouping by same column twice using FDW