From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #17064: Parallel VACUUM operations cause the error "global/pg_filenode.map contains incorrect checksum" |
Date: | 2021-06-22 14:11:06 |
Message-ID: | 1715251.1624371066@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> Your analysis seems right to me. We have to worry about both things:
> atomicity of writes on power failure (assumed to be sector-level,
> hence our 512 byte struct -- all good), and atomicity of concurrent
> reads and writes (we can't assume anything at all, so r/w locking is
> the simplest way to get a consistent read). Shouldn't relmap_redo()
> also acquire the lock exclusively?
Shouldn't we instead file a kernel bug report? I seem to recall that
POSIX guarantees atomicity of these things up to some operation size.
Or is that just for pipe I/O?
If we can't assume atomicity of relmapper file I/O, I wonder about
pg_control as well. But on the whole, what I'm smelling is a moderately
recently introduced kernel bug. We've been doing this this way for
years and heard no previous reports.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2021-06-22 14:19:23 | Re: BUG #17068: Incorrect ordering of a particular row. |
Previous Message | Tom Lane | 2021-06-22 14:00:22 | Re: BUG #17068: Incorrect ordering of a particular row. |