Re: patch to allow disable of WAL recycling

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: David Pacheco <dap(at)joyent(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Jerry Jelinek <jerry(dot)jelinek(at)joyent(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch to allow disable of WAL recycling
Date: 2018-07-12 10:52:37
Message-ID: ac271719-e07e-4841-a7f6-e8c609a53af7@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07/12/2018 02:25 AM, David Pacheco wrote:
> On Tue, Jul 10, 2018 at 1:34 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com <mailto:alvherre(at)2ndquadrant(dot)com>> wrote:
>
> On 2018-Jul-10, Jerry Jelinek wrote:
>
> > 2) Disabling WAL recycling reduces reliability, even on COW filesystems.
>
> I think the problem here is that WAL recycling in normal filesystems
> helps protect the case where filesystem gets full.  If you remove it,
> that protection goes out the window.  You can claim that people needs to
> make sure to have available disk space, but this does become a problem
> in practice.  I think the thing to do is verify what happens with
> recycling off when the disk gets full; is it possible to recover
> afterwards?  Is there any corrupt data?  What happens if the disk gets
> full just as the new WAL file is being created -- is there a Postgres
> PANIC or something?  As I understand, with recycling on it is easy (?)
> to recover, there is no PANIC crash, and no data corruption results.
>
>
>
> If the result of hitting ENOSPC when creating or writing to a WAL file
> was that the database could become corrupted, then wouldn't that risk
> already be present (a) on any system, for the whole period from database
> init until the maximum number of WAL files was created, and (b) all the
> time on any copy-on-write filesystem?
>

I don't follow Alvaro's reasoning, TBH. There's a couple of things that
confuse me ...

I don't quite see how reusing WAL segments actually protects against
full filesystem? On "traditional" filesystems I would not expect any
difference between "unlink+create" and reusing an existing file. On CoW
filesystems (like ZFS or btrfs) the space management works very
differently and reusing an existing file is unlikely to save anything.

But even if it reduces the likelihood of ENOSPC, it does not eliminate
it entirely. max_wal_size is not a hard limit, and the disk may be
filled by something else (when WAL is not on a separate device, when
there is think provisioning, etc.). So it's not a protection against
data corruption we could rely on. (And as was discussed in the recent
fsync thread, ENOSPC is a likely source of past data corruption issues
on NFS and possibly other filesystems.)

I might be missing something, of course.

AFAICS the original reason for reusing WAL segments was the belief that
overwriting an existing file is faster than writing a new file. That
might have been true in the past, but the question is if it's still true
on current filesystems. The results posted here suggest it's not true on
ZFS, at least.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-07-12 10:59:15 Re: Preferring index-only-scan when the cost is equal
Previous Message Heikki Linnakangas 2018-07-12 10:45:59 Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove pgbench "progress" test pending solution of its timing is (fwd)