Re: Online checksums patch - once again

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Online checksums patch - once again
Date: 2020-11-25 13:33:32
Message-ID: 8e1cda9b-1cb9-a634-d383-f757bf67b820@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 25/11/2020 15:20, Daniel Gustafsson wrote:
>> On 23 Nov 2020, at 18:36, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>> What happens is if you crash between UpdateControlFile() and XlogChecksum()?
>
> Good point, that would not get the cluster to a consistent state. The
> XlogChecksum should be performed before controlfile is udpated.
>
> +void
> +SetDataChecksumsOff(void)
> +{
> + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> +
> + if (ControlFile->data_checksum_version == 0)
> + {
> + LWLockRelease(ControlFileLock);
> + return;
> + }
> +
> + if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
> + {
> + ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
> + UpdateControlFile();
> + LWLockRelease(ControlFileLock);
> +
> + /*
> + * Update local state in all backends to ensure that any backend in
> + * "on" state is changed to "inprogress-off".
> + */
> + XlogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
> + WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF));
> +
> + /*
> + * At this point we know that no backends are verifying data checksums
> + * during reading. Next, we can safely move to state "off" to also
> + * stop writing checksums.
> + */
> + }
> +
> + XlogChecksums(0);
> +
> + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> + ControlFile->data_checksum_version = 0;
> + UpdateControlFile();
> + LWLockRelease(ControlFileLock);
> +
> + WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF));
> +}

The lwlocking doesn't look right here. If
ControlFile->data_checksum_version != PG_DATA_CHECKSUM_VERSION,
LWLockAcquire is called twice without a LWLockRelease in between.

What if a checkpoint, and a crash, happens just after the WAL record has
been written, but before the control file is updated? That's a
ridiculously tight window for a whole checkpoint cycle to happen, but in
principle I think that would spell trouble. I think you could set
delayChkpt to prevent the checkpoint from happening in that window,
similar to how we avoid this problem with the clog updates at commit.
Also, I think this should be in a critical section; we don't want the
process to error out in between for any reason, and if it does happen,
it's panic time.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2020-11-25 13:54:39 Re: Parallel plans and "union all" subquery
Previous Message Ashutosh Bapat 2020-11-25 13:23:47 Re: [PATCH] LWLock self-deadlock detection