Re: Online enabling of checksums

From: Andres Freund <andres(at)anarazel(dot)de>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Online enabling of checksums
Date: 2018-04-05 21:41:17
Message-ID: 20180405214117.f4l2gasynqvcnd3s@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018-04-05 23:32:19 +0200, Magnus Hagander wrote:
> On Thu, Apr 5, 2018 at 11:23 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Is there any sort of locking that guarantees that worker processes see
> > an up2date value of
> > DataChecksumsNeedWrite()/ControlFile->data_checksum_version? Afaict
> > there's not. So you can afaict end up with checksums being computed by
> > the worker, but concurrent writes missing them. The window is going to
> > be at most one missed checksum per process (as the unlocking of the page
> > is a barrier) and is probably not easy to hit, but that's dangerous
> > enough.
> >
>
> So just to be clear of the case you're worried about. It's basically:
> Session #1 - sets checksums to inprogress
> Session #1 - starts dynamic background worker ("launcher")
> Launcher reads and enumerates pg_database
> Launcher starts worker in first database
> Worker processes first block of data in database
> And at this point, Session #2 has still not seen the "checksums inprogress"
> flag and continues to write without checksums?

Yes. I think there are some variations of that, but yes, that's pretty
much it.

> That seems like quite a long time to me -- is that really a problem?

We don't generally build locking models that are only correct based on
likelihood. Especially not without a lengthy comment explaining that
analysis.

> I'm guessing you're seeing a shorter path between the two that I can't
> see right now (I'll blame the late evning...)?

I don't think it matters terribly much how long that path is.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-04-05 21:48:20 Re: Online enabling of checksums
Previous Message Michael Paquier 2018-04-05 21:41:14 Re: BUG #14941: Vacuum crashes