Re: Online enabling of checksums

From: Sergei Kornilov <sk(at)zsrv(dot)org>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Joshua D(dot) Drake <jd(at)commandprompt(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Online enabling of checksums
Date: 2018-08-01 18:42:27
Message-ID: 716191533148947@sas1-02732547ccc0.qloud-c.yandex.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

> This doesn't test the consequences of the restart being skipped, nor
> does it review on a code level the correctness.
I check not only one stuff during review. I look code too: bgworker checksumhelper.c registered with:
> bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
And then process the whole cluster (even if we run checksumhelper before, but exit before its completed). Or BgWorkerStart_RecoveryFinished does not guarantee start only after recovery finished?
Before start any real work (and after recovery end) checksumhelper checked current cluster status again:

> + * If a standby was restarted when in pending state, a background worker
> + * was registered to start. If it's later promoted after the master has
> + * completed enabling checksums, we need to terminate immediately and not
> + * do anything. If the cluster is still in pending state when promoted,
> + * the background worker should start to complete the job.

> What if your replicas are delayed (e.g. recovery_min_apply_delay)?
> What if you later need to do PITR?
if we start after replay pg_enable_data_checksums and before it completed - we plan start bgworker on recovery finish.
if we replay checksumhelper finish - we _can_ start checksumhelper again and this is handled during checksumhelper start.

Behavior seems correct for me. I miss something very wrong?

regards, Sergei

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Schneider 2018-08-01 19:07:47 Re: Have an encrypted pgpass file
Previous Message Simon Muller 2018-08-01 18:36:47 Re: Allow COPY's 'text' format to output a header