Re: Online enabling of checksums

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Online enabling of checksums
Date: 2018-06-26 11:45:49
Message-ID: CABUevEw9wx2xnoYU4QSNbDPy72zZDLwfkHAsw50kf=HO6tKx4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 9, 2018 at 7:22 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:

> On Sat, Apr 7, 2018 at 6:22 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
>> Hi,
>>
>> On 2018-04-07 08:57:03 +0200, Magnus Hagander wrote:
>> > Note however that I'm sans-laptop until Sunday, so I will revert it
>> then or
>> > possibly Monday.
>>
>> I'll deactive the isolationtester tests until then. They've been
>> intermittently broken for days now and prevent other tests from being
>> exercised.
>>
>
> Thanks.
>
> I've pushed the revert now, and left the pg_verify_checksums in place for
> the time being.
>
>
PFA an updated version of the patch for the next CF. We believe this one
takes care of all the things pointed out so far.

For this version, we "implemented" the MegaExpensiveRareMemoryBarrier() by
simply requiring a restart of PostgreSQL to initiate the conversion
background. That is definitely going to guarantee a memory barrier. It's
certainly not ideal, but restarting the cluster is still a *lot* better
than having to do the entire conversion offline. This can of course be
improved upon in the future, but for now we stuck to the safe way.

The concurrent create-database-from-one-that-had-no-checksums is handled by
simply looping over the list of databases as long as new databases show up,
and waiting for all open transactions to finish at the right moment to
ensure there is no concurrently running one as we get the database list.

Since the worker is now a regular background worker started from
postmaster, the cost-delay parameters had to be made GUCs instead of
function arguments.

(And the more or less broken isolation tests are simply removed)

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

Attachment Content-Type Size
online_checksums12.patch text/x-patch 64.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-06-26 12:02:44 Re: [bug fix] ECPG: freeing memory for pgtypes crashes on Windows
Previous Message Kyotaro HORIGUCHI 2018-06-26 11:19:42 Re: Small fixes about backup history file in doc and pg_standby