Re: Online enabling of checksums

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Michael Banck <michael(dot)banck(at)credativ(dot)de>
Subject: Re: Online enabling of checksums
Date: 2018-02-25 16:17:36
Message-ID: CABUevEx791=f06HOjQpeLWu56jwo56UK7BTp-_fCE2NZoEO=Kg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Feb 24, 2018 at 10:48 PM, Magnus Hagander <magnus(at)hagander(dot)net>
wrote:

> On Sat, Feb 24, 2018 at 4:29 AM, Tomas Vondra <
> tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
>> Hi,
>>
>> I see the patch also does throttling by calling vacuum_delay_point().
>> Being able to throttle the checksum workers not to affect user activity
>> definitely seems like a useful feature, no complaints here.
>>
>> But perhaps binding it to vacuum_cost_limit/vacuum_cost_delay is not the
>> best idea? I mean, enabling checksums seems rather unrelated to vacuum,
>> so it seems a bit strange to configure it by vacuum_* options.
>>
>> Also, how am I supposed to set the cost limit? Perhaps I'm missing
>> something, but the vacuum_delay_point call happens in the bgworker, so
>> setting the cost limit before running pg_enable_data_checksums() will
>> get there, right? I really don't want to be setting it in the config
>> file, because then it will suddenly affect all user VACUUM commands.
>>
>> And if this patch gets improved to use multiple parallel workers, we'll
>> need a global limit (something like we have for autovacuum workers).
>>
>> In any case, I suggest mentioning this in the docs.
>>
>>
> Ah yes. I actually have it on my TODO to work on that, but I forgot to put
> that in the email I sent out. Apologies for that, and thanks for pointing
> it out!
>
> Right now you have to set the limit in the configuration file. That's of
> course not the way we want to have it long term (but as long as it is that
> way it should at least be documented). My plan is to either pick it up from
> the current session that calls pg_enable_data_checksums(), or to simply
> pass it down as parameters to the function directly. I'm thinking the
> second one (pass a cost_delay and a cost_limit as optional parameters to
> the function) is the best one because as you say actually overloading it on
> the user visible GUCs seems a bit ugly. Once there I think the easiest is
> to just pass it down to the workers through the shared memory segment.
>
>
PFA an updated patch that adds this, and also fixes the problem in
pg_verify_checksums spotted by Michael Banck.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

Attachment Content-Type Size
online_checksums2.patch text/x-patch 70.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amirouche Boubekki 2018-02-25 17:14:30 neon: a functional database, git for structured data
Previous Message Magnus Hagander 2018-02-25 14:57:01 Re: Online enabling of checksums