Re: Online checksums verification in the backend

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Online checksums verification in the backend
Date: 2020-03-18 04:20:47
Message-ID: 20200318042047.GH214947@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 16, 2020 at 09:21:22AM +0100, Julien Rouhaud wrote:
> On Mon, Mar 16, 2020 at 12:29:28PM +0900, Michael Paquier wrote:
>> With a large amount of
>> shared buffer eviction you actually increase the risk of torn page
>> reads. Instead of a logic relying on partition mapping locks, which
>> could be unwise on performance grounds, did you consider different
>> approaches? For example a kind of pre-emptive lock on the page in
>> storage to prevent any shared buffer operation to happen while the
>> block is read from storage, that would act like a barrier.
>
> Even with a workload having a large shared_buffers eviction pattern, I don't
> think that there's a high probability of hitting a torn page. Unless I'm
> mistaken it can only happen if all those steps happen concurrently to doing the
> block read just after releasing the LWLock:
>
> - postgres read the same block in shared_buffers (including all the locking)
> - dirties it
> - writes part of the page
>
> It's certainly possible, but it seems so unlikely that the optimistic lock-less
> approach seems like a very good tradeoff.

Having false reports in this area could be very confusing for the
user. That's for example possible now with checksum verification and
base backups.

>> I guess that this leads to the fact that this function may be better as
>> a contrib module, with the addition of some better-suited APIs in core
>> (see paragraph above).
>
> Below?

Above. This thought more precisely:
>> For example a kind of pre-emptive lock on the page in
>> storage to prevent any shared buffer operation to happen while the
>> block is read from storage, that would act like a barrier.

> For the record when I first tested that feature I did try to check dirty
> blocks, and it seemed that dirty blocks of shared relation were sometimes
> wrongly reported as corrupted. I didn't try to investigate more though.

Hmm. It would be good to look at that, correct verification of shared
relations matter.

>> + * - if a block is dirty in shared_buffers, it's ignored as it'll be flushed to
>> + * disk either before the end of the next checkpoint or during recovery in
>> + * case of unsafe shutdown
>> Not sure that the indentation is going to react well on that part of
>> the patch, perhaps it would be better to add some "/*-------" at the
>> beginning and end of the comment block to tell pgindent to ignore this
>> part?
>
> Ok. Although I think only the beginning comment is needed?

From src/tools/pgindent/README:
"pgindent will reflow any comment block that's not at the left margin.
If this messes up manual formatting that ought to be preserved,
protect the comment block with some dashes:"

/*----------
* Text here will not be touched by pgindent.
*----------
*/

>> Based on the feedback gathered on this thread, I guess that you should
>> have a SRF returning the list of broken blocks, as well as NOTICE
>> messages.
>
> The current patch has an SRF and a WARNING message, do you want an additional
> NOTICE message or downgrade the existing one?

Right, not sure which one is better, for zero_damaged_pages a WARNING
is used.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shinoda, Noriyoshi (PN Japan A&PS Delivery) 2020-03-18 04:28:34 RE: Multivariate MCV list vs. statistics target
Previous Message David Rowley 2020-03-18 04:12:56 Re: [PATCH] Erase the distinctClause if the result is unique by definition