Re: Online checksums verification in the backend

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Online checksums verification in the backend
Date: 2020-10-15 07:59:25
Message-ID: CAOBaU_Yr1DHTC3w1W_KCOr-vsAOgtXFGWM7qNhQAyQnjW_sTrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 15, 2020 at 1:37 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Thu, Oct 1, 2020 at 1:07 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >
> > On Fri, Sep 25, 2020 at 06:11:47PM +0800, Julien Rouhaud wrote:
> > > Thanks a lot for the tests! I'm not surprised that forcing the lock
> > > will slow down the pg_check_relation() execution, but I'm a bit
> > > surprised that holding the buffer mapping lock longer in a workload
> > > that has a lot of evictions actually makes things faster. Do you have
> > > any idea why that's the case?
> >
> > That's still a bit unclear to me, but I have not spent much time
> > thinking about this particular point either.
> >
> > > I'm assuming that you prefer to remove both the optimization and the
> > > throttling part? I'll do that with the next version unless there's
> > > objections.
> >
> > Yeah, any tests I have done tends to show that. It would be good to
> > also check some perf profiles here, at least for the process running
> > the relation check in a loop.
> >
> > > I agree that putting the code nearby ReadBuffer_common() would be a
> > > good idea. However, that means that I can't move all the code to
> > > contrib/ I'm wondering what you'd like to see going there. I can see
> > > some values in also having the SQL functions available in core rather
> > > than contrib, e.g. if you need to quickly check a relation on a
> > > standby, so without requiring to create the extension on the primary
> > > node first.
> >
> > Good point. This could make the user experience worse.
> >
> > > Then, I'm a bit worried about adding this code in ReadBuffer_common.
> > > What this code does is quite different, and I'm afraid that it'll make
> > > ReadBuffer_common more complex than needed, which is maybe not a good
> > > idea for something as critical as this function.
> > >
> > > What do you think?
> >
> > Yeah, I have been looking at ReadBuffer_common() and it is true that
> > it is complicated enough so we may not really need an extra mode or
> > more options, for a final logic that is actually different than what a
> > buffer read does: we just want to know if a page has a valid checksum
> > or not. An idea that I got here would be to add a new, separate
> > function to do the page check directly in bufmgr.c, but that's what
> > you mean. Now only the prefetch routine and ReadBuffer_common use
> > partition locks, but getting that done in the same file looks like a
> > good compromise to me. It would be also possible to keep the BLCKSZ
> > buffer used to check the page directly in this routine, so as any
> > caller willing to do a check don't need to worry about any
> > allocation.
>
> I made all the suggested modifications in attached v14:
>
> - moved the C code in bufmgr.c nearby ReadBuffer
> - removed the GUC and throttling options
> - removed the dubious optimization
>
> All documentation and comments are updated to reflect those changes.
> I also split the commit in two, one for the backend infrastructure and
> one for the SQL wrappers.

And I did miss a reference in the sgml documentation, fixed in v15.

Attachment Content-Type Size
v15-0002-Add-a-pg_check_relation-SQL-function.patch application/octet-stream 16.3 KB
v15-0001-Add-backend-infrastructure-to-check-the-validity.patch application/octet-stream 9.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-10-15 08:12:03 Re: setLastTid() and currtid()
Previous Message Fujii Masao 2020-10-15 07:53:03 Re: Feature improvement for FETCH tab completion