Re: Fix parallel vacuum buffer usage reporting

From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix parallel vacuum buffer usage reporting
Date: 2024-05-13 13:10:39
Message-ID: CAN55FZ1tPTwExqkJPnWt_eSpPHjVm_mMAH3he2dq+HLk9dYTLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, 10 May 2024 at 19:09, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Fri, May 10, 2024 at 7:26 PM Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > Thank you for working on this!
> >
> > On Wed, 1 May 2024 at 06:37, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > Thank you for further testing! I've pushed the patch.
> >
> > I realized a behaviour change while looking at 'Use pgBufferUsage for
> > block reporting in analyze' thread [1]. Since that change applies here
> > as well, I thought it is better to mention it here.
> >
> > Before this commit, VacuumPageMiss did not count the blocks if its
> > read was already completed by other backends [2]. Now,
> > 'pgBufferUsage.local_blks_read + pgBufferUsage.shared_blks_read'
> > counts every block attempted to be read; possibly double counting if
> > someone else has already completed the read.
>
> True. IIUC there is such a difference only in HEAD but not in v15 and
> v16. The following comment in WaitReadBuffers() says that it's a
> traditional behavior that we count blocks as read even if someone else
> has already completed its I/O:
>
> /*
> * We count all these blocks as read by this backend. This is traditional
> * behavior, but might turn out to be not true if we find that someone
> * else has beaten us and completed the read of some of these blocks. In
> * that case the system globally double-counts, but we traditionally don't
> * count this as a "hit", and we don't have a separate counter for "miss,
> * but another backend completed the read".
> */
>
> So I think using pgBufferUsage for (parallel) vacuum is a legitimate
> usage and makes it more consistent with other parallel operations.

That sounds logical. Thank you for the clarification.

--
Regards,
Nazir Bilal Yavuz
Microsoft

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message aa 2024-05-13 13:35:22 Re: Is there any chance to get some kind of a result set sifting mechanism in Postgres?
Previous Message Juan Hernández 2024-05-13 13:01:45 I have an exporting need...