From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Arseniy Mukhin <arseniy(dot)mukhin(dot)dev(at)gmail(dot)com> |
Cc: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Use streaming read I/O in BRIN vacuuming |
Date: | 2025-10-08 23:02:44 |
Message-ID: | CAD21AoDhroZkeHNq95FcaanahmS-ZXS44kRRpFZwisZnDLOvdA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Aug 31, 2025 at 12:48 PM Arseniy Mukhin
<arseniy(dot)mukhin(dot)dev(at)gmail(dot)com> wrote:
>
> Hi!
>
> On Sun, Aug 31, 2025 at 8:49 PM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
> >
> > Hi!
> >
> > > On 31 Aug 2025, at 21:17, Arseniy Mukhin <arseniy(dot)mukhin(dot)dev(at)gmail(dot)com> wrote:
> > >
> > > PFA the patch that migrates BRIN vacuum to the read stream API.
> >
> > The patch is nice and straightforward. Looks good to me.
> >
>
> Thank you for the review!
>
> > Some notes that do not seem to me problem of this patch:
> > 1. This comment is copied 7 times already across codebase.
> > "It is safe to use batchmode as block_range_read_stream_cb"
> > Maybe we can refactor comments and function names...
>
> Yes, I had similar thoughts, but having these comments at callsites
> has its own benefits, there is a thread about these comments [0]...
>
> > 2. Somehow brin_vacuum_scan() avoid dance of getting RelationGetNumberOfBlocks() many times to be entirely sure everything is scanned. Unlike other index vacuums, see btvacuumscan() for example.
>
> If I understand correctly, in other access methods you need to be sure
> that you read the relation up to the end, so you don't leave any index
> tuples that should be pruned. BRIN doesn't have a prune phase, there
> is only a cleanup phase. So it seems it's not a big deal if you miss
> several pages that were allocated during the vacuum.
>
Thank you for proposing the patch! I've reviewed the patch and have
some comments:
+ stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE |
+ READ_STREAM_FULL |
+ READ_STREAM_SEQUENTIAL |
+ READ_STREAM_USE_BATCHING,
+ strategy,
+ idxrel,
+ MAIN_FORKNUM,
+ block_range_read_stream_cb,
+ &p,
+ 0);
Unlike other index AM's it uses READ_STREAM_SEQUENTIAL. If there are
some reasons to use it, we should leave comments there.
---
Have you done any performance testing with this patch? Since BRIN
indexes typically don't grow very large, I'm curious how much benefit
the read_stream provides for BRIN index cleanup.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Jeremy Schneider | 2025-10-08 23:40:57 | Re: another autovacuum scheduling thread |
Previous Message | Melanie Plageman | 2025-10-08 22:54:25 | Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) |