Re: [PoC] Improve dead tuple storage for lazy vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2022-11-08 14:14:39
Message-ID: CAD21AoB8oE4WtgXhs+YFwOH3ODAOvEN48afFLYL=vqMk0d25TA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 5, 2022 at 6:23 PM John Naylor <john(dot)naylor(at)enterprisedb(dot)com> wrote:
>
> On Fri, Nov 4, 2022 at 10:25 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > For parallel heap pruning, multiple workers will insert key-value
> > pairs to the radix tree concurrently. The simplest solution would be a
> > single lock to protect writes but the performance will not be good.
> > Another solution would be that we can divide the tables into multiple
> > ranges so that keys derived from TIDs are not conflicted with each
> > other and have parallel workers process one or more ranges. That way,
> > parallel vacuum workers can build *sub-trees* and the leader process
> > can merge them. In use cases of lazy vacuum, since the write phase and
> > read phase are separated the readers don't need to worry about
> > concurrent updates.
>
> It's a good idea to use ranges for a different reason -- readahead. See commit 56788d2156fc3, which aimed to improve readahead for sequential scans. It might work to use that as a model: Each worker prunes a range of 64 pages, keeping the dead tids in a local array. At the end of the range: lock the tid store, enter the tids into the store, unlock, free the local array, and get the next range from the leader. It's possible contention won't be too bad, and I suspect using small local arrays as-we-go would be faster and use less memory than merging multiple sub-trees at the end.

Seems a promising idea. I think it might work well even in the current
parallel vacuum (ie., single writer). I mean, I think we can have a
single lwlock for shared cases in the first version. If the overhead
of acquiring the lwlock per insertion of key-value is not negligible,
we might want to try this idea.

Apart from that, I'm going to incorporate the comments on 0004 patch
and try a pointer tagging.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2022-11-08 14:24:13 Re: Tables not getting vacuumed in postgres
Previous Message Karthik Jagadish (kjagadis) 2022-11-08 13:21:41 Re: Tables not getting vacuumed in postgres