Re: [PoC] Improve dead tuple storage for lazy vacuum

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2023-03-13 13:28:09
Message-ID: CAFBsxsHx9vwGuygw5eX6yryZ72FWeL-fL6GAnWmzYs8VP6HV_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 13, 2023 at 8:41 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:
>
> On Sun, Mar 12, 2023 at 12:54 AM John Naylor
> <john(dot)naylor(at)enterprisedb(dot)com> wrote:
> >
> > On Fri, Mar 10, 2023 at 9:30 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:

> > > * Additional size classes. It's important for an alternative of path
> > > compression as well as supporting our decoupling approach. Middle
> > > priority.
> >
> > I'm going to push back a bit and claim this doesn't bring much gain,
while it does have a complexity cost. The node1 from Andres's prototype is
32 bytes in size, same as our node3, so it's roughly equivalent as a way to
ameliorate the lack of path compression.
>
> But does it mean that our node1 would help reduce the memory further
> since since our base node type (i.e. RT_NODE) is smaller than the base
> node type of Andres's prototype? The result I shared before showed
> 1.2GB vs. 1.9GB.

The benefit is found in a synthetic benchmark with random integers. I
highly doubt that anyone would be willing to force us to keep
binary-searching the 1GB array for one more cycle on account of not adding
a size class here. I'll repeat myself and say that there are also
maintenance costs.

In contrast, I'm fairly certain that our attempts thus far at memory
accounting/limiting are not quite up to par, and lacking enough to
jeopardize the feature. We're already discussing that, so I'll say no more.

> > I say "roughly" because the loop in node3 is probably noticeably
slower. A new size class will by definition still use that loop.
>
> I've evaluated the performance of node1 but the result seems to show
> the opposite.

As an aside, I meant the loop in our node3 might make your node1 slower
than the prototype's node1, which was coded for 1 member only.

> > > * Node shrinking support. Low priority.
> >
> > This is an architectural wart that's been neglected since the tid store
doesn't perform deletion. We'll need it sometime. If we're not going to
make this work, why ship a deletion API at all?
> >
> > I took a look at this a couple weeks ago, and fixing it wouldn't be
that hard. I even had an idea of how to detect when to shrink size class
within a node kind, while keeping the header at 5 bytes. I'd be willing to
put effort into that, but to have a chance of succeeding, I'm unwilling to
make it more difficult by adding more size classes at this point.
>
> I think that the deletion (and locking support) doesn't have use cases
> in the core (i.e. tidstore) but is implemented so that external
> extensions can use it.

I think these cases are a bit different: Doing anything with a data
structure stored in shared memory without a synchronization scheme is
completely unthinkable and insane. I'm not yet sure if
deleting-without-shrinking is a showstopper, or if it's preferable in v16
than no deletion at all.

Anything we don't implement now is a limit on future use cases, and thus a
cause for objection. On the other hand, anything we implement also
represents more stuff that will have to be rewritten for high-concurrency.

> FYI, I've run TPC-C workload over the weekend, and didn't get any
> failures of the assertion proving tidstore and the current tid lookup
> return the same result.

Great!

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2023-03-13 13:36:40 Re: MERGE ... RETURNING
Previous Message Jonathan S. Katz 2023-03-13 13:26:45 Re: Doc: Rework contrib appendix -- informative titles, tweaked sentences