Re: Deleting older versions in unique indexes to avoid page splits

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Victor Yegorov <vyegorov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Deleting older versions in unique indexes to avoid page splits
Date: 2021-01-20 05:28:36
Message-ID: CAH2-WzmmPvghnBBpVoZF6sDEyTtQtNg6nc0pAbvptuvx23eE0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 19, 2021 at 7:54 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> The worst cases could be (a) when there is just one such duplicate
> (indexval logically unchanged) on the page and that happens to be the
> last item and others are new insertions, (b) same as (a) and along
> with it lets say there is an open transaction due to which we can't
> remove even that duplicate version. Have we checked the overhead or
> results by simulating such workloads?

There is no such thing as a workload that has page splits caused by
non-HOT updaters, but almost no actual version churn from the same
non-HOT updaters. It's possible that a small number of individual page
splits will work out like that, of course, but they'll be extremely
rare, and impossible to see in any kind of consistent way.

That just leaves long running transactions. Of course it's true that
eventually a long-running transaction will make it impossible to
perform any cleanup, for the usual reasons. And at that point this
mechanism is bound to fail (which costs additional cycles -- the
wasted access to a single heap page, some CPU cycles). But it's still
a bargain to try. Even with a long running transactions there will be
a great many bottom-up deletion passes that still succeed earlier on
(because at least some of the dups are deletable, and we can still
delete those that became garbage right before the long running
snapshot was acquired).

Victor independently came up with a benchmark that ran over several
hours, with cleanup consistently held back by ~5 minutes by a long
running transaction:

https://www.postgresql.org/message-id/CAGnEbogATZS1mWMVX8FzZHMXzuDEcb10AnVwwhCtXtiBpg3XLQ@mail.gmail.com

This was actually one of the most favorable cases of all for the patch
-- the patch prevented logically unchanged indexes from growing (this
is a mix of inserts, updates, and deletes, not just updates, so it was
less than perfect -- we did see the indexes grow by a half of one
percent). Whereas without the patch each of the same 3 indexes grew by
30% - 60%.

So yes, I did think about long running transactions, and no, the
possibility of wasting one heap block access here and there when the
database is melting down anyway doesn't seem like a big deal to me.

> I feel unlike LP_DEAD optimization this new bottom-up scheme can cost
> us extra CPU and I/O because there seems to be not much consideration
> given to the fact that we might not be able to delete any item (or
> very few) due to long-standing open transactions except that we limit
> ourselves when we are not able to remove even one tuple from any
> particular heap page.

There was plenty of consideration given to that. It was literally
central to the design, and something I poured over at length. Why
don't you go read some of that now? Or, why don't you demonstrate an
actual regression using a tool like pgbench?

I do not appreciate being accused of having acted carelessly. You
don't have a single shred of evidence.

The design is counter-intuitive. I think that you simply don't understand it.

> Now, say due to open transactions, we are able
> to remove very few tuples (for the sake of argument say there is only
> 'one' such tuple) from the heap page then we will keep on accessing
> the heap pages without much benefit. I feel extending the deletion
> mechanism based on the number of LP_DEAD items sounds more favorable
> than giving preference to duplicate items. Sure, it will give equally
> good or better results if there are no long-standing open
> transactions.

As Andres says, LP_DEAD bits need to be set by index scans. Otherwise
nothing happens. The simple deletion case can do nothing without that
happening. It's good that it's possible to reuse work from index scans
opportunistically, but it's not reliable.

> > I personally will
> > never vote for a theoretical risk with only a theoretical benefit. And
> > right now that's what the idea of doing bottom-up deletions in more
> > marginal cases (the page flag thing) looks like.
> >
>
> I don't think we can say that it is purely theoretical because I have
> dome shown some basic computation where it can lead to fewer splits.

I'm confused. You realize that this makes it *more* likely that
bottom-up deletion passes will take place, right? It sounds like
you're arguing both sides of the issue at the same time.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2021-01-20 06:00:37 Re: pg_class.reltype -> pg_type.oid missing for pg_toast table
Previous Message Andres Freund 2021-01-20 05:20:21 Re: Deleting older versions in unique indexes to avoid page splits