Re: Thoughts on "killed tuples" index hint bits support on standby

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Thoughts on "killed tuples" index hint bits support on standby
Date: 2021-01-30 02:03:58
Message-ID: CAH2-WzkSUcuFukhJdSxHFgtL6zEQgNhgOzNBiTbP_4u=k6igAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 28, 2021 at 10:16 AM Michail Nikolaev
<michail(dot)nikolaev(at)gmail(dot)com> wrote:
> > I wonder if it would help to not actually use the LP_DEAD bit for
> > this. Instead, you could use the currently-unused-in-indexes
> > LP_REDIRECT bit.
>
> Hm… Sound very promising - an additional bit is a lot in this situation.

Yeah, it would help a lot. But those bits are precious. So it makes
sense to think about what to do with both of them in index AMs at the
same time. Otherwise we risk missing some important opportunity.

> > Whether or not "recently dead" means "dead to my
> > particular MVCC snapshot" can be determined using some kind of
> > in-memory state that won't survive a crash (or a per-index-page
> > epoch?).
>
> Do you have any additional information about this idea? (maybe some thread). What kind of “in-memory state that won't survive a crash” and how to deal with flushed bits after the crash?

Honestly, that part wasn't very well thought out. A lot of things might work.

Some kind of "recently dead" bit is easier on the primary. If we have
recently dead bits set on the primary (using a dedicated LP bit for
original execution recently-dead-ness), then we wouldn't even
necessarily have to change anything about index scans/visibility at
all. There would still be a significant benefit if we simply used the
recently dead bits when considering which heap blocks nbtree simple
deletion will visit inside _bt_deadblocks() -- in practice there would
probably be no real downside from assuming that the recently dead bits
are now fully dead (it would sometimes be wrong, but not enough to
matter, probably only when there is a snapshot held for way way too
long).

Deletion in indexes can work well while starting off with only an
*approximate* idea of which index tuples will be safe to delete --
this is a high level idea behind my recent commit d168b666823. It
seems very possible that that could be pushed even further here on the
primary.

On standbys (which set standby recently dead bits) it will be
different, because you need "index hint bits" set that are attuned to
the workload on the standby, and because you don't ever use the bit to
help with deleting anything on the standby (that all happens during
original execution).

BTW, what happens when the page splits on the primary, btw? Does your
patch "move over" the LP_DEAD bits to each half of the split?

> Hm. What is about this way:
>
> 10 - dead to all on standby (LP_REDIRECT)
> 11 - dead to all on primary (LP_DEAD)
> 01 - future “recently DEAD” on primary (LP_NORMAL)

Not sure.

> Also, looks like both GIST and HASH indexes also do not use LP_REDIRECT.

Right -- if we were to do this, the idea would be that it would apply
to all index AMs that currently have (or will ever have) something
like the LP_DEAD bit stuff. The GiST and hash support for index
deletion is directly based on the original nbtree version, and there
is no reason why we cannot eventually do all this stuff in at least
those three AMs.

There are already some line-pointer level differences in index AMs:
LP_DEAD items have storage in index AMs, but not in heapam. This
all-table-AMs/all-index-AMs divide in how item pointers work would be
preserved.

> Also, btw, do you know any reason to keep minRecoveryPoint at a low value?

Not offhand.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-01-30 02:23:01 Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly
Previous Message Andres Freund 2021-01-30 02:02:11 Re: LogwrtResult contended spinlock