Re: Opportunistically pruning page before update

From: James Coleman <jtc331(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Opportunistically pruning page before update
Date: 2023-10-04 21:01:14
Message-ID: CAAaqYe8OA=a6-5iXB81OUVJ0UQ0LwHSshRdoPSqFim2Tj8tJfQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 26, 2023 at 8:30 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> On Tue, Sep 5, 2023 at 1:40 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> >
> > On Wed, Jun 21, 2023 at 8:51 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > > While at PGCon I was chatting with Andres (and I think Peter G. and a
> > > few others who I can't remember at the moment, apologies) and Andres
> > > noted that while we opportunistically prune a page when inserting a
> > > tuple (before deciding we need a new page) we don't do the same for
> > > updates.
> > >
> > > Attached is a patch series to do the following:
> > >
> > > 0001: Make it possible to call heap_page_prune_opt already holding an
> > > exclusive lock on the buffer.
> > > 0002: Opportunistically prune pages on update when the current tuple's
> > > page has no free space. If this frees up enough space, then we
> > > continue to put the new tuple on that page; if not, then we take the
> > > existing code path and get a new page.
> >
> > I've reviewed these patches and have questions.
> >
> > Under what conditions would this be exercised for UPDATE? Could you
> > provide an example?
> >
> > With your patch applied, when I create a table, the first time I update
> > it heap_page_prune_opt() will return before actually doing any pruning
> > because the page prune_xid hadn't been set (it is set after pruning as
> > well as later in heap_update() after RelationGetBufferForTuple() is
> > called).
> >
> > I actually added an additional parameter to heap_page_prune() and
> > heap_page_prune_opt() to identify if heap_page_prune() was called from
> > RelationGetBufferForTuple() and logged a message when this was true.
> > Running the test suite, I didn't see any UPDATEs executing
> > heap_page_prune() from RelationGetBufferForTuple(). I did, however, see
> > other statement types doing so (see RelationGetBufferForTuple()'s other
> > callers). Was that intended?
> >
> > > I started to work on benchmarking this, but haven't had time to devote
> > > properly to that, so I'm wondering if there's anyone who might be
> > > interested in collaborating on that part.
> >
> > I'm interested in this feature and in helping with it/helping with
> > benchmarking it, but I don't yet understand the design in its current
> > form.
>
> Hi Melanie,
>
> Thanks for taking a look at this! Apologies for the long delay in
> replying: I started to take a look at your questions earlier, and it
> turned into more of a rabbit hole than I'd anticipated. I've since
> been distracted by other things. So -- I don't have any conclusions
> here yet, but I'm hoping at or after PGConf NYC that I'll be able to
> dedicate the time this deserves.

Hi,

I poked at this a decent amount last night and uncovered a couple of
things (whether or not Andres and I had discussed these details at
PGCon...I don't remember):

1. We don't ever opportunistically prune on INSERT, but we do
(somewhat, see below) on UPDATE, since we call it the first time we
read the page with the to-be-updated tuple on it.
2. The reason that original testing on v1 didn't see any real changes
is because PageClearHasFreeLinePointers() wasn't the right fastpath
gate on this; I should have been using !PageIsFull().

With the change to use !PageIsFull() I can trivially show that there
is improvement functionally. Consider the following commands:

drop table if exists foo;
create table foo(pk serial primary key, t text);
insert into foo(t) select repeat('a', 250) from generate_series(1, 27);
select pg_relation_size('foo');
delete from foo where pk <= 10;
insert into foo(t) select repeat('b', 250) from generate_series(1, 10);
select pg_relation_size('foo');

On master this will result in a final relation size of 16384 while
with the patch applied the final relation size is 8192.

I talked to Andres and Peter again today, and out of that conversation
I have some observations and ideas for future improvements.

1. The most trivial case where this is useful is INSERT: we have a
target page, and it may have dead tuples, so trying to prune may
result in us being able to use the target page rather than getting a
new page.
2. The next most trivial case is where UPDATE (potentially after
failing to find space for a HOT tuple on the source tuple's page);
much like the INSERT case our backend's target page may benefit from
pruning.
3. A more complex UPDATE case occurs when we check the tuple's page
for space in order to insert a HOT tuple and fail to find enough
space. While we've already opportunistically pruned the page on
initial read of the tuple, in complex queries this might be some time
in the past, so it may be worth attempting again. Beyond that context
is key: if we already know we could otherwise do a HOT update but for
the lack of free space on the page, then spending extra cycles
rescuing that failed attempt is easier to justify. In order to do that
we ought to invent an "aggressive" flag to heap_page_prune_opt telling
it that it doesn't need to be quite so careful about exiting fast.
Perhaps we can rescue the HOT update optimization by pruning
aggressively.
4. We can prune the target page when the current backend recently
aborted a transaction. Additionally we could prune the target page
immediately on rollback (potentially we could even get into the
complexity of doing retail index tuple deletion when a transaction
aborts).

It may or may not be the case that I end up pursuing all of these in
this particular patch series, but I wanted to at least get it written
down here for history's sake.

The attached v2 patch series handles case 1 and likely case 2 (though
I haven't tested case 2 yet). The "log when pruning" patch files may
or may not be useful to you: they add a bunch of logging to make it
easier to observe what's happening while playing around in psql.

Regards,
James

Attachment Content-Type Size
v2-0003-Opportunistically-prune-to-avoid-building-a-new-p.patch application/octet-stream 3.7 KB
v2-0004-log-when-pruning-2.patch application/octet-stream 1.6 KB
v2-0001-Allow-getting-lock-before-calling-heap_page_prune.patch application/octet-stream 4.2 KB
v2-0002-log-when-pruning.patch application/octet-stream 2.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2023-10-04 21:05:37 Re: [PATCH] Add CANONICAL option to xmlserialize
Previous Message Jeff Davis 2023-10-04 20:38:15 Re: Pre-proposal: unicode normalized text