Re: Opportunistically pruning page before update

From: James Coleman <jtc331(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Opportunistically pruning page before update
Date: 2024-01-30 02:39:18
Message-ID: CAAaqYe_f-a0agsgvq3YWybWh5Y3qqT7x+t1rVSBTgx+2sfwZjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 26, 2024 at 8:33 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> On Tue, Jan 23, 2024 at 2:46 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Tue, Jan 23, 2024 at 7:18 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Jan 22, 2024 at 8:21 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > > >
> > > > See rebased patch attached.
> > >
> > > I just realized I left a change in during the rebase that wasn't necessary.
> > >
> > > v4 attached.
> >
> > I have noticed that you are performing the opportunistic pruning after
> > we decided that the updated tuple can not fit in the current page and
> > then we are performing the pruning on the new target page. Why don't
> > we first perform the pruning on the existing page of the tuple itself?
> > Or this is already being done before this patch? I could not find
> > such existing pruning so got this question because such pruning can
> > convert many non-hot updates to the HOT update right?
>
> First off I noticed that I accidentally sent a different version of
> the patch I'd originally worked on. Here's the one from the proper
> branch. It's still similar, but I want to make sure the right one is
> being reviewed.
>
> I'm working on a demo case for updates (to go along with the insert
> case I sent earlier) to test out your question, and I'll reply when I
> have that.

All right, getting all this loaded back into my head, as you noted
earlier the patch currently implements points 1 and 2 of my list of
possible improvements:

> 1. The most trivial case where this is useful is INSERT: we have a
> target page, and it may have dead tuples, so trying to prune may
> result in us being able to use the target page rather than getting a
> new page.
> 2. The next most trivial case is where UPDATE (potentially after
> failing to find space for a HOT tuple on the source tuple's page);
> much like the INSERT case our backend's target page may benefit from
> pruning.

What you're describing above would be implementing (at least part of) point 3:

> 3. A more complex UPDATE case occurs when we check the tuple's page
> for space in order to insert a HOT tuple and fail to find enough
> space. While we've already opportunistically pruned the page on
> initial read of the tuple, in complex queries this might be some time
> in the past, so it may be worth attempting again.
> ...

If we try to design a simple test case for updates (like my insert
test case above) we might end up with something like:

drop table if exists foo;
create table foo(pk serial primary key, t text);
insert into foo(t) select repeat('a', 250) from generate_series(1, 27);
select pg_relation_size('foo');
delete from foo where pk = 1;
update foo set t = repeat('b', 250) where pk = 2;
select pg_relation_size('foo');

But that actually works as expected on master, because we call
heap_page_prune_opt from heapam_index_fetch_tuple as part of the index
scan that drives the update query.

I was theorizing that if there are concurrent writes to the page we
might being able to trigger the need to re-prune a page in the for
loop in heap_update(), and I tried to both regular pgbench and a
custom pgbench script with inserts/deletes/updates (including some
artificial delays).

What I concluded what this isn't isn't likely to be fruitful: we need
the buffer to be local to our backend (no other pins) to be able to
clean it, but since we've already pruned it on read, we need to have
had another backend modify the page (and dropped its pin!) between our
read and our write.

If someone believes there's a scenario that would demonstrate
otherwise, I would of course be interested to hear any ideas, but at
this point I think it's probably worth focusing on the first two cases
this patch already addresses.

Regards,
James Coleman

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-01-30 03:14:40 Re: subscription disable_on_error not working after ALTER SUBSCRIPTION set bad conninfo
Previous Message Richard Guo 2024-01-30 02:33:19 Re: Support run-time partition pruning for hash join