Re: Disabling Heap-Only Tuples

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Disabling Heap-Only Tuples
Date: 2023-07-07 09:55:28
Message-ID: CAEze2WjVxKDdvefH3KYXSuEMHBf5ugP=GqN0DXF5_YGaOa+L-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 7 Jul 2023 at 06:53, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Fri, Jul 7, 2023 at 1:48 AM Matthias van de Meent
> <boekewurm+postgres(at)gmail(dot)com> wrote:
> >
> > On Wed, 5 Jul 2023 at 19:55, Thom Brown <thom(at)linux(dot)com> wrote:
> > >
> > > On Wed, 5 Jul 2023 at 18:05, Matthias van de Meent
> > > <boekewurm+postgres(at)gmail(dot)com> wrote:
> > > > So what were you thinking of? A session GUC? A table option?
> > >
> > > Both.
> >
> > Here's a small patch implementing a new table option max_local_update
> > (name very much bikesheddable). Value is -1 (default, disabled) or the
> > size of the table in MiB that you still want to allow to update on the
> > same page. I didn't yet go for a GUC as I think that has too little
> > control on the impact on the system.
>
> So IIUC, this parameter we can control that instead of putting the new
> version of the tuple on the same page, it should choose using
> RelationGetBufferForTuple(), and that can reduce the fragmentation
> because now if there is space then most of the updated tuple will be
> inserted in same pages. But this still can not truncate the pages
> from the heap right? because we can not guarantee that the new page
> selected by RelationGetBufferForTuple() is not from the end of the
> heap, and until we free the pages from the end of the heap, the vacuum
> can not truncate any page. Is my understanding correct?

Yes. If you don't have pages with (enough) free space for the updated
tuples in your table, or if the FSM doesn't accurately reflect the
actual state of free space in your table, this won't help (which is
also the reason why I run vacuum in the tests). It also won't help if
you don't update the tuples physically located at the end of your
table, but in the targeted workload this would introduce a bias where
new tuple versions are moved to the front of the table.

Something to note is that this may result in very bad bloat when this
is combined with a low fillfactor: All blocks past max_local_update
will be unable to use space reserved by fillfactor because FSM lookups
always take fillfactor into account, and all updates (which ignore
fillfactor when local) would go through the FSM instead, thus reducing
the space available on each block to exactly the fillfactor. So, this
might need some extra code to make sure we don't accidentally blow up
the table's size with UPDATEs when max_local_update is combined with
low fillfactors. I'm not sure where that would fit best.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Damir Belyalov 2023-07-07 10:00:27 Re: [feature]COPY FROM enable FORCE_NULL/FORCE_NOT_NULL on all columns
Previous Message 蔡梦娟 (玊于) 2023-07-07 09:48:39 回复:The same 2PC data maybe recovered twice