Re: Disabling Heap-Only Tuples

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Disabling Heap-Only Tuples
Date: 2023-07-07 10:57:45
Message-ID: CAFiTN-t_DgOwywTEQr60fBihNDsqYyLYe5CEE67dZjsS978Asw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 7, 2023 at 3:48 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>

> On 7/7/23 11:55, Matthias van de Meent wrote:
> > On Fri, 7 Jul 2023 at 06:53, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >>
> >> On Fri, Jul 7, 2023 at 1:48 AM Matthias van de Meent
> >> <boekewurm+postgres(at)gmail(dot)com> wrote:
> >>>
> >>> On Wed, 5 Jul 2023 at 19:55, Thom Brown <thom(at)linux(dot)com> wrote:
> >>>>
> >>>> On Wed, 5 Jul 2023 at 18:05, Matthias van de Meent
> >>>> <boekewurm+postgres(at)gmail(dot)com> wrote:
> >>>>> So what were you thinking of? A session GUC? A table option?
> >>>>
> >>>> Both.
> >>>
> >>> Here's a small patch implementing a new table option max_local_update
> >>> (name very much bikesheddable). Value is -1 (default, disabled) or the
> >>> size of the table in MiB that you still want to allow to update on the
> >>> same page. I didn't yet go for a GUC as I think that has too little
> >>> control on the impact on the system.
> >>
> >> So IIUC, this parameter we can control that instead of putting the new
> >> version of the tuple on the same page, it should choose using
> >> RelationGetBufferForTuple(), and that can reduce the fragmentation
> >> because now if there is space then most of the updated tuple will be
> >> inserted in same pages. But this still can not truncate the pages
> >> from the heap right? because we can not guarantee that the new page
> >> selected by RelationGetBufferForTuple() is not from the end of the
> >> heap, and until we free the pages from the end of the heap, the vacuum
> >> can not truncate any page. Is my understanding correct?
> >
> > Yes. If you don't have pages with (enough) free space for the updated
> > tuples in your table, or if the FSM doesn't accurately reflect the
> > actual state of free space in your table, this won't help (which is
> > also the reason why I run vacuum in the tests). It also won't help if
> > you don't update the tuples physically located at the end of your
> > table, but in the targeted workload this would introduce a bias where
> > new tuple versions are moved to the front of the table.
> >
> > Something to note is that this may result in very bad bloat when this
> > is combined with a low fillfactor: All blocks past max_local_update
> > will be unable to use space reserved by fillfactor because FSM lookups
> > always take fillfactor into account, and all updates (which ignore
> > fillfactor when local) would go through the FSM instead, thus reducing
> > the space available on each block to exactly the fillfactor. So, this
> > might need some extra code to make sure we don't accidentally blow up
> > the table's size with UPDATEs when max_local_update is combined with
> > low fillfactors. I'm not sure where that would fit best.
> >
>
> I know the thread started as "let's disable HOT" and this essentially
> just proposes to do that using a table option. But I wonder if that's
> far too simple to be reliable, because hoping RelationGetBufferForTuple
> happens to do the right thing does not seem great.
>
> I wonder if we should invent some definition of "strategy" that would
> tell RelationGetBufferForTuple what it should aim for ...
>
> I'm imagining either a table option with a couple possible values
> (default, non-hot, first-page, ...) or maybe something even more
> elaborate (perhaps even a callback?).
>
> Now, it's not my intention to hijack this thread, but this discussion
> reminds me one of the ideas from my "BRIN improvements" talk, about
> maybe using BRIN indexes for routing. UPDATEs may be a major issue for
> BRIN, making them gradually worse over time. If we could "tell"
> RelationGetBufferForTuple() which buffers are more suitable (by looking
> at an index, histogram or some approximate mapping), that might help.

IMHO that seems like the right direction for this feature to be
useful. Otherwise just forcing it to select a page using
RelationGetBufferForTuple() without any input or direction to this
function can behave pretty randomly. In fact, there should be some
way to say insert a new tuple in a smaller block number first
(provided they have free space) and with that, we might get an
opportunity to truncate some heap pages by vacuum.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2023-07-07 11:10:48 Re: Disabling Heap-Only Tuples
Previous Message Tomas Vondra 2023-07-07 10:18:04 Re: Disabling Heap-Only Tuples