Re: Disabling Heap-Only Tuples

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Disabling Heap-Only Tuples
Date: 2023-09-20 14:02:23
Message-ID: ZQr7b6V90iZPVPqZ@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Laurenz Albe (laurenz(dot)albe(at)cybertec(dot)at) wrote:
> On Tue, 2023-09-19 at 12:52 -0400, Robert Haas wrote:
> > On Tue, Sep 19, 2023 at 12:30 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > > I was thinking something vaguely like "a table size that's roughly what
> > > an optimal autovacuuming schedule would leave the table at" assuming 0.2
> > > vacuum_scale_factor.  You would determine the absolute minimum size for
> > > the table given the current live tuples in the table, then add 20% to
> > > account for a steady state of dead tuples and vacuumed space.  So it's
> > > not 1.2x of the "current" table size at the time the local_update_limit
> > > feature is installed, but 1.2x of the optimal table size.
> >
> > Right, that would be great. And honestly if that's something we can
> > figure out, then why does the parameter even need to be an integer
> > instead of a Boolean? If the system knows the optimal table size, then
> > the user can just say "try to compact this table" and need not say to
> > what size. The 1.2 multiplier is probably situation dependent and
> > maybe the multiplier should indeed be a configuration parameter, but
> > we would be way better off if the absolute size didn't need to be.
>
> I don't have high hopes for a reliable way to automatically determine
> the target table size. There are these queries floating around to estimate
> table bloat, which are used by various monitoring systems. I find that they
> get it right a lot of the time, but sometimes they get it wrong. Perhaps
> we can do better than that, but I vastly prefer a setting that I can control
> (even at the danger that I can misconfigure it) over an automatism that I
> cannot control and that sometimes gets it wrong.

Not completely against a setting- but would certainly prefer that this
be done in a more automated way, if possible.

To that end, my thought would be some kind of regular review of the FSM,
or maybe actual review by walking through the table (as VACUUM already
does...) to get an idea of where there's space and where there's used up
areas and then use that to inform various operations (either VACUUM
itself or perhaps UPDATEs from SQL). We could also try to 'start
simple' and look for cases that we can say "well, that's definitely not
good" and address those initially.

Consider (imagine as a histogram; X is used space, . is empty):

1: XXXXXXX
2: XXX
3: XXXXXXX
4: XXX
5: X
6: X
7: .
8: .
9: .
10: .
11: .
12: .
13: .
14: .
15: .
16: .
17: .
18: .
19: .
20: X

Well, obviously there's tons of free space in the middle and if we could
just move those few tuples/pages/whatever that are near the end to
earlier in the table then we'd be able to truncate off and shrink a
lot of the table.

> I like Alvaro's idea to automatically reset "local_update_limit" when the
> table has shrunk enough. Why not perform that task during vacuum truncation?
> If vacuum truncation has taken place, check if the table size is no bigger
> than "local_update_limit" * (1 + "autovacuum_vacuum_scale_factor"), and if
> it is no bigger, reset "local_update_limit". That way, we would not have
> to worry about a lock, because vacuum truncation already has the table locked.

Agreed on this too. Essentially, once we've done some truncation, we
should 'reset'.

I've no doubt that there's some better algorithm for this, but I keep
coming back to something as simple as- if the entire second half of the
table will fit into the entire first half then the table is twice as
large as it needs to be and perhaps that triggers a preference for
placing tuples in the first half of the table. As for what handles
this- maybe have both UPDATE and VACUUM able to, but prefer for UPDATE
to do so and only have VACUUM kick in once the tuples at the end of the
relation are older than some xid-based threshold (perhaps all of the
tuples on a given page have to be old enough?).

While it feels a bit 'late' in terms of when to start taking this
action, we could possibly start with 'all frozen' as an indicator of
'old enough'? Then, between the FSM and the VM, VACUUM could decide
that pages at the end of the table should be moved to be earlier and go
about making that happen. I'm a bit concerned about the risk of some
kind of deadlock or similar happening between VACUUM and user processes
if we're trying to do this with multiple tuples at a time but hopefully
we could come up with a way to avoid that. This process naturally would
have to involve updating indexes and the VM and FSM as the tuples get
moved.

In terms of what this would look like, my thinking is that VACUUM would
scan the table and the FSM and perhaps the VM and then say "ok, this
table is bigger than it needs to be, let's try to fix that" and then set
a flag on the table, which a user could also explicitly set to give them
control over this process happening sooner or not happening at all, and
that would indicate to UPDATE to prefer earlier pages over the current
page or HOT updates, while VACUUM would also look at the flag to decide
if it should try to move tuples itself to earlier. Then, once a VACUUM
has been able to come through and truncate the table, the flag would be
reset (maybe even if the user set it? Or perhaps we'd have a way for
the user to indicate if they want VACUUM to reset the flag on truncation
or not).

Broadly speaking, I agree with the points made that we should be trying
to design a way for this to all happen both automatically and from a
background process without requiring the user to issue UPDATE statements
to make it happen- but I do like the idea of making it work with user
issued UPDATE statements if the right conditions are met, to avoid the
case of VACUUM getting in the way of user activity due to locking or
creating excess writes.

Thanks,

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2023-09-20 14:07:28 Re: Index range search optimization
Previous Message Ashutosh Bapat 2023-09-20 14:00:49 Re: Infinite Interval