Re: Disabling Heap-Only Tuples

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Disabling Heap-Only Tuples
Date: 2023-09-21 22:33:35
Message-ID: 20230921223335.tumif47d25z5gx6t@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-09-19 14:50:13 -0400, Robert Haas wrote:
> On Tue, Sep 19, 2023 at 12:56 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Yea, a setting like what's discussed here seems, uh, not particularly useful
> > for achieving the goal of compacting tables. I don't think guiding this
> > through SQL makes a lot of sense. For decent compaction you'd want to scan the
> > table backwards, and move rows from the end to earlier, but stop once
> > everything is filled up. You can somewhat do that from SQL, but it's going to
> > be awkward and slow. I doubt you even want to use the normal UPDATE WAL
> > logging.
> >
> > I think having explicit compaction support in VACUUM or somewhere similar
> > would make sense, but I don't think the proposed GUC is a useful stepping
> > stone.
>
> I think there's a difference between wanting to compact instantly and
> wanting to compact over time. I think that this kind of thing is
> reasonably well-suited to the latter, if we can engineer away the
> cases where it backfires.
>
> But I know people will try to use it for instant compaction too, and
> there it's worth remembering why we removed old-style VACUUM FULL. The
> main problem is that it was mind-bogglingly slow.

I think some of the slowness was implementation related, rather than
fundamental. But more importantly, storage was something entirely different
back then than it is now.

> The other really bad problem is that it caused massive index bloat. I think
> any system that's based on moving around my tuples right now to make my
> table smaller right now is likely to have similar issues.

I think the problem of exploding WAL usage exists both for compaction being
done in VACUUM (or a dedicated command) and being done by backends. I think to
make using a facility like this realistic, you really need some form of rate
limiting, regardless of when compaction is performed. Even leaving WAL volume
aside, naively doing on-update compaction will cause lots of additional
contention on early FSM pages.

> In the case where you're trying to compact gradually, I think there
> are potentially serious issues with index bloat, but only potentially.
> It seems like there are reasonable cases where it's fine.

> Specifically, if you have relatively few indexes per table, relatively
> few long-running transactions, and all tuples get updated on a
> semi-regular basis, I'm thinking that you're more likely to win than
> lose.

Maybe - but are you going to have a significant bloat issue in that case?
Sure, if the updates update most of the table, youre are going to - but then
on-update compaction won't really be needed either, since you're going to run
out of space on pages on a regular basis.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-09-21 23:06:21 Re: pg_upgrade and logical replication
Previous Message Andres Freund 2023-09-21 22:25:21 Re: GenBKI emits useless open;close for catalogs without rows