Re: Disabling Heap-Only Tuples

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Disabling Heap-Only Tuples
Date: 2023-09-19 17:33:22
Message-ID: CAEze2WiDqwVsTa_Ln=wG535=GSNF+SKDeeTJAXT_vmTYsYptmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 19 Sept 2023 at 18:56, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2023-09-19 18:30:44 +0200, Alvaro Herrera wrote:
> > This makes me think that maybe the logic needs to be a little more
> > complex to avoid the problem you describe: if an UPDATE is prevented
> > from being HOT because of this setting, but then it goes and consults
> > FSM and it gives the update a higher block number than the tuple's
> > current block (or it fails to give a block number at all so it is forced
> > to extend the relation), then the update should give up on that strategy
> > and use a HOT update after all. (I have not read the actual patch;
> > maybe it already does this? It sounds kinda obvious.)
>
> Yea, a setting like what's discussed here seems, uh, not particularly useful
> for achieving the goal of compacting tables. I don't think guiding this
> through SQL makes a lot of sense. For decent compaction you'd want to scan the
> table backwards, and move rows from the end to earlier, but stop once
> everything is filled up. You can somewhat do that from SQL, but it's going to
> be awkward and slow. I doubt you even want to use the normal UPDATE WAL
> logging.

We can't move tuples around (or, not that I know of) without using a
transaction ID to control the visibility of the two locations of that
tuple. Doing table compaction would thus likely require using
transactions to move these tuples around. Using a single backend and
bulk operations, it'll still lock each tuple that is being moved, and
that can be noticed by user DML queries. I'd rather make the user's
queries move the data around than this long-duration, locking
background operation.

> I think having explicit compaction support in VACUUM or somewhere similar
> would make sense, but I don't think the proposed GUC is a useful stepping
> stone.

The point of this GUC is that the compaction can happen organically in
the user's UPDATE workflow, so that there is no long locking operation
going on (as you would see with VACUUM FULL / CLUSTER / pg_repack).

> > > But without any kind of auto-tuning, in my opinion, it's a fairly poor
> > > feature. Sure, some people will get use out of it, if they're
> > > sufficiently knowledgeable and sufficiently determined. But I think
> > > for most people in most situations, it will be a struggle.
>
> Indeed. I think it'd often just explode table and index sizes, because HOT
> pruning won't be able to make usable space in pages anymore (due to dead
> items).

You seem to misunderstand the latest patch. It explicitly only blocks
local updates if the update can then move the new tuple to an earlier
page. If that is not possible, then it'll insert locally (assuming
that is still possible) and HOT can then still apply.

And yes, moving tuples to earlier pages will indeed increase index
bloat, because it does create dead tuples where previously we could've
applied HOT. But we do have VACUUM and REINDEX CONCURRENTLY to clean
that up without serious long-duration stop-the-world actions, while
the other builtin cleanup methods don't.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-09-19 17:46:36 Re: Projection pushdown to index access method
Previous Message Robert Haas 2023-09-19 17:10:29 Re: Projection pushdown to index access method