Re: New strategies for freezing, advancing relfrozenxid early

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: New strategies for freezing, advancing relfrozenxid early
Date: 2023-01-26 02:31:16
Message-ID: CAH2-WzktQPSgAHLJFZ0B5Y0ckoWkjbgPNHutezwYOwm2AccX0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 25, 2023 at 5:49 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> Sure. But significantly regressing plausible if not common workloads is
> different than knowing that there'll be some edge case where we'll do
> something worse.

That's very vague. Significant to whom, for what purpose?

> prep:
> CREATE TABLE pgbench_time_data(client_id int8 NOT NULL, ts timestamptz NOT NULL, filla int8 NOT NULL, fillb int8 not null, fillc int8 not null);
> CREATE INDEX ON pgbench_time_data(ts);
> ALTER SYSTEM SET autovacuum_naptime = '10s';
> ALTER SYSTEM SET autovacuum_vacuum_cost_delay TO -1;
> ALTER SYSTEM SET synchronous_commit = off; -- otherwise more clients are needed
>
> pgbench script, with 15 clients:
> INSERT INTO pgbench_time_data(client_id, ts, filla, fillb, fillc) VALUES (:client_id, now(), 0, 0, 0);
>
> psql session deleting old data:
> EXPLAIN ANALYZE DELETE FROM pgbench_time_data WHERE ts < now() - '120s'::interval \watch 1
>
> Realistically the time should be longer, but I didn't want to wait that long
> for the deletions to actually start.

I'll review this tomorrow.

> I reproduced both with checkpoint_timeout=5min and 1min. 1min is easier for
> impatient me.

You said "Autovacuum on average generates between 1.5x-7x as much WAL
as before". Why stop there, though? There's a *big* multiplicative
effect in play here from FPIs, obviously, so the sky's the limit. Why
not set checkpoint_timeout to 30s?

> I switched between vacuum_freeze_strategy_threshold=0 and
> vacuum_freeze_strategy_threshold=too-high, because it's quicker/takes less
> warmup to set up something with smaller tables.

This makes no sense to me, at all.

> The concrete setting of vacuum_freeze_strategy_threshold doesn't matter.
> Table size simply isn't a usable proxy for whether eager freezing is a good
> idea or not.

It's not supposed to be - you have it backwards. It's intended to work
as a proxy for whether lazy freezing is a bad idea, particularly in
the worst case.

There is also an effect that likely would have been protective with
your test case had you used a larger table with the same test case
(and had you not lowered vacuum_freeze_strategy_threshold from its
already low default). In general there'd be a much better chance of
concurrent reuse of space by new inserts discouraging page-level
freezing, since VACUUM would take much longer relative to everything
else, as compared to a small table.

> You can have a 1TB table full of transient data, or you can have a 1TB table
> where part of the data is transient and only settles after a time. In neither
> case eager freezing is ok.

It sounds like you're not willing to accept any kind of trade-off.
How, in general, can we detect what kind of 1TB table it will be, in
the absence of user input? And in the absence of user input, why would
we prefer to default to a behavior that is highly destabilizing when
we get it wrong?

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-01-26 02:33:06 Re: New strategies for freezing, advancing relfrozenxid early
Previous Message Tom Lane 2023-01-26 02:27:57 Re: suppressing useless wakeups in logical/worker.c