Re: New strategies for freezing, advancing relfrozenxid early

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Jeff Davis <pgsql(at)j-davis(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: New strategies for freezing, advancing relfrozenxid early
Date: 2023-01-27 14:48:43
Message-ID: CA+TgmoY_P8nJWJF0W7ZzQH8ueut2HPaKde4e8M2Ei_kCPAR9CA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 26, 2023 at 4:51 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> This is the kind of remark that makes me think that you don't get it.
>
> The most influential OLTP benchmark of all time is TPC-C, which has
> exactly this problem. In spades -- it's enormously disruptive. Which
> is one reason why I used it as a showcase for a lot of this work. Plus
> practical experience (like the Heroku database in the blog post I
> linked to) fully agrees with that benchmark, as far as this stuff goes
> -- that was also a busy OLTP database.
>
> Online transaction involves transactions. Right? There is presumably
> some kind of ledger, some kind of orders table. Naturally these have
> entries that age out fairly predictably. After a while, almost all the
> data is cold data. It is usually about that simple.
>
> One of the key strengths of systems like Postgres is the ability to
> inexpensively store a relatively large amount of data that has just
> about zero chance of being read, let alone modified. While at the same
> time having decent OLTP performance for the hot data. Not nearly as
> good as an in-memory system, mind you -- and yet in-memory systems
> remain largely a niche thing.

I think it's interesting that TPC-C suffers from the kind of problem
that your patch was intended to address. I hadn't considered that. But
I do not think it detracts from the basic point I was making, which is
that you need to think about the downsides of your patch, not just the
upsides.

If you want to argue that there is *no* OLTP workload that will be
harmed by freezing as aggressively as possible, then that would be a
good argument in favor of your patch, because it would be arguing that
the downside simply doesn't exist, at least for OLTP workloads. The
fact that you can think of *one particular* OLTP workload that can
benefit from the patch is just doubling down on the "my patch has an
upside" argument, which literally no one is disputing.

I don't think you can make such an argument stick, though. OLTP
workloads come in all shapes and sizes. It's pretty common to have
tables where the application inserts a bunch of data, updates it over
and over again like, truncates the table, and starts over. In such a
case, aggressive freezing has to be a loss, because no freezing is
ever needed. It's also surprisingly common to have tables where a
bunch of data is inserted and then, after a bit of processing, a bunch
of rows are updated exactly once, after which the data is not modified
any further. In those kinds of cases, aggressive freezing is a great
idea if it happens after that round of updates but a poor idea if it
happens before that round of updates.

It's also pretty common to have cases where portions of the table
become very hot, get a lot of updates for a while, and then that part
of the table becomes cool and some other part of the table becomes
very hot for a while. I think it's possible that aggressive freezing
might do OK in such environments, actually. It will be a negative if
we aggressively freeze the part of the table that's currently hot, but
I think typically tables that have this access pattern are quite big,
so VACUUM isn't going to sweep through the table all that often. It
will probably freeze a lot more data-that-was-hot-a-bit-ago than it
will freeze data-that-is-hot-this-very-minute. Then again, maybe that
would happen without the patch, too. Maybe this kind of case is a wash
for your patch? I don't know.

Whatever you think of these examples, I don't see how it can be right
to suppose that *in general* freezing very aggressively has no
downsides. If that were true, then we probably wouldn't have
vacuum_freeze_min_age at all. We would always just freeze everything
ASAP. I mean, you could theorize that whoever invented that GUC is an
idiot and that they had absolutely no good reason for introducing it,
but that seems pretty ridiculous. Someone put guards against
overly-aggressive freezing into the system *for a reason* and if you
just go rip them all out, you're going to reintroduce the problems
against which they were intended to guard.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-01-27 14:48:53 Re: Set arbitrary GUC options during initdb
Previous Message vignesh C 2023-01-27 14:45:29 Re: pg_stat_statements and "IN" conditions