Re: Eager page freeze criteria clarification

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Eager page freeze criteria clarification
Date: 2023-09-06 05:09:00
Message-ID: 20230906050900.ig5d55fpiwsjqtil@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-08-28 12:26:01 -0400, Robert Haas wrote:
> On Mon, Aug 28, 2023 at 10:00 AM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> > For the second goal, I've relied on past data to predict future
> > behavior, so I tried several criteria to estimate the likelihood that a
> > page will not be imminently modified. What was most effective was
> > Andres' suggestion of comparing the page LSN to the insert LSN at the
> > end of the last vacuum of that table; this approximates whether the page
> > has been recently modified, which is a decent proxy for whether it'll be
> > modified in the future. To do this, we need to save that insert LSN
> > somewhere. In the attached WIP patch, I saved it in the table stats, for
> > now -- knowing that those are not crash-safe.
>
> I wonder what the real plan here is for where to store this. It's not
> obvious that we need this to be crash-safe; it's after all only for
> use by a heuristic, and there's no actual breakage if the heuristic
> goes wrong. At the same time, it doesn't exactly feel like a
> statistic.

I'm not certain either. This is generally something that's not satisfying
right now - although IMO not necessarily for the reason you mention. Given
that we already store, e.g., the time of the last autovacuum in the stats, I
don't see a problem also storing a corresponding LSN. My issue is more that
this kind of information not being crashsafe is really problematic - it's a
well known issue that autovacuum just doesn't do anything for a while after a
crash-restart (or pitr restore or ...), for example.

Given that all the other datapoints are stored in the stats, I think just
storing the LSNs alongside is reasonable.

> Then there's the question of whether it's the right metric. My first
> reaction is to think that it sounds pretty good. One thing I really
> like about it is that if the table is being vacuumed frequently, then
> we freeze less aggressively, and if the table is being vacuumed
> infrequently, then we freeze more aggressively. That seems like a very
> desirable property. It also seems broadly good that this metric
> doesn't really care about reads. If there are a lot of reads on the
> system, or no reads at all, it doesn't really change the chances that
> a certain page is going to be written again soon, and since reads
> don't change the insert LSN, here again it seems to do the right
> thing. I'm a little less clear about whether it's good that it doesn't
> really depend on wall-clock time.

Yea, it'd be useful to have a reasonably approximate wall clock time for the
last modification of a page. We just don't have infrastructure for determining
that. We'd need an LSN->time mapping (xid->time wouldn't be particularly
useful, I think).

A very rough approximate modification time can be computed by assuming an even
rate of WAL generation, and using the LSN at the time of the last vacuum and
the time of the last vacuum, to compute the approximate age.

For a while I thought that'd not give us anything that just using LSNs gives
us, but I think it might allow coming up with a better cutoff logic: Instead
of using a cutoff like "page LSN is older than 10% of the LSNs since the last
vacuum of the table", it would allow us to approximate "page has not been
modified in the last 15 seconds" or such. I think that might help avoid
unnecessary freezing on tables with very frequent vacuuming.

> Certainly, that's desirable from the point of view of not wanting to have to
> measure wall-clock time in places where we otherwise wouldn't have to, which
> tends to end up being expensive.

IMO the bigger issue is that we don't want to store a timestamp on each page.

> > Page Freezes/Page Frozen (less is better)

As, I think, Robert mentioned downthread, I'm not sure this is a useful thing
to judge the different heuristics by. If the number of pages frozen is small,
the ratio quickly can be very large, without the freezing having a negative
effect.

I suspect interesting top-level figures to compare would be:

1) WAL volume (to judge the amount of unnecessary FPIs)

2) data reads + writes (to see the effect of repeated vacuuming of the same
blocks)

3) number of vacuums and/or time spent vacuuming (freezing less aggressively
might increase the number of vacuums due to anti-wrap vacuums, at the same
time, freezing too aggressively could lead to vacuums taking too long)

4) throughput of the workload (to see potential regressions due to vacuuming
overhead)

5) for transactional workloads: p99 latency (to see if vacuuming increases
commit latency and such, just using average tends to hide too much)

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Suraj Kharage 2023-09-06 05:18:32 [Regression] Incorrect filename in test case comment
Previous Message Drouvot, Bertrand 2023-09-06 04:43:24 Re: Autogenerate some wait events code and documentation