Re: Eager page freeze criteria clarification

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Eager page freeze criteria clarification
Date: 2023-09-26 17:49:32
Message-ID: CA+TgmoY69OFYAgCdBnOuoUSqWC2KjAShJz09ejxCaosiJLiFdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 26, 2023 at 11:11 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> That I'd like you to expand on "using the RedoRecPtr of the latest checkpoint
> rather than the LSN of the previou vacuum." - I can think of ways of doing so
> that could end up with quite different behaviour...

Yeah, me too. I'm not sure what is best.

> As long as the most extreme cases are prevented, unnecessarily freezing is imo
> far less harmful than freezing too little.
>
> I'm worried that using something as long as 100-200%
> time-between-recent-checkpoints won't handle insert-mostly workload well,
> which IME are also workloads suffering quite badly under our current scheme -
> and which are quite common.

I wrote about this problem in my other reply and I'm curious as to
your thoughts about it. Basically, suppose we forget about all of
Melanie's tests except for three cases: (1) an insert-only table, (2)
an update-heavy workload with uniform distribution, and (3) an
update-heavy workload with skew. In case (1), freezing is good. In
case (2), freezing is bad. In case (3), freezing is good for cooler
pages and bad for hotter ones. I postulate that any
recency-of-modification threshold that handles (1) well will handle
(2) poorly, and that the only way to get both right is to take some
other factor into account. You seem to be arguing that we can just
freeze aggressively in case (2) and it won't cost much, but it doesn't
sound to me like Melanie believes that and I don't think I do either.

> > This doesn't seem completely stupid, but I fear it would behave
> > dramatically differently on a workload a little smaller than s_b vs.
> > one a little larger than s_b, and that doesn't seem good.
>
> Hm. I'm not sure that that's a real problem. In the case of a workload bigger
> than s_b, having to actually read the page again increases the cost of
> freezing later, even if the workload is just a bit bigger than s_b.

That is true, but I don't think it means that there is no problem. It
could lead to a situation where, for a while, a table never needs any
significant freezing, because we always freeze aggressively. When it
grows large enough, we suddenly stop freezing it aggressively, and now
it starts experiencing vacuums that do a whole bunch of work all at
once. A user who notices that is likely to be pretty confused about
what happened, and maybe not too happy when they find out.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-09-26 17:53:22 Re: Questions about the new subscription parameter: password_required
Previous Message Karl O. Pinc 2023-09-26 17:45:53 Re: [PGdocs] fix description for handling pf non-ASCII characters