From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations |
Date: | 2022-03-01 21:46:40 |
Message-ID: | CA+TgmoZv6q3wxxUm=R+5gB4_FmP27pjnGsWRdEJekC0uQniXcw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Feb 20, 2022 at 3:27 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > I think that the idea has potential, but I don't think that I
> > understand yet what the *exact* algorithm is.
>
> The algorithm seems to exploit a natural tendency that Andres once
> described in a blog post about his snapshot scalability work [1]. To a
> surprising extent, we can usefully bucket all tuples/pages into two
> simple categories:
>
> 1. Very, very old ("infinitely old" for all practical purposes).
>
> 2. Very very new.
>
> There doesn't seem to be much need for a third "in-between" category
> in practice. This seems to be at least approximately true all of the
> time.
>
> Perhaps Andres wouldn't agree with this very general statement -- he
> actually said something more specific. I for one believe that the
> point he made generalizes surprisingly well, though. I have my own
> theories about why this appears to be true. (Executive summary: power
> laws are weird, and it seems as if the sparsity-of-effects principle
> makes it easy to bucket things at the highest level, in a way that
> generalizes well across disparate workloads.)
I think that this is not really a description of an algorithm -- and I
think that it is far from clear that the third "in-between" category
does not need to exist.
> Remember when I got excited about how my big TPC-C benchmark run
> showed a predictable, tick/tock style pattern across VACUUM operations
> against the order and order lines table [2]? It seemed very
> significant to me that the OldestXmin of VACUUM operation n
> consistently went on to become the new relfrozenxid for the same table
> in VACUUM operation n + 1. It wasn't exactly the same XID, but very
> close to it (within the range of noise). This pattern was clearly
> present, even though VACUUM operation n + 1 might happen as long as 4
> or 5 hours after VACUUM operation n (this was a big table).
I think findings like this are very unconvincing. TPC-C (or any
benchmark really) is so simple as to be a terrible proxy for what
vacuuming is going to look like on real-world systems. Like, it's nice
that it works, and it shows that something's working, but it doesn't
demonstrate that the patch is making the right trade-offs overall.
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2022-03-01 21:50:08 | Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not? |
Previous Message | Andrew Dunstan | 2022-03-01 21:41:38 | Re: SQL/JSON: functions |