From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations |
Date: | 2022-02-04 20:30:37 |
Message-ID: | CAH2-WzmUXPHpS4VPqVz7VLUkxSEy=F0bJ=2B-yBp7r1J75oHrg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Feb 4, 2022 at 2:45 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> While I agree that there's some case to be made for leaving settled
> pages well enough alone, your criterion for settled seems pretty much
> accidental.
I fully admit that I came up with the FSM heuristic with TPC-C in
mind. But you have to start somewhere.
Fortunately, the main benefit of this patch series (avoiding the
freeze cliff during anti-wraparound VACUUMs, often avoiding
anti-wraparound VACUUMs altogether) don't depend on the experimental
FSM patch at all. I chose to post that now because it seemed to help
with my more general point about qualitatively different pages, and
freezing at the page level.
> Imagine a system where there are two applications running,
> A and B. Application A runs all the time and all the transactions
> which it performs are short. Therefore, when a certain page is not
> modified by transaction A for a short period of time, the page will
> become all-visible and will be considered settled. Application B runs
> once a month and performs various transactions all of which are long,
> perhaps on a completely separate set of tables. While application B is
> running, pages take longer to settle not only for application B but
> also for application A. It doesn't make sense to say that the
> application is in control of the behavior when, in reality, it may be
> some completely separate application that is controlling the behavior.
Application B will already block pruning by VACUUM operations against
application A's table, and so effectively blocks recording of the
resultant free space in the FSM in your scenario. And so application A
and application B should be considered the same application already.
That's just how VACUUM works.
VACUUM isn't a passive observer of the system -- it's another
participant. It both influences and is influenced by almost everything
else in the system.
> I can see that this could have significant advantages under some
> circumstances. But I think it could easily be far worse under other
> circumstances. I mean, you can have workloads where you do some amount
> of read-write work on a table and then go read only and sequential
> scan it an infinite number of times. An algorithm that causes the
> table to be smaller at the point where we switch to read-only
> operations, even by a modest amount, wins infinitely over anything
> else. But even if you have no change in the access pattern, is it a
> good idea to allow the table to be, say, 5% larger if it means that
> correlated data is colocated? In general, probably yes. If that means
> that the table fails to fit in shared_buffers instead of fitting, no.
> If that means that the table fails to fit in the OS cache instead of
> fitting, definitely no.
5% larger seems like a lot more than would be typical, based on what
I've seen. I don't think that the regression in this scenario can be
characterized as "infinitely worse", or anything like it. On a long
enough timeline, the potential upside of something like this is nearly
unlimited -- it could avoid a huge amount of write amplification. But
the potential downside seems to be small and fixed -- which is the
point (bounding the downside). The mere possibility of getting that
big benefit (avoiding the costs from heap fragmentation) is itself a
benefit, even when it turns out not to pay off in your particular
case. It can be seen as insurance.
> And to me, that kind of effect is why it's hard to gain much
> confidence in regards to stuff like this via laboratory testing. I
> mean, I'm glad you're doing such tests. But in a laboratory test, you
> tend not to have things like a sudden and complete change in the
> workload, or a random other application sometimes sharing the machine,
> or only being on the edge of running out of memory. I think in general
> people tend to avoid such things in benchmarking scenarios, but even
> if include stuff like this, it's hard to know what to include that
> would be representative of real life, because just about anything
> *could* happen in real life.
Then what could you have confidence in?
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Swaha Miller | 2022-02-04 20:52:08 | Re: support for CREATE MODULE |
Previous Message | Robert Haas | 2022-02-04 20:27:43 | Re: make MaxBackends available in _PG_init |