From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Robert Treat <rob(at)xzilla(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Should we update the random_page_cost default value? |
Date: | 2025-10-08 17:23:33 |
Message-ID: | CA+TgmoagjyStEVKymvA_kxLfKXhHUj5beaBZ_JVU9mOxrX+m2g@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 8, 2025 at 12:24 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> I don't think there's all that much disagreement, actually. This is a
> pretty good illustration that we're using random_page_cost to account
> for things other than "I/O cost" (like the expected cache hit ratios),
> because we simply don't have a better knob for that.
I agree with that conclusion.
> Isn't this somewhat what effective_cache_size was meant to do? That
> obviously does not know about what fraction of individual tables is
> cached, but it does impose size limit.
Not really, because effective_cache_size only models the fact that
when you iterate the same index scan within the execution of a single
query, it will probably hit some pages more than once. It doesn't have
any idea that anything other than an index scan might hit the same
pages more than once, and it doesn't have any idea that a query might
find data in cache as a result of previous queries. Also, when it
thinks the same page is accessed more than once, the cost of
subsequent accesses is 0.
I could be wrong, but I kind of doubt that there is any future in
trying to generalize effective_cache_size. It's an extremely
special-purpose mechanism, and what we need here is more of a general
approach that can cut across the whole planner -- or alternatively we
can decide that things are fine and that having rpc/spc implicitly
model caching behavior is good enough.
> I think in the past we mostly assumed we can't track cache size per
> table, because we have no visibility into page cache. But maybe direct
> I/O would change this?
I think it's probably going to work out really poorly to try to use
cache contents for planning. The plan may easily last much longer than
the cache contents.
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2025-10-08 17:37:22 | Re: another autovacuum scheduling thread |
Previous Message | Álvaro Herrera | 2025-10-08 17:20:58 | Re: another autovacuum scheduling thread |