Quick Links

Re: Should we update the random_page_cost default value?

From:	Tomas Vondra <tomas(at)vondra(dot)me>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Robert Treat <rob(at)xzilla(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Should we update the random_page_cost default value?
Date:	2025-10-08 19:25:53
Message-ID:	8346d349-92b7-4725-8c07-8eb68b9d58c8@vondra.me
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 10/8/25 19:23, Robert Haas wrote:
> On Wed, Oct 8, 2025 at 12:24 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>> I don't think there's all that much disagreement, actually. This is a
>> pretty good illustration that we're using random_page_cost to account
>> for things other than "I/O cost" (like the expected cache hit ratios),
>> because we simply don't have a better knob for that.
>
> I agree with that conclusion.
>
>> Isn't this somewhat what effective_cache_size was meant to do? That
>> obviously does not know about what fraction of individual tables is
>> cached, but it does impose size limit.
>
> Not really, because effective_cache_size only models the fact that
> when you iterate the same index scan within the execution of a single
> query, it will probably hit some pages more than once. It doesn't have
> any idea that anything other than an index scan might hit the same
> pages more than once, and it doesn't have any idea that a query might
> find data in cache as a result of previous queries. Also, when it
> thinks the same page is accessed more than once, the cost of
> subsequent accesses is 0.
>
> I could be wrong, but I kind of doubt that there is any future in
> trying to generalize effective_cache_size. It's an extremely
> special-purpose mechanism, and what we need here is more of a general
> approach that can cut across the whole planner -- or alternatively we
> can decide that things are fine and that having rpc/spc implicitly
> model caching behavior is good enough.
>
>> I think in the past we mostly assumed we can't track cache size per
>> table, because we have no visibility into page cache. But maybe direct
>> I/O would change this?
>
> I think it's probably going to work out really poorly to try to use
> cache contents for planning. The plan may easily last much longer than
> the cache contents.
>

Why wouldn't that trigger invalidations / replanning just like other
types of stats? I imagine we'd regularly collect stats about what's
cached, etc. and we'd invalidate stale plans just like after ANALYZE.

Just a random idea, though. We'd need some sort of summary anyway, it's
not plausible each backend would collect this info on it's own.

regards

--
Tomas Vondra

In response to

Re: Should we update the random_page_cost default value? at 2025-10-08 17:23:33 from Robert Haas

Responses

Re: Should we update the random_page_cost default value? at 2025-10-08 19:37:58 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2025-10-08 19:37:58	Re: Should we update the random_page_cost default value?
Previous Message	Ranier Vilela	2025-10-08 19:24:08	Re: [PATCH] Add tests for Bitmapset