Re: should we have a fast-path planning for OLTP starjoins?

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: should we have a fast-path planning for OLTP starjoins?
Date: 2025-11-28 19:21:03
Message-ID: ca3f520e-b5aa-4ea1-96bd-8107010a844a@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/28/25 19:57, Robert Haas wrote:
> On Wed, Nov 26, 2025 at 1:30 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>>> In my experience, restriction clauses on dimension tables are very common.
>>
>> Sure, but does that imply the inverse case (dimensions without non-join
>> restrictions) are not? I'm not sure.
>
> Obviously that depends on a lot of things, and I don't completely
> understand what the patch does and doesn't do. But, I think it would
> be sad to implement an optimization that falls over catastrophically
> when such restriction clauses are present. For example, a long time
> ago, I used to build web applications. Twenty, even thirty table joins
> were common. There certainly wouldn't be a restriction clause on every
> dimension table, but it would be an unusual situation if there were NO
> restriction clauses on ANY dimension table.

I think it depends on what you mean by "falls over catastrophically".

The patch identifies dimensions without restrictions, moves them aside,
and does regular join search on the rest of the relations (some of which
may be dimensions with restrictions).

So the presence of dimensions with restrictions does not disable the
optimization entirely, it just means the dimensions with restrictions
won't benefit from it. So in the worst case, it'll perform just like
before (assuming the optimization is kept cheap enough).

I'd love if the optimization worked for all dimensions, even for those
with restrictions. But I don't know about such optimization.

> It's maybe also worth
> mentioning that in those applications, it wasn't always a pure star
> join: one central fact table would join to a bunch of codes tables,
> but also very often to some other fact tables that had their own codes
> tables.
>

I certainly don't claim this optimization works for all queries, but
it's also not restricted to "pure" starjoins. It simply finds which
relations can be considered dimensions (i.e. joined through a FK). It
does not match the whole plan shape, or anything like that.

Yes, that may reduce the possible benefit of this optimization, because
the patch works within the "groups" generated by join_collapse_limit (so
within 8 relations by default). If those groups have few dimensions, the
patch may not help all that much.

> Point being that optimizations like this can be shown to have
> a LOT of value in individual test cases even if the circumstances in
> which they can be applied are very restricted, but lifting some of
> those restrictions can enormously expand the number of real-world
> cases to which they apply. My intuition is that a smaller gain on a
> larger class of queries will win us more praise than the reverse.

I don't disagree, but isn't this mostly what we're discussing now? I'm
trying to figure out if enough queries would benefit from this
optimization, which only applies to dimensions without restrictions.

regards

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2025-11-28 19:51:41 Re: IPC/MultixactCreation on the Standby server
Previous Message Hannu Krosing 2025-11-28 19:08:12 Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements