Re: should we have a fast-path planning for OLTP starjoins?

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, James Hunter <james(dot)hunter(dot)pg(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: should we have a fast-path planning for OLTP starjoins?
Date: 2025-11-15 02:52:40
Message-ID: aRfq-OHE2uAKVdn5@momjian.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 15, 2025 at 01:41:04AM +0100, Tomas Vondra wrote:
> > And then there is the problem of detecting when this happens.
> >
> > I think my big question is, when we join A->B->C->D, we do a lot of work
> > in the optimizer to figure out what order to use, but when we do A->B,
> > A->C, A->D, why are we spending the same amount of optimizer effort?
> >
>
> I'm sorry, I don't quite understand what's the question here. What does
> A->B->C->D mean, exactly?

It means table A joins B, and B joins C, and C joins D. I can see that
as a much harder problem, and one we have code for in the optimizer,
than A joining to B, C, and D.

> > Could we just order B, C, D in order of which have restrictions, or
> > based on size? I assume we can't because we don't know if A->C or
> > another join would increase the number of rows, and this patch is saying
> > if there is a foreign key relationship, they can't increase the rows so
> > just short-circuit and order them simply. Is that accurate?
> >
> > Crazy question, if we have A->B, A->C, and A->D, why can't we just sort
> > B,C,D based in increasing order of adding rows to the result, and just
> > use that ordering, without requiring foreign keys?
> >
>
> Yeah, that's mostly the idea one of the comments in the patch suggests.
> To do the joins that reduce the cardinality first, then the dimensions,
> and finally the joins that increase the cardinality.

Yes, I saw that.

> However, the examples make it look like we're joining pairs of tables,
> but that's not necessarily true. The join may be between F and relation
> that is really a join itself. And now you need to know how this join
> changes the cardinality, which is more expensive than when only looking
> at joins of pairs of tables.

Yes, if the join is from F to D, and D joins to something else,
especially if it is not a foreign key where we know there is only one
match, I think we have to give up and go back to the normal optimizer
process.

> So I think we'd need to first identify these "independent join groups"
> first. But that seems non-trivial, because we don't know which table to
> start from (we don't know what's the "F" table).

Do we easily know how many relations each relation joins to? Does that
help us?

> I'm sure there's an algorithm to decompose the join tree like this, but
> I'm not sure how expensive / invasive it'd be. The premise of this patch
> is that it's a cheap optimization, that doesn't need to do this.

Yeah, I can see expense being an issue, which explains, as you said, why
many other databases have star join "flags", which ideally we can avoid
since very few people are going to use the flag.

> It simply "peels off" the dimensions from the join, based on things it
> can prove locally, without having to decompose the whole jointree.

I see, and I think understand better now. Thanks.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nico Williams 2025-11-15 06:15:19 Re: should we have a fast-path planning for OLTP starjoins?
Previous Message John Naylor 2025-11-15 02:47:11 Re: tuple radix sort