Quick Links

Re: Key joins

From:	Arne Roland <arne(dot)roland(at)malkut(dot)net>
To:	Tomas Vondra <tomas(at)vondra(dot)me>, Joel Jacobson <joel(at)compiler(dot)org>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc:	Anders Granlund <anders(dot)granlund(dot)0(at)gmail(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, Vik Fearing <vik(at)chouppes(dot)com>
Subject:	Re: Key joins
Date:	2026-06-14 18:17:10
Message-ID:	a0e6e185-f6d9-4ae6-bf23-21bec0524564@malkut.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 2026-06-12 7:26 PM, Tomas Vondra wrote:
>
> On 6/12/26 02:20, Arne Roland wrote:
>> Hi Tomas,
>>
>> thank you for checking it out!
>> [...]
>>
> I guess git-am gets confused in some way, because cfbot has the same
> issue with this:
>
> https://github.com/postgresql-cfbot/postgresql/actions/runs/27379594748/job/80912606962
I see. It's a rebase issue. The patch shouldn't apply cleanly on master,
but somehow it does. My next version will be rebased on the current
master again, but I want to work more of the feedback into the patch
before posting a new version here. At least more detailed documentation
and glossary, at best already some more finely grained split up.
> [...]
>>> 6) Does it always have to be a "foreign key traversal"? Consider an
>>> example like this:
>>>
>>>      create table dim (id serial primary key);
>>>      create table f (id serial primary key, did int);
>>>      select * from f left join dim for key (id) <- f(did);
>>>
>>> Currently this fails because of no matching foreign key constraint, but
>>> isn't that pretty much the same thing as if the table actually had the
>>> foreign key (at least considering the cardinality of the join - it won't
>>> change it by adding/removing rows).
>>>
>>> I realize that'd contradict the "FOR KEY" part of this patch, but it's
>>> also one of the things that might be beneficial for the starjoin
>>> planning (in that we could maybe extend it to more cases). Although,
>>> maybe we could check those constraints during planning, just like we
>>> check the fkey_list.
>> This is a very different feature avoiding far less bugs than the
>> original key join. To give just once, consider
>>
>> SELECT *
>> FROM customers cu
>> LEFT JOIN FOR KEY orders (id) <- cu (id)
>> WHERE customers.id = 122354;
>>
> I don't follow. What does this query illustrate?
Most relevantly here it illustrates, that fks instead have additional
relevant guarantees compared to unique/primary key constraints for
making writing and reading queries easier and safer.
>> Our proof framework with proof facts could potentially be leveraged to
>> proof such constructions too. We opted to keep this patch as minimal as
>> possible, since it's already huge. In the end our conquest is to get to
>> something, which is eventually commitable. This patch has a lot things
>> to get wrong. But I think as a second step adding support of other
>> schemas and handing this information of to the planer, should be a
>> fairly straight forward thing to do. Although it would add a few lines
>> on it's own, it's step, that I seems easier than landing a minimal
>> version of this patch.
>>
> OK. I realize this probably goes against my suggestion to make the patch
> smaller. I certainly don't insist on supporting it.
I do want that though. It has advantages in using key joins in old
codebases with existing SQL without full refactoring. And it might be
ultimately helpful to the optimizer too. I just wonder about the best
order to get a simple case of the architecture in.
> [...]
>
>>> 8) I don't understand this change in functioncmds:
>>>
>>>      /* Routine kind cannot change for an existing pg_proc OID. */
>>>      Assert(procForm->prokind != PROKIND_AGGREGATE);
>>>
>>> The comment says we can't change OID, but the assert checks it's not an
>>> aggregate functions. Isn't that misleading?
>> Maybe it would be cleaner to mention additionally, that "aggregates were
>> already rejected on the pre-lock tuple"? I actually think, that comment
>> is indeed stating a helpful invariant to explain why we are concurrency
>> save here.
> I don't follow. The comment says we can't change OID of an existing
> procedure. OK, sure. But the assert checks the procedure is not an
> aggregate function. How is that comment an accurate description of the
> purpose of the assert?
it explains the invariant, why we can rely on our earlier pre-lock tuple
check. To me that seemed to be a relevant enough assumption about the
code to make it explicit in a comment.
> [...]
>>> 13) I think the main question is whether parse-analyze is the right
>>> place to handle this. I don't know. I assume one of the reasons for
>>> doing that is to get an error when defining a view, or when altering an
>>> object - e.g. like when dropping a function used by a view. Which seems
>>> reasonable, people would not like views silently broken by DDL.
>>>
>>> But it seems to be this also leads to a lot of code duplication, because
>>> the parse analysis now has to "reimplement" a lot of the stuff already
>>> done in later stages, after parse analyze. For example, we have the
>>> root->fkey_list thing, but that's not available yet.
>>>
>>> There's probably more such information - like innerrel_is_unique,
>>> rel_is_distinct_for, relation_has_unique_index_for etc.
>>>
>>> Maybe it's not worth it? What if we did this in the planner instead? How
>>> much simpler would it get? My intuition is it'd get much smaller, but
>>> maybe I'm wrong.
>>>
>>> It'd probably mean it's not necessary to expand views during parse,
>>> which was one of the problems mentioned at the beginning of this thread.
>>>
>>> I suppose we'd need to keep additional information in the plan to know
>>> which joins to check, etc.
>> If we want to use this for correctness and for proofs, we can't do it
>> plan time. We have DDL-writes to store proofs. In it's nature this is a
>> compile time feature.
>>
> Is that actually true? If we check the correctness later in the planner,
> we'd still get a failure. Why couldn't that verify all the proofs, just
> like the parse-analyze? We may need to retain more information, ofc.
>
>> Even if we don't include all of that in the first patch, this places
>> very severe limitations like the DDL issue, which would be just tooo
>> limiting to get to as an end place. I do think it's more reasonable how
>> to cut this up into more digestible pieces.
>>
> I'm not sure what "DDL issue" is, sorry :-(
>
> I may be missing something, but the main difference in behavior seems to
> be for views - the current patch verifies the proofs when creating the
> view / altering objects, and rejects those. And AFAICS after moving it
> to the planner, that would not be the case anymore.
>
> But for regular ad hoc queries there's not much difference, no? You'd
> get a failure, no matter what.

With DDL-issue I was eluding to a DDL command that attaches a query to
some object. Views, sql functions and policies all do that. I don't see
us doing that work planning time. Theoretically we could try to use a
new planner hook for that, to call the planner with an extra struct
element telling it about increased locking arrangements without the
intend to ever execute a query and abort after the proof.

However I do think this introduces more future complexity by entangling
separate concerns in the same code path, and I am not sure we are doing
ourselves a favor with this, even if we would save a hand full of lines.
I am not convinced I want to tackle that complexity on top of this
already non-trivial patch.

I intend to try to separate the DDL handling case out into it's own
patch of the patch series. While in lines of code is not that massive,
reasoning about those two individually is indeed simpler. Even just
getting to locking for the DDL-case right, isn't trivial.

I hope to get to that by the end of the now starting week.

>> Even though the planner is messy, I feel more at home there than here in
>> the parser. I could see a world were we clean up some of those out of
>> the planner to make them available earlier some time in parsing. We
>> could push those variables forward to the planner to avoid redoing the
>> work. Each of these steps would need a careful performance check and I
>> am not sure how the real stepping stones would look like.
>>
> I'm skeptical about moving this stuff to an earlier phase (so that the
> parse can see it). You can try, but it probably requires moving a lot of
> other stuff too, some of which may be expensive.

I have a high level general response and one tailored to this patch.
I'll start with this particular patch: For this patch I am mostly
concerned with not doing complex work twice: Once in the parser AND once
in the optimizer. And the below general thoughts are not easily actionable.

More generally:

Your skepticism is well placed. Every change there would warrant
benchmarking. I am just not completely convinced the planner is super
optimal in the way it works. I suspect a lot of code there has not been
optimized, because it's hard to wrap your head around it, that it
doesn't feel like an easy gain anymore.

While 1:1 pulling stuff out of the planner almost surely gives mostly
performance regressions, a clean architecture allowing more simple,
isolated performance improvements sounds beneficial, too. And I don't
want to open this can of worms right now.

I just recalled the time when Andres made the storage pluggable by
introducing a layer of pointers everywhere. The new storage interface
ended up being faster than the old.
This is not storage and I am not Andres. I just wanted to share
precedent, that a clear architecture with optimization disadvantages is
not necessarily always slower.

> regards
>
Regards
Arne

In response to

Re: Key joins at 2026-06-12 17:26:04 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrey Borodin	2026-06-14 18:48:48	Re: Add wait events for server logging destination writes
Previous Message	Fujii Masao	2026-06-14 15:52:44	Re: Fix ALTER DOMAIN VALIDATE CONSTRAINT locking