Re: Is there value in having optimizer stats for joins/foreignkeys?

From: Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrei Lepikhov <lepihov(at)gmail(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, hs(at)cybertec(dot)at, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Is there value in having optimizer stats for joins/foreignkeys?
Date: 2026-05-27 17:49:44
Message-ID: CAK98qZ0_4Kodyemk0Tdmew=YG8jeHQ=wzOydQws4s99A1X2h5g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tom and Tomas,

Thank you so much for the feedback!

On Mon, May 25, 2026 at 8:04 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Tomas Vondra <tomas(at)vondra(dot)me> writes:
> > On 5/21/26 22:25, Tom Lane wrote:
> >> I don't love stxkeyrefs[]. I wonder if it's time to throw away
> >> stxkeys[], represent all the target columns as regular expression
> >> trees in stxexprs, and then special-case columns that are simple
> >> Vars where appropriate at execution.
> >> (In the same vein, I dislike the grammar's separation of plain
> >> columns from expressions; I'd like to replace stats_params
> >> with expr_list and sort it all out later. But perhaps that's
> >> material for a separate patch.)
>
> > FWIW the extended stats copied this from pg_index, which also stores
> > keys and expressions separately. I suppose there was a reason for that,
> > most likely performance - is cheaper to compare attnums than
> > expressions, and plain keys are much more common.
>
> I think I might be to blame for the separate storage of indexprs.
> If so, the motivation was to avoid breakage of older code that only
> knew about indkey[]. (Of course, such code would necessarily fail
> on indexes with expressions, but we wanted to avoid breakage for the
> common case of no-expressions.) I don't think that consideration is
> nearly as pressing for extended stats. There's probably a lot less
> client-side code that knows about extended stats at all, and what
> there is seems more likely to rely on the server-side display
> functions than to dig into the catalog details for itself. Also,
> if there is anything that's looking at pg_statistic_ext details,
> it will need work anyway after this patch; there's no way around that.

I'm working on removing stxkeys[] as a prerequisite commit before the main
join
stats patch, representing all target columns as Var nodes in stxexprs, as
you
both suggested.

One question about the pg_stats_ext view: currently it has two complementary
columns:

- attnames (name[]) — Names of the columns included in the statistics object
- exprs (text[]) — Expressions included in the statistics object

With stxkeys gone from the catalog, should the view:

(a) Stay as-is: keep attnames and exprs as separate columns with the same
semantics. Implemented via a helper function that extracts plain column
names
from the unified stxexprs.

or

(b) Mirror the catalog: remove attnames, make exprs show all entries (both
column names and expressions together in one text[] array).

Any preference?

--
Alexandra Wang
EDB: https://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2026-05-27 17:56:50 Re: Set notice receiver before libpq connection startup
Previous Message Alberto Piai 2026-05-27 17:44:23 Re: Adding a stored generated column without long-lived locks