| From: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
|---|---|
| To: | Alexander Nestorov <alexandernst(at)gmail(dot)com> |
| Cc: | pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: [PATCH] btree_gist: add cross-type integer operator support for GiST |
| Date: | 2026-06-05 18:49:24 |
| Message-ID: | 4B4B0998-7B43-4893-9603-0AF212036690@yandex-team.ru |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Alexander!
Thanks for working on this - this looks like useful feature and
btree_gist users will appreciate it.
> When a query provides a value whose type is compatible but different
> from the column type, the planner cannot use the index for that column.
> [...] The workaround is to write an explicit cast in every query:
> WHERE camera_id = 1189::int8
> This is fragile as ORMs, application parameter binding, and even hand-
> written SQL queries produce values whose types do not exactly match
> the column type.
This bites particularly hard with composite indexes that mix a PostGIS
column with a plain identifier - a layer number, tenant id, etc.:
CREATE INDEX ON t USING gist (layer_id, geom);
... WHERE geom && :bbox AND layer_id = 42;
Today the integer side forces a cast to be index-usable, which seems ugly
and easy to forget, and the ORM argument above only makes it worse. So
I think this is worth pursuing.
A few things that seem important to me (though discussion may well prove
some of them minor):
1. GiST consistent() is CPU-bound (it runs for every key on every
visited page), so the bar here should be strictly zero speed regression
for existing same-type users. Note sk_subtype is the operator's right
operand type, so for ordinary same-type scans it is the native type,
not InvalidOid - meaning the same-type path now also goes through the
cross-type dispatch and scans the subtype table. Please add some fast
pathm and back it with a microbenchmark on a same-type workload.
2. On the "general foundation" framing:
> Other btree_gist opclasses (float4/float8, date, timestamp, ...) and
> even range-type GiST opfamilies in core can adopt the same pattern
I'd be cautious here. The scalar int64/fabs() dispatch shape does not
obviously fit range types (not scalar at all), and timestamp/date bring
their own questions (infinity, etc.). It may be better to solve types
as they actually come up and generalize the scaffolding as load on it
grows, rather than commit to a universal framework upfront.
3. The cross-type knowledge lives in two hand-maintained places - the
pg_amop entries in SQL and the C dispatch tables:
> I propose [...] to dispatch cross-type queries directly inside the
> existing consistent and distance functions and use the existing
> subtype OID argument.
They are only reconciled at query time via an elog(), not at
amvalidate/CREATE time. (Version skew between the .so and the catalog
isn't the worry - the new extension version ships with the new major.)
The concern is plain authoring drift: adding an amop without the
matching C entry, or vice versa, passes validation and only fails on a
live query. It would be nicer if we could assert that every cross-type
amop in the family has a corresponding dispatch entry (in amvalidate?).
4. KNN: the distance callbacks compute fabs() in float8, so for int8
values beyond 2^53 the lower-bound distance loses precision. This
matches the existing same-type int8 KNN, so it's not a regression, but
since you're widening the mix it's worth calling out as a known limit.
The missing int_crosstype test files (referenced from REGRESS/meson but
not in the patch) break make check, but I take it that's just the
not-yet-included test round.
Did you register your patch on the commitfest? [0]
Thank you!
Best regards, Andrey Borodin.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bruce Momjian | 2026-06-05 18:58:41 | Re: First draft of PG 19 release notes |
| Previous Message | Baji Shaik | 2026-06-05 18:45:49 | Re: [PATCH] Add regression tests for btree skip scan support functions |