Re: [PATCH] btree_gist: add cross-type integer operator support for GiST

From: Alexander Nestorov <alexandernst(at)gmail(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] btree_gist: add cross-type integer operator support for GiST
Date: 2026-06-13 23:35:14
Message-ID: 80ef3b41-1a71-47a4-a320-29e118d7092c@Spark
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Andrey!

Following up with the things I owed you: the benchmarks, the consistency check
and adding a note for the 2^53 case.

I added a fast path. Each integer opclass's consistent() / distance() now
detects the "same type" case and calls the original gbt_num_consistent() /
gbt_num_distance() directly.

To confirm there's no regression I ran a microbenchmark on an -O2 build, no
asserts, single client, over a 500k row int4 GiST index, with the following
options:

-c enable_seqscan=off \
-c enable_bitmapscan=off \
-c enable_sort=off \
-c max_parallel_workers_per_gather=0

This is the base for the bench:

CREATE EXTENSION IF NOT EXISTS btree_gist;
DROP TABLE IF EXISTS benchg;
CREATE TABLE benchg (a int4);
INSERT INTO benchg SELECT g FROM generate_series(0, 499999) g;
CREATE INDEX benchg_idx ON benchg USING gist (a);
VACUUM (ANALYZE, FREEZE) benchg;

And the two workloads:

consistent(), full-range index-only count(*):
SELECT count(*) FROM benchg WHERE a >= 0 AND a <= 499999;

distance(), full KNN ordering (ORDER BY a<->k over all rows):
SELECT count(*) FROM (SELECT a FROM benchg ORDER BY a <-> 250000 LIMIT 1000000) q;

The numbers in ms (12 repetitions, 15s each) before
(3e3d7875e95621b02311ea3443e5139e3bce944a) and after my patch:

  before   consistent   min/med/mean = 51.754 52.718 54.137 ms
  after    consistent   min/med/mean = 52.042 52.480 52.572 ms
  ------------------------------------------------------------------------
  before   distance     min/med/mean = 76.863 77.177 77.395 ms
  after    distance     min/med/mean = 77.357 77.803 77.980 ms

All numbers seem to be within measurement noise, except the consistent-before,
which is probably inflated by one slow rep.

Regarding the other point, I explored the regression suite path I mentioned.

The consistent() / distance() functions dispatch cross-type queries through a
single static table of supported subtype OIDs (gbt_int_crosstype_table in
btree_utils_num.c). I expose that exact table to SQL, in gbt_int_crosstype_subtypes(),
so there is no hand-maintained second copy of the list.

The int_crosstype.sql regression test then builds the set of cross-type
(lefttype, righttype, strategy) entries that should exist in pg_amop from that
function, and EXCEPTs it against the cross-type rows actually present in
gist_int{2,4,8}_ops:

  - a pg_amop row whose subtype the C dispatch does not handle shows up as
    "unexpected in pg_amop", and
  - a dispatch entry without the matching pg_amop rows shows up as
    "missing from pg_amop".

Either kind of drift produces a diff under `make check`. So adding an ALTER
OPERATOR FAMILY entry without a matching dispatch entry (or vice versa) fails
the suite (as I mentioned in my previous email, I'm not aware of a way to do
this with amvalidate() without patching core).

I'm attaching the new set of patches (this time I include the tests).

Best regards!

Attachment Content-Type Size
0001-Implement-cross-type-operators-for-GiST-indexes.patch application/octet-stream 26.4 KB
0002-Add-tests-for-cross-type-operators-for-GiST-indexes.patch application/octet-stream 23.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2026-06-14 02:39:42 Re: Row pattern recognition
Previous Message Sami Imseih 2026-06-13 23:07:28 Re: [PATCH] COPY TO FORMAT json: respect column list order