Re: Fw:Re: Fw: gbt_var_consistent in contrib/btree_gist/btree_utils_var.c has internal-node type confusion on the <> strategy, bypassing exclusion constraints

From: Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: 王跃林 <violin0613(at)tju(dot)edu(dot)cn>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>, 3764353996 <3764353996(at)qq(dot)com>
Subject: Re: Fw:Re: Fw: gbt_var_consistent in contrib/btree_gist/btree_utils_var.c has internal-node type confusion on the <> strategy, bypassing exclusion constraints
Date: 2026-07-02 15:43:30
Message-ID: CAJTYsWX1C5JLj92QdqEcArWs4XtXPog+yvyb6=X=_cghjXnkYQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On Thu, 2 Jul 2026 at 20:08, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> =?UTF-8?B?546L6LeD5p6X?= <violin0613(at)tju(dot)edu(dot)cn> writes:
> > The analysis and fix look correct. The BtreeGistNotEqual branch is
> > the only strategy that bypasses the is_leaf check and passes
> > potentially truncated internal-node keys directly to f_eq, which is
> > unsafe for types like bit and varbit that require a minimum valid
> > header size. Returning true for internal nodes is the right
> > conservative choice and is consistent with how the other strategies
> > handle the internal-node case.
>
> None of this passes the smell test for me. If it's unsafe to
> call the type's f_eq function on a truncated key, how is it
> any safer to call the f_cmp function? IOW, why aren't *all* the
> cases in gbt_var_consistent() broken?
>

I think the difference is that for bit/varbit f_eq and f_cmp aren't the
same kind of function. f_eq is biteq (reads bit_len, does VARSIZE-8),
while f_cmp is byteacmp (plain byte-wise, only needs the varlena
header). On internal pages every other strategy goes through
f_cmp/byteacmp, and only the <> branch reaches f_eq. So I believe
that's why the byte-wise cases stay safe on a truncated key and <>
doesn't.

> It looks to me like gbt_var_node_truncate adjusts the length words
> of the truncated keys so that they are still valid, just shorter.
> Or at least it's trying to. That works fine for text and bytea,
> but it's not fine for bit/varbit because (a) it fails to update the
> "bit_len" field that follows the length word, and (b) the common
> prefix length selection logic doesn't know that it mustn't truncate
> away any part of the bit_len field. So my own thought about fixing
> this is that type bit needs a custom truncation method.
>

That was my first assumption too, but it looks like the node key has
no bit_len to preserve: gbt_bit_l2n/gbt_bit_xfrm seem to drop it and
keep only the raw bits before truncation runs. If so, a custom
truncation may not help, since the crash seems to come from <> calling
biteq rather than a damaged bit_len. Though I'm not fully sure I'm
reading the l2n step right.

> (I continue to regret that we ever accepted such underdocumented code.
> I think we ought to reverse-engineer a comment explaining what
> gbt_var_consistent is doing, eg, why are all of the tests seemingly
> reversed?)
>

Agreed, we can add a comment there.

Regards,
Ayush

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2026-07-02 16:54:17 Re: Fw:Re: Fw: gbt_var_consistent in contrib/btree_gist/btree_utils_var.c has internal-node type confusion on the <> strategy, bypassing exclusion constraints
Previous Message Tom Lane 2026-07-02 14:38:16 Re: Fw:Re: Fw: gbt_var_consistent in contrib/btree_gist/btree_utils_var.c has internal-node type confusion on the <> strategy, bypassing exclusion constraints