Re: Strange GiST logic leading to uninitialized memory access in pg_trgm gist code

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Strange GiST logic leading to uninitialized memory access in pg_trgm gist code
Date: 2020-11-12 03:48:17
Message-ID: CAPpHfdutQZ+Ligok9fVaYY3PRG2pviWz7KnGEsuAR_0+o4g7_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On Wed, Nov 11, 2020 at 12:53 PM Andrew Gierth
<andrew(at)tao11(dot)riddles(dot)org(dot)uk> wrote:
> Now the obvious simple fix is just to reorder those last two operations,
> and the original reporter verified that doing so fixed their problem
> (patch attached). But I'd really like to understand the logic here and
> whether there is any reason to have this special treatment at all - why
> would it not be better to just cache all N items upfront and consider
> them all as potential seeds?

I think this comes from the idea that when N items are passed to the
picksplit method, then the first N-1 are existing items on the page,
while the last Nth is the new item to be inserted. So, we are trying
to split first N-1 items and then insert the last Nth item there. But
this is wrong for two reasons.

1) As you've pointed out, GiST code doesn't necessarily pass items to
the picksplit method in that way.
2) Even if items are passed as assumed, there is no point in having
special handling of the item to be inserted. It's better to consider
the whole set of items to produce a better split.

> Another issue I don't understand yet is that even though this code is
> largely unchanged since 8.x, the original reporter could not reproduce
> the crash on any version before 13.0.

I think this is related to my commit 911e702077. It has changed the
memory allocation for the signatures to support the signatures of
variable length. So, it seems that despite the error existing since
8.x, it started causing segfaults only since 911e702077.

> Anyone have any ideas? (If not, I'll commit and backpatch something like
> the attached patch at some suitable time.)

I would rather propose to rip off special handling of the last item
completely (see the attached patch).

------
Regards,
Alexander Korotkov

Attachment Content-Type Size
0001-Fix-handling-of-the-last-item-in-gtrgm_picksplit.patch application/octet-stream 2.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message k.jamison@fujitsu.com 2020-11-12 04:00:14 RE: [Patch] Optimize dropping of relation buffers using dlist
Previous Message Fujii Masao 2020-11-12 03:24:48 Re: Delay of standby shutdown