Re: BUG #17406: Segmentation fault on GiST index after 14.2 upgrade

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: vyegorov(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, PG Bug reporting form <noreply(at)postgresql(dot)org>
Subject: Re: BUG #17406: Segmentation fault on GiST index after 14.2 upgrade
Date: 2022-02-15 20:51:47
Message-ID: cb3a777a-99c7-f671-532d-b828a2223502@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2/15/22 18:32, PG Bug reporting form wrote:
> The following bug has been logged on the website:
>
> Bug reference: 17406
> Logged by: Victor Yegorov
> Email address: vyegorov(at)gmail(dot)com
> PostgreSQL version: 14.2
> Operating system: Ubuntu 18.04.6 LTS (bionic)
> Description:
>
> KVM guest, on Intel(R) Xeon(R) CPU E5-2697 v2.
>
> PostgreSQL 12.9 had been upgraded to 14.2 using
>
> pg_upgradecluster -k -m link 12 main /mnt/postgres/14/
>
> After that, one of the queries is crashing with Segmentation Fault.
> However, after REINDEX problem will be fixed. I thought I should report
> still.
>

Hmm. So I guess there are three options:

1) The index was already broken on 12.9, but for some reason (choice of
a different plan, ...) it was not causing any issues.

2) The index got broken during/after the upgrade, for some reason.

3) The index is fine, but there's a newly introduced bug in ltree (or
gist in general).

Hard to say which it is.

How large is the table/index? Are you able to run the query with a
custom build (without values optimized out)? Any chance you still have a
backup from before the pg_upgrade, on which you might run the query?

> explain
> SELECT r.id, r.name, r.map_center_y, r.map_center_x,
> COALESCE ((SELECT string_agg(id::TEXT, ',') FROM v3_region AS r2
> WHERE r2.id != r.id AND r2.ltree_path <@ r.ltree_path), '') AS children
> FROM v3_region AS r;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------
> Seq Scan on v3_region r (cost=0.00..3207.31 rows=349 width=580)
> SubPlan 1
> -> Aggregate (cost=8.17..8.18 rows=1 width=32)
> -> Index Scan using region_ltree_path_idx_gist on v3_region r2
> (cost=0.14..8.16 rows=1 width=4)
> Index Cond: (ltree_path <@ r.ltree_path)
> Filter: (id <> r.id)
> (6 rows)
>
> Here is backtrace:
> #0 __memmove_sse2_unaligned_erms () at
> ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:435
> #1 0x00007f129244c266 in memcpy (__len=<optimized out>,
> __src=0x55bbbd430f4a, __dest=<optimized out>) at
> /usr/include/x86_64-linux-gnu/bits/string_fortified.h:34
> #2 copy_ltree (src=0x55bbbd430f4a) at
> ./build/../contrib/ltree/ltree_gist.c:446
> #3 0x00007f129244db51 in gist_ischild (siglen=28, query=0x55bbbd410cb8,
> key=0x55bbbd430f18) at ./build/../contrib/ltree/ltree_gist.c:454

Interesting. Too bad the memcpy() parameters are optimized out. But
clearly src is not NULL, so perhas something is wrong with the dst pointer?

> #4 ltree_consistent (fcinfo=0x7ffdac2e3e20) at
> ./build/../contrib/ltree/ltree_gist.c:674
> #5 0x000055bbbb944a5d in FunctionCall5Coll
> (flinfo=flinfo(at)entry=0x55bbbd416b20, collation=<optimized out>,
> arg1=arg1(at)entry=140727492165472, arg2=<optimized out>, arg3=<optimized out>,
> arg4=<optimized out>, arg5=140727492165471)
> at ./build/../src/backend/utils/fmgr/fmgr.c:1241
> #6 0x000055bbbb4db72d in gistindex_keytest (recheck_distances_p=<synthetic
> pointer>, recheck_p=<synthetic pointer>, offset=4, page=0x7f1293a45180
> "K\313\003", tuple=0x7f1293a47058, scan=0x55bbbd4196c8)
> at ./build/../src/backend/access/gist/gistget.c:222
> #7 gistScanPage (scan=scan(at)entry=0x55bbbd4196c8,
> pageItem=pageItem(at)entry=0x7ffdac2e3fe0, myDistances=myDistances(at)entry=0x0,
> tbm=tbm(at)entry=0x0, ntids=ntids(at)entry=0x0) at
> ./build/../src/backend/access/gist/gistget.c:438

Can you print the pageItem->blkno? That should tell us which index page
is causing issues. And then you can dump the page using pageinspect [1].
For example if blkno = 100, then this might tell us more:

SELECT * FROM
gist_page_opaque_info(get_raw_page('region_ltree_path_idx_gist', 100));

SELECT * FROM
gist_page_items(get_raw_page('region_ltree_path_idx_gist', 100),
'region_ltree_path_idx_gist');

SELECT * FROM
gist_page_items_bytea(get_raw_page('region_ltree_path_idx_gist', 0));

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2022-02-16 01:29:53 Re: Report a potential memory leak in setup_config()
Previous Message Tom Lane 2022-02-15 18:17:57 Re: Postgres 13.5 out parameter argument with explicit cast fails with argument is not writable