BUG #15378: SP-GIST memory context screwup?

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: andrew(at)tao11(dot)riddles(dot)org(dot)uk
Subject: BUG #15378: SP-GIST memory context screwup?
Date: 2018-09-11 02:09:26
Message-ID: 153663176628.23136.11901365223750051490@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 15378
Logged by: Andrew Gierth
Email address: andrew(at)tao11(dot)riddles(dot)org(dot)uk
PostgreSQL version: 11beta3
Operating system: Debian
Description:

I found this while analyzing a report from IRC that initially looked like a
PostGIS bug, but which I now think is breakage in spgist:

spgrescan starts out by doing
MemoryContextReset(so->traversalCxt);

then later it calls resetSpGistScanOpaque(so);
which calls freeScanStack(so)
which calls freeScanStackEntry(so)
which does:

if (stackEntry->traversalValue)
pfree(stackEntry->traversalValue);

But stackEntry->traversalValue, if not NULL, is supposed to have been
allocated in so->traversalCxt, and so it's already gone.

The specific case we were looking at involved a query using a LATERAL
subselect with a LIMIT 1, with conditions that would have used the spgist
index (on a PostGIS geometry column). The PostGIS code seems to be correctly
allocating the traversalValues in the correct context, but we were getting
this crash (on pg11 beta3):

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) bt
#0 0x0000000000000000 in ?? ()
#1 0x0000561e38427e1c in freeScanStackEntry (stackEntry=0x561e396ef718,
so=<optimized out>) at ./build/../src/backend/access/spgist/spgscan.c:47
#2 0x0000561e38428a6e in freeScanStack (so=0x561e396ed508) at
./build/../src/backend/access/spgist/spgscan.c:60
#3 resetSpGistScanOpaque (so=0x561e396ed508) at
./build/../src/backend/access/spgist/spgscan.c:75
#4 spgrescan (scan=<optimized out>, scankey=0x561e3969eab8,
nscankeys=<optimized out>, orderbys=<optimized out>, norderbys=<optimized
out>)
at ./build/../src/backend/access/spgist/spgscan.c:229

which is consistent with a bad pfree(), I think.

The original reporter's data set is proprietary, but this is the query
that died:

SELECT b.idhu, layer, max(dba)
FROM (SELECT b."IDHU" AS idhu, first(geom) AS geom
FROM do_buildings b JOIN do_data d ON b."IDHU"::text =
d."IDHU"::text
GROUP BY 1 LIMIT 100) b
CROSS JOIN unnest('{str_isof_den}'::text[]) AS lyr
CROSS JOIN LATERAL (
SELECT idhu, layer, dba
FROM do_laerm l
WHERE layer =lyr AND (b.geom && l.geom AND ST_Intersects(l.geom,
b.geom))
LIMIT 1
) AS target
GROUP by 1, 2;

Note that do_laerm(geom) has an SP-GIST index, and the LATERAL placement
means that this is going to be rescanned at a fairly arbitrary point in the
scan (due to the LIMIT). Unfortunately I don't think this can be
demonstrated with the built-in spgist opclasses, which don't allocate
traversalValues.

I /think/ the offending pfree could just be removed...

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2018-09-11 02:24:14 Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables
Previous Message kalyani kaniganti 2018-09-11 01:50:47 Re: BUG #15376: Postgres sql 9.4.19 pg_upgrade stops with error The source cluster was not shut down cleanly.