Re: BUG #17926: Segfault in SELECT

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: erik(at)nib4(dot)nl, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17926: Segfault in SELECT
Date: 2023-05-11 09:14:37
Message-ID: 20230511091437.oz4ne5gdpt6v2dxv@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2023-May-08, PG Bug reporting form wrote:

> Program received signal SIGSEGV, Segmentation fault.
> 0x00010000017fedec in ?? ()
> (gdb) bt
> #0 0x00010000017fedec in ?? ()
> #1 0x0000ffff017febd0 in ?? ()
> #2 0x0000aaab10af7fa8 in ?? ()
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>
> The query uses a 3 partitions of a table where all fields have an brin
> index.
> This query will segfault:
> SELECT count(*)
> FROM dw
> WHERE (dw.ts >= '2022-09-01' AND dw.ts <= '2022-10-30')
> AND dw.source='type3' AND dw.customer='123.456';
>
> The query should return 0 because the customer does not exist.
> Removing the dw.customer or dw.source constraint will make the segfault not
> occur.
> Also using 'set enable_bitmapscan to off' will not trigger the segfault.

I'm not sure how to go about debugging this problem -- without a stack
trace, we don't know where to look. I tried to reproduce it without
success, but I didn't try to add any data. Which partitions do you
have, and how much data? Does EXPLAIN (without ANALYZE) work, and if so
what does it report?

Would it be possible for you to run the query under 'rr record'?
There are some instructions here:
https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Recording_Postgres_using_rr_Record_and_Replay_Framework

One possibility is that the index structure is corrupted in some way.
pageinspect's functions be useful, but I suppose you'd have to scan the
whole index in order to find what went wrong. We don't have any tooling
for that ... I guess amcheck support would be a nice addition.

I guess the other option is that things are failing at optimizer or
executor setup time. Having one index per column is unusual.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Las navajas y los monos deben estar siempre distantes" (Germán Poo)

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Hans Buschmann 2023-05-11 10:07:41 AW: BUG #17923: Excessive warnings of collation version mismatch in logs
Previous Message PG Bug reporting form 2023-05-11 08:00:01 BUG #17928: Standby fails to decode WAL on termination of primary