Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833 numrange query

From: Adam Scott <adam(dot)c(dot)scott(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833 numrange query
Date: 2019-11-18 22:51:25
Message-ID: CA+s62-PWUgGJ7XaXnvFEcT9FeeVjHwojjtzP3i7LeuuoLDxZ+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

> Did you see that after updating to 10.11. If you used 10.10 or
> an older version, did the problem happen?
Originally this first was discovered on Centos 7, PG 10.10, then loaded the
40gb table on 10.11 on Ubuntu.

>Seeing the plan of your query may help as well. Could you run EXPLAIN
>on it or does it crash before? Perhaps a parallel plan is involved
>here?
Explain plan works fine with no crash

I've done a binary search to find out where the error occurs in the data,
but no luck. It seems intermittent now. Finally, I was able to reproduce
the error repeatably with a blank table:

Stop and start postgres from fresh, and then run this query (notice, I
removed a sarg from the originally supplied query):
select id from natica_hdu_test
WHERE
"dec_range" <@ '[88.9999998611111,90.0000001388889)';

Thank you for your help in tracking this down!

On Sun, Nov 17, 2019 at 8:50 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> Hi Adam,
>
> On Mon, Nov 18, 2019 at 01:27:22AM +0000, PG Bug reporting form wrote:
> > Email address: adam(dot)c(dot)scott(at)gmail(dot)com
> > PostgreSQL version: 10.11
> > Operating system: Ubuntu &amp; CentOS
> > Description:
>
> Did you see that after updating to 10.11. If you used 10.10 or
> an older version, did the problem happen?
>
> > Seg fault can be repeated by running this query:
> >
> > select count(*) from natica_hdu where boundary is not null
> > and
> > "dec_range" <@ '[89.9999998611111,90.0000001388889)' AND "ra_range" <@
> > '[45.0,45.1]';
>
> It would help if we had a sample of data here able to reproduce the
> problem. Something looks to be wrong with this range operator when
> working on numeric ranges, still I cannot reproduce the problem with
> for example stuff like that:
> CREATE TABLE aa (a numrange);
> CREATE INDEX aai ON aa(a);
> INSERT INTO aa
> SELECT ('['|| (90.0 - a::numeric/10000000000) || ',' ||
> (90.0 + a::numeric/10000000000) || ')')::numrange
> FROM generate_series(1,10000) as a;
> SELECT count(*) FROM aa
> WHERE a <@ '[89.9999998611111,90.0000001388889)';
>
> Seeing the plan of your query may help as well. Could you run EXPLAIN
> on it or does it crash before? Perhaps a parallel plan is involved
> here?
>
> > #0 pg_detoast_datum (datum=0xffffffff) at fmgr.c:1833
> > #1 0x0000557a18c19545 in numeric_sub (fcinfo=fcinfo(at)entry
> =0x7ffff5795e30)
> > at numeric.c:2288
>
> Hmm. Per the shape of this backtrace, you are indeed processing the
> range operator, and this portion of memory looks freed. My simplified
> example above basically goes through the same when planning the
> query.
> --
> Michael
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Adam Scott 2019-11-18 23:59:14 Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833 numrange query
Previous Message Andres Freund 2019-11-18 22:24:16 Re: BUG #16125: Crash of PostgreSQL's wal sender during logical replication