Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables

From: Jeremy Evans <code(at)jeremyevans(dot)net>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables
Date: 2018-09-07 16:51:03
Message-ID: 20180907165103.GH17425@jeremyevans.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 09/06 03:28, Jeremy Evans wrote:
> On 09/06 02:58, Michael Paquier wrote:
> > On Thu, Sep 06, 2018 at 08:35:39PM +0000, PG Bug reporting form wrote:
> > > If necessary I can build a debug version of PostgreSQL and try using that in
> > > production so I can get a better backtrace if it crashes again. However,
> > > considering that the crash is rare in my environment, it's unlikely I will
> > > be able to produce a better backtrace for the error quickly.
> >
> > That would be nice. From what I can see this would be a race condition,
> > which is not obvious by looking at the code. Testing with a two-node
> > deployment where the first node has a foreign table which connects to a
> > second node, using SCRAM authentication, holding the physical table,
> > then doing many foreign scans across many clients don't show any
> > problem. Did libpq complain at some point in the session where the
> > crash happened about any error?
>
> The PostgreSQL logfile only shows:
>
> postgres(64978) in free(): bogus pointer (double free?) 0x4a115aec398
> 2018-09-06 12:01:52.202 PDT [45953] LOG: server process (PID 64978) was terminated by signal 6: Abort trap
> 2018-09-06 12:01:52.202 PDT [45953] DETAIL: Failed process was running: ...
> 2018-09-06 12:01:52.202 PDT [45953] LOG: terminating any other active server processes
>
> If there is another place I should look, please let me know. The log
> files of the client process don't show anything during the crash,
> probably because the client libpq connection was just dropped when the
> server process crashed. After the crash, other client libpq connections
> show the following, which is probably expected:
>
> WARNING: terminating connection because of crash of another server process
> DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
> HINT: In a moment you should be able to reconnect to the database and repeat your command.
>
> I'll try to install a version with debug symbols on September 14, and
> if it crashes again I'll respond with a more complete and accurate
> backtrace.

We experienced an almost identical crash this morning. The query was
different, but the backtrace was almost the same, and the query was
using a foreign table with SCRAM authentication, just like the one
yesterday.

One thing that was similar between the two crashes is that shortly
before both crashes, we were testing database changes on different
databases in the same cluster, different from both the postgres
process that crashed (the client of the foreign table scan) and the
postgres process executing the foreign table scan.

The database changes mostly consisted of the following statement types:

DROP TABLE
DROP SCHEMA
DROP FUNCTION
CREATE FUNCTION
CREATE SCHEMA
CREATE TABLE
CREATE INDEX
CREATE TRIGGER
INSERT
GRANT

We only recently started testing these database changes in this cluster
yesterday. Based on the timing, I'm guessing this issue only occurs
when system table changes are being made.

I hadn't yet had time to install debug symbols on the production server,
but since I think I have a better idea on how to recreate this issue, I
will try recreating this on a test cluster with debug symbols.

Thanks,
Jeremy

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeremy Evans 2018-09-07 17:55:18 Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables
Previous Message PG Bug reporting form 2018-09-07 15:16:05 BUG #15368: using pgadmin 4 to change any characteristic of a numeric field of a table.