Re: logical decoding bug: segfault in ReorderBufferToastReplace()

From: Jeremy Schneider <schnjere(at)amazon(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical decoding bug: segfault in ReorderBufferToastReplace()
Date: 2019-12-14 00:13:35
Message-ID: 81182626-e836-061f-8f19-204edac18922@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-committers pgsql-hackers

On 12/11/19 08:35, Andres Freund wrote:
> I think we need to see pg_waldump output for the preceding records. That
> might allow us to see why there's a toast record that's being associated
> with this table, despite there not being a toast table.
Unfortunately the WAL logs are no longer available at this time.  :(

I did a little poking around in the core file and searching source code
but didn't find anything yet.  Is there any memory structure that would
have the preceding/following records cached in memory?  If so then I
might be able to extract this from the core dumps.

> Seems like we clearly should add an elog(ERROR) here, so we error out,
> rather than crash.
done - in the commit that I replied to when I started this thread :)

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=69f883fef14a3fc5849126799278abcc43f40f56

> Has there been DDL to this table?
I'm not sure that we will be able to find out at this point. 

> Could you print out *change?

This was also in the original email - here it is:

(gdb) print *change
$1 = {lsn = 9430473343416, action = REORDER_BUFFER_CHANGE_INSERT,
origin_id = 0, data = {tp = {relnode = {spcNode = 1663, dbNode = 16401,
        relNode = 16428}, clear_toast_afterwards = true, oldtuple = 0x0,
newtuple = 0x2b79313f9c68}, truncate = {
      nrelids = 70441758623359, cascade = 44, restart_seqs = 64, relids
= 0x0}, msg = {
      prefix = 0x40110000067f <Address 0x40110000067f out of bounds>,
message_size = 4294983724, message = 0x0},
    snapshot = 0x40110000067f, command_id = 1663, tuplecid = {node =
{spcNode = 1663, dbNode = 16401, relNode = 16428}, tid = {
        ip_blkid = {bi_hi = 1, bi_lo = 0}, ip_posid = 0}, cmin = 0, cmax
= 826252392, combocid = 11129}}, node = {prev = 0x30ac918,
    next = 0x30ac9b8}}

> Is this version of postgres effectively unmodified in any potentially
> relevant region (snapshot computations, generation of WAL records, ...)?
It's not changed from community code in any relevant regions.  (Also,
FYI, this is not Aurora.)

-Jeremy

--
Jeremy Schneider
Database Engineer
Amazon Web Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2019-12-14 00:22:39 Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
Previous Message Heikki Linnakangas 2019-12-13 22:02:52 Re: BUG #16162: create index using gist_trgm_ops leads to panic

Browse pgsql-committers by date

  From Date Subject
Next Message Andres Freund 2019-12-14 00:25:13 Re: logical decoding bug: segfault in ReorderBufferToastReplace()
Previous Message Heikki Linnakangas 2019-12-13 22:03:33 pgsql: Fix crash when a page was split during GiST index creation.

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-12-14 00:25:13 Re: logical decoding bug: segfault in ReorderBufferToastReplace()
Previous Message David Steele 2019-12-13 23:50:25 Re: non-exclusive backup cleanup is mildly broken