From: | Jeremy Schneider <schnjere(at)amazon(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: logical decoding bug: segfault in ReorderBufferToastReplace() |
Date: | 2019-12-20 23:21:30 |
Message-ID: | 187dfed1-7d97-4a8c-2932-b8a3d4dce697@amazon.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-committers pgsql-hackers |
On 12/13/19 16:25, Andres Freund wrote:
> On 2019-12-13 16:13:35 -0800, Jeremy Schneider wrote:
>> On 12/11/19 08:35, Andres Freund wrote:
>>> I think we need to see pg_waldump output for the preceding records. That
>>> might allow us to see why there's a toast record that's being associated
>>> with this table, despite there not being a toast table.
>> Unfortunately the WAL logs are no longer available at this time. :(
>>
>> I did a little poking around in the core file and searching source code
>> but didn't find anything yet. Is there any memory structure that would
>> have the preceding/following records cached in memory? If so then I
>> might be able to extract this from the core dumps.
>
> Well, not the records directly, but the changes could be, depending on
> the size of the changes. That'd already help. It depends a bit on
> whether there are subtransactions or not (txn->nsubtxns will tell
> you). Within one transaction, the currently loaded (i.e. not changes
> that are spilled to disk, and haven't currently been restored - see
> txn->serialized) changes are in ReorderBufferTXN->changes.
I did include the txn in the original post to this thread; there are 357
changes in the transaction and they are all in memory (none spilled to
disk a.k.a. serialized). No subtransactions. However I do see that
"txn.has_catalog_changes = true" which makes me wonder if that's related
to the bug.
So... now I know... walking a dlist in gdb and dumping all the changes
is not exactly a walk in the park! Need some python magic like Tomas
Vondra's script that decodes Nodes. I was not yet successful today in
figuring out how to do this... so the changes are there in the core dump
but I can't get them yet. :)
I also dug around the ReorderBufferIterTXNState a little bit but there's
nothing that isn't already in the original post.
If anyone has a trick for walking a dlist in gdb that would be awesome...
I'm off for holidays and won't be working on this for a couple weeks;
not sure whether it'll be possible to get to the bottom of it. But I
hope there's enough info in this thread to at least get a head start if
someone hits it again in the future.
> Well, I've heard mutterings that plain RDS postgres had some efficiency
> improvements around snapshots (in the GetSnapshotData() sense) - and
> that's an area where slightly wrong changes could quite plausibly
> cause a bug like this.
Definitely no changes around snapshots. I've never even heard anyone
talk about making changes like that in RDS PostgreSQL - feels to me like
people at AWS want it to be as close as possible to postgresql.org code.
Aurora is different; it feels to me like the engineering org has more
license to make changes. For example they re-wrote the subtransaction
subsystem. No changes to GetSnapshotData though.
-Jeremy
--
Jeremy Schneider
Database Engineer
Amazon Web Services
From | Date | Subject | |
---|---|---|---|
Next Message | Zhihong Zhang | 2019-12-21 00:39:28 | Re: Indexing on JSONB field not working |
Previous Message | Jeff Janes | 2019-12-20 22:57:37 | Re: Indexing on JSONB field not working |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2019-12-21 02:18:06 | Re: pgsql: Superuser can permit passwordless connections on postgres_fdw |
Previous Message | Andrew Dunstan | 2019-12-20 22:22:56 | Re: pgsql: Adjust test case added by commit 6136e94dc. |
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Lorenz | 2019-12-21 00:15:07 | Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D' |
Previous Message | Bruce Momjian | 2019-12-20 21:38:32 | Re: Session WAL activity |