From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: logical decoding / rewrite map vs. maxAllocatedDescs |
Date: | 2018-08-14 14:05:29 |
Message-ID: | 9552a9ef-250d-c7bf-abca-0c0533215fee@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 08/14/2018 01:49 PM, Tomas Vondra wrote:
> On 08/13/2018 04:49 PM, Andres Freund wrote:
>> Hi,
>>
>> On 2018-08-13 11:46:30 -0300, Alvaro Herrera wrote:
>>> On 2018-Aug-11, Tomas Vondra wrote:
>>>
>>>> Hmmm, it's difficult to compare "bt full" output, but my backtraces
>>>> look
>>>> somewhat different (and all the backtraces I'm seeing are 100% exactly
>>>> the same). Attached for comparison.
>>>
>>> Hmm, looks similar enough to me -- at the bottom you have the executor
>>> doing its thing, then an AcceptInvalidationMessages in the middle
>>> section atop which sit a few more catalog accesses, and further up from
>>> that you have another AcceptInvalidationMessages with more catalog
>>> accesses. AFAICS that's pretty much the same thing Andres was
>>> describing.
>>
>> It's somewhat different because it doesn't seem to involve a reload of a
>> nailed table, which my traces did. I wasn't able to reproduce the crash
>> more than once, so I'm not at all sure how to properly verify the issue.
>> I'd appreciate if Thomas could try to do so again with the small patch I
>> provided.
>>
>
> I'll try in the evening. I've tried reproducing it on my laptop, but I
> can't make that happen for some reason - I know I've seen some crashes
> here, but all the reproducers were from the workstation I have at home.
>
> I wonder if there's some subtle difference between the two boxes, making
> it more likely on one of them ... the whole environment (distribution,
> packages, compiler, ...) should be exactly the same, though. The only
> thing I can think of is different CPU speed, possibly making some race
> conditions more/less likely. No idea.
>
I take that back - I can reproduce the crashes, both with and without
the patch, all the way back to 9.6. Attached is a bunch of backtraces
from various versions. There's a bit of variability depending on which
pgbench script gets started first (insert.sql or vacuum.sql) - in one
case (when vacuum is started before insert) the crash happens in
InitPostgres/RelationCacheInitializePhase3, otherwise it happens in
exec_simple_query.
Another observation is that the failing COPY is not necessary, I can
reproduce the crashes without this (so even with wal_level=replica).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
crash-10.log.gz | application/gzip | 9.2 KB |
crash-11.log.gz | application/gzip | 9.3 KB |
crash-11-2.log.gz | application/gzip | 13.1 KB |
crash-11-3.log.gz | application/gzip | 10.3 KB |
crash-96.log.gz | application/gzip | 10.1 KB |
crash-96-2.log.gz | application/gzip | 10.1 KB |
crash-96-logical.log.gz | application/gzip | 11.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2018-08-14 14:38:56 | Re: [HACKERS] proposal: schema variables |
Previous Message | Peter Eisentraut | 2018-08-14 13:35:14 | Re: Memory leak with CALL to Procedure with COMMIT. |