Re: [BUGS] bug or simply not enough stack space?

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>
Cc: Frank van Vugt <ftm(dot)van(dot)vugt(at)foxi(dot)nl>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: [BUGS] bug or simply not enough stack space?
Date: 2019-10-04 14:12:38
Message-ID: CAHyXU0yofd1JJs-B3n7qW7AjfjDuHm4VPuHXXqUMQ02E0UOP4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Jul 18, 2009 at 2:18 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Frank van Vugt <ftm(dot)van(dot)vugt(at)foxi(dot)nl> writes:
> >> I think I might have been overenthusiastic in trying to free resources
> >> during a subtransaction abort. Try this patch and see if you notice
> >> any bad side-effects.
>
> > All examples I had that crashed and burned, now work correctly and/or bail out
> > correctly where needed.
>
> > No side-effects noticed.
>
> Well, it had some :-(. I've committed a better version of that patch,
> which will be in 8.4.1. However, if you're not seeing any serious
> memory leakage then you could keep using the patch you have for the
> time being.

Reviving this ancient thread. I saw "did not find subXID" errors, in
9.6.12. Here is what happened.

2019-10-03 19:54:01 CDT [: @] []: LOG: checkpoints are occurring too
frequently (14 seconds apart)
2019-10-03 19:54:01 CDT [: @] []: HINT: Consider increasing the
configuration parameter "max_wal_size".
2019-10-03 19:54:15 CDT [: @] []: LOG: checkpoints are occurring too
frequently (14 seconds apart)
2019-10-03 19:54:15 CDT [: @] []: HINT: Consider increasing the
configuration parameter "max_wal_size".
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: did not find subXID 384134 in MyProc
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: CONTEXT: PL/pgSQL function
loadhistorydatafromysm_testing2() line 99 during exception cleanup
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: LOG: could not send data to client: Broken
pipe
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: CONTEXT: PL/pgSQL function
loadhistorydatafromysm_testing2() line 99 during exception cleanup
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: STATEMENT: select
LoadHistoryDataFromYSM_testing2();
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: ERROR: failed to re-find shared lock object
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: CONTEXT: PL/pgSQL function
loadhistorydatafromysm_testing2() line 99 during exception cleanup
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: STATEMENT: select
LoadHistoryDataFromYSM_testing2();
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: AbortSubTransaction while in ABORT
state
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: did not find subXID 384134 in MyProc
2019-10-03 19:53:31 CDT [: @] []: LOG: checkpoints are occurring too
frequently (16 seconds apart)
2019-10-03 19:53:31 CDT [: @] []: HINT: Consider increasing the
configuration parameter "max_wal_size".
2019-10-03 19:53:31 CDT [: @] []: LOG: checkpoints are occurring too
frequently (16 seconds apart)
2019-10-03 19:53:31 CDT [: @] []: HINT: Consider increasing the
configuration parameter "max_wal_size".
2019-10-03 19:53:47 CDT [: @] []: LOG: checkpoints are occurring too
frequently (16 seconds apart)
2019-10-03 19:53:47 CDT [: @] []: HINT: Consider increasing the
configuration parameter "max_wal_size".
2019-10-03 19:54:01 CDT [: @] []: LOG: checkpoints are occurring too
frequently (14 seconds apart)
2019-10-03 19:54:01 CDT [: @] []: HINT: Consider increasing the
configuration parameter "max_wal_size".
2019-10-03 19:54:15 CDT [: @] []: LOG: checkpoints are occurring too
frequently (14 seconds apart)
2019-10-03 19:54:15 CDT [: @] []: HINT: Consider increasing the
configuration parameter "max_wal_size".
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: did not find subXID 384134 in MyProc
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: CONTEXT: PL/pgSQL function
loadhistorydatafromysm_testing2() line 99 during exception cleanup
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: LOG: could not send data to client: Broken
pipe
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: CONTEXT: PL/pgSQL function
loadhistorydatafromysm_testing2() line 99 during exception cleanup
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: STATEMENT: select
LoadHistoryDataFromYSM_testing2();
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: ERROR: failed to re-find shared lock object
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: CONTEXT: PL/pgSQL function
loadhistorydatafromysm_testing2() line 99 during exception cleanup
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: STATEMENT: select
LoadHistoryDataFromYSM_testing2();
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: AbortSubTransaction while in ABORT
state
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: did not find subXID 384134 in MyProc
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: ERROR: failed to re-find shared lock object
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: AbortSubTransaction while in ABORT
state
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: did not find subXID 384134 in MyProc
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: ERROR: failed to re-find shared lock object
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: AbortSubTransaction while in ABORT
state
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: did not find subXID 384134 in MyProc
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: ERROR: failed to re-find shared lock object
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: AbortSubTransaction while in ABORT
state
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: did not find subXID 384134 in MyProc
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: you don't own a lock of type
ExclusiveLock
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: WARNING: ReleaseLockIfHeld: failed??
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: ERROR: failed to re-find shared lock object
2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
[10.22.236.83(54943)]: PANIC: ERRORDATA_STACK_SIZE exceeded
2019-10-03 19:58:37 CDT [: @] []: LOG: server process (PID 10715) was
terminated by signal 6: Aborted
2019-10-03 19:58:37 CDT [: @] []: LOG: terminating any other active
server processes

Couple interesting things here:
*) Logs are strongly out of order
*) "loadhistorydatafromysm_testing2()" is using pl/sh, which is a
known source of weird (but rare) instability issues (I'm assuming this
is underlying cause of issue)
*) everything restarted and is working fine
*) chance of providing reproducible case is zero

Bringing this up due to this thread:
https://github.com/laurenz/oracle_fdw/issues/328

I can't help but wonder if we have some kind of obscure issue that is
related to C extension problems; just throwing a data point on the
table.

merlin

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alvaro Herrera 2019-10-04 14:43:52 Re: BUG #16038: Alter table - SegFault
Previous Message PG Bug reporting form 2019-10-04 13:29:46 BUG #16038: Alter table - SegFault