Re: Odd behavior with PG_TRY

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Odd behavior with PG_TRY
Date: 2017-01-04 22:48:20
Message-ID: 60eff92b-47db-6734-9756-27d117211d23@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/3/17 9:20 PM, Amit Kapila wrote:
> On Wed, Jan 4, 2017 at 3:47 AM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
>> On 1/2/17 9:47 PM, Tom Lane wrote:
>>> Correct coding would be
>>>
>>> volatile TupleDesc desc = slot->tts_tupleDescriptor;
>>> CallbackState * volatile myState = (CallbackState *) self;
>>> PLyTypeInfo * volatile args = myState->args;
>>>
>>> because what needs to be marked volatile is the pointer variable,
>>> not what it points at. I'm a bit surprised you're not getting
>>> "cast away volatile" warnings from the code as you have it.
>>
>>
>> Unfortunately, that didn't make a difference. Amit's suggestion of isolating
>> the single statement in a PG_TRY() didn't work either, but assigning
>> args->in.r.atts[i] to a pointer did.
>>
>
> Good to know that it worked, but what is the theory? From your
> experiment, it appears that in some cases accessing local pointer
> variables is okay and in other cases, it is not okay.

I can run some other experiments if you have any to suggest.

I do think it's interesting that the data appeared to be completely fine
until I actually ran whatever the first assembly instruction of the for
loop is, so presumably it was fine after the sigsetjmp() call (which I'm
assuming is what causes all the fuss to begin with...) From my
understanding of what volatile does, I can understand why it might be
necessary for something in the CATCH block to need it, but not in the TRY.

Two other things of note that might possibly make a difference here:

- This is happening inside a function used as a DestReceiver receiver
- The original call is a plpython function, calling a plpython function,
calling a plpython function (specifically, nested_call_one() in the
plpython regression test).

That does mean that the call stack looks something like this:

plpython
SPI_execute_callback
(my customer DestReceiverer Setup function (PLy_CSSetup) is called
somewhere in here, which is what populates myState)
plpython
SPI_execute_callback
(PLy_CSSetup gets called again)
plpython (this just returns a value)
After that plpython call, the executor is going to call PLy_CSreceive,
which is the function with this problematic code. So by the time this
error happens, there are two nested levels of plpython+SPI going on. I
originally thought the re-entrant calls were causing the problem, but
after monitoring what PLy_CSSetup was doing and what PLy_CSreceive was
getting that's not the case, or at least not the only reason for this.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-01-04 23:05:46 Re: ALTER TABLE parent SET WITHOUT OIDS and the oid column
Previous Message Thomas Munro 2017-01-04 22:03:10 Re: [sqlsmith] Crash reading pg_stat_activity