Re: Faster methods for getting SPI results

From: Chapman Flack <chap(at)anastigmatix(dot)net>
To: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Joe Conway <mail(at)joeconway(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Faster methods for getting SPI results
Date: 2017-08-02 02:30:30
Message-ID: 59813946.40508@anastigmatix.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/20/16 23:14, Jim Nasby wrote:
> I'm guessing one issue might be that
> we don't want to call an external interpreter while potentially holding page
> pins, but even then couldn't we just copy a single tuple at a time and save
> a huge amount of palloc overhead?

On 04/06/17 03:38, Craig Ringer wrote:
> Also, what rules apply in terms of what you can/cannot do from within
> a callback? Presumably it's unsafe to perform additional SPI calls,
> perform transactions, call into the executor, change the current
> snapshot, etc, but I would consider that reasonably obvious. Are there
> any specific things to avoid?

Confessing, right up front, that I'm not very familiar with the executor
or DestReceiver code, but thinking of issues that might be expected with
PLs, I wonder if there could be a design where the per-tuple callback
could sometimes return a status HAVE_SLOW_STUFF_TO_DO.

If it does, the executor could release some pins or locks, stack some
state, whatever allows it to (as far as practicable) relax restrictions
on what the callback would be allowed to do, then reinvoke the callback,
not with another tuple, but with OK_GO_DO_YOUR_SLOW_STUFF.

On return from that call, the executor could reacquire its stacked
state/locks/pins and resume generating tuples.

That way, a callback could, say, return normally 9 out of 10 times, just
quickly buffering up 10 tuples, and every 10th time return SLOW_STUFF_TO_DO
and get a chance to jump into the PL interpreter and deal with those 10 ...
(a) minimizing the restrictions on what the PL routine may do, and (b)
allowing any costs of state-stacking/lock-releasing-reacquiring, and control
transfer to the interpreter, to be amortized over some number of tuples.
How many tuples that should be might be an empirical question for any given
PL, but with a protocol like this, the callback has an easy way to control
it.

Or would that be overcomplicated?

-Chap

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-08-02 04:56:16 INSERT ON CONFLICT and partitioned tables
Previous Message Amit Kapila 2017-08-02 02:10:24 Re: Proposal for CSN based snapshots