Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Alexey Kondratov <kondratov(dot)aleksey(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Anastasia Lubennikova <lubennikovaAV(at)gmail(dot)com>
Subject: Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
Date: 2018-03-03 05:29:15
Message-ID: CAMsr+YH5mYTM_C-WrT10=G1HEVE9Xsgig=WeFKqrNJ8+-ChoHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3 March 2018 at 13:08, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com
> wrote:

> On 1/22/18 21:33, Craig Ringer wrote:
> > We don't have much in the way of rules about what input functions can or
> > cannot do, so you can't assume much about their behaviour and what must
> > / must not be cleaned up. Nor can you just reset the state in a heavy
> > handed manner like (say) plpgsql does.
>
> I think one thing to try would to define a special kind of exception
> that can safely be caught and ignored. Then, input functions can
> communicate benign parse errors by doing their own cleanup first, then
> throwing this exception, and then the COPY subsystem can deal with it.
>

That makes sense. We'd only use the error code range in question when it
was safe to catch without re-throw, and we'd have to enforce rules around
using a specific memory context. Of course no LWLocks could be held, but
that's IIRC true when throwing anyway unless you plan to proc_exit() in
your handler.

People will immediately ask for it to handle RI errors too, so something
similar would be needed there. But frankly, Pg's RI handling for bulk
loading desperately needs a major change in how it works to make it
efficient anyway, the current model of individual row triggers is horrible
for bulk load performance.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2018-03-03 05:35:29 Re: psql tab completion for ALTER INDEX SET
Previous Message Erik Rijkers 2018-03-03 05:19:56 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions