Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

From: Alexey Kondratov <kondratov(dot)aleksey(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Anastasia Lubennikova <lubennikovaAV(at)gmail(dot)com>
Subject: Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
Date: 2017-12-01 14:33:04
Message-ID: CADfU8WyK_9x4Fio8Ww0fmTtP9Wtq7P=Ye7evUvz5MtTPnDLn5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 1, 2017 at 1:58 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> On a *very* quick look, please use an enum to return from NextCopyFrom
> rather than 'int'. The chunks that change bool to int are very
> odd-looking. This would move the comment that explains the value from
> copy.c to copy.h, obviously. Also, you seem to be using non-ASCII dashes
> in the descriptions of those values; please don't.

I will fix it, thank you.

>
> Or maybe I misunderstood the patch completely.
>

I hope so. Here is my thoughts how it all works, please correct me,
where I am wrong:

1) First, I have simply changed ereport level to WARNING for specific
validations (extra or missing columns, etc) if INGONE_ERRORS option is
used. All these checks are inside NextCopyFrom. Thus, this patch
performs here pretty much the same as before, excepting that it is
possible to skip bad lines, and this part should be safe as well

2) About PG_TRY/CATCH. I use it to catch the only one specific
function call inside NextCopyFrom--it is InputFunctionCall--which is
used just to parse datatype from the input string. I have no idea how
WAL write or trigger errors could get here

All of these is done before actually forming a tuple, putting it into
the heap, firing insert-related triggers, etc. I am not trying to
catch all errors during the row processing, only input data errors. So
why is it unsafe?

Best,

Alexey

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-12-01 14:34:45 Re: [HACKERS] [POC] Faster processing at Gather node
Previous Message Peter Eisentraut 2017-12-01 14:31:32 Re: [HACKERS] SQL procedures