Re: COPY FROM WHEN condition

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Surafel Temesgen <surafel3000(at)gmail(dot)com>, Adam Berlin <berlin(dot)ab(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: COPY FROM WHEN condition
Date: 2018-12-04 10:06:16
Message-ID: 4e9c8b43-9f84-37d4-960c-1c4f4ec83c65@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 12/4/18 10:44 AM, Alvaro Herrera wrote:
> After reading this thread, I think I like WHERE better than FILTER.
> Tally:
>
> WHERE: Adam Berlin, Lim Myungkyu, Dean Rasheed, yours truly
> FILTER: Tomas Vondra, Surafel Temesgen
>
> Couldn't find others expressing an opinion in this regard.
>

While I still like FILTER more, I won't object to using WHERE if others
thinks it's a better choice.

> On 2018-Nov-30, Tomas Vondra wrote:
>
>> I think it should be enough just to switch to CIM_SINGLE and
>> increment the command counter after each inserted row.
>
> Do we apply command counter increment per row with some other COPY
> option?

I don't think we increment the command counter anywhere, most likely
because COPY is not allowed to run any queries directly so far.

> Per-row CCI makes me a bit uncomfortable because with you'd get in
> trouble with a large copy. I think it's particularly nasty here,
> precisely because you may want to filter out some rows of a very
> large file, and the CCI may prevent that from working.
Sure.

> I'm not convinced by the example case of reading how many tuples
> you've imported so far in the WHERE/WHEN/FILTER clause each time
> (that'd become incrementally slower as it progresses).
>

Well, not sure how else am I supposed to convince you? It's an example
of a behavior that's IMHO surprising and inconsistent with things that
might be reasonably expected to behave similarly. It may not be a
perfect example, but that's the price for simplicity.

FWIW, another way to achieve mostly the same filtering feature is a
BEFORE INSERT trigger:

create or replace function copy_filter() returns trigger as $$
declare
v_c int;
begin
select count(*) into v_c from t;
if v_c >= 100 then
return null;
end if;
return NEW;
end; $$ language plpgsql;

create trigger filter before insert on t
for each row execute procedure copy_filter();

This behaves consistently with INSERT, i.e. it enforces the total count
constraint the same way. And the COPY FILTER behaves differently.

FWIW I do realize this is not a particularly great check - for example,
it will not see effects of concurrent transactions etc. All I'm saying
is I find it annoying/strange that it behaves differently.

Also, considering the trigger does the right thing, maybe I spoke too
early about the command counter not being incremented?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message didier 2018-12-04 12:18:56 Re: [proposal] Add an option for returning SQLSTATE in psql error message
Previous Message Alvaro Herrera 2018-12-04 09:44:18 Re: COPY FROM WHEN condition