Re: COPY as a set returning function

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(dot)ringer(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY as a set returning function
Date: 2017-01-26 01:51:38
Message-ID: CADkLM=ex0ebCojoC16K=A-J_ns=Vz9TcfJN1_uVfPWm=BV82tg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
>> I don't understand why do we have all these checks. Can't we just pass
>> the values to COPY and let it apply the checks? That way, when COPY is
>> updated to support multibyte escape chars (for example) we don't need to
>> touch this code. Together with removing the unneeded braces that would
>> make these stanzas about six lines long instead of fifteen.
>>
>
> If I understand you correctly, COPY (via BeginCopyFrom) itself relies on
> having a relation in pg_class to reference for attributes.
> In this case, there is no such relation. So I'd have to fake a relcache
> entry, or refactor BeginCopyFrom() to extract a ReturnSetInfo from the
> Relation and pass that along to a new function BeginCopyFromReturnSet. I'm
> happy to go that route if you think it's a good idea.
>
>
>>
>>
>> > + tuple = heap_form_tuple(tupdesc,values,nulls);
>> > + //tuple = BuildTupleFromCStrings(attinmeta,
>> field_strings);
>> > + tuplestore_puttuple(tupstore, tuple);
>>
>> No need to form a tuple; use tuplestore_putvalues here.
>>
>
> Good to know!
>
>
>
>>
>> I wonder if these should be an auxiliary function in copy.c to do this.
>> Surely copy.c itself does pretty much the same thing ...
>>
>
> Yes. This got started as a patch to core because not all of the parts of
> COPY are externally callable, and aren't broken down in a way that allowed
> for use in a SRF.
>
> I'll get to work on these suggestions.
>

I've put in some more work on this patch, mostly just taking Alvaro's
suggestions, which resulted in big code savings.

I had to add a TupleDesc parameter to BeginCopy() and BeginCopyFrom(). This
seemed the easiest way to leverage the existing tested code (and indeed, it
worked nearly out-of-the-box). The only drawback is that a minor change
will have to be made to the BeginCopyFrom() call in file_fdw.c, and any
other extensions that leverage COPY. We could make compatibility functions
that take the original signature and pass it along to the corresponding
function with rsTupDesc set to NULL.

Some issues:
- I'm still not sure if the direction we want to go is a set returning
function, or a change in grammar that lets us use COPY as a CTE or similar.
- This function will have the same difficulties as adding the program
option did to file_fdw: there's very little we can reference that isn't
os/environment specific
- Inline (STDIN) prompts the user for input, but gives the error: server
sent data ("D" message) without prior row description ("T" message). I
looked for a place where the Relation was consulted for the row
description, but I'm not finding it.

I can continue to flesh this out with documentation and test cases if there
is consensus that this is the way to go.

# select * from copy_srf('echo "x\ty"',true) as t(x text, y text);
x | y
---+---
x | y
(1 row)

Time: 1.074 ms
# select * from copy_srf('echo "x\t4"',true) as t(x text, y integer);
x | y
---+---
x | 4
(1 row)

Time: 1.095 ms
# select * from copy_srf(null) as t(x text, y integer);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> a 4
>> b 5
>> \.
server sent data ("D" message) without prior row description ("T" message)

Attachment Content-Type Size
copy_srf_function_003.diff text/plain 12.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2017-01-26 01:58:41 Re: Checksums by default?
Previous Message Stephen Frost 2017-01-26 01:42:34 Re: pgbench more operators & functions