COPY as a set returning function

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: COPY as a set returning function
Date: 2016-09-30 21:03:00
Message-ID: CADkLM=eC8RG-LjKwk-PDY8V39bewWY0RA++eem5i6A8U3D44kA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached is a _very_ rough patch implementing a proof-of-concept function
copy_srf();

It allows you to do things like this:

# select a,c,e from copy_srf('echo 12,345,67,89,2016-01-01',true) as t(a
integer, b text, c text, d text, e date);
a | c | e
----+----+------------
12 | 67 | 2016-01-01
(1 row)

Uses for this include:
- avoiding the pattern of creating a temp table just to select all the rows
back out and then discard the table (avoidable disk activity, avoidable oid
churn)
- avoiding the pattern of creating a file_fdw table in pg_temp just to drop
it after one select (avoidable oid churn)
- filtering file/program input by the columns that are relevant to the
user's needs.

This experiment arose from my realization that file_fdw just plugs into the
externally visible portions of copy.c to do all of it's work. So why
couldn't we do the same for a set returning function? Of course it wasn't
as simple as that. The existing Begin/NextCopyFrom functions require the
->rel to be a valid Oid...which we won't have in this context, so I had to
bypass that code and use CopyFromRawFields() directly...which isn't
externally visible, hence this being a patch to core rather than an
extension.

Currently the function only accepts two parameters, "filename" and
"is_program". Header is always false and csv mode is always true. Obviously
if we go forward on this, we'll want to add that functionality back in, but
I'm holding off for now to keep the example simple and wait for consensus
on future direction.

As for that future direction, we could either have:
- a robust function named something like copy_srf(), with parameters for
all of the relevant options found in the COPY command
- a function that accepts an options string and parse that
- we could alter the grammar to make COPY RETURNING col1, col3, col5 FROM
'filename' a legit CTE.

Regardless of the path forward, I'm going to need help in getting there,
hence this email. Thank you for your consideration.

Attachment Content-Type Size
copy_as_a_srf.diff application/octet-stream 8.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-09-30 21:19:19 Re: COPY as a set returning function
Previous Message Thomas Munro 2016-09-30 20:32:01 Re: ICU integration