Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

From: Daniel Farina <drfarina(at)acm(dot)org>
To: Greg Smith <greg(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Hannu Krosing <hannu(at)krosing(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Daniel Farina <dfarina(at)truviso(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
Date: 2009-12-05 08:32:04
Message-ID: 7b97c5a40912050032g5651c257q49b9cdca6da82e2d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 30, 2009 at 12:14 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> Jeff Davis wrote:
>
> COPY target FROM FUNCTION foo() WITH RECORDS;
>
>
> In what format would the records be?

As a not-quite aside, I think I have a better syntax suggestion. I
have recently been digging around in the grammar with the changes made
in the following commit:

commit a6833fbb85cb5212a9d8085849e7281807f732a6
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Date: Mon Sep 21 20:10:21 2009 +0000

Define a new, more extensible syntax for COPY options.

This is intentionally similar to the recently revised syntax for EXPLAIN
options, ie, (name value, ...). The old syntax is still supported for
backwards compatibility, but we intend that any options added in future
will be provided only in the new syntax.

Robert Haas, Emmanuel Cecchet

As it turns out, the following syntax may work pretty well:

COPY y TO FUNCTION (setup_function the_setup_function('some arg', 3, 7, 42))

Basically the func_expr reduction fits very neatly into the
copy_generic_opt_elem reduction:

copy_generic_opt_elem:
ColLabel copy_generic_opt_arg
{
$$ = (Node *) makeDefElem($1, $2);
}
| ColLabel func_expr
{
$$ = (Node *) $2;
}
;

Now we can use more or less any reasonable number of symbol names and
function calls we desire. This makes life considerably easier, I
think...

We can also try to refactor COPY's internals to take advantage of
these features (and potentially reduce the number of mechanisms. For
example, the legacy "COPY ... TO '/place' WITH CSV" perhaps can be
more verbosely/generically expressed as:

COPY ... TO FUNCTION (setup_function to_file('/place'),
record_converter csv_formatter,
stream_function fwrite
end_function fclose);

We can also add auxiliary symbols for error handling behavior. For
example, were the COPY to fail for some reason maybe it would make
sense "on_error" to call "unlink" to clean up the partially finished
file.

I also have what I think is a somewhat interesting hack. Consider
some of the functions up there without arguments (presumably they'd be
called with a somewhat fixed contract the mechanics of COPY itself):
how does one disambiguate them? Ideally, one could sometimes use
literal arguments (when the return value of that function is desired
to be threaded through the other specified functions) and other times
it'd be nice to disambiguate functions via type names. That would
look something like the following:

COPY ... TO FUNCTION (setup_function to_file('/place'),
record_converter csv_formatter(record),
stream_function fwrite(bytea),
end_function fclose(text));

I think this is possible to implement without much ambiguity, drawing
on the observation that the COPY statement does not have -- and
probably will never have -- references via Var(iable) node, unlike
normal SQL statements such as SELECT, INSERT, et al. That means we
might be able disambiguate using the following rules when scanning the
funcexpr's arguments during the semantic analysis phase to figure out
what to do:

* Only literal list items found: it's a function call with the types
of those literals. Ex: my_setup_function('foo'::text, 3)

* Only non-literal list items found: it's type specifiers. Ex:
csv_formatter(record).

* Both literal and non-literal values found: report an error.

This only works because no cases where a non-literal quantity could be
confused with a type name come to mind. If one could name a type "3"
and being forced to double-quote "3" to get your type disambiguated
was just too ugly, then we are at an impasse. But otherwise I think
this may work quite well.

Common constellations of functions could perhaps be bound together
into a DDL to reduce the amount of symbol soup going on here, but that
seems like a pretty clean transition strategy at some later time.
Most of the functionality could still be captured with this simple
approach for now...

Also note that factoring out high-performance implementations of
things like csv_formatter (and friends: pg_binary_formatter) will
probably take some time, but ultimately I think all the existing
functionality could be realized as a layer of syntactic sugar over
this mechanism.

fdr

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hitoshi Harada 2009-12-05 08:50:33 Re: add more frame types in window functions (ROWS)
Previous Message Tom Lane 2009-12-05 06:21:22 Re: First feature patch for plperl - draft [PATCH]