Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Daniel Farina <drfarina(at)gmail(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Hannu Krosing <hannu(at)krosing(dot)net>, Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Daniel Farina <dfarina(at)truviso(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
Date: 2009-11-30 02:35:45
Message-ID: 1259548545.3355.43.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2009-11-26 at 18:30 -0800, Daniel Farina wrote:
> Okay, so this thread sort of wandered into how we could refactor other
> elements of COPY. Do we have a good sense on what we should do to the
> current patch (or at least the idea represented by it) to get it into
> a committable state within finite time?

We're in the middle of a commitfest, so a lot of hackers are
concentrating on other patches. In a week or two, when it winds down,
people will be more willing to make decisions on new proposals and
designs. I still think this thread has been productive.

> I think adding a bytea and/or text mode is once such improvement...I
> am still reluctant to give up on INTERNAL because the string buffer
> passed in the INTERNAL scenario is ideal for C programmers -- the
> interface is even simpler than dealing with varlena types. But I
> agree that auxiliary modes should exist to enable easier hacking.

I like the idea of an internal mode as well. We may need some kind of
performance numbers to justify avoiding the extra memcpy, though.

> The thorniest issue in my mind is how state can be initialized
> retained and/or modified between calls to the bytestream-acceptance
> function.
>
> Arguably it is already in a state where it is no worse than dblink,
> which itself has a global hash table to manage state.

The idea of using a separate type of object (e.g. "CREATE COPYSTREAM")
to bundle the init/read/write/end functions together might work. That
also allows room to specify what the functions should accept
(internal/bytea/text).

I think that's the most elegant solution (at least it sounds good to
me), but others might not like the idea of a new object type just for
this feature. Perhaps if it fits nicely within an overall SQL/MED-like
infrastructure, it will be easier to justify.

> Also, if you look carefully at the dblink test suite I submitted,
> you'll see an interesting trick: one can COPY from multiple sources
> consecutively to a single COPY on a remote node when in text mode
> (binary mode has a header that cannot be so neatly catenated). This
> is something that's pretty hard to enable with any automatic
> startup-work-cleanup approach.

What if the network buffer is flushed in the middle of a line? Is that
possible, or is there a guard against that somewhere?

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Farina 2009-11-30 02:53:58 Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
Previous Message Jeff Davis 2009-11-30 02:23:39 Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION