Quick Links

Re: Optimizing COPY

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Optimizing COPY
Date:	2008-10-30 13:29:33
Message-ID:	11008.1225373373@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> The basic idea is to replace the custom loop in CopyReadLineText with
> memchr(), because memchr() is very fast. To make that possible, perform
> the client-server encoding conversion on each raw block that we read in,
> before splitting it into lines. That way CopyReadLineText only needs to
> deal with server encodings, which is required for the memchr() to be safe.

Okay, so of course the trick with that is the block boundary handling.
The protocol says the client can break the data apart however it likes.
I see you've tried to deal with that, but this part seems wrong:

> ! if (convertable_bytes == 0)
> ! {
> ! /*
> ! * EOF, and there was some unconvertable chars at the end.
> ! * Call pg_client_to_server on the remaining bytes, to
> ! * let it throw an error.
> ! */
> ! cvt = pg_client_to_server(raw, inbytes);
> ! Assert(false); /* pg_client_to_server should've errored */
> ! }

You're not (AFAICS) definitely at EOF here; you might just have gotten
a pathologically short message.

regards, tom lane

In response to

Optimizing COPY at 2008-10-30 13:14:14 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2008-10-30 13:31:14	Re: User defined I/O conversion casts
Previous Message	Tom Lane	2008-10-30 13:21:46	Re: Hot Standby: Caches and Locks