Re: Add ENCODING option to COPY

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add ENCODING option to COPY
Date: 2011-02-04 14:55:25
Message-ID: AANLkTimYHaNMKS2gZn98n=v6L45co4GQ66Czrg4uvOkD@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2011/2/4 Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>:
> On Tue, Feb 1, 2011 at 13:08, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> wrote:
>>> The third patch is attached, modifying mb routines so that they can
>>> receive conversion procedures as FmgrInof * and save the function
>>> pointer in CopyState.
>>> I tested it with encoding option and could not see performance slowdown.
>> Hmm, sorry, the patch was wrong. Correct version is attached.
>
> Here is a brief review for the patch.

Thanks for the review.

> * Syntax: ENCODING encoding vs. ENCODING 'encoding'
> We don't have to quote encoding names in the patch, but we always need
> quotes for CREATE DATABASE WITH ENCODING. I think we should modify
> CREATE DATABASE to accept unquoted encoding names aside from the patch.

I followed the syntax in SET client_encoding TO xxx, where you don't
need quote xxx. I didn't care CREATE DATABASE.

> Changes in pg_wchar.h are the most debatable parts of the patch.
> The patch adds pg_cached_encoding_conversion(). We normally use
> pg_do_encoding_conversion(), but it is slower than the proposed
> function because it lookups conversion proc from catalog every call.
>
> * Can we use #ifndef FRONTEND in the header?
> Usage of fmgr.h members will broke client applications without the #ifdef,
> but I guess client apps don't always have definitions of FRONTEND.
> If we don't want to change pg_wchar.h, pg_conversion_fn.h might be
> a better header for the new API because FindDefaultConversion() is in it.

It doesn't look to me like clear solution. FindDefaultConversion() is
implemented in pg_conversion.c, whereas
pg_cached_encoding_conversion() in mbutils.c. Maybe adding another
header, namely pg_wchar_backend.h?

> * CopyState can have conv_proc entity as a member instead of the pointer.
> * need_transcoding checks could be replaced with conv_proc IS NULL check.

No, need_transcoding means you need verification even if the target
and server encoding is the same. See comments in CopyTo().

If pg_wchar_backend .h is acceptable, I'll post the revised patch.

Regards,

--
Hitoshi Harada

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-02-04 15:10:56 Re: SQL/MED - file_fdw
Previous Message Nick Lello 2011-02-04 14:55:01 Re: [HACKERS] Slow count(*) again...