Skip site navigation (1) Skip section navigation (2)

Re: What is the maximum encoding-conversion growth rate, anyway?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: What is the maximum encoding-conversion growth rate, anyway?
Date: 2007-05-29 02:23:42
Message-ID: 24469.1180405422@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
> I'm afraid we have to mke it larger, rather than smaller for 8.3. For
> example 0x82f5 in SHIFT_JIS_2004 (new in 8.3) becomes *pair* of 3
> bytes UTF_8 (0x00e3818b and 0x00e3829a). See
> util/mb/Unicode/shift_jis_2004_to_utf8_combined.map for more details.

> So the worst case is now 6, rather than 3.

Yipes.

> Can we add a column to pg_conversion which represents the "growth
> rate"? This would reduce the rate for most encodings much smaller than
> 6.

We need to do something, but the pg_conversion catalog seems a bad place
to put the info --- don't we have places that need to be able to do
conversion without catalog access?

Perhaps better would be to redefine the API for the conversion functions
so that they palloc their own result space.  Then each conversion
function would have to know the maximum growth rate for its particular
conversion.  This change would also make it feasible for a conversion
function to prescan the data and determine an exact output size, if that
seemed worthwhile because the potential growth rate was too extreme.

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2007-05-29 02:30:38
Subject: Re: CREATE TABLE LIKE INCLUDING INDEXES support
Previous:From: Greg SmithDate: 2007-05-29 02:21:54
Subject: Re: Logging checkpoints and other slowdown causes

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group