Re: Bug #728: Interactions between bytea and character encoding

From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Anders Hammarquist <iko(at)strakt(dot)com>, pgsql-bugs(at)postgresql(dot)org, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Subject: Re: Bug #728: Interactions between bytea and character encoding
Date: 2002-08-04 05:26:28
Message-ID: 3D4CBB04.4090906@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tom Lane wrote:
> Ah. So the issue is that ANALYZE tries to do textin(byteaout(...))
> in order to produce a textual representation of the most common value
> in the BYTEA column, and apparently textin feels that the string
> generated by byteaout is not legal text. While Joe says that the
> problem has gone away in CVS tip, I'm not sure I believe that.

I didn't either, except I tried it and it worked ;-) But you're
undoubtedly correct that there are other cases which would break the
current code.

> A possible answer is to change the pg_statistics columns from text to
> some other less picky datatype. (bytea maybe ;-)) Or should we
> conclude that text is broken and needs to be fixed? Choice #3 would
> be "bytea is broken and needs to be fixed", but I don't care for that
> answer --- if bytea can produce an output string that will break
> pg_statistics, then so can some other future datatype.

BYTEA sounds like the best answer to me. TEXT is supposed to honor
character set specific peculiarities, while bytea should be able to
represent any arbitrary set of bytes.

Joe

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Bruce Momjian 2002-08-04 05:55:10 Re: Bug #728: Interactions between bytea and character encoding
Previous Message Bruce Momjian 2002-08-04 02:59:23 Re: "analyze" putting wrong reltuples in pg_class