Re: Bug #728: Interactions between bytea and character encoding when doing analyze

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Anders Hammarquist <iko(at)strakt(dot)com>, pgsql-bugs(at)postgresql(dot)org, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Subject: Re: Bug #728: Interactions between bytea and character encoding when doing analyze
Date: 2002-08-04 02:25:54
Message-ID: 6274.1028427954@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Joe Conway <mail(at)joeconway(dot)com> writes:
> (gdb) bt
> #0 pg_verifymbstr (mbstr=0x837a698 "42", len=2) at wchar.c:541
> #1 0x08149c26 in textin (fcinfo=0xbfffeca0) at varlena.c:191
> #2 0x08160579 in DirectFunctionCall1 (func=0x8149c00 <textin>,
> arg1=137864856) at fmgr.c:657
> #3 0x080bbffa in update_attstats (relid=74723, natts=2,
> vacattrstats=0x8379f58) at analyze.c:1740

Ah. So the issue is that ANALYZE tries to do textin(byteaout(...))
in order to produce a textual representation of the most common value
in the BYTEA column, and apparently textin feels that the string
generated by byteaout is not legal text. While Joe says that the
problem has gone away in CVS tip, I'm not sure I believe that.

A possible answer is to change the pg_statistics columns from text to
some other less picky datatype. (bytea maybe ;-)) Or should we
conclude that text is broken and needs to be fixed? Choice #3 would
be "bytea is broken and needs to be fixed", but I don't care for that
answer --- if bytea can produce an output string that will break
pg_statistics, then so can some other future datatype.

Comments?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Bruce Momjian 2002-08-04 02:59:23 Re: "analyze" putting wrong reltuples in pg_class
Previous Message Tom Lane 2002-08-04 02:13:21 Re: "analyze" putting wrong reltuples in pg_class