Re: invalidly encoded strings

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: invalidly encoded strings
Date: 2007-09-10 03:22:26
Message-ID: 16212.1189394546@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> Would stringTypeDatum() in parse_type.c be a good place to put the
> pg_verifymbstr()?

Probably not, in its current form, since it hasn't got any idea where
the "char *string" came from; moreover it is not in any better position
than the typinput function to determine whether there was a bogus
embedded null.

OTOH, there may be no decent way to fix the embedded-null problem
other than by hacking the scanner to reject \0 immediately. If we
did that it would give us more flexibility about where to put the
encoding validity checks.

In any case, I feel dubious that checking in stringTypeDatum will cover
every code path. Somewhere around where A_Const gets transformed to
Const seems like it'd be a better plan. (But I think that in most
utility statement parsetrees, A_Const never does get transformed to
Const; and there seem to be a few places in gram.y where an SCONST
gives rise to something other than A_Const; so this is still not a
bulletproof choice, at least not without additional changes.)

In the short run it might be best to do it in scan.l after all. A few
minutes' thought about what it'd take to delay the decisions till later
yields a depressingly large number of changes; and we do not have time
to be developing mostly-cosmetic patches for 8.3. Given that
database_encoding is frozen for any one DB at the moment, and that that
is unlikely to change in the near future, insisting on a solution that
allows it to vary is probably unreasonable at this stage of the game.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2007-09-10 03:33:20 Re: invalidly encoded strings
Previous Message Kenneth Marshall 2007-09-10 02:42:58 Re: Hash index todo list item

Browse pgsql-patches by date

  From Date Subject
Next Message Andrew Dunstan 2007-09-10 03:33:20 Re: invalidly encoded strings
Previous Message Florian G. Pflug 2007-09-09 22:52:04 Re: WIP patch for latestCompletedXid method of computing snapshot xmax