Re: BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Cc: digoal(at)126(dot)com
Subject: Re: BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding
Date: 2014-02-20 01:22:13
Message-ID: 4385.1392859333@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

I wrote:
>> The minimum-refactoring solution to this would be to tweak
>> pg_do_encoding_conversion() so that if the src_encoding is SQL_ASCII but
>> the dest_encoding isn't, it does pg_verify_mbstr() rather than nothing.

>> I'm not sure if this would break anything we need to have work,
>> though. Thoughts? Do we want to back-patch such a change?

> I looked through all the callers of pg_do_encoding_conversion(), and
> AFAICS this change is a good idea. There are a whole bunch of places
> that use pg_do_encoding_conversion() to convert from the database encoding
> to encoding X (most usually UTF8), and right now if you do that in a
> SQL_ASCII database you have no assurance whatever that what is produced
> is actually valid in encoding X. I think we need to close that loophole.

The more I looked into mbutils.c, the less happy I got. The attached
proposed patch takes care of the missing-verification hole in
pg_do_encoding_conversion() and pg_server_to_any(), and also gets rid
of what I believe to be obsolete provisions in pg_do_encoding_conversion
to "work" if called outside a transaction --- if you consider it working
to completely fail to honor its API contract. That should no longer be
necessary now that we perform client<->server encoding conversions via
perform_default_encoding_conversion rather than here.

For testing I inserted an "Assert(IsTransactionState());" at the top of
pg_do_encoding_conversion(), and couldn't trigger it, so I'm fairly sure
this change is OK. Still, back-patching it might not be a good thing.
On the other hand, if there is still any code path that can get here
outside a transaction, we probably ought to find out rather than allow
completely bogus data to be returned.

I also made some more-cosmetic improvements, notably removing a bunch
of Asserts that are certainly dead code because the relevant variables
are never NULL.

I've not done anything yet about simplifying unnecessary calls of
pg_do_encoding_conversion into pg_server_to_any/pg_any_to_server.

How much of this is back-patch material, do you think?

regards, tom lane

Attachment Content-Type Size
mbutils-fixes.patch text/x-diff 16.6 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2014-02-20 04:25:55 Re: BUG #9278: Error: SQLSTATE[42702]: Ambiguous column: 7 ERROR: column reference "tid" is ambiguous LINE 8: ...
Previous Message Tom Lane 2014-02-19 23:44:00 Re: Connection errors in PostgreSQL 9.3.2

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2014-02-20 01:27:34 Re: should we add a XLogRecPtr/LSN SQL type?
Previous Message Michael Paquier 2014-02-20 01:06:01 Re: should we add a XLogRecPtr/LSN SQL type?