Re: Augment every test postgresql.conf

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Augment every test postgresql.conf
Date: 2019-05-12 01:56:15
Message-ID: 20190512015615.GD1124997@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 07, 2019 at 07:56:02AM -0400, Andrew Dunstan wrote:
> On Sun, Apr 7, 2019 at 2:41 AM Noah Misch <noah(at)leadboat(dot)com> wrote:
> >
> > On Sun, Dec 30, 2018 at 10:32:31AM -0500, Andrew Dunstan wrote:
> > > On 12/30/18 12:53 AM, Noah Misch wrote:
> > > > 2. stats_temp_directory is incompatible with TAP suites that start more than
> > > > one node simultaneously.
> >
> > > The obvious quick fix would be to have PostgresNode.pm set this to the
> > > default after inserting the TEMP_CONFIG file.
> >
> > I'd like to get $SUBJECT in place for variables other than
> > stats_temp_directory, using your quick fix idea. Attached. When its time
> > comes, your stats_temp_directory work can delete that section.
>
> Looks good.

Pushed. This broke 010_dump_connstr.pl on bowerbird, introducing 'invalid
byte sequence for encoding "UTF8"' errors. That's because log_connections
renders this 010_dump_connstr.pl solution insufficient:

# In a SQL_ASCII database, pgwin32_message_to_UTF16() needs to
# interpret everything as UTF8. We're going to use byte sequences
# that aren't valid UTF-8 strings, so that would fail. Use LATIN1,
# which accepts any byte and has a conversion from each byte to UTF-8.
$ENV{LC_ALL} = 'C';
$ENV{PGCLIENTENCODING} = 'LATIN1';

The log_connections message prints before CheckMyDatabase() calls
pg_perm_setlocale() to activate that LATIN1 database encoding. Since
bowerbird does a non-NLS build, GetMessageEncoding()==PG_SQL_ASCII at that
time. Some options:

1. Make this one test explicitly set log_connections = off. This workaround
restores what we had a day ago.

2. Move the log_connections message after CheckMyDatabase() calls
pg_perm_setlocale(), so it gets regular post-startup encoding treatment.
That fixes this particular test. It's still wrong when a database's name
is not valid in that database's encoding.

3. If GetMessageEncoding()==PG_SQL_ASCII, make pgwin32_message_to_UTF16()
assume the text is already UTF8, like it does when not in a transaction.
If UTF8->UTF16 conversion fails, the caller will send untranslated bytes to
write() or ReportEventA().

4. If GetMessageEncoding()==PG_SQL_ASCII, make pgwin32_message_to_UTF16()
return NULL. The caller will always send untranslated bytes to write() or
ReportEventA(). This seems consistent with the SQL_ASCII concept and with
pg_do_encoding_conversion()'s interpretation of SQL_ASCII.

5. When including a datname or rolname value in a message, hex-escape
non-ASCII bytes. They are byte sequences, not text of known encoding.
This preserves the most information, but it's overkill and ugly in the
probably-common case of one encoding across all databases of a cluster.

I'm inclined to do (1) in back branches and (4) in HEAD only. (If starting
fresh today, I would store the encoding of each rolname and dbname or just use
UTF8 for those particular fields.) Other preferences?

Thanks,
nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-05-12 02:43:59 Re: Augment every test postgresql.conf
Previous Message Rikard Falkeborn 2019-05-12 01:18:08 Wrong dead return value in jsonb_utils.c