pg_stat_statements vs escape_string_warning

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: pg_stat_statements vs escape_string_warning
Date: 2015-01-21 18:14:59
Message-ID: 25700.1421864099@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I happened to notice that if you run the regression tests with
pg_stat_statements installed, you will often (not always) get
a failure that looks like this:

*** src/test/regress/expected/plpgsql.out Tue Jan 20 12:01:52 2015
--- src/test/regress/results/plpgsql.out Wed Jan 21 12:43:19 2015
***************
*** 4672,4677 ****
--- 4672,4683 ----
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
QUERY: SELECT 'foo\\bar\041baz'
CONTEXT: PL/pgSQL function strtest() line 4 at RETURN
+ WARNING: nonstandard use of \\ in a string literal
+ LINE 1: SELECT 'foo\\bar\041baz'
+ ^
+ HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
+ QUERY: SELECT 'foo\\bar\041baz'
+ CONTEXT: PL/pgSQL function strtest() line 4 at RETURN
strtest
-------------
foo\bar!baz

That is, we're getting an extra copy of the "escape string warning"
message. Investigation disclosed that the reason is that
pg_stat_statements' pgss_post_parse_analyze hook re-runs the lexer
over the query string (see fill_in_constant_lengths()), so that you
get an extra instance of any side-effects in the lexer.

This is kind of annoying, especially since it's nondeterministic ---
if there's already a pg_stat_statements entry matching "SELECT ?" then
you don't see the extra warning.

What I'm inclined to do about this is add an escape_string_warning bool
field to struct core_yy_extra_type, which would be copied from the GUC
variable by scanner_init(); then check_string_escape_warning() would
consult that field instead of consulting the GUC directly. It would
then be possible for fill_in_constant_lengths to reset that field after
calling scanner_init so that no warnings appear during its reparse.

As a matter of cleanliness I'm inclined to do the same with
backslash_quote and standard_conforming_strings, so that callers of the
core lexer are not tied to using the prevailing GUC settings but can
control all these things.

This isn't a back-patchable bug fix, but given the lack of prior
complaints, maybe it doesn't matter. Alternatively, we could back-patch
only the addition of escape_string_warning to the struct: that would fit
into padding space in the struct so that there would be no ABI risk.

Comments, objections?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-01-21 18:28:12 Re: pg_stat_statements vs escape_string_warning
Previous Message Bruce Momjian 2015-01-21 17:50:00 Re: moving from contrib to bin