Re: Why format() adds double quote?

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why format() adds double quote?
Date: 2016-01-20 07:50:32
Message-ID: CAFj8pRDOHSNR5L6bvChrO2Do2-avOORtUwJq2XmrsMCHteO6gg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

2016-01-20 7:20 GMT+01:00 Tatsuo Ishii <ishii(at)postgresql(dot)org>:

> > 2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii(at)postgresql(dot)org>:
> >
> >> test=# select format('%I', t) from t1;
> >> format
> >> ----------
> >> aaa
> >> "AAA"
> >> "あいう"
> >> (3 rows)
> >>
> >> Why is the text value of the third line needed to be double quoted?
> >> (note that it is a multi byte character). Same thing can be said to
> >> quote_ident().
> >>
> >> We treat identifiers made of the multi byte characters without double
> >> quotation (non delimited identifier) in other places.
> >>
> >> test=# create table t2(あいう text);
> >> CREATE TABLE
> >> test=# insert into t2 values('aaa');
> >> INSERT 0 1
> >> test=# select あいう from t2;
> >> あいう
> >> --------
> >> aaa
> >> (1 row)
> >
> > format uses same routine as quote_ident. So quote_ident should be fixed
> > first.
>
> Yes, I had that in my mind too.
>
> Attached is the proposed patch to fix the bug.
> Regression tests passed.
>
> Here is an example after the patch. Note that the third row is not
> quoted any more.
>
> test=# select format('%I', あいう) from t2;
> format
> --------
> aaa
> "AAA"
> あああ
> (3 rows)
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
> diff --git a/src/backend/utils/adt/ruleutils.c
> b/src/backend/utils/adt/ruleutils.c
> index 3783e97..b93fc27 100644
> --- a/src/backend/utils/adt/ruleutils.c
> +++ b/src/backend/utils/adt/ruleutils.c
> @@ -9405,7 +9405,7 @@ quote_identifier(const char *ident)
> * would like to use <ctype.h> macros here, but they might yield
> unwanted
> * locale-specific results...
> */
> - safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_');
> + safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_' ||
> IS_HIGHBIT_SET(ident[0]));
>
> for (ptr = ident; *ptr; ptr++)
> {
> @@ -9413,7 +9413,8 @@ quote_identifier(const char *ident)
>
> if ((ch >= 'a' && ch <= 'z') ||
> (ch >= '0' && ch <= '9') ||
> - (ch == '_'))
> + (ch == '_') ||
> + (IS_HIGHBIT_SET(ch)))
> {
> /* okay */
> }
>
>
This patch ls simply - I remember I was surprised, so we allow any
multibyte char few months ago.

+1

Pavel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shulgin, Oleksandr 2016-01-20 07:50:43 Re: Stream consistent snapshot via a logical decoding plugin as a series of INSERTs
Previous Message Andres Freund 2016-01-20 07:34:59 Re: Logical decoding on standby