Re: slow queries over information schema.tables

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: slow queries over information schema.tables
Date: 2018-12-20 05:24:37
Message-ID: 23817.1545283477@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> ... However, I wonder what people would think of a
> more aggressive approach, viz:
> -CREATE DOMAIN sql_identifier AS character varying COLLATE "C";
> +CREATE DOMAIN sql_identifier AS name;
>
> I've not checked to verify that sql_identifier is used for all and only
> those view columns that expose "name" catalog columns. If the SQL
> committee was sloppy about that, this idea might not work.

I poked into this by instrumenting the parser to see what type conversions
it inserts into the information_schema views. It appears that the vast
majority of conversions to sql_identifier are indeed on "name" columns,
but we have some occurrences of cases like this:

CAST(a.attnum AS sql_identifier) AS dtd_identifier,

and some like this:

CAST(p.proname || '_' || CAST(p.oid AS text) AS sql_identifier) AS specific_name

It doesn't look to me like converting to name rather than varchar would
have any real performance consequence in either case: certainly we're not
going to be able to propagate WHERE conditions on these view columns back
to any existing catalog index, regardless of the cast. However, the
second case offers something else to worry about: what if the
concatenation yields a string longer than NAMEDATALEN? As the code
stands, the view will simply return a string that's too long to be a name,
which arguably is a violation of SQL spec. If we change sql_identifier
to be "name", the cast will silently truncate, which also arguably is a
violation of SQL spec, because I think specific_name is supposed to be
unique. (The point of concatenating the function OID is to make it so.)

Perhaps we could fix this by truncating the p.proname part to ensure
that the concatenation result fits in NAMEDATALEN. I'm not sure about
a good way to get a correct value of NAMEDATALEN into the
information_schema script, though. Worse, I don't think we expose any
convenient way to truncate a string based on byte length rather than
character length (substr() does the latter). So I think that a reasonable
way to tackle this might be to provide a C function along the lines of

nameconcatoid(name, oid) returns name

which contracts to produce "$1 || '_' || $2" while truncating $1 only as
much as needed to make the result fit in NAMEDATALEN.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-12-20 05:33:52 Re: [PATCH] Improve tab completion for CREATE TABLE
Previous Message Pavel Stehule 2018-12-20 05:21:38 Re: slow queries over information schema.tables