Quick Links

Re: slow queries over information schema.tables

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc:	Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject:	Re: slow queries over information schema.tables
Date:	2018-12-20 05:24:37
Message-ID:	23817.1545283477@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I wrote:
> ... However, I wonder what people would think of a
> more aggressive approach, viz:
> -CREATE DOMAIN sql_identifier AS character varying COLLATE "C";
> +CREATE DOMAIN sql_identifier AS name;
>
> I've not checked to verify that sql_identifier is used for all and only
> those view columns that expose "name" catalog columns. If the SQL
> committee was sloppy about that, this idea might not work.

I poked into this by instrumenting the parser to see what type conversions
it inserts into the information_schema views. It appears that the vast
majority of conversions to sql_identifier are indeed on "name" columns,
but we have some occurrences of cases like this:

CAST(a.attnum AS sql_identifier) AS dtd_identifier,

and some like this:

CAST(p.proname || '_' || CAST(p.oid AS text) AS sql_identifier) AS specific_name

It doesn't look to me like converting to name rather than varchar would
have any real performance consequence in either case: certainly we're not
going to be able to propagate WHERE conditions on these view columns back
to any existing catalog index, regardless of the cast. However, the
second case offers something else to worry about: what if the
concatenation yields a string longer than NAMEDATALEN? As the code
stands, the view will simply return a string that's too long to be a name,
which arguably is a violation of SQL spec. If we change sql_identifier
to be "name", the cast will silently truncate, which also arguably is a
violation of SQL spec, because I think specific_name is supposed to be
unique. (The point of concatenating the function OID is to make it so.)

Perhaps we could fix this by truncating the p.proname part to ensure
that the concatenation result fits in NAMEDATALEN. I'm not sure about
a good way to get a correct value of NAMEDATALEN into the
information_schema script, though. Worse, I don't think we expose any
convenient way to truncate a string based on byte length rather than
character length (substr() does the latter). So I think that a reasonable
way to tackle this might be to provide a C function along the lines of

nameconcatoid(name, oid) returns name

which contracts to produce "$1 || '_' || $2" while truncating $1 only as
much as needed to make the result fit in NAMEDATALEN.

regards, tom lane

In response to

Re: slow queries over information schema.tables at 2018-12-19 23:14:37 from Tom Lane

Responses

Re: slow queries over information schema.tables at 2018-12-21 18:46:55 from Greg Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2018-12-20 05:33:52	Re: [PATCH] Improve tab completion for CREATE TABLE
Previous Message	Pavel Stehule	2018-12-20 05:21:38	Re: slow queries over information schema.tables