Quick Links

Re: slow queries over information schema.tables

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc:	Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject:	Re: slow queries over information schema.tables
Date:	2018-12-19 23:14:37
Message-ID:	9288.1545261277@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I wrote:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> Slow query
>> select * from information_schema.tables where table_name = 'pg_class';

> Yeah. This has been complained of many times before.

> The core of the problem, I think, is that we're unable to convert the
> condition on table_name into an indexscan on pg_class.relname, because
> the view has cast pg_class.relname to the sql_identifier domain.

> There are two different issues in that. One is that the domain might
> have constraints (though in reality it does not), so the planner can't
> throw away the CoerceToDomain node, and thus can't match the expression
> to the index. Even if we did throw away the CoerceToDomain, it still
> would not work because the domain is declared to be over varchar, and
> so there's a cast-to-varchar underneath the CoerceToDomain.

After my last few commits, the only issue that's left here is the
cast-to-varchar implied by casting to sql_identifier. Upthread
I showed a possible planner hack to get rid of that, and we could
still solve it that way so far as allowing indexscans on catalogs
is concerned. However, I wonder what people would think of a
more aggressive approach, viz:

diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index 0fbcfa8..3891e3b 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -216,7 +216,7 @@ CREATE DOMAIN character_data AS character varying COLLATE "C";
* SQL_IDENTIFIER domain
*/

-CREATE DOMAIN sql_identifier AS character varying COLLATE "C";
+CREATE DOMAIN sql_identifier AS name;

I've not checked to verify that sql_identifier is used for all and only
those view columns that expose "name" catalog columns. If the SQL
committee was sloppy about that, this idea might not work. But assuming
that the length restriction is valid for the columns that have this
type, would this be an OK idea? It does seem to fix the poor-plan-quality
problem at a stroke, with no weird planner hacks.

What I find in the SQL spec is

5.5 SQL_IDENTIFIER domain

Function

Define a domain that contains all valid <identifier body>s and
<delimited identifier body>s.

Definition

CREATE DOMAIN SQL_IDENTIFIER AS
CHARACTER VARYING (L)
CHARACTER SET SQL_IDENTIFIER;

GRANT USAGE ON DOMAIN SQL_IDENTIFIER
TO PUBLIC WITH GRANT OPTION;

Description

1) This domain specifies all variable-length character values that
conform to the rules for formation and representation of an SQL
<identifier body> or an SQL <delimited identifier body>.

NOTE 4 - There is no way in SQL to specify a <domain
constraint> that would be true for the body of any valid SQL
<regular identifier> or <delimited identifier> and false for all
other character string values.

2) L is the implementation-defined maximum length of <identifier
body> and <delimited identifier body>.

So we'd be violating the part of the spec that says that the domain's
base type is varchar, but considering all the other requirements here
that we're blithely ignoring, maybe that's not such a sin. With the
recent collation changes, type name is hard to functionally distinguish
from a domain over varchar anyway. Furthermore, since name's length limit
corresponds to the "implementation-defined maximum length" part of the
spec, you could argue that in some ways this definition is closer to the
spec than what we've got now.

Thoughts?

regards, tom lane

In response to

Re: slow queries over information schema.tables at 2018-12-05 17:24:54 from Tom Lane

Responses

Re: slow queries over information schema.tables at 2018-12-20 03:31:59 from Pavel Stehule
Re: slow queries over information schema.tables at 2018-12-20 05:24:37 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2018-12-19 23:22:28	Re: GIN predicate locking slows down valgrind isolationtests tremendously
Previous Message	Robert Haas	2018-12-19 23:04:26	Re: Why are we PageInit'ing buffers in RelationAddExtraBlocks()?