selecting data from information_schema.columns performance.

From: Steve Martin <smartin(at)nec(dot)co(dot)nz>
To: pgsql-performance(at)postgresql(dot)org
Subject: selecting data from information_schema.columns performance.
Date: 2006-10-02 02:26:52
Message-ID: 452078EC.90300@nec.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I am trying to determine if there is a way to improve the performance
when selecting data from the information_schema.columns view.

We use data from this view to inform our application information on the
columns on a table and is used when data is selected from a table.

Below is the output from EXPLAIN ANALYSE:

smf=> explain analyse select column_name, column_default, is_nullable,
data_type, character_maximum_length, numeric_precision,
numeric_precision_radix,
smf-> numeric_scale, udt_name from information_schema.columns where
table_name = 't_fph_tdrdw' order by ordinal_position;

QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=5228.55..5228.64 rows=38 width=449) (actual
time=567.434..567.467 rows=47 loops=1)
Sort Key: (a.attnum)::information_schema.cardinal_number
-> Hash Join (cost=5071.47..5227.55 rows=38 width=449) (actual
time=547.207..567.113 rows=47 loops=1)
Hash Cond: ("outer".oid = "inner".atttypid)
-> Hash Left Join (cost=79.27..173.95 rows=1169 width=310)
(actual time=8.036..17.515 rows=1170 loops=1)
Hash Cond: ("outer".typbasetype = "inner".oid)
Join Filter: ("outer".typtype = 'd'::"char")
-> Hash Join (cost=1.06..75.29 rows=1169 width=176)
(actual time=0.046..6.960 rows=1170 loops=1)
Hash Cond: ("outer".typnamespace = "inner".oid)
-> Seq Scan on pg_type t (cost=0.00..56.69
rows=1169 width=116) (actual time=0.006..3.868 rows=1170 loops=1)
-> Hash (cost=1.05..1.05 rows=5 width=68) (actual
time=0.025..0.025 rows=5 loops=1)
-> Seq Scan on pg_namespace nt
(cost=0.00..1.05 rows=5 width=68) (actual time=0.003..0.013 rows=5 loops=1)
-> Hash (cost=75.29..75.29 rows=1169 width=138) (actual
time=7.983..7.983 rows=1170 loops=1)
-> Hash Join (cost=1.06..75.29 rows=1169
width=138) (actual time=0.036..5.620 rows=1170 loops=1)
Hash Cond: ("outer".typnamespace = "inner".oid)
-> Seq Scan on pg_type bt (cost=0.00..56.69
rows=1169 width=78) (actual time=0.003..2.493 rows=1170 loops=1)
-> Hash (cost=1.05..1.05 rows=5 width=68)
(actual time=0.022..0.022 rows=5 loops=1)
-> Seq Scan on pg_namespace nbt
(cost=0.00..1.05 rows=5 width=68) (actual time=0.003..0.012 rows=5 loops=1)
-> Hash (cost=4992.11..4992.11 rows=38 width=143) (actual
time=536.532..536.532 rows=47 loops=1)
-> Merge Join (cost=4722.45..4992.11 rows=38 width=143)
(actual time=535.940..536.287 rows=47 loops=1)
Merge Cond: ("outer".attrelid = "inner".oid)
-> Merge Left Join (cost=4527.17..4730.67
rows=26238 width=143) (actual time=481.392..520.627 rows=10508 loops=1)
Merge Cond: (("outer".attrelid =
"inner".adrelid) AND ("outer".attnum = "inner".adnum))
-> Sort (cost=4471.90..4537.50 rows=26238
width=107) (actual time=481.345..497.647 rows=10508 loops=1)
Sort Key: a.attrelid, a.attnum
-> Seq Scan on pg_attribute a
(cost=0.00..1474.20 rows=26238 width=107) (actual time=0.007..92.444
rows=26792 loops=1)
Filter: ((attnum > 0) AND (NOT
attisdropped))
-> Sort (cost=55.27..57.22 rows=780
width=38) (actual time=0.035..0.035 rows=0 loops=1)
Sort Key: ad.adrelid, ad.adnum
-> Seq Scan on pg_attrdef ad
(cost=0.00..17.80 rows=780 width=38) (actual time=0.003..0.003 rows=0
loops=1)
-> Sort (cost=195.27..195.28 rows=3 width=8)
(actual time=3.900..3.938 rows=1 loops=1)
Sort Key: c.oid
-> Hash Join (cost=194.12..195.25 rows=3
width=8) (actual time=3.889..3.892 rows=1 loops=1)
Hash Cond: ("outer".oid =
"inner".relnamespace)
-> Seq Scan on pg_namespace nc
(cost=0.00..1.05 rows=5 width=4) (actual time=0.007..0.016 rows=5 loops=1)
-> Hash (cost=194.11..194.11 rows=3
width=12) (actual time=3.826..3.826 rows=1 loops=1)
-> Seq Scan on pg_class c
(cost=0.00..194.11 rows=3 width=12) (actual time=2.504..3.818 rows=1
loops=1)
Filter: (((relkind =
'r'::"char") OR (relkind = 'v'::"char")) AND (pg_has_role(relowner,
'MEMBER'::text) OR has_table_privilege(oid, 'SELECT'::text) OR
has_table_privilege(oid, 'INSERT'::text) OR has_table_privilege(oid,
'UPDATE'::text) OR has_table_privilege(oid, 'REFERENCES'::text)) AND
(((relname)::information_schema.sql_identifier)::text =
't_fph_tdrdw'::text))
Total runtime: 568.211 ms
(39 rows)

smf=>

If I create a table from this view "create table
my_information_schema_columns as select * from
information_schema.columns;", naturally the performance is greatly improved.

smf=> explain analyse select column_name, column_default, is_nullable,
data_type, character_maximum_length, numeric_precision,
numeric_precision_radix,
smf-> numeric_scale, udt_name from my_information_schema_columns where
table_name = 't_fph_tdrdw' order by ordinal_position;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=605.75..605.81 rows=24 width=180) (actual
time=39.878..39.914 rows=47 loops=1)
Sort Key: ordinal_position
-> Seq Scan on my_information_schema_columns (cost=0.00..605.20
rows=24 width=180) (actual time=16.280..39.651 rows=47 loops=1)
Filter: ((table_name)::text = 't_fph_tdrdw'::text)
Total runtime: 40.049 ms
(5 rows)

smf=>

And if I add a index "create index my_information_schema_columns_index
on my_information_schema_columns (table_name);" , it is improved even more.

smf=> explain analyse select column_name, column_default, is_nullable,
data_type, character_maximum_length, numeric_precision,
numeric_precision_radix,
smf-> numeric_scale, udt_name from my_information_schema_columns where
table_name = 't_fph_tdrdw' order by
ordinal_position;

QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=294.18..294.48 rows=119 width=180) (actual
time=0.520..0.558 rows=47 loops=1)
Sort Key: ordinal_position
-> Bitmap Heap Scan on my_information_schema_columns
(cost=2.42..290.08 rows=119 width=180) (actual time=0.169..0.296 rows=47
loops=1)
Recheck Cond: ((table_name)::text = 't_fph_tdrdw'::text)
-> Bitmap Index Scan on my_information_schema_columns_index
(cost=0.00..2.42 rows=119 width=0) (actual time=0.149..0.149 rows=47
loops=1)
Index Cond: ((table_name)::text = 't_fph_tdrdw'::text)
Total runtime: 0.691 ms
(7 rows)

smf=>

If a table is created from the information_schema.columns view, then we
have the problem of keeping the table up to date.

Any hints, rtfm's (locations please), where to look, etc, will be
appreciated.

Regards
Steve Martin

--
\\|// From near to far,
@ @ from here to there,
---oOOo-(_)-oOOo--- funny things are everywhere. (Dr. Seuss)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2006-10-02 03:01:19 Re: selecting data from information_schema.columns performance.
Previous Message Tom Lane 2006-09-30 16:45:50 Re: Table not getting vaccumed.