Re: pg_dump and thousands of schemas

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: pg_dump and thousands of schemas
Date: 2012-05-30 00:58:16
Message-ID: 20120530.095816.631069637690597891.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

> Yeah, Jeff's experiments indicated that the remaining bottleneck is lock
> management in the server. What I fixed so far on the pg_dump side
> should be enough to let partial dumps run at reasonable speed even if
> the whole database contains many tables. But if psql is taking
> AccessShareLock on lots of tables, there's still a problem.

Yes, I saw this kind of lines:

29260 2012-05-30 09:39:19 JST LOG: statement: LOCK TABLE public.t10 IN ACCESS SHARE MODE

It seems this is not very efficient query since LOCK TABLE can take
multiple tables as an argument and we could pass as many tables as
possible to one LOCK TABLE query. This way we could reduce the
communication between pg_dump and backend.

Also I noticed lots of queries like these:

29260 2012-05-30 09:39:19 JST LOG: statement: SELECT attname, attacl FROM pg_catalog.pg_attribute WHERE attrelid = '516391' AND NOT attisdropped AND attacl IS NOT NULL ORDER BY attnum

I guess this is for each table and if there are tones of tables these
queries are major bottle neck as well as LOCK. I think we could
optimize somewhat this in that we issue queries to extract info of
multiple tables rather than extracting only one table inof as current
implementation does.

Or even better we could create a temp table which contains target
table oids to join the query above.

In my opinion, particular use case such as multi tenancy would create
tons of objects in a database cluster and the performance of pg_dump
might be highlighted more in the future.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2012-05-30 01:29:19 Re: Uh, I change my mind about commit_delay + commit_siblings (sort of)
Previous Message Sergey Koposov 2012-05-30 00:40:01 slow dropping of tables, DropRelFileNodeBuffers, tas

Browse pgsql-performance by date

  From Date Subject
Next Message Tatsuo Ishii 2012-05-30 09:06:20 Re: pg_dump and thousands of schemas
Previous Message Alejandro Carrillo 2012-05-29 22:34:26 Re: Recover rows deleted