Re: Space for pg_dump

From: Jan Otto <asche(at)me(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rainer Bauer <usenet(at)munnin(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Space for pg_dump
Date: 2009-04-20 17:27:35
Message-ID: 335F27C9-6F13-4FE0-9B10-C47E2C3DD2F9@me.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

> Rainer Bauer <usenet(at)munnin(dot)com> writes:
>> Greg Smith wrote:
>>> Since running an entire pgdump can take forever on a big database,
>>> what I
>>> usually do here is start by running the disk usage query at
>>> http://wiki.postgresql.org/wiki/Disk_Usage
>
>> Interesting. However, the query gives an error if the table name
>> contains
>> upper case characters, like in my case "tblConnections":
>
>> ERROR: relation "public.tblconnections" does not exist.
>
>> Replacing all occurences of <relname> by <'"' || relname || '"'>
>> fixes the
>> error.
>
> That still fails if the table name contains double quotes. A proper
> solution is to use the table OID --- I've corrected the example.

If you have big toast tables you get wrong results with the query
suggested
at http://wiki.postgresql.org/wiki/Disk_Usage because it takes the
toasted
values not into account.

Simple example (take a look at the first row -> public.media):

SELECT nspname || '.' || relname AS "relation",
pg_size_pretty(pg_relation_size(C.oid)) AS "size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND nspname !~ '^pg_toast'
ORDER BY pg_relation_size(C.oid) DESC
LIMIT 20;
relation | size
---------------------------+------------
public.media | 727 MB
public.identifier_idx | 342 MB
public.media_pk | 190 MB
public.mediateypes_pk | 16 kB
public.mediaformats_uk | 16 kB
public.contentsizes_pk | 16 kB
public.contenttype_pk | 16 kB
public.mediaformats_pk | 16 kB
public.contenttypes | 8192 bytes
public.media_media_id_seq | 8192 bytes
public.contentsizes | 8192 bytes
public.mediaformats | 8192 bytes
public.mediatypes | 8192 bytes
public.vmedia2 | 0 bytes
public.vmedia | 0 bytes
(15 rows)

Now a fixed query which gets the sizes of the related pg_toast_oid and
pg_toast_oid_index too:

SELECT nspname || '.' || relname AS "relation",
pg_size_pretty(pg_relation_size(C.oid)
+ COALESCE((SELECT pg_relation_size(C2.oid) FROM
pg_class C2 WHERE C2.relname = 'pg_toast_' || C.oid ),0::bigint)
+ COALESCE((SELECT pg_relation_size(C3.oid) FROM
pg_class C3 WHERE C3.relname = 'pg_toast_' || C.oid || '_index'),
0::bigint)
) AS "size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND nspname !~ '^pg_toast'
ORDER BY pg_relation_size(C.oid)
+ COALESCE((SELECT pg_relation_size(C2.oid) FROM
pg_class C2 WHERE C2.relname = 'pg_toast_' || C.oid ),0::bigint)
+ COALESCE((SELECT pg_relation_size(C3.oid) FROM
pg_class C3 WHERE C3.relname = 'pg_toast_' || C.oid || '_index'),
0::bigint)
DESC
LIMIT 20;
relation | size
---------------------------+------------
public.media | 164 GB
public.identifier_idx | 342 MB
public.media_pk | 190 MB
public.contenttype_pk | 16 kB
public.contenttypes | 16 kB
public.contentsizes | 16 kB
public.contentsizes_pk | 16 kB
public.mediateypes_pk | 16 kB
public.mediaformats | 16 kB
public.mediatypes | 16 kB
public.mediaformats_pk | 16 kB
public.mediaformats_uk | 16 kB
public.media_media_id_seq | 8192 bytes
public.vmedia | 0 bytes
public.vmedia2 | 0 bytes
(15 rows)

There is a difference of about 163 GB (which is from the toast of
public.media)
relation | size
---------------------------------+------------
pg_toast.pg_toast_6366088 | 162 GB
pg_toast.pg_toast_6366088_index | 1832 MB
public.media | 727 MB

If you have only small or no toast tables the query from the wiki will
be working for you.

regards, jan

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2009-04-20 17:35:37 Re: Space for pg_dump
Previous Message Ben Chobot 2009-04-20 16:36:17 Re: postgreSQL & amazon ec2 cloud