Re: database is bigger after dump/restore - why? (60 GB to 109 GB)

From: Aleksey Tsalolikhin <atsaloli(dot)tech(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: database is bigger after dump/restore - why? (60 GB to 109 GB)
Date: 2011-02-27 08:49:48
Message-ID: AANLkTinXb4NC5wgqp-LW_CRhMOfNPQknPHNovNqdm1e+@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

OK, just to recap:

database A has a table that is 50 GB in size (according to:

SELECT relname as "Table",
pg_size_pretty(pg_total_relation_size(relid)) As "Size" from
pg_catalog.pg_statio_user_tables ORDER BY
pg_total_relation_size(relid) DESC;
)

I pg_dump -Fc this table, which gives me a 9.8 GB file.

I then pg_restore this table into database B, which results in a
100 GB table, according to the same SQL query.

Database versions are identical:
A: PostgreSQL 8.4.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
B: PostgreSQL 8.4.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit

Character encoding of the source and target databases are identical:
UTF8. (As reported by "psql -l".)

fillfactor is not set for this table, it is the default on both A and
B. (As reported by "select reloptions from pg_class where
relname='tablename';".)

If I pg_dump -Fc the 100 GB table on database B, I get a 9.9 GB file.

If I do a "select count(*) from tablename", I get identical results
on A and B (1,628,348 rows).

Adrian asked about the schema for this table. It is the same on A and B:

Table "tablename"
Column | Type | Modifiers
-----------------+--------------------------+-----------
column1 | character varying | not null
column2 | character varying |
column3 | character varying |
column4 | character varying |
column5 | timestamp with time zone |
column6 | character varying |
column7 | character varying |
column8 | character varying |
column9 | character varying |
column10 | character varying |
Indexes:
"tablename_pkey" PRIMARY KEY, btree (column1)
"tablename_column6_index" btree (column6)
"tablename_column9_index" btree (device_dnq)
"tablename_column8_index" btree (kdm_gmt_end)
"tablename_column7_index" btree (kdm_gmt_start)

When I pg_dump the 50 GB table, I get a 40 GB file.

When I pg_dump the 100 GB table, I get a 40 GB file.

so looks like it's something low-level, something about how the data is stored.

i've installed the "pageinspect" contrib module as Tomas suggested but
I don't know what to do with it or what to look at. I looked at the
manual for it but it's totally over my head right now.

What sections of the manual should I read to be able to use this
module? (there are 2167 pages in the whole Postgres 8.4 manual and I
don't have time to study the whole thing. :( I just need to study
enough to understand how to use pageinspect.)

(I'm not a Postgres DBA but I am having to become one to support our database.)

Thanks again for the help. I'll be at SCALE 9x tomorrow helping Joe
Conway and Bruce M and Richard B and company get the word out about
Postgres.

Best,
Aleksey

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Aleksey Tsalolikhin 2011-02-27 09:25:12 Re: pg_dump makes our system unusable - any way to pg_dump in the middle of the day? (postgres 8.4.4)
Previous Message Cyril Scetbon 2011-02-27 08:46:13 Re: Question about switchover with PG9 replication