Re: Maintenance question / DB size anomaly...

From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Maintenance question / DB size anomaly...
Date: 2007-06-19 15:11:19
Message-ID: 607iq0dlns.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

kurt(at)hotdogrecords(dot)com (Kurt Overberg) writes:
> In my investigation of this anomaly, I noticed that the data/ dir on
> db1 (the master) is around 60 Gigs. The data directory on the slaves
> is around 25Gb. After about 3 months of head scratching, someone on
> the irc channel suggested that it may be due to index bloat.
> Although, doing some research, it would seem that those problems were
> resolved in 7.4(ish), and it wouldn't account for one database being
> 2.5x bigger. Another unknown is Slony overhead (both in size and
> vacuum times).

There are three tables in Slony-I that would be of interest; on the
master, do a VACUUM VERBOSE on:

- [clustername].sl_log_1
- [clustername].sl_log_2
- [clustername].sl_seqlog

If one or another is really bloated, that could be the cause of *some*
problems. Though that shouldn't account for 35GB of space :-).

> The ONLY thing I can think of is that I DROPped a large number of
> tables from db1 a few months ago (they weren't getting replicated).
> This is on the order of 1700+ fairly largeish (50,000+ row) tables.
> I do not remember doing a vacuum full after dropping them, so perhaps
> that's my problem. I'm planning on doing some maintenance this
> weekend, during which I will take the whole system down, then on db1,
> run a VACUUM FULL ANALYZE on the whole database, then a REINDEX on my
> very large tables. I may drop and recreate the indexes on my big
> tables, as I hear that may be faster than a REINDEX. I will probably
> run a VACUUM FULL ANALYZE on the slaves as well.

When tables are dropped, so are the data files. So even if they were
bloated, they should have simply disappeared. So I don't think that's
the problem.

> Thoughts? Suggestions? Anyone think this will actually help my
> problem of size and vacuum times? Do I need to take Slony down while
> I do this? Will the VACUUM FULL table locking interfere with Slony?

I'd be inclined to head to the filesystem level, and try to see what
tables are bloated *there*.

You should be able to search for bloated tables via the command:

$ find $PGDATA/base -name "[0-9]+\.[0-9]+"

That would be likely to give you a listing of filenames that look
something like:

12341.1
12341.2
12341.3
12341.4
12341.5
12341.6
231441.1
231441.2
231441.3

which indicates all table (or index) data files that had to be
extended past 1GB.

In the above, the relation with OID 12341 would be >6GB in size,
because it has been extended to have 6 additional files (in addition
to the "bare" filename, 12341).

You can then go into a psql session, and run the query:
select * from pg_class where oid = 12341;
and thereby figure out what table is involved.

I'll bet that if you do this on the "origin" node, you'll find that
there is some small number of tables that have *way* more 1GB
partitions than there are on the subscriber nodes.

Those are the tables that will need attention.

You could probably accomplish the reorganization more quickly via the
"CLUSTER" statement; that will reorganize the table according based on
the ordering of one specified index, and then regenerate all the other
indices. It's not MVCC-safe, so if you have reports running
concurrently, this could confuse them, but if you take the apps down,
as you surely should, it won't be a problem.

You don't forcibly have to take Slony-I down during this, but the
locks taken out on tables by CLUSTER/VACUUM FULL will block slons from
doing any work until those transactions complete.

I wouldn't think you need to do VACUUM FULL or CLUSTER against the
subscribers if they haven't actually bloated (and based on what you
have said, there is no indication that they have).
--
output = reverse("ofni.secnanifxunil" "@" "enworbbc")
http://www3.sympatico.ca/cbbrowne/linuxdistributions.html
The quickest way to a man's heart is through his chest, with an axe.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Chris Browne 2007-06-19 15:22:17 Re: [PERFORM] Postgres VS Oracle
Previous Message Tom Lane 2007-06-19 14:56:15 Re: Performance query about large tables, lots of concurrent access