Re: too many clog files

From: "Matt Smiley" <mss(at)rentrak(dot)com>
To: <alvherre(at)commandprompt(dot)com>, <duanlg(at)nec-as(dot)nec(dot)com(dot)cn>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: too many clog files
Date: 2008-09-10 05:24:43
Message-ID: 48C6F7A5.D078.0028.0@rentrak.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Alvaro Herrera wrote:
> Move the old clog files back where they were, and run VACUUM FREEZE in
> all your databases. That should clean up all the old pg_clog files, if
> you're really that desperate.

Has anyone actually seen a CLOG file get removed under 8.2 or 8.3? How about 8.1?

I'm probably missing something, but looking at src/backend/commands/vacuum.c (under 8.2.9 and 8.3.3), it seems like vac_truncate_clog() scans through *all* tuples of pg_database looking for the oldest datfrozenxid. Won't that always be template0, which as far as I know can never be vacuumed (or otherwise connected to)?

postgres=# select datname, datfrozenxid, age(datfrozenxid), datallowconn from pg_database order by age(datfrozenxid), datname ;
datname | datfrozenxid | age | datallowconn
------------------+--------------+----------+--------------
template1 | 36347792 | 3859 | t
postgres | 36347733 | 3918 | t
mss_test | 36347436 | 4215 | t
template0 | 526 | 36351125 | f
(4 rows)

I looked at several of my 8.2 databases' pg_clog directories, and they all have all the sequentially numbered segments (0000 through current segment). Would it be reasonable for vac_truncate_clog() to skip databases where datallowconn is false (i.e. template0)? Looking back to the 8.1.13 code, it does exactly that:
if (!dbform->datallowconn)
continue;

Also, Duan, if you have lots of files under pg_clog, you may be burning through transactions faster than necessary. Do your applications leave autocommit turned on? And since no one else mentioned it, as a work-around for a small filesystem you can potentially shutdown your database, move the pg_clog directory to a separate filesystem, and create a symlink to it under your PGDATA directory. That's not a solution, just a mitigation.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Wong 2008-09-10 06:53:17 Effects of setting linux block device readahead size
Previous Message Jignesh K. Shah 2008-09-09 22:28:54 Re: PostgreSQL TPC-H test result?