Re: Vacuum full - disk space eaten by WAL logfiles

From: "Lee Wu" <Lwu(at)mxlogic(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Vacuum full - disk space eaten by WAL logfiles
Date: 2005-01-10 21:52:24
Message-ID: ECAB83AA52BCC043A0E24BBC000010241114E4@mxhq-exch.corp.mxlogic.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi Tom,

1. shared_buffers | 32768
2. I/O bandwidth is not an issue to best our knowledge
3. It is "vacuum full" as shown:
Jan 8 20:25:38 mybox postgres[8603]: [15] FATAL: The database system
is in recovery mode
Jan 8 20:25:38 mybox postgres[7284]: [14] LOG: statement: vacuum full
analyze the_35G_table
Jan 8 20:25:39 mybox postgres[8604]: [15] FATAL: The database system
is in recovery mode

Also this error happened last 2 Saturdays and matched our vacuum log
timing:
20050108-194000 End vacuum full analyze on table1.
20050108-194000 Begin vacuum full analyze on the_35G_table.
WARNING: Message from PostgreSQL backend:
The Postmaster has informed me that some other backend
died abnormally and possibly corrupted shared memory.
I have rolled back the current transaction and am
going to terminate your database system connection and exit.
Please reconnect to the database system and repeat your query.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost
20050108-202539 End vacuum full analyze on the_35G_table.
20050108-202539 Begin vacuum full analyze on table3.
psql: FATAL: The database system is in recovery mode

We only do vacuum full on Saturday. This error has not been seen
occurring other time.

4. PG upgrade issue - out of my (an DBA) control

Thanks,

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Monday, January 10, 2005 2:27 PM
To: Lee Wu
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: [ADMIN] Vacuum full - disk space eaten by WAL logfiles

"Lee Wu" <Lwu(at)mxlogic(dot)com> writes:
> When we do weekly "vacuum full", PG uses all space and causes PG down.

This implies that checkpoints aren't completing for some reason.
If they were, they'd be recycling WAL space.

I'm not aware of any problems in 7.3 that would block a checkpoint
indefinitely, but we have seen cases where it just took too darn long
to do the checkpoint --- implying either a ridiculously large
shared_buffers setting, or a drastic shortage of I/O bandwidth.

You might want to try strace'ing the checkpoint process to see if it
seems to be making progress or not.

Also, are you certain that this is happening during a VACUUM? The
log messages you show refer to COPY commands.

> PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96

Are you aware of the number and significance of post-7.3.2 bug fixes
in the 7.3 branch? You really ought to be on 7.3.8, if you can't afford
to migrate to 7.4 right now.

regards, tom lane

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2005-01-10 23:00:32 Re: Vacuum full - disk space eaten by WAL logfiles
Previous Message Tom Lane 2005-01-10 21:26:52 Re: Vacuum full - disk space eaten by WAL logfiles