WAL recycling, ext3, Linux 2.4.18

From: Doug Fields <dfields-pg-general(at)pexicom(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Cc: Glenn Stone <gstone(at)pogolinux(dot)com>
Subject: WAL recycling, ext3, Linux 2.4.18
Date: 2002-07-08 06:36:27
Message-ID: 5.1.0.14.2.20020708022105.01f36598@pop.pexicom.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello all,

I'm still trying to track down my very odd periodic pauses/hangs in
PostgreSQL 7.2.1.

I've localized it to what seems to be the "recycled transaction log file"
lines in the log file. Whenever this happens, a whole bunch of queries
which were "on hold" (just sitting there, as can be seen in
pg_stat_activity, when they usually execute in fractions of a second) come
back to life and finish very quickly.

Unfortunately, PostgreSQL doesn't seem to log when it starts doing this
recycling, only when it's done.

However, it seems to be taking about 1.5 minutes (yes, around 90 seconds)
to do this recycling on about sixteen of these WAL files at a time.
(Deduction from the logs from the application that uses the database.) I
currently have about 102 of these WAL files (I don't mind; I have 50 gigs
set aside for pg_xlog). My postgresql.conf settings are:

WAL_FILES = 48
WAL_BUFFERS = 16
CHECKPOINT_SEGMENTS = 30

With this, during my heavy load period, I get those 16 WAL recycling
messages every 6.5 minutes. During heavy vacuuming, the recycling happens
every 3 minutes, and that was my goal (no more than every three minutes,
per Bruce Momjian's PDF on tuning).

My server specs:
Dual P4 Xeon 2.4
8gb RAM
RAID-1 drive for pg_xlog - running ext3
RAID-5 drive dedicated to PostgreSQL for everything else - running ext3
Debian 3.0 (woody) kernel 2.4.18

Some questions:

1) Is there any known bad interactions with ext3fs and PostgreSQL? My
hardware vendor (Pogo Linux, recommended) seemed to suggest that ext3fs has
problems in multi-threading.
2) Any ideas on how to get it to log more info on WAL usage?
3) Which process in PostgreSQL should I attach to using gdb to check out
this WAL stuff?

Putting my application on hold for 1.5 minutes out of every 6.5 is of
course very bad... I'm stumped. Any ideas are welcome; I am willing to
provide any additional information and run any other tests.

Thanks,

Doug

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ricardo Junior 2002-07-08 07:29:39 Re: I am being interviewed by OReilly
Previous Message Curt Sampson 2002-07-08 06:31:26 Re: clean up time!