BUG #4879: bgwriter fails to fsync the file in recovery mode

From: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #4879: bgwriter fails to fsync the file in recovery mode
Date: 2009-06-25 12:55:07
Message-ID: 200906251255.n5PCt77V016240@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 4879
Logged by: Fujii Masao
Email address: masao(dot)fujii(at)gmail(dot)com
PostgreSQL version: 8.4dev
Operating system: RHEL5.1 x86_64
Description: bgwriter fails to fsync the file in recovery mode
Details:

The restartpoint by bgwriter in recovery mode caused the following error.

ERROR: could not fsync segment 0 of relation base/11564/16422_fsm: No
such file or directory

The following procedure can reproduce this error.

(1) create warm-standby environment
(2) execute "pgbench -i -s10"
(3) execute the following SQLs

TRUNCATE pgbench_accounts ;
TRUNCATE pgbench_branches ;
TRUNCATE pgbench_history ;
TRUNCATE pgbench_tellers ;
CHECKPOINT ;
SELECT pg_switch_xlog();

(4) wait a minute, then the upcoming restartpoint would cause the error
in the standby server.

Whether this error happens or not depends on the timing of operations.
So, you might need to repeat the procedure (2) and (3) in order to
reproduce the error.

I suspect that the cause of this error is the race condition between
file deletion by startup process and fsync by bgwriter: TRUNCATE xlog
record immediately deletes the corresponding file, while it might be
scheduled to be fsynced by bgwriter. We should leave the actual file
deletion to bgwriter instead of startup process, like normal mode?

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Greg Stark 2009-06-25 13:01:42 Re: [BUGS] Integrity check
Previous Message Frank Heikens 2009-06-25 11:02:39 Re: BUG #4878: function age() give a wrong interval