Re: checkpointer: PANIC: could not fsync file: No such file or directory

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: checkpointer: PANIC: could not fsync file: No such file or directory
Date: 2019-11-28 14:13:23
Message-ID: CA+hUKGLPAab-5nVDP97nh1jaOpXL5Zm1XpgG8W=NgaOW9_0aYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 27, 2019 at 7:53 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> 2019-11-26 23:41:50.009-05 | could not fsync file "pg_tblspc/16401/PG_12_201909212/16460/973123799.10": No such file or directory

I managed to reproduce this (see below). I think I know what the
problem is: mdsyncfiletag() uses _mdfd_getseg() to open the segment to
be fsync'd, but that function opens all segments up to the one you
requested, so if a lower-numbered segment has already been unlinked,
it can fail. Usually that's unlikely because it's hard to get the
request queue to fill up and therefore hard to split up the cancel
requests for all the segments for a relation, but your workload and
the repro below do it. In fact, the path it shows in the error
message is not even the problem file, that's the one it really wanted,
but first it was trying to open lower-numbered ones. I can see a
couple of solutions to the problem (unlink in reverse order, send all
the forget messages first before unlinking anything, or go back to
using a single atomic "forget everything for this rel" message instead
of per-segment messages), but I'll have to think more about that
tomorrow.

=== repro ===

Recompile with RELSEG_SIZE 2 in pg_config.h. Run with
checkpoint_timeout=30s and shared_buffers=128kB. Then:

create table t (i int primary key);
cluster t using t_pkey;
insert into t select generate_series(1, 10000);

Session 1:
cluster t;
\watch 1

Session 2:
update t set i = i;
\watch 1.1

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-11-28 14:31:16 Re: Option to dump foreign data in pg_dump
Previous Message Alvaro Herrera 2019-11-28 14:03:33 Re: Implementing Incremental View Maintenance