Re: BUG #13143: Cannot stop and restart a streaming server with a replication slot

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pdrolet(at)infodata(dot)ca, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #13143: Cannot stop and restart a streaming server with a replication slot
Date: 2015-04-27 15:12:29
Message-ID: 20150427151229.GG18789@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2015-04-27 11:44:47 -0300, Alvaro Herrera wrote:
> I think this is failing in the fsync_fname() call in slot.c line 1045
> (REL9_4_STABLE).

Patrice has since replied with log_error_verbosity=verbose logs, but
that reply is probably still stuck in moderation:

> 2015-04-25 14:25:59 EDT LOG: 00000: le système de bases de données a été arrêté à 2015-04-25 14:25:39 EDT
> 2015-04-25 14:25:59 EDT EMPLACEMENT : StartupXLOG, src\backend\access\transam\xlog.c:6011
> 2015-04-25 14:25:59 EDT PANIC: XX000: n'a pas pu synchroniser sur disque (fsync) le fichier « pg_replslot/node_win2008sec/state » : Bad file descriptor
> 2015-04-25 14:25:59 EDT EMPLACEMENT : RestoreSlotFromDisk, src\backend\replication\slot.c:1115
> 2015-04-25 14:25:59 EDT LOG: 00000: processus de lancement (PID 2696) a été arrêté par l'exception 0xC0000409
> 2015-04-25 14:25:59 EDT ASTUCE : Voir le fichier d'en-tête C « ntstatus.h » pour une description de la valeur
> hexadécimale.
> 2015-04-25 14:25:59 EDT EMPLACEMENT : LogChildExit, src\backend\postmaster\postmaster.c:3336
> 2015-04-25 14:25:59 EDT LOG: 00000: annulation du démarrage à cause d'un échec dans le processus de lancement
> 2015-04-25 14:25:59 EDT EMPLACEMENT : reaper, src\backend\postmaster\postmaster.c:2604

So it looks to me like it's a straight pg_fsync() failing. Given that
the open apparently succeeded I'm unsure how that could be. The error
message appears to be a EBADFD.

Hm. I wonder if it's maybe that the file is opened with O_RDONLY? The
OSs I have access to don't care - for good reason imo, fsync isn't a
write - but it's not inconceivable that windows might. I very dimly
remember that that was a problem before at some point. Yep:
http://archives.postgresql.org/message-id/10494.1266903446%40sss.pgh.pa.us

So that's easy enough fixed.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2015-04-27 15:16:18 Re: pg_get_constraintdef failing with cache lookup error
Previous Message Alvaro Herrera 2015-04-27 14:59:10 Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)