Re: PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, furuyao(at)pm(dot)nttdata(dot)co(dot)jp, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, teranishih(at)nttdata(dot)co(dot)jp
Subject: Re: PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.
Date: 2014-11-17 01:02:43
Message-ID: CAHGQGwH_aj1apxYi_zMpNw-aQrjsRdXFboZr-jB1VqEj_OFfbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Sat, Nov 15, 2014 at 3:42 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> On 2014-11-15 03:25:16 +0900, Fujii Masao wrote:
>>> On Fri, Nov 14, 2014 at 7:22 PM, <furuyao(at)pm(dot)nttdata(dot)co(dot)jp> wrote:
>>> > "pg_ctl stop" does't work propley, if --slot option is specified when WAL is flushed only it has switched.
>>> > These processes still continue even after the posmaster failed:pg_receivexlog, walsender and logger.
>>>
>>> I could reproduce this problem. At normal shutdown, walsender keeps waiting
>>> for the last WAL record to be replicated and flushed in pg_receivexlog. But
>>> pg_receivexlog issues sync command only when WAL file is switched. Thus,
>>> since pg_receivexlog may never flush the last WAL record, walsender may have
>>> to keep waiting infinitely.
>>
>> Right.
> It is surprising that nobody complained about that before,
> pg_receivexlog has been released two years ago.

It's not so surprising because the problem can happen only when
replication slot is specified, i.e., the version is 9.4 or later.

>>> pg_recvlogical handles this problem by calling fsync() when it receives the
>>> request of immediate reply from the server. That is, at shutdown, walsender
>>> sends the request, pg_receivexlog receives it, flushes the last WAL record,
>>> and sends the flush location back to the server. Since walsender can see that
>>> the last WAL record is successfully flushed in pg_receivexlog, it can
>>> exit cleanly.
>>>
>>> One idea to the problem is to introduce the same logic as pg_recvlogical has,
>>> to pg_receivexlog. Thought?
>>
>> Sounds sane to me. Are you looking into doing that?
> Yep, sounds a good thing to do if master requested answer from the
> client in the keepalive message. Something like the patch attached
> would make the deal.

Isn't it better to do this only when replication slot is used?

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip kumar 2014-11-17 03:25:42 Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Previous Message Pavel Stehule 2014-11-16 22:48:25 Re: printing table in asciidoc with psql