Re: replication_timeout not effective

From: Dang Minh Huong <kakalot49(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Kyotaro HORIGUCHI <kyota(dot)horiguchi(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>, "<pgsql-bugs(at)postgresql(dot)org>" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: replication_timeout not effective
Date: 2013-04-10 14:37:44
Message-ID: 51657938.9020404@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Thanks all,

(2013/04/10 22:55), Andres Freund wrote:
> On 2013-04-10 22:38:07 +0900, Kyotaro HORIGUCHI wrote:
>> Hello,
>>
>> On Wed, Apr 10, 2013 at 6:57 PM, Dang Minh Huong <kakalot49(at)gmail(dot)com> wrote:
>>> In 9.3, it sounds replication_timeout is replaced by wal_sender_timeout.
>>> So if it is solved in 9.3 i think there is a way to terminate it.
>>> I hope it is fixed in 9.1 soon
>> Hmm. He said that,
>>
>>> But in my environment the sender process is hang up (in several tens of minunites) if i turn off (by power off) Standby PC while *pg_basebackup* is excuting.
>> Does basebackup run only on 'replication connection' ?
>> As far as I saw base backup uses 'base backup' connection in addition
>> to 'streaming' connection. The former seems not under the control of
>> wal_sender_timeout or replication_timeout and easily blocked at
>> send(2) after sudden cut out of the network connection underneath.
>> Although the latter indeed is terminated by them.
> Yes, it's run via a walsender connection. The only "problem" is that it
> doesn't check for those timeouts. I am not sure it would be a good thing
> to do so to be honest. At least not using the same timeout as actual WAL
> sending, thats just has different characteristics.
> On the other hand, hanging around that long isn't nice either...
I tried max_wal_sender with 1, so when the walsender is hanging.
I can not run again pg_basebackup (or start the standby DB).
I'm increasing it to 2, so the seconds successfully. But i'm afraid
that when the third occures the hanging walsender in the first
is not yet terminated...

I think not, but is there a way to terminate hanging up but not
restart PostgreSQL server or kill walsender process?
(kill walsender process can caused a crash to DB server,
so i don't want to do it).

# i've also tried with pg_cancel_backend() but it did not work too.
>> Blocking in send(2) might could occur for async-rep connection but not
>> likely for sync-rep since it does not fill the buffers of libpq and
>> socket easilly.
> You just need larger transactions for it. A COPY or so ought to do it.
>
> Greetings,
>
> Andres Freund
>
Regard,
Huong DM

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2013-04-10 14:44:02 Re: [BUGS] replication_timeout not effective
Previous Message Andres Freund 2013-04-10 13:55:15 Re: replication_timeout not effective

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-04-10 14:44:02 Re: [BUGS] replication_timeout not effective
Previous Message Shaun Thomas 2013-04-10 14:26:11 Re: Inconsistent DB data in Streaming Replication