Re: Review of "pg_basebackup and pg_receivexlog to use non-blocking socket communication", was: Re: Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Heikki Linnakangas'" <hlinnakangas(at)vmware(dot)com>, "'Boszormenyi Zoltan'" <zb(at)cybertec(dot)at>
Cc: "'Hari Babu'" <haribabu(dot)kommi(at)huawei(dot)com>, "'Fujii Masao'" <masao(dot)fujii(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Review of "pg_basebackup and pg_receivexlog to use non-blocking socket communication", was: Re: Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
Date: 2013-01-18 06:50:58
Message-ID: 03b701cdf548$2cfd7170$86f85450$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Wednesday, January 16, 2013 4:02 PM Heikki Linnakangas wrote:
> On 07.01.2013 16:23, Boszormenyi Zoltan wrote:
> > Since my other patch against pg_basebackup is now committed,
> > this patch doesn't apply cleanly, patch rejects 2 hunks.
> > The fixed up patch is attached.
>
> Now that I look at this a high-level perspective, why are we only
> worried about timeouts in the Copy-mode and when connecting? The
> initial
> checkpoint could take a long time too, and if the server turns into a
> black hole while the checkpoint is running, pg_basebackup will still
> hang. Then again, a short timeout on that phase would be a bad idea,
> because the checkpoint can indeed take a long time.

True, but IMO, if somebody want to take basebackup, he should do that when
the server is not loaded.

> In streaming replication, the keep-alive messages carry additional
> information, the timestamps and WAL locations, so a keepalive makes
> sense at that level. But otherwise, aren't we just trying to
> reimplement
> TCP keepalives? TCP keepalives are not perfect, but if we want to have
> an application level timeout, it should be implemented in the FE/BE
> protocol.
>
> I don't think we need to do anything specific to pg_basebackup. The
> user
> can simply specify TCP keepalive settings in the connection string,
> like
> with any libpq program.

I think currently user has no way to specify TCP keepalive settings from
pg_basebackup, please let me know if there is any such existing way?

I think specifying TCP settings is very cumbersome for most users, that's
the reason most standard interfaces (ODBC/JDBC) have such application level
timeout mechanism.

By implementing in FE/BE protocol (do you mean to say that make such
non-blocking behavior inside Libpq or something else), it might be generic
and can be used for others as well but it might need few interface changes.

IMHO if by having such less impact changes for pg_basebackup, it makes
pg_basebackup network sensitive, the current approach can also be
considered.

With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2013-01-18 10:15:53 Passing connection string to pg_basebackup
Previous Message Kevin Grittner 2013-01-17 21:57:53 Re: BUG #7814: Rotation of the log is not carried out.

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2013-01-18 08:31:16 Re: Removing PD_ALL_VISIBLE
Previous Message Jeevan Chalke 2013-01-18 06:46:21 Re: pg_dump --pretty-print-views