Re: [ADMIN] pg_basebackup blocking all queries with horrible performance

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Lonni J Friedman <netllama(at)gmail(dot)com>, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>, Jerry Sievers <gsievers19(at)comcast(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [ADMIN] pg_basebackup blocking all queries with horrible performance
Date: 2012-06-29 10:22:18
Message-ID: CABUevEwEfd1+md8kqgetP6wxe6yP3yFUOa4tixSmFvZRiQRvfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

On Wed, Jun 27, 2012 at 7:24 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Thu, Jun 21, 2012 at 3:18 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Wed, Jun 20, 2012 at 7:18 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>>>>> You agreed to add something like NOSYNC option into START_REPLICATION command?
>>>>>
>>>>> I'm on the fence. I was hoping somebody else would chime in with an
>>>>> opinion as well.
>>>>
>>>> +1
>>>
>>> Nobody else with any opinion on this? :(
>>
>> I don't think we really need a NOSYNC flag at this point.  Just not
>> setting the flush location in clients that make a point of flushing in
>> a timely fashion seems fine.
>
> Okay, I'm in the minority, so I'm writing the patch that way. WIP
> patch attached.
>
> In the patch, pg_basebackup background process and pg_receivexlog always
> return invalid location as flush one, and will never become sync standby even
> if their name is in synchronous_standby_names. The timing of their sending

That doesn't match with the patch, afaics. The patch always sets the
correct write location, which means it can become a remote_write
synchronous standby, no? It will only send it back when timeout
expires, but it will be sent back.

I wonder if that might actually be a more reasonable mode of operation
in general:

* always send back the write position, at the write interval
* always send back the flush position, when we're flushing (meaning
when we switch xlog)

have an option that makes it possible to:
* always send back the write position as soon as it changes (making
for a reasonable remote_write sync standby)
* actually flush the log after each write instead of end of file
(making for a reasonable full sync standby)

meaning you'd have something like "pg_receivexlog --sync=write" and
"pg_receivexlog --sync=flush" controlling it instead.

And deal with the "user put * in synchronous_standby_names and
accidentally got pg_receivexlog as the sync standby" by more clearly
warning people not to use * for that parameter... Since it's simply
dangerous :)

> the reply depends on the standby_message_timeout specified in -s option. So
> the write position may lag behind the true position.
>
> pg_receivexlog accepts new option -S (better option character?). If this option
> is specified, pg_receivexlog returns true flush position, and can become sync
> standby. It sends back the reply to the master each time the write position
> changes or the timeout passes. If synchronous_commit is set to remote_write,
> synchronous replication to pg_receivexlog would work well.

Yeah, I hadn't considered the remote_write mode, but I guess that's
why you have to track the current write position across loads, which
first confused me.

Looking at some other usecases for this, I wonder if we should also
force a status message whenever we switch xlog files, even if we
aren't running in sync mode, even if the timeout hasn't expired. I
think that would be a reasonable thing to do, since you often want to
track things based on files.

> The patch needs more documentation. But I think that it's worth reviewing the
> code in advance, so I attached the WIP patch. Comments? Objections?

Looking at the code, what exactly prompts the changes to the backend
side? That seems unrelated? Are we actually considering picking a
standby with InvalidXlogRecPtr as a sync standby today?

Isn't it enough to just send the proper write and flush locations from
the frontend?

> The patch is based on current HEAD, i.e., 9.3dev. If the patch is applied,
> we need to write the backport version of the patch for 9.2.

Oh, conflicts with Heikkis xlog patches, right? Ugh. But yeah.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Christian Rosnes 2012-06-30 05:56:58 Oldest xmin is far in the past
Previous Message Magnus Hagander 2012-06-29 10:04:39 Re: [ADMIN] pg_basebackup blocking all queries with horrible performance

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2012-06-29 11:28:40 Re: Reporting hba lines
Previous Message Magnus Hagander 2012-06-29 10:04:39 Re: [ADMIN] pg_basebackup blocking all queries with horrible performance