Re: [ADMIN] pg_basebackup blocking all queries with horrible performance

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Lonni J Friedman <netllama(at)gmail(dot)com>, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>, Jerry Sievers <gsievers19(at)comcast(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [ADMIN] pg_basebackup blocking all queries with horrible performance
Date: 2012-07-02 18:17:16
Message-ID: CAHGQGwHhUZf=vMhhAp+HBsYEEvxjTk_krL+s7H0n_eiXn4zqDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

On Mon, Jul 2, 2012 at 4:01 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Sun, Jul 1, 2012 at 7:14 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>
>> On Fri, Jun 29, 2012 at 7:22 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>> On Wed, Jun 27, 2012 at 7:24 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>>> On Thu, Jun 21, 2012 at 3:18 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>>> On Wed, Jun 20, 2012 at 7:18 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>>>>>>>> You agreed to add something like NOSYNC option into START_REPLICATION command?
>>>>>>>>
>>>>>>>> I'm on the fence. I was hoping somebody else would chime in with an
>>>>>>>> opinion as well.
>>>>>>>
>>>>>>> +1
>>>>>>
>>>>>> Nobody else with any opinion on this? :(
>>>>>
>>>>> I don't think we really need a NOSYNC flag at this point. Just not
>>>>> setting the flush location in clients that make a point of flushing in
>>>>> a timely fashion seems fine.
>>>>
>>>> Okay, I'm in the minority, so I'm writing the patch that way. WIP
>>>> patch attached.
>>>>
>>>> In the patch, pg_basebackup background process and pg_receivexlog always
>>>> return invalid location as flush one, and will never become sync standby even
>>>> if their name is in synchronous_standby_names. The timing of their sending
>>>
>>> That doesn't match with the patch, afaics. The patch always sets the
>>> correct write location, which means it can become a remote_write
>>> synchronous standby, no? It will only send it back when timeout
>>> expires, but it will be sent back.
>>
>> No. Though correct write location is sent back, they don't become sync standby
>> because flush location is always invalid. While flush location is
>> invalid, the master
>> will never regard the remote server as sync one even if synchronous_commit is
>> set to remote_write.
>
> Oh. I wasn't aware of that part.
>
>
>>> I wonder if that might actually be a more reasonable mode of operation
>>> in general:
>>>
>>> * always send back the write position, at the write interval
>>> * always send back the flush position, when we're flushing (meaning
>>> when we switch xlog)
>>>
>>> have an option that makes it possible to:
>>> * always send back the write position as soon as it changes (making
>>> for a reasonable remote_write sync standby)
>>> * actually flush the log after each write instead of end of file
>>> (making for a reasonable full sync standby)
>>>
>>> meaning you'd have something like "pg_receivexlog --sync=write" and
>>> "pg_receivexlog --sync=flush" controlling it instead.
>>
>> Yeah, in this way, pg_receivexlog can become sync even if
>> synchronous_commit is on, which seems more useful. But
>> I'm thinking that the synchronous pg_receivexlog stuff should
>> be postponed to 9.3 because its patch seems to become too
>> big to apply at this beta stage. So, in 9.2, to fix the problem,
>> what about just applying the simple patch which prevents
>> pg_basebackup background process and pg_receivexlog from
>> becoming sync standby whatever synchronous_standby_names
>> and synchronous_commit are set to?
>
> Agreed.
>
> With the addition that we should set the write location, because
> that's very useful and per what you said above should be perfectly
> safe.
>
>
>>> And deal with the "user put * in synchronous_standby_names and
>>> accidentally got pg_receivexlog as the sync standby" by more clearly
>>> warning people not to use * for that parameter... Since it's simply
>>> dangerous :)
>>
>> Yep.
>
> What would be good wording? Something along the line of "Using the *
> entry is not recommended since it can lead to unexpected results when
> new standbys are added" or something like that?
>
>
>>>> the reply depends on the standby_message_timeout specified in -s option. So
>>>> the write position may lag behind the true position.
>>>>
>>>> pg_receivexlog accepts new option -S (better option character?). If this option
>>>> is specified, pg_receivexlog returns true flush position, and can become sync
>>>> standby. It sends back the reply to the master each time the write position
>>>> changes or the timeout passes. If synchronous_commit is set to remote_write,
>>>> synchronous replication to pg_receivexlog would work well.
>>>
>>> Yeah, I hadn't considered the remote_write mode, but I guess that's
>>> why you have to track the current write position across loads, which
>>> first confused me.
>>
>> The patch has to track the current write location to decide whether to send
>> back the reply to the master, IOW to know whether the write location
>> has changed, IOW to know whether we've already sent the reply about
>> the latest write location.
>
> Yeha, makes perfect sense.
>
>
>>> Looking at some other usecases for this, I wonder if we should also
>>> force a status message whenever we switch xlog files, even if we
>>> aren't running in sync mode, even if the timeout hasn't expired. I
>>> think that would be a reasonable thing to do, since you often want to
>>> track things based on files.
>>
>> You mean that the pg_receivexlog should send back the correct flush
>> location whenever it switches xlog files?
>
> No, I mean just send back a status message. Meaning that without
> specifiying the sync modes per above, it would send back the *write*
> location. This would be useful for tracking xlog filenames between
> master and pg_receivexlog, without extra delay.
>
>>>> The patch needs more documentation. But I think that it's worth reviewing the
>>>> code in advance, so I attached the WIP patch. Comments? Objections?
>>>
>>> Looking at the code, what exactly prompts the changes to the backend
>>> side? That seems unrelated? Are we actually considering picking a
>>> standby with InvalidXlogRecPtr as a sync standby today?
>>>
>>> Isn't it enough to just send the proper write and flush locations from
>>> the frontend?
>>
>> No, unless I'm missing something.
>>
>> The problem that we should address first is that the master can pick up
>> pg_basebackup background process and pg_receivexlog as a sync
>> standby even if they always return an invalid write and flush positions.
>> Since they don't send any correct write and flush positions, if they are
>> accidentally regarded as sync standby, all transactions can get blocked
>> infinitely. So the patch needed to change the walsender code so that it
>> doesn't pick up the remote server as sync one while its flush position
>> is invalid.
>
> Yeah, that is clearly wrong. I think I missed this behaviour, and got
> confused by the fact that the patch was trying to fix two different
> things - only one of which I was aware of.
>
> So yes, per above, let's isolate out this part as one patch and get
> that into 9.2, along with the "set the proper write location", but
> leave everything else for 9.3.

Agreed. The attached patch always sets the correct write location and
prevents the remote server sending back invalid flush location from
becoming sync standby.

Regards,

--
Fujii Masao

Attachment Content-Type Size
prevent_pgreceivexlog_becoming_syncstandby_v1.patch application/octet-stream 4.7 KB

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Samuel Stearns 2012-07-03 01:43:46 Duplicate Index Creation
Previous Message Magnus Hagander 2012-07-01 19:01:12 Re: [ADMIN] pg_basebackup blocking all queries with horrible performance

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-07-02 18:42:12 Re: Event Triggers reduced, v1
Previous Message Robert Haas 2012-07-02 17:57:03 Re: Event Triggers reduced, v1