Re: pg_basebackup blocking all queries with horrible performance

From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: Lonni J Friedman <netllama(at)gmail(dot)com>
Cc: Jerry Sievers <gsievers19(at)comcast(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-admin(at)postgresql(dot)org
Subject: Re: pg_basebackup blocking all queries with horrible performance
Date: 2012-06-08 06:04:41
Message-ID: 4FD195F9.8080108@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

On 06/08/2012 09:01 AM, Lonni J Friedman wrote:
> On Thu, Jun 7, 2012 at 5:07 PM, Jerry Sievers<gsievers19(at)comcast(dot)net> wrote:
>> You might try stopping pg_basebackup in place with SIGSTOP and check
>> if problem goes away. SIGCONT and you should start having
>> sluggishness again.
>>
>> If verified, then any sort of throttling mechanism should work.
>
> I'm certain that the problem is triggered only when pg_basebackup is
> running. Its very predictable, and goes away as soon as pg_basebackup
> finishes running. What do you mean by a throttling mechanism?

Sure, it only happens when pg_basebackup is running. But if you *pause*
pg_basebackup, so it's still running but not currently doing work, does
the problem go away? Does it come back when you unpause pg_basebackup?
That's what Jerry was telling you to try.

If the problem goes away when you pause pg_basebackup and comes back
when you unpause it, it's probably a system load problem.

If it doesn't go away, it's more likely to be a locking issue or
something _other_ than simple load.

SIGSTOP ("kill -STOP") pauses a process, and SIGCONT ("kill -CONT")
resumes it, so on Linux you can use these to try and find out. When you
SIGSTOP pg_basebackup then the postgres backend associated with it
should block shortly afterwards as its buffers fill up and it can't send
more data, so the load should come off the server.

A "throttling mechanism" refers to anything that limits the rate or
speed of a thing. In this case, what you want to do if your problem is
system overload is to limit the speed at which pg_basebackup does its
work so other things can still get work done. In other words you want to
throttle it. Typical throttling mechanisms include the "ionice" and
"renice" commands to change I/O and CPU priority, respectively.

Note that you may need to change the priority of the *backend* that
pg_basebackup is using, not necessarily the pg_basebackup command its
self. I haven't done enough with Pg's replication to know how that
works, so someone else will have to fill that bit in.

--
Craig Ringer

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message sgm 2012-06-08 10:43:34 Question about PITR backup
Previous Message Craig Ringer 2012-06-08 05:56:38 (new thread) could not rename temporary statistics file "pg_stat_tmp/pgstat.tmp" to "pg_stat_tmp/pgstat.stat": No such file or directory

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-06-08 06:44:24 Re: Avoiding adjacent checkpoint records
Previous Message Tom Lane 2012-06-08 04:01:52 Re: Avoiding adjacent checkpoint records