Re: backups blocking everything

From: Samuel Gendler <sgendler(at)ideasculptor(dot)com>
To: "Nicholson, Brad (Toronto, ON, CA)" <bnicholson(at)hp(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: backups blocking everything
Date: 2011-10-27 21:15:28
Message-ID: CAEV0TzB4LxwJ893-z-w2WFt9gULJZSLH5zWRVEmgM6zK4VbLmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Oct 27, 2011 at 1:45 PM, Nicholson, Brad (Toronto, ON, CA) <
bnicholson(at)hp(dot)com> wrote:

> >From: pgsql-performance-owner(at)postgresql(dot)org [mailto:
> pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Samuel Gendler
> >Sent: Thursday, October 27, 2011 12:47 PM
> >To: pgsql-performance(at)postgresql(dot)org
> >Subject: [PERFORM] backups blocking everything
> >
> >I've got a large mixed-used database, with the data warehouse side of
> things consisting of several tables at hundreds of millions of rows, plus a
> number of tables with tens of >millions. There is partitioning, but as the
> volume of data has risen, individual partitions have gotten quite large.
> Hardware is 2x4 core 2.0Ghz Xeon processors, 176GB of RAM, 4 drives in
> >raid 10 for WAL logs and 16 or 20 spindles for data, also in RAID 10.
> Total database size is currently 399GB - via pg_database_size(). It's also
> worth noting that we switched from 8.4 to >9.0.4 only about a month ago, and
> we were not seeing this problem on 8.4.x. The database is growing, but not
> at some kind of exponential rate. full backup, compressed, on the old
> hardware >was 6.3GB and took about 1:45:00 to be written. Recent backups
> are 8.3GB and taking 3 or 4 hours. We were not seeing al queries stall out
> during the backups on 8.4, so far as I am aware.
> >
> >The time it takes for pg_dump to run has grown from 1 hour to 3 and even 4
> hours over the last 6 months, with more than half of that increase occurring
> since we upgrade to 9.0.x. In the >last several weeks (possibly since the
> upgrade to 9.0.4), we are seeing all connections getting used up (our main
> apps use connection pools, but monitoring and some utilities are making
> >direct connections for each query, and some of them don't check for the
> prior query to complete before sending another, which slowly eats up
> available connections). Even the connection >pool apps cease functioning
> during the backup, however, as all of the connections wind up in parse
> waiting state. I also see lots of sockets in close wait state for what
> seems to be an >indefinite period while the backup is running and all
> connections are used up. I assume all of this is the result of pg_dump
> starting a transaction or otherwise blocking other access. I >can get
> everything using a pool, that's not a huge problem to solve, but that won't
> fix the fundamental problem of no queries being able to finish while the
> backup is happening.
>
> What is the I/O utilization like during the dump? I've seen this situation
> in the past and it was caused be excessively bloated tables causing I/O
> starvation while they are getting dumped.
>

There are definitely no bloated tables. The large tables are all
insert-only, and old data is aggregated up and then removed by dropping
whole partitions. There should be no bloat whatsoever. The OLTP side of
things is pretty minimal, and I can pg_dump those schemas in seconds, so
they aren't the problem, either. I don't know what the I/O utilization is
during the dump, offhand. I'll be doing a more thorough investigation
tonight, though I suppose I could go look at the monitoring graphs if I
weren't in the middle of 6 other things at the moment. the joys of startup
life.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2011-10-27 22:19:46 Re: Performance problem with a table with 38928077 record
Previous Message Nicholson, Brad (Toronto, ON, CA) 2011-10-27 20:45:11 Re: backups blocking everything