Re: Waiting on ExclusiveLock on extension 9.3, 9.4 and 9.5

From: Tom Dearman <tom(dot)dearman(at)gmail(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Waiting on ExclusiveLock on extension 9.3, 9.4 and 9.5
Date: 2015-11-02 15:32:08
Message-ID: 2F14C445-07AD-46FC-9FB7-037F3242B3BA@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks for the prompt replies so far, I have done some more investigation to be able to clearly answer some of the question.

The original shared-buffers was 8G and I have done another run on Friday using this old value instead of my more recent 1G limit. There was no noticeable improvement. I also installed the extension pg_buffercache and following some articles such as:

http://www.keithf4.com/a-large-database-does-not-mean-large-shared_buffers/ <http://www.keithf4.com/a-large-database-does-not-mean-large-shared_buffers/>

using:

SELECT pg_size_pretty(count(*) * 8192) as ideal_shared_buffers
FROM pg_class c
INNER JOIN pg_buffercache b ON b.relfilenode = c.relfilenode
INNER JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
WHERE usagecount >= 3;

My system under load is using just over 500M of the shared_buffer at usage count >= 3. Our system is very write heavy, with all of the big tables written to but not read from (at least during the load test run). Although our db will grow (under load) to 100G in a few hours and keep growing, the data in shared_buffers - according to my observations above - seems low. We have the WAL on a different disk from the main tables. We have the following representative TPS for the disks (from SAR) when under our load test:

06:34:01 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
06:35:01 PM dev8-0 176.15 25.89 8773.98 49.96 0.83 4.73 3.79 66.85
06:35:01 PM dev8-1 174.74 0.13 8746.49 50.06 0.81 4.64 3.82 66.81
06:35:01 PM dev8-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
06:35:01 PM dev8-5 1.41 25.76 27.49 37.74 0.02 14.96 2.31 0.33
06:35:01 PM dev8-16 15.02 0.00 3994.82 265.95 1.27 84.88 2.76 4.14
06:35:01 PM dev8-17 15.02 0.00 3994.82 265.95 1.27 84.88 2.76 4.14

dev8-0 to dev8-5 are the represent the disk where the WAL is. dev8-16 to dev8-17 are where the main postgres data directory is located.

We have run bonnie against these WAL disks and the SAR log for that show the following (examples for heavy write and a heavy read and write):

heavy write
01:28:01 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
01:29:01 PM dev8-0 371.85 12.11 377771.45 1015.97 122.93 330.46 2.69 100.00
01:29:01 PM dev8-1 371.85 12.11 377771.45 1015.97 122.93 330.46 2.69 100.00
01:29:01 PM dev8-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
01:29:01 PM dev8-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
01:29:01 PM dev8-16 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
01:29:01 PM dev8-17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

heavy read/write
12:56:02 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
12:57:01 PM dev8-0 456.32 355622.67 109942.01 1020.26 58.34 126.70 2.17 99.00
12:57:01 PM dev8-1 456.32 355622.67 109942.01 1020.26 58.34 126.70 2.17 99.00
12:57:01 PM dev8-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:57:01 PM dev8-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:57:01 PM dev8-16 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:57:01 PM dev8-17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

To me this shows that we are not IO limited at least it is not obvious to me how we can be. I have also, previously, run our load test (purely a sanity check) where I had the WAL written to RAM disk, just to be sure that the IO was not a bottleneck, and we still had the same problem with waiting on exclusive locks.

As far as the checkpoint goes, it does happen every 5 minutes and takes about 4.5 mins which corresponds to the 0.9 checkpoint_completion_target we have set.

> On 28 Oct 2015, at 19:20, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>
> On Wed, Oct 28, 2015 at 8:43 AM, Tom Dearman <tom(dot)dearman(at)gmail(dot)com> wrote:
>> We have a performance problem when our postgres is under high load. The CPU
>> usage is very low, we have 48 cores for our postgres and the idle time
>> averages at 90%. The problem is we get spikes in our transaction times
>> which don’t appear with any obvious regularity and when we get the larger
>> spikes, if I look in the postgres log we see that there is locking on
>> 'process 41915 acquired ExclusiveLock on extension of relation 27177 of
>> database 26192’. The actual relation changes one time it might be one table
>> and another time another, though they are always big tables. I have looked
>> at various previous threads and the only suggestions are either that the
>> disk io is maxed out, which from our observations we don’t believe is the
>> case for us,
>
> What are those observations? Keep in mind that if 20 processes are
> all trying to extend the relation at the same time, one will block on
> IO (according to top/sar/vmstat etc.) and the other 19 will block on
> that first one on a PostgreSQL heavy-weight lock. So all 20 of them
> are effectively blocked on IO, but system monitoring tools won't know
> that.
>
> Also, the IO spikes will be transient, so any monitoring that
> over-averages will not pick up on them.
>
>
>> or that ‘shared_buffers’ is to large - so we have reduced this
>> right down to 1G. In the previous threads there was an indication that the
>> underlying problem was a lock which I believe has been either removed or
>> much improved in 9.5 (see Lock scalability improvements), however we have
>> not seen any improvement in the relation extension locking problem that we
>> see. The version of 9.5 that we have tested is beta1. Any help in showing
>> us how to improve this would be greatly appreciated.
>
> I don't believe any of the improvements made were to this area.
>
> Your latency spikes seem to be happening at a 20 minute interval.
> That would make me think they are lined up with end-of-checkpoint
> fsync activity, except those should be happening every 5 minutes as
> your conf has not changed checkpoint_timeout away from the default.
> Since you have log_checkpoints on, what do you see in the log files
> about how often they occur, and what the checkpoint write time, sync
> time, etc. are?
>
> Cheers,
>
> Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Paul Jungwirth 2015-11-02 15:44:22 pg_archivecleanup not deleting anything?
Previous Message Tom Lane 2015-11-02 14:30:15 Re: ftell mismatch with expected position