Re: Inserting 8MB bytea: just 25% of disk perf used?

From: Scott Carey <scott(at)richrelevance(dot)com>
To: "fkater(at)googlemail(dot)com" <fkater(at)googlemail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Inserting 8MB bytea: just 25% of disk perf used?
Date: 2010-01-19 20:31:04
Message-ID: F77F68B4-8EE9-4F71-BF6F-53366494A684@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On Jan 19, 2010, at 2:50 AM, fkater(at)googlemail(dot)com wrote:

> Scott Carey:
>
>> You are CPU bound.
>>
>> 30% of 4 cores is greater than 25%. 25% is one core fully
>> used.
>
> I have measured the cores separately. Some of them reached
> 30%. I am not CPU bound here.
>

Measuring the cores isn't enough. The OS switches threads between cores faster than it aggregates the usage time. On Windows, measure the individual process CPU to see if any of them are near 100%. One process can be using 100% (one full cpu, cpu bound) but switching between cores making it look like 25% on each.

If the individual postgres backend is significantly less than 100%, then you are probably not CPU bound. If this is the case and additionally the system has significant disk wait time, then you are definitely not CPU bound.
Record and post the perfmon log if you wish. In the "Process" section, select CPU time %, system time %, and user time % for all processes. In the graph, you should
see one (or two) processes eating up that CPU during the test run.

>
>> If I am wrong, you are I/O bound
>
> Yes. This is the first half of what we found out now.
>

Does the OS report that you are actually waiting on disk? See the PerfMon "Physical Disk" section. Disk time % should be high if you are waiting on disk.

>> -- this will show up in
>> windows Performance Monitor as "Disk Time (%)" -- which
>> you can get on a per device or total basis, along with i/o
>> per second (read and/or write) and bytes/sec metrics.
>
> Yes, I am using this tool.
>

What does it report for disk time %? How many I/O per second?

> However, the deeper question is (sounds ridiculous): Why am
> I I/O bound *this much* here. To recall: The write
> performance in pg is about 20-25% of the worst case serial
> write performance of the disk (and only about 8-10% of the
> best disk perf) even though pg_xlog (WAL) is moved to
> another disk, only 10 simple INSERT commands, a simple table
> of 5 columns (4 unused, one bytea) and one index for OID, no
> compression since STORAGE EXTERNAL, ntfs tweaks (noatime
> etc), ...
>

Its not going to be completely serial, we want to know if it is disk bound, and if so by what type of access. The disk time %, i/o per second, and MB/sec are needed to figure this out. MB/sec alone is not enough. PerfMon has tons of useful data you can extract on this -- i/o per second for writes and reads, size of writes, size of reads, time spent waiting on each disk, etc.

All the writes are not serial. Enough disk seeks interleaved will kill the sequential writes.
You have random writes due to the index. Try this without the index and see what happens. The index is also CPU consuming.
You can probably move the index to the other disk too (with tablespaces), and the random disk activity may then follow it.
To minimize index writes, increase shared_buffers.

>
>> To prove that you are CPU bound, split your test in half,
>> and run the two halves at the same time. If you are CPU
>> bound, then your bytes/sec performance will go up
>> significantly, along with CPU usage.
>
> Done already (see earlier posting). I am not CPU bound.
> Speed was the same.
>
If that result correlates with the system reporting high Disk Time (%), and the MB/sec written is low, then random writes or reads are causing the slowdown. The chief suspects for that are the WAL log, and the index. The WAL is generally sequential itself, and not as much of a concern as the index.

Another reply references DiskMon from sysinternals. This is also highly useful. Once you have identified the bottleneck from PerfMon, you can use this to see the actual system API calls for the disk reads and writes, tracking down to "which process on what file". Much more than you can get from Linux easily.

Bulk inserts into an indexed table is always significantly slower than inserting unindexed and then indexing later. Partitioned tables combined with staging tables can help here if you need to increase insert throughput more. Also, if random writes are your problem, a battery backed caching raid controller will significantly improve performance, as will anything that can improve random write performance (high quality SSD, faster RPM disks, more disks).

>
> Thank You for the detailed reply.
>
> Felix
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Carlo Stonebanks 2010-01-19 21:09:56 Re: New server to improve performance on our large and busy DB - advice?
Previous Message PG User 2010 2010-01-19 20:19:10 performance question on VACUUM FULL (Postgres 8.4.2)