From: | Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> |
---|---|
To: | |
Cc: | Simon Riggs <simon(at)2ndQuadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alan Li <ali(at)truviso(dot)com> |
Subject: | Re: 8.4 open item: copy performance regression? |
Date: | 2009-06-21 10:48:08 |
Message-ID: | 4A3E0FE8.6000508@kaltenbrunner.cc |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Stefan Kaltenbrunner wrote:
> Simon Riggs wrote:
>> On Sun, 2009-06-21 at 10:28 +0200, Stefan Kaltenbrunner wrote:
>>> I did some limited testing on that but I was unable to measure any
>>> significant effect - especially since the difference between
>>> wal-logged and not is rather small for a non-parallel COPY (ie in the
>>> above example you get around 6m20s runtime for wal-logged and ~5m40s
>>> in the other case).
>>
>> This is a common confusion for small tests.
>>
>> Non-WAL logged case causes all buffers to be written to disk at end of
>> COPY. This is roughly the same size as the volume of WAL written. In
>> logged case we do not write data blocks, they get written at next
>> checkpoint. So the reduction in I/O is not apparent, since during the
>> period of the test the I/O is about the same in both cases and less I/O
>> in the non-WAL logged case. On longer tests the difference shows more
>> clearly because the data blocks start to migrate out of shared buffers
>> while the COPY is still running, effecting the test results.
>
> I was actually testing with and without explicit CHECKPOINTing
> before/after the load(and also with longer runs) too - the difference is
> negligible especially with only one process involved.
> I think the difference is simply not that large because we are still
> mostly CPU bound within COPY on reasonably fast IO-subsystems.
hmm to further demonstrate that I just did some testing(same config as
before and the 16MB for the buffer) by loading those 60M rows into a
20GB ramdisk instead of the SAN(with a CHECKPOINT before and after).
this results in the following "improvements":
16384:
wal bypass: 5min40s -> 5min10s (~9%)
wal logged: 6min20s -> 6min8s (~3%)
vmstat 5 output shows that the system is in fact CPU bound (ie using ~6%
which is more or less a full core on a 16 core box) and not doing
anything IO-wise.
r b swpd free buff cache si so bi bo in cs us sy
id wa st
1 0 0 19010444 118648 15415684 0 0 0 6 1068 2151
6 0 94 0 0
1 0 0 18870448 118652 15555204 0 0 0 6 1069 2142
6 0 94 0 0
1 0 0 18730568 118684 15694872 0 0 0 185 1080 2151
6 0 94 0 0
1 0 0 18591236 118692 15834516 0 0 0 19 1072 2144
6 0 94 0 0
1 0 0 18451472 118696 15973532 0 0 0 46 1073 2152
6 0 94 0 0
1 0 0 18311720 118704 16113204 0 0 0 7 1059 2136
6 0 94 0 0
1 0 0 18171968 118704 16252944 0 0 0 0 1077 2171
6 0 94 0 0
1 0 0 18032088 118712 16392300 0 0 0 54 1062 2138
6 0 94 0 0
1 0 0 17891716 118720 16532060 0 0 0 8 1078 2176
6 0 94 0 0
So I do think that IO is in fact not too significant for this kind of
testing and we still have ways to go in terms of CPU efficiency in COPY.
Stefan
From | Date | Subject | |
---|---|---|---|
Next Message | rct682 | 2009-06-21 10:54:57 | enquery for timezone GMT |
Previous Message | Stefan Kaltenbrunner | 2009-06-21 09:17:01 | Re: 8.4 open item: copy performance regression? |