Re: UPDATEDs slowing SELECTs in a fully cached database

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: lars <lhofhansl(at)yahoo(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Ivan Voras <ivoras(at)freebsd(dot)org>, pgsql-performance(at)postgresql(dot)org
Subject: Re: UPDATEDs slowing SELECTs in a fully cached database
Date: 2011-07-16 22:33:03
Message-ID: CAMkU=1xFsFC+SR0LbqoJhkK=_KHynWo58YGMZSBdBtuQDzXxCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Jul 15, 2011 at 5:21 PM, lars <lhofhansl(at)yahoo(dot)com> wrote:
> On 07/14/2011 04:03 PM, Jeff Janes wrote:
>>
>> On Wed, Jul 13, 2011 at 3:41 PM, lars<lhofhansl(at)yahoo(dot)com>  wrote:
>>>
>>> On 07/13/2011 11:42 AM, Kevin Grittner wrote:
>>>>
>>>> So transactions without an XID *are* sensitive to
>>>> synchronous_commit.  That's likely a useful clue.
>>>>
>>>> How much did it help the run time of the SELECT which followed the
>>>> UPDATE?
>>>
>>> It has surprisingly little impact on the SELECT side:
>>
>> If your fsync is truly fsyncing, it seems like it should have
>> considerable effect.
>>
>> Could you strace with both -ttt and -T, with and without synchronous
>> commit?
>>
>> Cheers,
>>
>> Jeff
>
> Ok, here we go:
>
> "Q\0\0\0_select count(*) from test w"..., 8192, 0, NULL, NULL) = 96
> <5.357152>
> 1310774187.750791 gettimeofday({1310774187, 750809}, NULL) = 0 <0.000022>
> 1310774187.751023 lseek(12, 0, SEEK_END) = 329908224 <0.000023>
> 1310774187.751109 lseek(15, 0, SEEK_END) = 396607488 <0.000022>
> 1310774187.751186 lseek(18, 0, SEEK_END) = 534175744 <0.000022>
> 1310774187.751360 lseek(12, 0, SEEK_END) = 329908224 <0.000023>
> 1310774187.753389 brk(0x248e000)        = 0x248e000 <0.000026>
> 1310774187.753953 brk(0x24ce000)        = 0x24ce000 <0.000023>
> 1310774187.755158 brk(0x254e000)        = 0x254e000 <0.000024>
> 1310774187.766605 brk(0x2450000)        = 0x2450000 <0.000170>
> 1310774187.766852 lseek(23, 4513792, SEEK_SET) = 4513792 <0.000023>
> 1310774187.766927 write(23,
> "f\320\1\0\1\0\0\0\320\0\0\0\0\340D-\22\0\0\0\30(at)!000000000"..., 32768) =
> 32768 <0.000075>
> 1310774187.767071 fdatasync(23)         = 0 <0.002618>
> 1310774187.769760 gettimeofday({1310774187, 769778}, NULL) = 0 <0.000022>
> 1310774187.769848 sendto(5,
> "\2\0\0\0\350\0\0\0\1(at)\0\0\2\0\0\0\1\0\0\0\0\0\0\0\r\201\0\0>\0\2\0"...,
> 232, 0, NULL, 0) = 232 <0.000064>
> 1310774187.769993 sendto(6,
> "T\0\0\0\36\0\1count\0\0\0\0\0\0\0\0\0\0\24\0\10\377\377\377\377\0\0D"...,
> 66, 0, NULL, 0) = 66 <0.000199>

The total time for this is about 19 ms, but your previous example was
around 35 ms. Is this reproducible? A change of set up between then
and now?

2.6 ms for an fsync seems awfully quick. I wonder if EBS uses
nonvolatile/battery-backed write cache, or if it just lies about fsync
actually hitting disk.

But anyway it looks like you aren't blocking much in system calls, and
I don't think there are non-system-call ways to block, so the time is
probably being spent in something CPU intensive.

On my (physical) computer, synchronous_commit=off does eliminate the
timing differences between the select immediately after the update and
subsequent selects. So while I could reproduce timing differences
that were superficially similar to yours, they seem to have some
fundamentally different cause.

Maybe the best way to figure out what is going on is to loop the
update and the select in different processes, and use perf or oprof to
profile just the select process (with and without the update running).
It would also be good to know the timings without profiling turned on
as well, to know how much the profiling is disturbing the timing.

...

> Here's an example with more dirty rows (I basically let the updater run for
> a while dirtying very many rows).

I'm surprised that make that much of a difference. The select should
only clean up blocks it actually visits, and updating more rows
shouldn't change that very much.

...
> 1310775407.027503 semop(229383, {{2, -1, 0}}, 1) = 0 <2.215137>
...
> No WAL, but checkout that one expensive semop! 2s!!

Is that reproducible, or is it just a one time anomaly?

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message lars 2011-07-17 02:12:47 Re: UPDATEDs slowing SELECTs in a fully cached database
Previous Message lars 2011-07-16 00:21:28 Re: UPDATEDs slowing SELECTs in a fully cached database