From: | Ivan Voras <ivoras(at)freebsd(dot)org> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance |
Date: | 2010-10-07 12:47:06 |
Message-ID: | i8kfft$e5j$1@dough.gmane.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
On 10/07/10 02:39, Robert Haas wrote:
> On Wed, Oct 6, 2010 at 6:31 PM, Ivan Voras<ivoras(at)freebsd(dot)org> wrote:
>> On 10/04/10 20:49, Josh Berkus wrote:
>>
>>>> The other major bottleneck they ran into was a kernel one: reading from
>>>> the heap file requires a couple lseek operations, and Linux acquires a
>>>> mutex on the inode to do that. The proper place to fix this is
>>>> certainly in the kernel but it may be possible to work around in
>>>> Postgres.
>>>
>>> Or we could complain to Kernel.org. They've been fairly responsive in
>>> the past. Too bad this didn't get posted earlier; I just got back from
>>> LinuxCon.
>>>
>>> So you know someone who can speak technically to this issue? I can put
>>> them in touch with the Linux geeks in charge of that part of the kernel
>>> code.
>>
>> Hmmm... lseek? As in "lseek() then read() or write()" idiom? It AFAIK
>> cannot be fixed since you're modifying the global "strean position"
>> variable and something has got to lock that.
>
> Well, there are lock free algorithms using CAS, no?
Nothing is really "lock free" - in this case the algorithms simply push
the locking down to atomic operations on the CPU (and the memory bus).
Semantically, *something* has to lock the memory region for however
brief period of time and then propagate that update to other CPUs'
caches (i.e. invalidate them).
>> OTOH, pread() / pwrite() don't have to do that.
>
> Hey, I didn't know about those. That sounds like it might be worth
> investigating, though I confess I lack a 48-core machine on which to
> measure the alleged benefit.
As Jon said, it will in any case reduce the number of these syscalls by
half, and they can be wrapped by a C macro for the platforms which don't
implement them.
(and just in case it's needed: pread() is a special case of preadv()).
From | Date | Subject | |
---|---|---|---|
Next Message | Markus Wanner | 2010-10-07 12:54:58 | Re: Issues with Quorum Commit |
Previous Message | Robert Haas | 2010-10-07 12:33:07 | Re: [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2010-10-07 13:30:56 | Re: On Scalability |
Previous Message | Robert Haas | 2010-10-07 12:33:07 | Re: [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance |