Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

From: Sergey Koposov <koposov(at)ast(dot)cam(dot)ac(dot)uk>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
Date: 2012-05-30 23:16:27
Message-ID: alpine.LRH.2.02.1205302252020.6351@calx046.ast.cam.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 30 May 2012, Florian Pflug wrote:
>
> I wonder if the huge variance could be caused by non-uniform
> synchronization costs across different cores. That's not all that
> unlikely, because at least some cache levels (L2 and/or L3, I think) are
> usually shared between all cores on a single die. Thus, a cache bouncing
> line between cores on the same die might very well be faster then it
> bouncing between cores on different dies.
>
> On linux, you can use the taskset command to explicitly assign processes
> to cores. The easiest way to check if that makes a difference is to
> assign one core for each connection to the postmaster before launching
> your test. Assuming that cpu assignment are inherited to child
> processes, that should then spread your backends out over exactly the
> cores you specify.

Wow, thanks! This seems to be working to some extend. I've found that
distributing each thread x ( 0<x<7) to the cpu 1+3*x
(reminder, that i have HT disabled and in total I have 4 cpus with 6
proper cores each) gives quite good results. And after a few runs, I seem
to be getting a more or less stable results for the multiple threads,
with the performance of multithreaded runs going from 6 to 11 seconds for
various threads. (another reminder is that 5-6 seconds is roughly the
timing of a my queries running in a single thread).

So to some extend one can say that the problem is partially solved (i.e.
it is probably understood)
But the question now is whether there is a *PG* problem here or
not, or is it Intel's or Linux's problem ?
Because still the slowdown was caused by locking. If there wouldn't be
locking there wouldn't be any problems (as demonstrated a while ago by
just cat'ting the files in multiple threads).

Cheers,
S

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2012-05-30 23:33:33 Re: WalSndWakeup() and synchronous_commit=off
Previous Message David E. Wheeler 2012-05-30 23:07:23 Re: We're not lax enough about maximum time zone offset from UTC