9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

From: Sergey Koposov <koposov(at)ast(dot)cam(dot)ac(dot)uk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
Date: 2012-05-24 13:24:38
Message-ID: alpine.LRH.2.02.1205241246440.14366@calx046.ast.cam.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I've been running some tests on pg 9.2beta1 and in particular a set
of queries like

create table _tmp0 as select * from (
select *, (select healpixid from idt_match as m where
m.transitid=o.transitid)
as x from idt_photoobservation as o offset 0
) as y where x%16=ZZZ order by x;

(where ZZZ is some number integer number 0<=ZZZ<16)

With the following plan:
----------------------------------------------------------------------------------------------------------------------
Sort (cost=3228814504.96..3228815137.21 rows=252902 width=498)
Sort Key: y.x
-> Subquery Scan on y (cost=0.00..3228791809.25 rows=252902 width=498)
Filter: ((y.x % 16::bigint) = 0)
-> Limit (cost=0.00..3228033102.41 rows=50580456 width=490)
-> Seq Scan on idt_photoobservation o (cost=0.00..3228033102.41 rows=50580456 width=490)
SubPlan 1
-> Index Scan using idt_match_transitid_idx on idt_match m (cost=0.00..63.74 rows=1 width=8)
Index Cond: (transitid = o.transitid)

The schema of the tables are:

e2es2=> \d idt_match
Table "public.idt_match"
Column | Type | Modifiers
-----------+----------+-----------
sourceid | bigint |
transitid | bigint |
healpixid | bigint |
flagsxm | smallint |
Indexes:
"idt_match_idx" btree (healpixid)
"idt_match_transitid_idx" btree (transitid)

Table "public.idt_photoobservation"
Column | Type | Modifiers
-----------+----------+-----------
transitid | bigint |
fluxbp0 | real |
....
more columns
....
Indexes:
"idt_photoobservation_idx" btree (transitid)

And I noticed than when I run the query like the one shown above in parallel
(in multiple connections for ZZZ=0...8) the performance of each query
drops down significantly (factor of 2)(despite the fact that during the
execution of the query postgres is mostly CPU bound).

So I tried to oprofile it and strace it, And that's what I saw:

The strace -c of each process shows something like that
#########
Process 18660 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
76.25 0.001342 0 268987 semop
23.75 0.000418 0 61694 read
0.00 0.000000 0 138 lseek
0.00 0.000000 0 355 select
0.00 0.000000 0 3 kill
------ ----------- ----------- --------- --------- ----------------
100.00 0.001760 331177 total
#######

And the oprofile shows this on top:
-------------------------------------------------------------------------------
2863981 25.7117 ReleasePredicateLocks
2863981 100.000 ReleasePredicateLocks [self]
-------------------------------------------------------------------------------
1246629 11.1917 LocalBufferAlloc
1246629 100.000 LocalBufferAlloc [self]
-------------------------------------------------------------------------------
1135393 10.1931 CheckForSerializableConflictIn
1135393 100.000 CheckForSerializableConflictIn [self]
------------------------------------------------------------

So there is a lot of locking for some reason, And I was surprised to see
anything related to SSI in the profile at all, because I'm not running
serializable transactions (I was just running my queries from multiple
psql sessions).

Is there a reasonable explanation for what I'm seeing ?

Thanks in advance,
Sergey

More info: there is no other concurrent activity on the machine.
shared_buffers is 10G, The machine has ~ 60G of RAM, 24 cores and proper
RAID

Another note is that the oprofile and strace were obtained during the
stage when the idt_photo.. table was scanned and data was being stored in
the pgsql_tmp (not during the last bit when the data in pgsql_tmp were
actually being sorted).

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2012-05-24 13:55:34 Re: Interrupting long external library calls
Previous Message Sandro Santilli 2012-05-24 13:04:21 Re: Interrupting long external library calls