parallel sequential scan returns extraneous rows

From: Michael Day <blake(at)rcmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: parallel sequential scan returns extraneous rows
Date: 2016-11-29 14:39:28
Message-ID: D0E8A029-D1AC-42E8-979A-5DE4A77E4413@rcmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I have found a nasty bug when using parallel sequential scans with an exists clause on postgresql 9.6.1. I have found that the rows returned using parallel sequential scan plans are incorrect (though I haven’t dug sufficiently to know in what ways). See below for an example of the issue.

denver=# set max_parallel_workers_per_gather = 0;
SET
denver=# select count(*)
denver-# from users u
denver-# join address a on (a.users_id = u.id)
denver-# where exists (select 1 from address where users_id = u.id)
count
---------
9486910
(1 row)

QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Aggregate (cost=2117655.96..2117655.97 rows=1 width=8)
-> Merge Join (cost=1257.34..2094242.48 rows=9365393 width=0)
Merge Cond: (u.id = a.users_id)
-> Merge Semi Join (cost=495.43..1607025.52 rows=2824687 width=8)
Merge Cond: (u.id = address.users_id)
-> Index Only Scan using users_pk on users u (cost=0.43..1018930.31 rows=11648927 width=4)
-> Index Only Scan using address_idx01 on address (cost=0.43..456495.80 rows=9365393 width=4)
-> Index Only Scan using address_idx01 on address a (cost=0.43..456495.80 rows=9365393 width=4)
(8 rows)

denver=# set max_parallel_workers_per_gather = 1;
SET
denver=# select count(*)
denver-# from users u
denver-# join address a on (a.users_id = u.id)
denver-# where exists (select 1 from address where users_id = u.id);
count
----------
29288954
(1 row)

QUERY PLAN
---------------------------------------------------------------------------------------------------------
Aggregate (cost=1889898.47..1889898.48 rows=1 width=8)
-> Hash Join (cost=1401575.70..1866484.99 rows=9365393 width=0)
Hash Cond: (a.users_id = u.id)
-> Seq Scan on address a (cost=0.00..299463.93 rows=9365393 width=4)
-> Hash (cost=1355233.12..1355233.12 rows=2824687 width=8)
-> Gather (cost=571820.86..1355233.12 rows=2824687 width=8)
Workers Planned: 1
-> Hash Join (cost=570820.86..1071764.42 rows=2824687 width=8)
Hash Cond: (address.users_id = u.id)
-> Parallel Seq Scan on address (cost=0.00..260900.55 rows=5509055 width=4)
-> Hash (cost=379705.27..379705.27 rows=11648927 width=4)
-> Seq Scan on users u (cost=0.00..379705.27 rows=11648927 width=4)
(12 rows)

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2016-11-29 14:53:49 Re: BUG #14439: Trigger function fails with internal error after upgrade to 9.6
Previous Message max.valjanski 2016-11-29 14:16:39 BUG #14439: Trigger function fails with internal error after upgrade to 9.6