Re: parallel sequential scan returns extraneous rows

From: Michael Day <blake(at)rcmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: parallel sequential scan returns extraneous rows
Date: 2016-11-29 21:39:59
Message-ID: 6388A361-BE20-44B5-9F07-58ABDE24DFAD@rcmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I was able to reproduce with this set of data.

create table users (id integer);
create table address (id integer, users_id integer);

insert into users select s from generate_series(1,1000000) s;
insert into address select s, s/2 from generate_series(1,2000000) s;

analyze users;
analyze address;

set max_parallel_workers_per_gather = 0;

select count(*)
from users u
join address a on (a.users_id = u.id)
where exists (select 1 from address where users_id = u.id);

set max_parallel_workers_per_gather = 1;

select count(*)
from users u
join address a on (a.users_id = u.id)
where exists (select 1 from address where users_id = u.id);

On 11/29/16, 11:19 AM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

Michael Day <blake(at)rcmail(dot)com> writes:
> I have found a nasty bug when using parallel sequential scans with an exists clause on postgresql 9.6.1. I have found that the rows returned using parallel sequential scan plans are incorrect (though I haven’t dug sufficiently to know in what ways). See below for an example of the issue.

Hm, looks like a planner error: it seems to be forgetting that the join
to "address" should be a semijoin. "address" should either be on the
inside of a "Semi" join (as in your first, correct-looking plan) or be
passed through a unique-ification stage such as a HashAgg. Clearly,
neither thing is happening in the second plan.

I couldn't reproduce this in a bit of trying, however. Can you come
up with a self-contained test case?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2016-11-29 23:07:45 Re: BUG #14438: Wrong row count in the join plan with unique index scan
Previous Message Alvaro Herrera 2016-11-29 19:28:47 Re: [BUGS] object_classes array is broken, again