Re: Fix for parallel BTree initialization bug

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: "Jameson, Hunter 'James'" <hunjmes(at)amazon(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <rhaas(at)postgresql(dot)org>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>
Subject: Re: Fix for parallel BTree initialization bug
Date: 2020-09-11 11:11:22
Message-ID: CAA4eK1+ArU=t6b3AjwE6Xs3MV4=UuLwiOV_7Revi9AF8q1WpLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 11, 2020 at 8:07 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> Against all odds, I was able to reproduce this.
>

Thanks, this helps me to understand the problem. So whats going on
here is that once one of the workers has moved to the next set of scan
keys without incrementing parallel shared key count the other workers
can try to join the on-going scan with a different set of keys which
can lead to unpredictable behavior which is seen by both you and
James. In your case, it scanned the blocks twice for the same set of
scan keys due to which you are getting more rows than actual rows to
be returned by scan and in the case of James, one of the workers
changed it scan block to InvalidBlockNumber (basically start of scan)
during the scan which lead to the problem.

So the fix provided by James is correct. I have slightly adjusted the
commit message in the attached. It needs to be backpatched till 10
where this feature was introduced.

I have tested this on HEAD. It would be great if you can verify in
back branches as well. I'll also do it before commit.

--
With Regards,
Amit Kapila.

Attachment Content-Type Size
v2-0001-Update-parallel-BTree-scan-state-when-the-scan-ke.patch application/octet-stream 1.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2020-09-11 11:18:52 Re: TDE (Transparent Data Encryption) supported ?
Previous Message Daniel Gustafsson 2020-09-11 11:11:06 Re: copyright problem in REL_13_STABLE