Re: Fix for parallel BTree initialization bug

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: "Jameson, Hunter 'James'" <hunjmes(at)amazon(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fix for parallel BTree initialization bug
Date: 2020-09-10 04:22:09
Message-ID: 20200910042208.GC18552@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 08, 2020 at 06:25:03PM +0000, Jameson, Hunter 'James' wrote:
> Hi, I ran across a small (but annoying) bug in initializing parallel BTree scans, which causes the parallel-scan state machine to get confused. The fix is one line; the description is a bit longer—

What postgres version was this ?

> Before, function _bt_first() would exit immediately if the specified scan keys could never be satisfied--without notifying other parallel workers, if any, that the scan key was done. This moved that particular worker to a scan key beyond what was in the shared parallel-query state, so that it would later try to read in "InvalidBlockNumber", without recognizing it as a special sentinel value.
>
> The basic bug is that the BTree parallel query state machine assumes that a worker process is working on a key <= the global key--a worker process can be behind (i.e., hasn't finished its work on a previous key), but never ahead. By allowing the first worker to move on to the next scan key, in this one case, without notifying other workers, the global key ends up < the first worker's local key.
>
> Symptoms of the bug are: on R/O, we get an error saying we can't extend the index relation, while on an R/W we just extend the index relation by 1 block.

What's the exact error ? Are you able to provide a backtrace ?

> To reproduce, you need a query that:
>
> 1. Executes parallel BTree index scan;
> 2. Has an IN-list of size > 1;

Do you mean you have an index on col1 and a query condition like: col1 IN (a,b,c...) ?

> 3. Has an additional index filter that makes it impossible to satisfy the
> first IN-list condition.

.. AND col1::text||'foo' = '';
I think you mean that the "impossible" condition makes it so that a btree
worker exits early.

> (We encountered such a query, and therefore the bug, on a production instance.)

Could you send the "shape" of the query or its plan, obfuscated and redacted as
need be ?

--
Justin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-09-10 04:37:10 Re: please update ps display for recovery checkpoint
Previous Message Pavel Stehule 2020-09-10 04:11:30 Re: Proposals for making it easier to write correct bgworkers