Re: [HACKERS] [postgresql 10 beta3] unrecognized node type: 90

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, "Adam, Etienne (Nokia-TECH/Issy Les Moulineaux)" <etienne(dot)adam(at)nokia(dot)com>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>, "Duquesne, Pierre (Nokia-TECH/Issy Les Moulineaux)" <pierre(dot)duquesne(at)nokia(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [postgresql 10 beta3] unrecognized node type: 90
Date: 2017-08-30 18:16:14
Message-ID: 18224.1504116974@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> On Tue, Aug 29, 2017 at 10:05 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> If no objections, I'll do the additional legwork and push.

> No objections.

Done. Out of curiosity, I pushed just the rescan-param patch to the
buildfarm to start with, to see if anything would fall over, and indeed
some things did:

* prairiedog has shown several instances of a parallel bitmap heap scan
test failing with too many rows being retrieved. I think what's
happening there is that the leader's ExecReScanBitmapHeapScan call is
slow enough to happen that the worker(s) have already retrieved some rows
using the old shared state. We'd determined that the equivalent case
for a plain seqscan would result in no failure because the workers would
think they had nothing to do, but this evidently isn't true for a parallel
bitmap scan.

* prairiedog and loach have both shown failures with the test case from
a2b70c89c, in which the *first* scan produces too many rows and then the
later ones are fine. This befuddled me initially, but then I remembered
that nodeNestloop.c will unconditionally do an ExecReScan call on its
inner plan before the first ExecProcNode call. With the modified code
from 7df2c1f8d, this results in the leader's Gather node's top child
having a pending rescan on it due to a chgParam bit. That's serviced
when we do the first ExecProcNode call on the child, after having started
the workers. So that's another way in which a ReScan call can happen
in the leader when workers are already running, and if the workers have
already scanned some pages then those pages will get scanned again.

So I think this is all fixed up by 41b0dd987, but evidently those patches
are not nearly as independent as I first thought.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message fp9 2017-08-30 18:55:31 BUG #14793: PG Admin Silent install
Previous Message Tom Lane 2017-08-30 17:56:51 Re: BUG #14791: Error 42P07 but the relation DOESN'T Exists! Error 42P01

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2017-08-30 18:54:26 Re: Polyphase merge is obsolete
Previous Message Ashutosh Bapat 2017-08-30 16:47:38 Re: expanding inheritance in partition bound order