Re: Top-N sorts verses parallelism

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Top-N sorts verses parallelism
Date: 2017-12-15 19:10:11
Message-ID: CAMkU=1xS2ucXoFcHsw8=joUi5wz+07AarR8ScdtQNw+yeBHi4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 14, 2017 at 5:12 PM, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com
> wrote:

> >
> > This looks like a costing bug. The estimated cost of sorting 416,667
> > estimated tuples in one parallel worker is almost identical to the
> estimated
> > cost of sorting 1,000,000 tuples when parallelism is turned off. Adding
> > some logging to the cost_sort function, it looks like the limit does not
> get
> > sent down for the parallel estimate:
> >
> > NOTICE: JJ cost_sort tuples 1000000.000000, limit 61.000000, sort_mem
> 65536
> > NOTICE: JJ cost_sort tuples 416667.000000, limit -1.000000, sort_mem
> 65536
> >
> > So the LIMIT gets passed down for the execution step, but not for the
> > planning step.
>
> Oh, well spotted. I was looking in the wrong place. Maybe the fix is
> as simple as the attached? It certainly helps in the cases I tested,
> including with wider tables.
>

I had hit on the same change. And was also surprised that it was located
where it was. With the change, it uses the parallel plan all the way down
to LIMIT 1.

With the patch, it still satisfies make check, so if it introduces errors
they are subtle ones. If we can't actually do this and it needs to stay
-1, then I think we need a comment to explain why.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-12-15 19:10:45 Re: [HACKERS] replace GrantObjectType with ObjectType
Previous Message Peter Geoghegan 2017-12-15 19:03:37 Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple