From: | James Coleman <jtc331(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: enable_incremental_sort changes query behavior |
Date: | 2020-11-20 18:51:09 |
Message-ID: | CAAaqYe9DksvkytX2eV9N5H2dg1OYnGcD5dhRbPBeOTGr3oCmww@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Nov 20, 2020 at 12:06 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, Oct 7, 2020 at 6:22 PM Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> > I'm still not entirely sure I understand what's happening, or what the
> > exact rule is. Consider this query:
> >
> > explain (verbose) select distinct i, t, md5(t) from ref_0;
> >
> > which on PG12 (i.e. before incremental sort) is planned like this:
> >
> > QUERY PLAN
> > ----------------------------------------------------------------------------------
> > Unique (cost=78120.92..83120.92 rows=500000 width=65)
> > Output: i, t, (md5(t))
> > -> Sort (cost=78120.92..79370.92 rows=500000 width=65)
> > Output: i, t, (md5(t))
> > Sort Key: ref_0.i, ref_0.t, (md5(ref_0.t))
> > -> Seq Scan on public.ref_0 (cost=0.00..10282.00 rows=500000 width=65)
> > Output: i, t, md5(t)
> > (7 rows)
> >
> > i.e. the (stable) function is pushed all the way to the scan node. And
> > even if we replace it with a volatile expression it gets pushed down:
> >
> > explain (verbose) select distinct i, t, md5(random()::text || t) from ref_0;
> >
> > QUERY PLAN
> > ----------------------------------------------------------------------------------
> > Unique (cost=83120.92..88120.92 rows=500000 width=65)
> > Output: i, t, (md5(((random())::text || t)))
> > -> Sort (cost=83120.92..84370.92 rows=500000 width=65)
> > Output: i, t, (md5(((random())::text || t)))
> > Sort Key: ref_0.i, ref_0.t, (md5(((random())::text || ref_0.t)))
> > -> Seq Scan on public.ref_0 (cost=0.00..15282.00 rows=500000 width=65)
> > Output: i, t, md5(((random())::text || t))
> > (7 rows)
> >
> >
> > But perhaps I just don't understand the assumption correctly?
>
> This isn't a counterexample, because there's no join tree here -- or,
> well, there is, but it's trivial, because there's only one relation
> involved. You can't have a non-Var expression computed before you
> finish all the joins, because there are no joins.
>
> What I said was: "target lists for any nodes below the top of the join
> tree were previously always just Var nodes." The topmost join allowed
> non-Var nodes before, but not lower levels.
As I understand what you're saying, the attached (from the repro case
in [1]'s discussion about parallel safety here) is a counterexample.
Specifically we have a plan like:
Merge Right Join
-> Unique
-> Gather Merge
-> Sort
-> Nested Loop
The pathtarget of the nested loop contains non-var expressions (in
this case a CASE expression).
Am I misunderstanding what you're saying?
I've attached verbose output (and the query).
James
Attachment | Content-Type | Size |
---|---|---|
explain_verbose.txt | text/plain | 5.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2020-11-20 19:37:22 | Re: xid wraparound danger due to INDEX_CLEANUP false |
Previous Message | Pavel Stehule | 2020-11-20 18:28:58 | Re: [Proposal] Global temporary tables |