From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: WIP: Upper planner pathification |
Date: | 2016-03-01 14:30:26 |
Message-ID: | 8783.1456842626@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> I tried to look into patch and I had a question (one for now): why LimitPath
> doesn't contain actual limit/offset value? I saw a lot of subqueries with LIMIT
> 1 which could be transformed into EXISTS subquery.
Oh, yeah, I intended to change that but didn't get to it yet. Consider
it done.
> Me too. I applied the patch and can confirm that 'make test' doesn't fail on
> FreeBSD 10.2. Now I will try to run kind of TPC-H with and without patch.
I do not think the patch will make a lot of performance difference as-is;
its value is more in what it will let us do later. There are a couple of
regression test cases that change plans for the better, but it's sort of
accidental. Those cases look like
select d.* from d left join (select * from b group by b.id, b.c_id) s
on d.a = s.id;
and what happens in HEAD is that the subquery chooses a hashagg plan
and then the upper query decides a mergejoin would be a good idea ...
so it has to sort the output of the hashagg. With the patch, what
comes back from the subquery is a Path for the hashagg and a Path
for doing the GROUP BY with Sort/Uniq. The second path is more expensive,
but it survives the add_path tournament because it can produce sorted
output. Then the outer level discovers that it can use that to do its
mergejoin without a separate sort step, and that way is cheaper overall.
So instead of
! -> Sort
! Sort Key: s.id
! -> Subquery Scan on s
! -> HashAggregate
! Group Key: b.id
! -> Seq Scan on b
we get
! -> Group
! Group Key: b.id
! -> Index Scan using b_pkey on b
which is noticeably cheaper, and not just because we got rid of the
Subquery Scan node. So that's nice --- but it's more or less accidental,
because the outer level isn't telling the inner level that this sort order
might be interesting.
Once this infrastructure is in place, I want to look at passing down more
information to recursive subquery_planner calls so that we're not leaving
this kind of optimization to chance. But the patch is big enough already,
so that (and a lot of other things) are getting left for later.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Julien Rouhaud | 2016-03-01 14:37:28 | Re: Publish autovacuum informations |
Previous Message | Craig Ringer | 2016-03-01 14:28:40 | Re: TAP / recovery-test fs-level backups, psql enhancements etc |