Skip site navigation (1) Skip section navigation (2)

Re: Fixing Grittner's planner issues

From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Subject: Re: Fixing Grittner's planner issues
Date: 2009-02-19 21:46:47
Message-ID: 4136ffa0902191346g62081081v8607f0b92c206f0a@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Thu, Feb 19, 2009 at 7:54 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Thu, Feb 19, 2009 at 1:20 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> [ back to planner stuff after a hiatus ]
>
>> Well, as I wrote upthread:
>
> What you're actually suggesting is modifying the executor to incorporate
> the unique-fication logic into hashjoin and/or mergejoin.  Maybe, but
> that code is way too complex already for my taste (especially mergejoin)
> and what we'd save is, hmm, four lines in the planner.

I'm not entirely following the implications for semijoins but I know
I've noticed more than a few cases where an option to Hash to only
gather unique values seems like it would be a win.

Consider cases like this where we hash the values twice:

postgres=# explain select * from generate_series(1,1000) as a(i) where
i in (select * from generate_series(1,100) as b(i));
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Hash Join  (cost=19.50..45.75 rows=1000 width=4)
   Hash Cond: (a.i = b.i)
   ->  Function Scan on generate_series a  (cost=0.00..12.50 rows=1000 width=4)
   ->  Hash  (cost=17.00..17.00 rows=200 width=4)
         ->  HashAggregate  (cost=15.00..17.00 rows=200 width=4)
               ->  Function Scan on generate_series b
(cost=0.00..12.50 rows=1000 width=4)
(6 rows)


It's tempting to have Hash cheat and just peek at the node beneath it
to see if it's a HashAggregate, in which case it could call a special
method to request the whole hash. But it would have to know that it's
just a plain uniquify and not implementing a GROUP BY.

-- 
greg

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2009-02-19 21:53:34
Subject: Re: Fixing Grittner's planner issues
Previous:From: Jaime CasanovaDate: 2009-02-19 21:35:43
Subject: Re: Updates of SE-PostgreSQL 8.4devel patches (r1530)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group