Re: why doesn't optimizer can pull up where a > ( ... )

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, xuncheng(at)google(dot)com
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: why doesn't optimizer can pull up where a > ( ... )
Date: 2019-11-21 00:30:51
Message-ID: CAKU4AWr9+_-FBY-xgmFhDOitjbb6yruO1TAMmBGgx2cNOqJZtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
> Hm. That actually raises the stakes a great deal, because if that's
> what you're expecting, it would require planning out both the transformed
> and untransformed versions of the query before you could make a cost
> comparison.

I don't know an official name, let's call it as "bloom filter push down
(BFPD)" for reference. this algorithm may be helpful on this case with
some extra effort.

First, Take . "select ... from t1, t2 where t1.a = t2.a and t1.b = 100"
for example, and assume t1 is scanned before t2 scanning, like hash
join/sort merge and take t1's result as inner table.

1. it first scan t1 with filter t1.b = 100;
2. during the above scan, it build a bloom filter *based on the join key
(t1.a) for the "selected" rows.*
3. during scan t2.a, it filters t2.a with the bloom filter.
4. probe the the hash table with the filtered rows from the above step.

Back to this problem, if we have a chance to get the p_brand we are
interested, we can use the same logic to only group by the p_brand.

Another option may be we just keep the N versions, and search them
differently and compare their cost at last.

> The Greenplum page mentions they also added "join-aggregates
reordering", in addition to subquery unnesting.
Thanks, I will search more about this.

>Having said that, the best form of criticism is a patch. If somebody
>actually wrote the code to do something like this, we could look at how
>much time it wasted in which unsuccessful cases and then have an
>informed discussion about whether it was worth adopting.
>

I would try to see how far I can get.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2019-11-21 01:05:14 Re: function calls optimization
Previous Message Tom Lane 2019-11-20 22:31:51 Re: BufFileRead() error signalling