Re: slow IN() clause for many cases

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: andrew(at)supernews(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: slow IN() clause for many cases
Date: 2005-11-29 22:07:06
Message-ID: 1133302026.2906.454.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2005-10-17 at 12:49 +0100, Simon Riggs wrote:
> On Fri, 2005-10-14 at 19:09 -0400, Tom Lane wrote:
> > I wrote:
> > > I'm thinking that IN should be
> > > converted to a ScalarArrayOpExpr, ie
> >
> > > x = ANY (ARRAY[val1,val2,val3,val4,...])
> >
> > Actually, there is one little thing in the way of doing this: it'll
> > fail if any of the IN-list elements are NULL, because we have not got
> > support for arrays with null elements. So we'd have to fix that first.
>
> You'd also need to consider how this effects partial indexes and
> constraint exclusion. Not much of an issue, but an extra case to handle
> in the predicate proving code.
>
> = = =
>
> Just had a case where using an IN list was quicker than using a join
> because it allowed an index lookup to occur. There is also some clear
> mileage in transforming this type of query to a more plannable form:
>
> select * from bigtable where word IN (
> select word from customer_word where customer = 6)
>
> i.e. where the values for the IN clause are evaluated at run time,
> rather than at plan time.

Do you think we'll be able to generate a single ScalarArrayOpExpr from a
small subselect and pass it through as an indexable expression?

I'm guessing its not lost on you that this would give a Star join like
capability, when joining multiple dimension tables to a large Fact
table.

e.g.

Select * From Sales where month IN (
select month from time_dimension where FinYear = 2005 and Quarter = 3)
...

Having taught predtest.c about ScalarArrayOpExpr means that would allow
this to work with constraint exclusion.

So that solves the how-to-join-AND-partition problem I've been
struggling with: don't join, transform. Very cool.

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2005-11-29 22:19:27 Re: ice-broker scan thread
Previous Message Tom Lane 2005-11-29 21:59:08 Re: Using multi-row technique with COPY