Re: Use unique index for longer pathkeys.

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Use unique index for longer pathkeys.
Date: 2014-07-27 05:50:42
Message-ID: CAA4eK1+gbteOuY7Y0DudHAfAK8D8fag7d8fSKvKNs7q++JCwLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 26, 2014 at 11:53 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> On Fri, Jul 25, 2014 at 12:48 PM, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > > I think there is one more disadvantage in the way current patch is
> > > done which is that you need to collect index path keys for all
relations
> > > irrespective of whether they will be of any use to eliminate useless
> > > pathkeys from query_pathkeys. One trivial case that comes to mind is
> > > when there are multiple relations involved in query and ORDER BY is
> > > base on columns of only part of the tables involved in query.
> >
> > Like this?
> >
> > select x.a, x.b, y.b from x, y where x.a = y.a order by x.a, x.b;
> >
> > Equivalent class consists of (x.a=y.a) and (x.b), so index
> > pathkeys for i_y is (y.a.=x.a). As a result, no common primary
> > pathkeys found.
>
> I think it will find common pathkey incase you have an unique index
> on x.a (please see the example below), but currently I am not clear
> why there is a need for a common index path key in such a case to
> eliminate useless keys in ORDER BY, why can't we do it based
> on individual table's path key.
>
> Example:
>
> create table t (a int not null, b int not null, c int, d text);
> create unique index i_t_pkey on t(a, b);
> insert into t (select a % 10, a / 10, a, 't' from generate_series(0,
100000) a);
> analyze;
>
> create table t1 (a int not null, b int not null, c int, d text);
> create unique index i_t1_pkey_1 on t1(a);
> create unique index i_t1_pkey_2 on t1(a, b);
> insert into t1 (select a * 2, a / 10, a, 't' from generate_series(0,
100000) a);
> explain (costs off, analyze off) select * from t,t1 where t.a=t1.a order
by t1.a,t1.b,t1.c,t1.d;
>
> QUERY PLAN
> ------------------------------------------
> Merge Join
> Merge Cond: (t.a = t1.a)
> -> Index Scan using i_t_pkey on t
> -> Index Scan using i_t1_pkey_1 on t1
> (4 rows)
>
> Here we can notice that there is no separate sort key in plan.
>
> Now drop the i_t1_pkey_1 and check the query plan again.
>
> drop index i_t1_pkey_1;
> explain (costs off, analyze off) select * from t,t1 where t.a=t1.a order
by t1.a,t1.b,t1.c,t1.d;
> QUERY PLAN
> ------------------------------------------------
> Sort
> Sort Key: t.a, t1.b, t1.c, t1.d
> -> Merge Join
> Merge Cond: (t.a = t1.a)
> -> Index Scan using i_t_pkey on t
> -> Index Scan using i_t1_pkey_2 on t1
> (6 rows)
>
> Can't above plan eliminate Sort Key even after dropping index
> (i_t1_pkey_1)?

Here I have one additional thought which I would like to share with
you to see if this patch can be done in a simpler way. In function
standard_qp_callback(), can we directly trim the sortclause list based
on index information in PlannerInfo. We have access to target list in
this function to know exactly the relation/column information of
sortclause.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message MauMau 2014-07-27 05:57:44 Re: [RFC] Should smgrtruncate() avoid sending sinval message for temp relations
Previous Message Pavel Stehule 2014-07-27 04:59:15 Re: PL/PgSQL: RAISE and the number of parameters