Re: partitioned table and ORDER BY indexed_field DESC LIMIT 1

From: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Luke Lonergan <llonergan(at)greenplum(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Anton <anton200(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: partitioned table and ORDER BY indexed_field DESC LIMIT 1
Date: 2007-11-08 02:46:34
Message-ID: 4732788A.6050405@paradise.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Gregory Stark wrote:
> "Mark Kirkwood" <markir(at)paradise(dot)net(dot)nz> writes:
>
>
>> I spent today looking at getting this patch into a self contained state.
>> Working against HEAD I'm getting bogged down in the PathKeyItem to
>> PathKey/EquivalenceClass/EquivalenceMember(s) change. So I figured I'd divide
>> and conquer to some extent, and initially provide a patch:
>>
>> - against 8.2.(5)
>> - self contained (i.e no mystery functions)
>>
>
> That would be helpful for me. It would include the bits I'm looking for.
>
>
>> The next step would be to update to to HEAD. That would hopefully provide some
>> useful material for others working on this.
>>
>
> If that's not too much work then that would be great but if it's a lot of work
> then it may not be worth it if I'm planning to only take certain bits. On the
> other hand if it's good then we might just want to take it wholesale and then
> add to it.
>
>

Here is a (somewhat hurried) self-contained version of the patch under
discussion. It applies to 8.2.5 and the resultant code compiles and
runs. I've left in some unneeded parallel stuff (PathLocus struct),
which I can weed out in a subsequent version if desired. I also removed
the 'cdb ' from most of the function names and (I hope) any Greenplum
copyrights.

I discovered that the patch solves a slightly different problem... it
pulls up index scans as a viable path choice, (but not for the DESC
case) but does not push down the LIMIT to the child tables ... so the
actual performance improvement is zero - however hopefully the patch
provides useful raw material to help.

e.g - using the examine schema from the OP email - but removing the DESC
from the query:

part=# set enable_seqscan=off;
SET
part=# explain SELECT * FROM n_traf ORDER BY date_time LIMIT 1;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=198367.14..198367.15 rows=1 width=20)
-> Sort (cost=198367.14..200870.92 rows=1001510 width=20)
Sort Key: public.n_traf.date_time
-> Result (cost=0.00..57464.92 rows=1001510 width=20)
-> Append (cost=0.00..57464.92 rows=1001510 width=20)
-> Index Scan using n_traf_date_time_login_id on
n_traf (cost=0.00..66.90 rows=1510 width=20)
-> Index Scan using
n_traf_y2007m01_date_time_login_id on n_traf_y2007m01 n_traf
(cost=0.00..4748.38 rows=83043 width=20)
-> Index Scan using
n_traf_y2007m02_date_time_login_id on n_traf_y2007m02 n_traf
(cost=0.00..4772.60 rows=83274 width=20)
-> Index Scan using
n_traf_y2007m03_date_time_login_id on n_traf_y2007m03 n_traf
(cost=0.00..4782.12 rows=83330 width=20)
-> Index Scan using
n_traf_y2007m04_date_time_login_id on n_traf_y2007m04 n_traf
(cost=0.00..4818.29 rows=83609 width=20)
-> Index Scan using
n_traf_y2007m05_date_time_login_id on n_traf_y2007m05 n_traf
(cost=0.00..4721.85 rows=82830 width=20)
-> Index Scan using
n_traf_y2007m06_date_time_login_id on n_traf_y2007m06 n_traf
(cost=0.00..4766.56 rows=83357 width=20)
-> Index Scan using
n_traf_y2007m07_date_time_login_id on n_traf_y2007m07 n_traf
(cost=0.00..4800.44 rows=83548 width=20)
-> Index Scan using
n_traf_y2007m08_date_time_login_id on n_traf_y2007m08 n_traf
(cost=0.00..4787.55 rows=83248 width=20)
-> Index Scan using
n_traf_y2007m09_date_time_login_id on n_traf_y2007m09 n_traf
(cost=0.00..4830.67 rows=83389 width=20)
-> Index Scan using
n_traf_y2007m10_date_time_login_id on n_traf_y2007m10 n_traf
(cost=0.00..4795.78 rows=82993 width=20)
-> Index Scan using
n_traf_y2007m11_date_time_login_id on n_traf_y2007m11 n_traf
(cost=0.00..4754.26 rows=83351 width=20)
-> Index Scan using
n_traf_y2007m12_date_time_login_id on n_traf_y2007m12 n_traf
(cost=0.00..4819.51 rows=84028 width=20)
(18 rows)

Attachment Content-Type Size
index-order-by-wip.patch.gz application/x-gzip 11.0 KB

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gregory Stark 2007-11-08 06:21:10 Re: partitioned table and ORDER BY indexed_field DESC LIMIT 1
Previous Message Guillaume Smet 2007-11-07 23:36:16 Re: Estimation problem with a LIKE clause containing a /