Skip site navigation (1) Skip section navigation (2)

Path question

From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Hans-Juergen Schoenig <hs(at)cybertec(dot)at>
Subject: Path question
Date: 2010-09-01 13:57:12
Message-ID: 4C7E5BB8.40502@cybertec.at (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi,

we are experimenting with modifying table partitioning
so the ORDER BY clause can be pushed down to
child nodes on the grounds that:
1. smaller datasets are faster to sort, e.g. two datasets that almost
    spill out to disk are faster to sort in memory and later merge them
    than the union set that spills out to disk, or two larger sets
    that spill out to disk are faster to sort individually than the
    union dataset (because of the longer seeks, etc)
2. individual child nodes can have indexes that produce
    the sorted output already

Currently I am abusing the AppendPath node but later I will add a new
executor node that will merge the sorted input coming from the children.

I added the pathkey to the AppendPath to reflect that it produces
sorted output and I am adding the Sort plan to the children.

My problem is:

zozo=# explain select * from t1 where d = '2008-01-01' order by d;
                                            QUERY
PLAN                                            
---------------------------------------------------------------------------------------------------
 Result  (cost=8.28..33.13 rows=4 width=40)
   ->  Append  (cost=8.28..33.13 rows=4 width=40)
         ->  Sort  (cost=8.28..8.28 rows=1 width=40)
               Sort Key: public.t1.d
               ->  Index Scan using t1_d_key on t1  (cost=0.00..8.27
rows=1 width=40)
                     Index Cond: (d = '2008-01-01'::date)
         ->  Sort  (cost=8.28..8.28 rows=1 width=40)
               Sort Key: public.t1.d
               ->  Index Scan using t1_2008_d_key on t1_2008 t1 
(cost=0.00..8.27 rows=1 width=40)
                     Index Cond: (d = '2008-01-01'::date)
         ->  Sort  (cost=8.28..8.28 rows=1 width=40)
               Sort Key: public.t1.d
               ->  Index Scan using t1_2009_d_key on t1_2009 t1 
(cost=0.00..8.27 rows=1 width=40)
                     Index Cond: (d = '2008-01-01'::date)
         ->  Sort  (cost=8.28..8.28 rows=1 width=40)
               Sort Key: public.t1.d
               ->  Index Scan using t1_2010_d_key on t1_2010 t1 
(cost=0.00..8.27 rows=1 width=40)
                     Index Cond: (d = '2008-01-01'::date)
(18 rows)

In one leaf, e.g.:

         ->  Sort  (cost=8.28..8.28 rows=1 width=40)
               Sort Key: public.t1.d
               ->  Index Scan using t1_2010_d_key on t1_2010 t1 
(cost=0.00..8.27 rows=1 width=40)
                     Index Cond: (d = '2008-01-01'::date)

The plan is scanning the t_2010 child table, but the Sort is trying to
sort by the fully qualified parent table. I think this is a problem
but I don't know how to solve it. I have tried transforming the
parent query with

     adjust_appendrel_attrs((Node *) parse, appinfo)

where parse is

     Query      *parse = root->parse;

in set_append_rel_pathlist() and the transformed query trees are
used for the children with

    make_sort_from_sortclauses(root, query->sortClause, subplan)

in create_append_plan(). adjust_appendrel_attrs() seems to be prepared
to be called with a Query * , so I don't know why the above leaf plan
doesn't show "Sort Key: public.t1_2010.d" and so on.

Can someone help me?

Best regards,
Zoltán Böszötményi


Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2010-09-01 14:10:21
Subject: Re: Path question
Previous:From: Simon RiggsDate: 2010-09-01 11:43:25
Subject: Re: Synchronous replication - patch status inquiry

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group