Re: Join optimization for inheritance tables

From: Nedyalko Borisov <nedyalko(at)asterdata(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Herodotos Herodotou" <Herodotos(dot)Herodotou(at)asterdata(dot)com>, Eric Friedman <Eric(dot)Friedman(at)asterdata(dot)com>, John Cieslewicz <John(dot)Cieslewicz(at)asterdata(dot)com>, Dheeraj Pandey <Dheeraj(dot)Pandey(at)asterdata(dot)com>
Subject: Re: Join optimization for inheritance tables
Date: 2009-07-06 21:23:36
Message-ID: 4A526B58.8070607@asterdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Nedyalko Borisov <nedyalko(at)asterdata(dot)com> writes:
>
>> In summary, we are making two suggestions:
>> 1. Extend the optimizer to consider joins between child tables when hierarchies are joined together.
>>
>
> We already handle this for the case where the join is nestloop with
> inner index scan, and I'm not convinced that there's any real gain to be
> had for other join types.
>
>
From OLTP perspective this proposal won't introduce any benefits due to
the fact that queries operate on small parts of the data, so we can add
a flag that will disable/enable the inherited join.
However, the OLAP queries process significant amount of data and to
leverage this fact the DB admins partition the data. We think that the
optimizer should take advantage of this partitioning and consider plans
where the joins are performed on small parts of the data.

For example, typical observed scenario is: optimizer chooses Merge Join
for two partitioned tables like the plan below:
Merge Cond: (table1.id = table2.id)
-> Sort
Sort Key: id
-> Append
-> Seq Scan on table1_part1
-> Seq Scan on table1_part2
-> ....
-> Seq Scan on table1_partN
-> Sort
Sort Key: id
-> Append
-> Seq Scan on table2_part1
-> Seq Scan on table2_part2
-> ....
-> Seq Scan on table2_partM

This plan ignores the fact there are indexes on the table2 partitions
and that the pairwise partitions joins (index nested loop or hash join)
will be faster than scanning all the partitions and sorting them.

To see the effect of the pairwise joins we performed some experiments
with the initial implementation. The experiments consisted of joining
two partitioned tables where each of the tables have around 200 children
and the 2 int columns id and count. We generated data of different sizes
and measured the execution times and here are the results:
0.5 million records -> regular plan 0.69s -> modified plan 0.51
1 million records -> regular plan 2.9s -> modified plan 1
2.5 million records -> regular plan 4.17s -> modified plan 2.28
5 million records -> regular plan 11.25s -> modified plan 4.46

Increasing the data size or adding more columns will increase the
difference between the current plan that the database picks and the
proposed modification of the plans. Thus, we thing that it might be
useful if the optimizer considers plans with inherited joins.

>> 2. Add the "Empty Check Constraint", which would enforce that a particular table is to remain empty.
>>
>
> The trouble with that is that a constraint that doesn't propagate to its
> child tables is a weird beast that I'd just as soon not invent.
>
> We are currently thinking about inventing an explicit notion of
> partitioned tables. If we had that, it would be reasonable to have
> a special kind of "parent" table for a partitioned set and refuse to
> allow any data in that relation. But I'm not excited about contorting
> the general constraint mechanism in the way that would be necessary to
> express this as a constraint.
>
>
OK, implementing a special "abstract"/"parent" table would make more
sense. In this line of thoughts could you elaborate on the explicit
notion of partitioned tables or give us some references.

Thanks,
Nedyalko Borisov and Herodotos Herodotou

> regards, tom lane
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2009-07-06 22:14:13 Re: Join optimization for inheritance tables
Previous Message Kevin Grittner 2009-07-06 21:16:11 Re: TODO items: Alter view add column