Re: TB-sized databases

From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: TB-sized databases
Date: 2008-03-18 05:48:35
Message-ID: 47DF57B3.9090206@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>> I have _not_ added a TODO for this item. Let me know if one is needed.
>
> Please do, I think it's an open issue.
>
> * Consider Cartesian joins when both relations are needed to form an
> indexscan qualification for a third relation
>

Would another possible condition for considering
Cartesian joins be be:

* Consider Cartesian joins when a unique constraint can prove
that at most one row will be pulled from one of the tables
that would be part of this join?

In the couple cases where this happened to me it was
in queries on a textbook star schema like this:

select * from fact
join dim1 using (dim1_id)
join dim2 using (dim2_id)
where dim1.value = 'something'
and dim2.valuex = 'somethingelse'
and dim2.valuey = 'more';

and looking up all the IDs before hitting the huge
fact table. Often in these cases the where clause
on the dimension tables are on values with a unique
constraint.

If I understand right - if the constraint can prove
it'll return at most 1 row - that means the cartesian
join is provably safe from blowing up.

Not sure if that's redundant with the condition you
mentioned, or if it's yet a separate condition where
we might also want to consider cartesian joins.

Ron M

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message James Mansion 2008-03-18 07:08:58 Re: Benchmark: Dell/Perc 6, 8 disk RAID 10
Previous Message Peter Koczan 2008-03-18 03:26:04 Re: What is the best way to storage music files in Postgresql