Re: star schema and the optimizer

From: Marc Cousin <cousinmarc(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: star schema and the optimizer
Date: 2015-02-27 15:48:09
Message-ID: 54F091B9.9060602@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27/02/2015 15:27, Marc Cousin wrote:
> On 27/02/2015 15:08, Tom Lane wrote:
>> Marc Cousin <cousinmarc(at)gmail(dot)com> writes:
>>> So I gave a look at the optimizer's code to try to understand why I got this problem. If I understand correctly, the optimizer won't do cross joins, except if it has no choice.
>>
>> That's right, and as you say, the planning-speed consequences of doing
>> otherwise would be disastrous. However, all you need to help it find the
>> right plan is some dummy join condition between the dimension tables,
>> which will allow the join path you want to be considered. Perhaps you
>> could do something like
>>
>> SELECT * FROM dim1,dim2,facts WHERE facts.dim1=dim1.a and facts.dim2=dim2.a and dim1.b=12 AND dim2.b=17 and (dim1.a+dim2.a) is not null;
>
> No I can't. I cannot rewrite the query at all, in my context.
>
>
> What do you mean by disastrous ?
>
> I've given it a few tries here, and with 8 joins (same model, 7
> dimensions), planning time is around 100ms. At least in my context, it's
> well worth the planning time, to save minutes of execution.
>
> I perfectly understand that it's not something that should be "by
> default", that would be crazy. But in a datawarehouse, it seems to me
> that accepting one, or even a few seconds of planning time to save
> minutes of execution is perfectly legetimate.
>

I have given it a bit more thought. Could it be possible, to mitigate
this, to permit only a few (few being to define) cross joins ? Still
optional, of course, it still has an important cost. Only allowing cross
joins for the first 3 levels, and keeping this to left-right sided
joins, I can plan up to 11 joins on my small test machine in 500ms
(instead of 150ms with the unpatched one), and get a "good plan", good
meaning 100 times faster.

Regards

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-02-27 15:49:31 Re: star schema and the optimizer
Previous Message Jan Urbański 2015-02-27 15:29:21 Re: GSoC idea - Simulated annealing to search for query plans