Re: star schema and the optimizer

From: Marc Cousin <cousinmarc(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: star schema and the optimizer
Date: 2015-02-27 14:27:10
Message-ID: 54F07EBE.5030508@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27/02/2015 15:08, Tom Lane wrote:
> Marc Cousin <cousinmarc(at)gmail(dot)com> writes:
>> So I gave a look at the optimizer's code to try to understand why I got this problem. If I understand correctly, the optimizer won't do cross joins, except if it has no choice.
>
> That's right, and as you say, the planning-speed consequences of doing
> otherwise would be disastrous. However, all you need to help it find the
> right plan is some dummy join condition between the dimension tables,
> which will allow the join path you want to be considered. Perhaps you
> could do something like
>
> SELECT * FROM dim1,dim2,facts WHERE facts.dim1=dim1.a and facts.dim2=dim2.a and dim1.b=12 AND dim2.b=17 and (dim1.a+dim2.a) is not null;

No I can't. I cannot rewrite the query at all, in my context.

What do you mean by disastrous ?

I've given it a few tries here, and with 8 joins (same model, 7
dimensions), planning time is around 100ms. At least in my context, it's
well worth the planning time, to save minutes of execution.

I perfectly understand that it's not something that should be "by
default", that would be crazy. But in a datawarehouse, it seems to me
that accepting one, or even a few seconds of planning time to save
minutes of execution is perfectly legetimate.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gilles Darold 2015-02-27 15:29:11 Re: Bug in pg_dump
Previous Message Anastasia Lubennikova 2015-02-27 14:19:47 Re: Index-only scans for GiST.