Quick Links

Re: Should Oracle outperform PostgreSQL on a complex

From:	Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Pål Stenslet <paal(dot)stenslet(at)exie(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Should Oracle outperform PostgreSQL on a complex
Date:	2005-12-18 22:13:55
Message-ID:	43A5DF23.5010504@paradise.net.nz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Simon Riggs wrote:
> On Sat, 2005-12-17 at 13:13 -0500, Tom Lane wrote:
>
>>Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
>>
>>>On Fri, 2005-12-16 at 23:28 -0500, Bruce Momjian wrote:
>>>
>>>>How are star joins different from what we do now?
>>
>>>Methods:
>>>1. join all N small tables together in a cartesian product, then join to
>>>main Large table once (rather than N times)
>>
>>Of course, the reason the current planner does not think of this is that
>>it does not consider clauseless joins unless there is no alternative.
>
>
> Understood
>
>
>> The plan you currently get for
>>this sort of scenario is typically a nest of hash joins:
>>
>> QUERY PLAN
>>------------------------------------------------------------------------
>> Hash Join (cost=2.25..4652.25 rows=102400 width=16)
>> Hash Cond: ("outer".f1 = "inner".f1)
>> -> Hash Join (cost=1.12..3115.12 rows=102400 width=12)
>> Hash Cond: ("outer".f2 = "inner".f1)
>> -> Seq Scan on fact (cost=0.00..1578.00 rows=102400 width=8)
>> -> Hash (cost=1.10..1.10 rows=10 width=4)
>> -> Seq Scan on d2 (cost=0.00..1.10 rows=10 width=4)
>> -> Hash (cost=1.10..1.10 rows=10 width=4)
>> -> Seq Scan on d1 (cost=0.00..1.10 rows=10 width=4)
>>(9 rows)
>>
>>This involves only one scan of the fact table. As each row is pulled up
>>through the nest of hash joins, we hash one dimension key and join to
>>one small table at each level.
>
>
> Understood
>
>
>>This is at worst the same amount of work
>>as hashing all the keys at once and probing a single cartesian-product
>>hashtable, probably less work (fewer wasted key-comparisons). And
>>definitely less memory. You do have to keep your eye on the ball that
>>you don't waste a lot of overhead propagating the row up through
>>multiple join levels, but we got rid of most of the problem there in
>>8.1 via the introduction of "virtual tuple slots". If this isn't fast
>>enough yet, it'd make more sense to invest effort in further cutting the
>>executor's join overhead (which of course benefits *all* plan types)
>>than in trying to make the planner choose a star join.
>
>
> That join type is used when an index-organised table is available, so
> that a SeqScan of the larger table can be avoided.
>
> I'd say the plan would make sense if the columns of the cartesian
> product match a multi-column index on the larger table that would not
> ever be used unless sufficient columns are restricted in each lookup.
> That way you are able to avoid the SeqScan that occurs for the multiple
> nested Hash Join case. (Clearly, normal selectivity rules apply on the
> use of the index in this way).
>
> So I think that plan type still can be effective in some circumstances.
> Mind you: building an N-way index on a large table isn't such a good
> idea, unless you can partition the tables and still use a join. Which is
> why I've not centred on this case as being important before now.
>
> My understanding: Teradata and DB2 use this.
>

FWIW - I think DB2 uses the successive fact RID buildup (i.e method 2),
unfortunately I haven't got a working copy of DB2 in front of me to test.

Cheers

Mark

In response to

Re: Should Oracle outperform PostgreSQL on a complex at 2005-12-18 13:48:52 from Simon Riggs

Responses

Re: Should Oracle outperform PostgreSQL on a complex at 2005-12-18 22:28:37 from Simon Riggs

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Simon Riggs	2005-12-18 22:21:04	Re: Should Oracle outperform PostgreSQL on a complex
Previous Message	Mark Kirkwood	2005-12-18 22:10:01	Re: Should Oracle outperform PostgreSQL on a complex