Re: Data warehousing requirements

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: josh(at)agliodbs(dot)com
Cc: pgsql-performance(at)postgresql(dot)org, Gabriele Bartolini <angusgb(at)tin(dot)it>, "Aaron Werman" <awerman2(at)hotmail(dot)com>
Subject: Re: Data warehousing requirements
Date: 2004-10-08 02:43:33
Message-ID: 27539.1097203413@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> For one thing, this is false optimization; a NULL isn't saving you any table
> size on an INT or BIGINT column. NULLs are only smaller on variable-width
> columns.

Uh ... not true. The column will not be stored, either way. Now if you
had a row that otherwise had no nulls, the first null in the column will
cause a null-columns-bitmap to be added, which might more than eat up
the savings from storing a single int or bigint. But after the first
null, each additional null in a row is a win, free-and-clear, whether
it's fixed-width or not.

(There are also some alignment considerations that might cause the
savings to vanish.)

> More importantly, you should never, ever allow null FKs on a star-topology
> database. LEFT OUTER JOINs are vastly less efficient than INNER JOINs in a
> query, and the difference between having 20 outer joins for your data view,
> vs 20 regular joins, can easily be a difference of 100x in execution time.

It's not so much that they are necessarily inefficient as that they
constrain the planner's freedom of action. You need to think a lot more
carefully about the order of joining than when you use inner joins.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2004-10-08 05:53:26 Re: Data warehousing requirements
Previous Message Aaron Werman 2004-10-08 01:19:44 Re: Data warehousing requirements