Re: column ordering, was Re: [PATCHES] Enums patch v2

From: Russell Smith <mr-russ(at)pws(dot)com(dot)au>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 21:23:49
Message-ID: 4589A9E5.40509@pws.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan wrote:
> Russell Smith wrote:
>> Tom Lane wrote:
>>> Stephen Frost <sfrost(at)snowman(dot)net> writes:
>>>
>>>> Force references to go through macros which implement the lookup
>>>> for the
>>>> appropriate type? ie: LOGICAL_COL(table_oid,2) vs.
>>>> PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic.
>>>>
>>>
>>> It doesn't really address the question of how you know which one to
>>> use at any particular line of code; or even more to the point, what
>>> mechanism will warn you if you use the wrong one.
>>>
>>> My gut feeling about this is that we could probably enforce such a
>>> distinction if we were using C++, but while coding in C I have no
>>> confidence in it. (And no, that's not a vote to move to C++ ...)
>>>
>> What about a comprimise...
>>
>> The 8.1 documentation for ALTER TABLE states the following.
>>
>> Adding a column with a non-null default or changing the type of an
>> existing column will require the entire table to be rewritten. This
>> may take a significant amount of time for a large table; and it will
>> temporarily require double the disk space.
>>
>>
>> Now, we are rewriting the table from scratch anyway, the on disk
>> format is changing. What is stopping us from switching the column
>> order at the same time. The only thing I can think is that the
>> catalogs will need more work to update them. It's a middle sized
>> price to pay for being able to reorder the columns in the table. One
>> of the problems I have is wanting to add a column in the middle of
>> the table, but FK constraints stop me dropping the table to do the
>> reorder. If ALTER TABLE would let me stick it in the middle and
>> rewrite the table on disk, I wouldn't care. It's likely that I would
>> be rewriting the table anyway. And by specifying AT POSITION, or
>> BEFORE/AFTER you know for big tables it's going to take a while.
>>
>
> This isn't really a compromise. Remember that this discussion started
> with consideration of optimal record layout (minimising space use by
> reducing or eliminating alignment padding). The above proposal really
> does nothing for that.
>
> cheers
>
> andrew
>
>
This is partly true. If you have the ability to rewrite the table and
put columns in a specific order you can "manually" minimize the
alignment padding. However that will probably produce a table that is
not in the logical order you would like. I still see plenty of use case
for both my initial case as the alignment padding case, even without
logical layout being different to disk layout.

Also there has been a large about of discussion on performance relating
to having firm numbers for proposals for different compiler options. Do
anybody have tested numbers, and known information about where/how you
can eliminate padding by column ordering? Tom suggests in this thread
that lots of types have padding issues, so how much is it really going
to buy us space wise if we re-order the table in optimal format. What
is the optimal ordering to reduce disk usage?

Russell.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2006-12-20 21:52:35 Re: Release 8.2.0 done, 8.3 development starts
Previous Message Zdenek Kotala 2006-12-20 21:05:50 Re: Question about debugging bootstrapping and catalog

Browse pgsql-patches by date

  From Date Subject
Next Message Roman Kononov 2006-12-20 23:05:14 BUG #2846: inconsistent and confusing handling of underflows, NaNs and INFs
Previous Message Inaam Rana 2006-12-20 20:29:30 Re: Load distributed checkpoint