Re: logical column order and physical column order

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical column order and physical column order
Date: 2013-11-05 07:07:04
Message-ID: CAApHDvo1KyFpneFXXQEF94GMVAd=Z07A0Srh9T8nczMoOqx7UA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 4, 2013 at 3:14 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>wrote:

> David Rowley escribió:
> > I've just been looking at how alignment of columns in tuples can make the
> > tuple larger than needed.
>
> This has been discussed at length previously, and there was a design
> proposed to solve this problem. See these past discussions:
>
> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php
> http://archives.postgresql.org/pgsql-hackers/2007-02/msg01235.php
> http://archives.postgresql.org/pgsql-hackers/2008-11/msg01680.php
>
> I started work on this, and managed to get parts of it to work. While
> doing so I realized that it was quite a lot more hideous than I had
> originally expected. I published a tree at github:
> https://github.com/alvherre/postgres/tree/column

Thanks for the archive links... I read these last night and pulled out some
key pieces of information from some of the posts.

I should say that I've not dived into the code too much to see how hard it
would be, but my, perhaps naive original idea would have just be to add 1
column to pg_attribute to store the logical order and have attnum store the
physical order... This would have meant that at least only the following
places would have to take into account the change.

1. pg_dump needs to display columns in logical order both for create tables
and copy/insert statements.
2. INSERT INTO table values( ... ) (i.e without column names) needs to look
at logical order.
3. create table like <table>
4. backup and restore using copy
5. select * expand to column names

And of lesser importance as I'd assume it would just be a change in an
ORDER BY clause in their queries of pg_attribute

1. Display in clients... psql Pg Admin

I thought the above would have been doable and I did wonder what all the
fuss was about relating to bugs in the code where it could use the logical
number instead of attnum.

On reading of the posts last night I can see that the idea was to add not 1
but 2 new fields to pg_attribute. One was for the physical order and one
for the logical order and at first I didn't really understand as I thought
attnum would always be the physical order. I didn't really know before this
that attnum was static... I did some tests were I dropped one of the middle
columns out of a table and then rewrote the table with cluster and I see
that the pg_attribute record is kept even though the remains of the column
values have been wiped out by the rewrite... Is this done because things
like indexes, foreign keys and sequences reference the {attrelid,attnum} ?
if so then I see why the 2 extra columns are needed and I guess that's
where the extra complications come from.

So now I'm wondering, with my freshly clustered table which I dropped one
of the middle columns from before the cluster... my pg_attributes look
something like:

relname | attname | attnum
---------+------------------------------+--------
dropcol | tableoid | -7
dropcol | cmax | -6
dropcol | xmax | -5
dropcol | cmin | -4
dropcol | xmin | -3
dropcol | ctid | -1
dropcol | one | 1
dropcol | ........pg.dropped.2........ | 2
dropcol | three | 3

and I would imagine since the table has just been clustered that the
columns are stored like {..., ctid, one,three}
In this case how does Postgresql know that attnum 3 is the 2nd user column
in that table? Unless I have misunderstood something then there must be
some logic in there to skip dropped columns and if so then could it not
just grab the "attphynum" at that location? then just modify the 1-5 places
listed above to sort on attlognum?

Regards

David Rowley

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2013-11-05 07:45:57 Re: Fast insertion indexes: why no developments
Previous Message Vik Fearing 2013-11-05 06:20:05 Re: WITHIN GROUP patch