Re: logical column ordering

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: logical column ordering
Date: 2015-02-23 23:09:06
Message-ID: 54EBB312.7090000@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

attached is the result of my first attempt to make the logical column
ordering patch work. This touches a lot of code in the executor that is
mostly new to me, so if you see something that looks like an obvious
bug, it probably is (so let me know).

improvements
------------
The main improvements of this version are that:

* initdb actually works (while before it was crashing)

* regression tests work, with two exceptions

(a) 'subselect' fails because EXPLAIN prints columns in physical order
(but we expect logical)

(b) col_order crashes works because of tuple descriptor mismatch in a
function call (this actually causes a segfault)

The main change is this patch is that tlist_matches_tupdesc() now checks
target list vs. physical attribute order, which may result in doing a
projection (in cases when that would not be done previously).

I don not claim this is the best approach - maybe it would be better to
keep the physical tuple and reorder it lazily. That's why I kept a few
pieces of code (fix_physno_mutator) and a few unused fields in Var.

Over the time I've heard various use cases for this patch, but in most
cases it was quite speculative. If you have an idea where this might be
useful, can you explain it here, or maybe point me to a place where it's
described?

There's also a few FIXMEs, mostly from Alvaro's version of the patch.
Some of them are probably obsolete, but I wasn't 100% sure by that so
I've left them in place until I understand the code sufficiently.

randomized testing
------------------
I've also attached a python script for simple randomized testing. Just
execute it like this:

$ python randomize-attlognum.py -t test_1 test_2 \
--init-script attlognum-init.sql \
--test-script attlognum-test.sql

and it will do this over and over

$ dropdb test
$ createdb test
$ run init script
$ randomly set attlognums for the tables (test_1 and test_2)
$ run test script

It does not actually check the result, but my experience is that when
there's a bug in handling the descriptor, it results in segfault pretty
fast (just put some varlena columns into the table).

plans / future
--------------
After discussing this with Alvaro, we've both agreed that this is far
too high-risk change to commit in the very last CF (even if it was in a
better shape). So while it's added to 2015-02 CF, we're aiming for 9.6
if things go well.

regards

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
logical-column-ordering.patch text/x-diff 120.1 KB
randomize-attlognum.py text/x-python 3.7 KB
attlognum-init.sql application/sql 712 bytes
attlognum-test.sql application/sql 326 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-02-23 23:55:30 Raspberry PI vs Raspberry PI 2: time to compile backend code
Previous Message Alvaro Herrera 2015-02-23 21:02:23 Re: POLA violation with \c service=