Re: logical column ordering

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: logical column ordering
Date: 2015-03-12 13:57:04
Message-ID: 20150312135704.GU3291@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas Vondra wrote:
> On 12.3.2015 14:17, Alvaro Herrera wrote:
> > Tomas Vondra wrote:
> >> On 12.3.2015 03:16, Tom Lane wrote:
> >
> >>> I agree though that it's worth considering defining
> >>> pg_attribute.attnum as the logical column position so as to minimize
> >>> the effects on client-side code. I doubt there is much stuff
> >>> client-side that cares about column creation order, but there is
> >>> plenty that cares about logical column order. OTOH this would
> >>> introduce confusion into the backend code, since Alvaro's definition
> >>> of attnum is what most of the backend should care about.
> >>
> >> IMHO reusing attnum for logical column order would actually make it more
> >> complex, especially if we allow users to modify the logical order using
> >> ALTER TABLE. Because if you change it, you have to walk through all the
> >> places where it might be referenced and update those too (say, columns
> >> referenced in indexes and such). Keeping attnum immutable makes this
> >> much easier and simpler.
> >
> > I think you're misunderstanding. The suggestion, as I understand it,
> > is to rename the attnum column to something else (maybe, say,
> > attidnum), and rename attlognum to attnum. That preserves the
> > existing property that "ORDER BY attnum" gives you the correct view
> > of the table from the point of view of the user. That's very useful
> > because it means clients looking at pg_attribute need less changes,
> > or maybe none at all.
>
> Hmm ... I understood it as a suggestion to drop attlognum and just
> define (attnum, attphysnum).

Pretty sure it wasn't that.

> > I think this wouldn't be too difficult to implement, because there
> > aren't that many places that refer to the column-identity attribute
> > by name; most of them just grab the TupleDesc->attrs array in
> > whatever order is appropriate and scan that in a loop. Only a few of
> > these use att->attnum inside the loop --- that's what would need to
> > be changed, and it should be pretty mechanical.
>
> I think it's way more complicated. We may fix all the pieces of the
> code, but that's not all - attnum is referenced in various system views,
> catalogs and such. For example pg_stats view does this:
>
> FROM pg_statistic s JOIN pg_class c ON (c.oid = s.starelid)
> JOIN pg_attribute a ON (c.oid = attrelid AND attnum = s.staattnum)
> LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace)
> WHERE NOT attisdropped
> AND has_column_privilege(c.oid, a.attnum, 'select');
>
> information_schema also uses attnum on many places too.

Those can be fixed with relative ease to refer to attidnum instead.

> I see the catalogs as a kind of public API, and redefining the meaning
> of an existing column this way seems tricky, especially when we
> reference it from other catalogs - I'm pretty sure there's plenty of SQL
> queries in various tools that rely on this.

That's true, but then we've never promised that system catalogs remain
unchanged forever. That would essentially stop development.

However, there's a difference between making a query silently given
different results, and breaking it completely forcing the user to
re-study how to write it. I think the latter is better. In that light
we should just drop attnum as a column name, and use something else:
maybe (attidnum, attlognum, attphysnum). So all queries in the wild
would be forced to be updated, but we would not silently change
semantics instead.

> Which actually breaks the catalog definition as specified here:
>
> http://www.postgresql.org/docs/devel/static/catalog-pg-index.html
>
> which explicitly says that indkey references pg_attribute.attnum.

That's a simple doc fix.

> But maybe we don't really care about breaking this API and it is a good
> approach - I need to think about it and try it.

Yeah, thanks.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-03-12 13:57:29 Re: shebang for tcl postgresql modules
Previous Message Michael Paquier 2015-03-12 13:54:19 Re: Install shared libs in lib/ and bin/ with MSVC (Was: install libpq.dll in bin directory on Windows / Cygwin)