Re: type info refactoring

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: type info refactoring
Date: 2010-10-31 18:30:33
Message-ID: AANLkTi=mK3RzEvUutKjMniH4_GNw3GdKzrTu3V+eQVM9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 31, 2010 at 1:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> ... I assumed that TypeInfo would be
>> embedded in other structs directly, rather than a pointer and palloc.
>
> Yeah, that would avoid the extra-pallocs complaint, although it might be
> notationally a bit of a PITA in places like equalfuncs.c.  I think that
> would end up needing a separate COMPARE_TYPEINFO_FIELD macro instead of
> being able to treat it like a Node* field.
>
> But I'm still wondering whether it's smart to try to promote all of this
> fundamentally-auxiliary information to first-class status.  It's really
> unclear to me that that will end up being a net win either conceptually
> or notationally.

I think this is a chicken-and-egg problem. Most of the things we use
typmod for are unimportant, because typmod doesn't get propagated
everywhere and therefore if you try to use it for anything that
actually matters, it'll break. And on the flip side, there's no need
for typmod to get propagated everywhere, because it's not used for
anything all that important. Blah!

It's true that if the ostensible maximum length of a string or the
precision of a numeric get lost somewhere on their path through the
system, probably nothing terribly awful will happen. The worst case
is that those values won't be enforced someplace where the user might
expect it, and that's probably avoidable in most practical cases by
adding an appropriate cast. I'm not sure whether it'll also be true
for collation, because that affects comparison semantics, and getting
the wrong comparison semantics is worse than failing to enforce a
maximum length.

And we keep having these pesky requests to embed more complex
information in the typmod, some of which are things that can't just be
lightly thrown away because we feel like it. One of the more common
ones is "an OID", so you can have things like a range over a
designated base type, or a map from one base type to another base
type, or whatever. Right now the on-disk representation of an array
includes a 4-byte OID to store the type of the elements in that array.
That's almost pure evil. Data in the database should not need to be
self-identifying: imagine what our performance would look like if
every integer datum in the database had to contain a tag identifying
it as an integer. Granting that we have no immediate ability to
change it, we should be thinking about what sort of infrastructure
would be needed to eliminate this type of kludgery, or at least make
it unnecessary for new types.

Long story short, I'm inclined to view any data structure that is
carrying only the type OID with great suspicion. If the additional
information isn't needed today, it may well be tomorrow.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-10-31 18:42:53 Re: ALTER OBJECT any_name SET SCHEMA name
Previous Message Dimitri Fontaine 2010-10-31 18:19:08 Re: ALTER OBJECT any_name SET SCHEMA name