Re: Custom Data Type Question

From: Tom Dunstan <pgsql(at)tomd(dot)cc>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Mitchell <gmitchell(at)atdesk(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Custom Data Type Question
Date: 2006-11-21 02:51:40
Message-ID: 456269BC.2030009@tomd.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> I'd also love any suggestions as to how we might be able to use a
> similar local-data-cacheing mechanism to work when we specify SQL
> standard FOREIGN KEYs. It would be really cool to say USING LOCAL CACHE
> or some way of avoiding the overhead of all those stored after triggers
> and SPI SELECT statements when we've got checks against tables with only
> a few rows where the values hardly ever change.

Uh, sounds like an enum is a perfect fit. :) This is certainly one of
the use-cases that I've encountered when I wished that I had had an enum
type to use.

> The enum concept departs
> radically from the declarative Referential Integrity concepts that many
> of us are already used to.

I have to challenge this. It's *just another type*. Is using a boolean
type a radical departure from RI just because you're not referencing
some external table with the definitions for true and false in it? After
all, from a functional point of view, booleans are just another
enumerated type.

A major annoyance with SQL has been that it hasn't had a good solution
for this pattern. I've seen any number of broken solutions, from lots of
little mostly-static tables littered all over your data model, to single
big code tables that every other table references, and for which you
need triggers to enforce data integrity because standard RI doesn't
work, to chars and varchars with incorrect ordering or meaningless names
or which suck storage-wise. Don't even get me started on MySQL enums.

The reason that I wanted to do the enum patch was because *all* of those
solutions suck. Requiring a table to represent a small fixed set of
allowable values that a column should take is broken. But because it's
the least ugly solution that we've had using vanilla SQL, it's what
we've used, and dare I suggest that because we've all done it for so
long, we start to think that *not* doing it that way is broken.

Enums, as implemented in the patch, are reasonably efficient, typesafe
and properly ordered. Plus they make your data model look cleaner, your
queries don't need to have lookups anymore and you use less disk space.
Oh, and they also bring you coffee and put out the trash :)

> I'd like to be able to speed things up
> without radical re-design of the database... so a few nicely sprinked
> ALTER TABLE statements would be a much better way of implementing this
> IMHO.

OK, back to what you'd like to do. :)

If your external tables are so small and static, just how long does the
FK check take? Are they really that slow?

I would have thought that it might be difficult to get rid of the FK
check altogether, but perhaps, in the context of a single query (I'm
thinking a bulk insert) you could have some sort of LRU cache. If you
want the cache to stick around, you've got to deal with what happens
when it goes out of date... notifying all the backends etc, and what
happens when one if the other backends was halfway through a
transaction. Maybe you could set this "cached mode" on, but would have
to switch it off before updating the tables in question or something
like that. I dunno. That stuff sounds hard; I found it easier to just
implement my own type ;)

Cheers

Tom

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Neil Conway 2006-11-21 03:21:44 Re: quick review
Previous Message Bruce Momjian 2006-11-21 02:50:20 Re: quick review