Re: Avoiding rewrite in ALTER TABLE ALTER TYPE

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Avoiding rewrite in ALTER TABLE ALTER TYPE
Date: 2010-12-29 15:56:39
Message-ID: 10483596-A0C7-4B32-9752-02B03937A1D7@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Dec 29, 2010, at 7:56 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> ALTER TABLE ALTER TYPE always rewrites the table heap and its indexes. In some
> cases, we can determine that doing so is unhelpful, and that the conversion
> shall always succeed:
>
> CREATE DOMAIN loosedom AS text;
> CREATE TABLE t (c varchar(2));
> ALTER TABLE t ALTER c TYPE varchar(4);
> ALTER TABLE t ALTER c TYPE text;
> ALTER TABLE t ALTER c TYPE loosedom;
>
> In other cases, we can determine that the rewrite is unhelpful, but a cast could
> still throw an error:
>
> CREATE DOMAIN tightdom AS text CHECK (value LIKE '<%/>');
> CREATE TABLE t (c text);
> ALTER TABLE t ALTER c TYPE xml USING c::xml;
> ALTER TABLE t ALTER c TYPE varchar(64);
> ALTER TABLE t ALTER c TYPE tightdom;
>
> I wish to replace table rewrites with table verification scans where possible,
> then skip those verification scans where possible.

Seems like a good idea.

> Having thought on it
> more, though, it actually seems best to attempt the verification scan *every*
> time. In most ineligible conversions, an inequality will appear very early, so
> the scan is effectively O(1) in the negative case. A notable exception is
> something like char(6)->varchar(6) in a table with a billion tuples having
> length(col) = 6 and one with length(col) = 5. The verification scan might read
> most of the table before finding the one tuple that forces a rewrite. That
> isn't a particularly regular scenario in my experience, so the "just do the
> right thing" aspect of preceding every potential rewrite with a verification
> scan seems to win out.

I think this scenario will be more common than you might think. Tables don't contain random data; they contain data that the DBA thinks is valid. The situation where the data is mostly as you expect but with a few kooky rows is, in my experience, extremely common. And it makes the worst case a LOT worse. I really doubt this is worth the complexity anyway - converting between two types that are mostly-but-not-quite binary compatible seems like an edge case in every sense.

> Certain very popular type changes (see introduction) can be _exempt_ from the
> verification scan: we can determine that they will always succeed. To capture
> that, I propose extending CREATE CAST with the notion of an exemptor function:
>
> CREATE CAST (source_type AS target_type)
> { WITH FUNCTION function_name (argument_type [, ...])
> [ WITH EXEMPTOR function_name ] |
> WITHOUT FUNCTION |
> WITH INOUT }
> [ AS ASSIGNMENT | AS IMPLICIT ]
>
> The exemptor shall have this signature:
>
> exemptor_func(
> integer, -- source_typmod
> integer -- dest_typmod
> ) RETURNS boolean
>
> The exemptor shall return true iff datumIsEqual(x, x::target_type(dest_typmod))
> for every x in source_type most recently coerced to source_type(source_typmod).
> When the cast is WITHOUT FUNCTION (a binary coercion) and target_type lacks a
> length coercion cast, the cast has an implicit exemption, and an exemptor is
> superfluous: code can assume an exemptor that always returns true. Use of WITH
> EXEMPTOR mainly makes sense alongside multi-arg WITH FUNCTION (should probably
> raise an error upon other uses).

I am not sure whether it's worth trying to be general here. Maybe we should just hard-code the known cases involving core datatypes.

>

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-12-29 16:16:23 Re: Avoiding rewrite in ALTER TABLE ALTER TYPE
Previous Message Joel Jacobson 2010-12-29 15:21:16 Re: pg_dump --split patch