Re: Restructure ALTER TABLE notes to clarify table rewrites and verification scans

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, David Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
Subject: Re: Restructure ALTER TABLE notes to clarify table rewrites and verification scans
Date: 2022-03-31 14:58:15
Message-ID: CAEze2WhAK0eGy8iQYZCY9cTkGa+hdHu=SLB4hdcPdnw-VaeNrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 29 Mar 2022 at 16:20, James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> Over in the "Document atthasmissing default optimization avoids
> verification table scan" thread David Johnston (who I've cc'd)
> suggested that my goals might be better implemented with a simple
> restructuring of the Notes section of the ALTER TABLE docs. I think
> this is also along the lines of Tom Lane's suggestion of a "unified
> discussion", but I've chosen for now (and simplicity's sake) not to
> break this into an entirely new page. If reviewers feel that is
> warranted at this stage, I can do that, but it seems to me that for
> now this improves the structure and sets us up for such a future page
> but falls short of sufficient content to move into its own page.
>
> One question on the changes: the current docs say "when attaching a
> new partition it may be scanned to verify that existing rows meet the
> partition constraint". The word "may" there seems to suggest there may
> also be occasions where scans are not needed, but no examples of such
> cases are present. I'm not immediately aware of such a case. Are these
> constraints always validated? If not, in which cases can such a scan
> be skipped?
>
> I've also incorporated the slight correction in "Correct docs re:
> rewriting indexes when table rewrite is skipped" [2] here, and will
> rebase this patch if that gets committed.

See comments in that thread.

> + Changing the type of an existing column will require the entire table and its
> + indexes to be rewritten. As an exception, if the <literal>USING</literal> clause
> + does not change the column contents and the old type is either binary coercible
> + to the new type or an unconstrained domain over the new type, a table rewrite is
> + not needed.

This implies "If the old type is [...] an unconstrained domain over
the new type, a table rewrite is not needed.", which is the wrong way
around.

I'd go with something along the lines of:

+ Changing the type of an existing column will require the entire table to be
+ rewritten, unless the <literal>USING</literal> clause is only a
binary coercible
+ cast, or if the new type is an unconstrained
<literal>DOMAIN<literal> over the
+ old type.

That would drop the reference to index rebuilding; but that should be
covered in other parts of the docs.

> + The following alterations of the table require the entire table, and in some
> + cases its indexes as well, to be rewritten.

It is impossible to rewrite the table without at the same time also
rewriting the indexes; as the location of tuples changes and thus
previously generated indexes will become invalid. At the same time;
changes to columns might not require a table rewrite, while still
requiring the indexes to be rewritten. I suggest changing the order of
"table" and "index", or dropping the clause.

> + [...] For a large table such a rewrite
> + may take a significant amount of time and will temporarily require as much as
> + double the disk space.

I'd replace the will with could. Technically, this "double the disk
space" could be even higher than that; due to index rebuilds taking up
to 3x normal space (one original index which is only dropped at the
end, one sorted tuple store for the rebuild, and one new index).

> - Similarly, when attaching a new partition it may be scanned to verify that
> - existing rows meet the partition constraint.
> + Attaching a new partition requires scanning the table to verify that existing
> + rows meet the partition constraint.

This is also (and better!) documented under section
sql-altertable-attach-partition: we will skip full table scan if the
table partition's existing constraints already imply the new partition
constraints. The previous wording is better in that regard ("may
need", instead of "requires"), though it could be improved by refering
to the sql-altertable-attach-partition section.

Kind regards,

Matthias van de Meent

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Anton A. Melnikov 2022-03-31 15:08:01 Re: Possible fails in pg_stat_statements test
Previous Message Andrew Dunstan 2022-03-31 14:51:53 Re: pgsql: Add 'basebackup_to_shell' contrib module.