Re: Document atthasmissing default optimization avoids verification table scan

From: James Coleman <jtc331(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: "Bossart, Nathan" <bossartn(at)amazon(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Document atthasmissing default optimization avoids verification table scan
Date: 2022-01-20 14:05:40
Message-ID: CAAaqYe8qdz1vpb+fVJGHAZfiUA5TZcP1d-JHM0v9M4W2k7amOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 19, 2022 at 9:34 PM David G. Johnston
<david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> On Wed, Jan 19, 2022 at 6:14 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
>>
>> I'm open to the idea of wordsmithing here, of course, but I strongly
>> disagree that this is irrelevant data.
>
>
> Ok, but wording aside, only changing a tip in the DDL - Add Table section doesn't seem like a complete fix. The notes in alter table, where I'd look for such an official directive first, need to be touched as well.
>
> For the alter table docs maybe change/add to the existing sentence below (I'm in favor of not pointing out that scanning the table doesn't mean we are rewriting it, but maybe I'm making another unwarranted assumption regarding obviousness...).
>
> "Adding a CHECK or NOT NULL constraint requires scanning the table [but not rewriting it] to verify that existing rows meet the constraint. It is skipped when done as part of ADD COLUMN unless a table rewrite is required anyway."

I'm looking over the docs again to see how it might be better
structured; point is well taken that we should have it clearly in the
primary place.

> On that note, does the check constraint interplay with the default rewrite avoidance in the same way?

I hadn't checked until you asked, but interestingly, no it doesn't (I
assume you mean scan not rewrite in this context):

test=# select seq_scan from pg_stat_all_tables where relname = 't2';
seq_scan
----------
2
test=# alter table t2 add column i3 int not null default 5;
ALTER TABLE
test=# select seq_scan from pg_stat_all_tables where relname = 't2';
seq_scan
----------
2
test=# alter table t2 add column i4 int default 5 check (i4 < 50);
ALTER TABLE
test=# select seq_scan from pg_stat_all_tables where relname = 't2';
seq_scan
----------
3

That seems like an opportunity for improvement here, though it's
obviously a separate patch. I might poke around at that though
later...

> For the Tip I'd almost rather redo it to say:
>
> "Before PostgreSQL 11, adding a new column to a table required rewriting that table, making it a very slow operation. More recent versions can sometimes optimize away this rewrite and related validation scans. See the notes in ALTER TABLE for details."
>
> Though I suppose I'd accept something like (leave existing text, alternative patch text):
>
> "[...]large tables.\nIf the added column also has a not null constraint the usual verification scan is also skipped."
>
> "constant" is used in the Tip, "non-volatile" is used in alter table - hence a desire to have just one source of truth, with alter table being the correct place. We should sync them up otherwise.

As noted I'll look over how restructuring might improve and reply with
an updated proposed patch.

Thanks,
James Coleman

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2022-01-20 14:20:52 Re: [PATCH] Add reloption for views to enable RLS
Previous Message Robert Haas 2022-01-20 14:02:46 Re: removing datlastsysoid