From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Joshua Ma <josh(at)benchling(dot)com> |
Cc: | PostgreSQL mailing lists <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Why does CREATE INDEX CONCURRENTLY need two scans? |
Date: | 2015-04-01 02:08:37 |
Message-ID: | CAB7nPqSWkNm0UveY6xnr=cn4X9LS469NdHiG40P2XeH7VfHxOA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, Apr 1, 2015 at 9:43 AM, Joshua Ma <josh(at)benchling(dot)com> wrote:
> Hi all,
>
> I was curious about why CONCURRENTLY needs two scans to complete - from
> the documentation on HOT (access/heap/README.HOT), it looks like the
> process is:
>
> 1) insert pg_index entry, wait for relevant in-progress txns to finish
> (before marking index open for inserts, so HOT updates won't write
> incorrect index entries)
> 2) build index in 1st snapshot, mark index open for inserts
> 3) in 2nd snapshot, validate index and insert missing tuples since first
> snapshot, mark index valid for searches
>
> Why are two scans necessary? What would break if it did something like the
> following?
>
> 1) insert pg_index entry, wait for relevant txns to finish, mark index
> open for inserts
>
2) build index in a single snapshot, mark index valid for searches
>
> Wouldn't new inserts update the index correctly? Between the snapshot and
> index-updating txns afterwards, wouldn't all updates be covered?
>
When an index is built with index_build, are included in the index only the
tuples seen at the start of the first scan. A second scan is needed to add
in the index entries for the tuples that have been inserted into the table
during the build phase.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | TonyS | 2015-04-01 02:49:27 | Would like to know how analyze works technically |
Previous Message | Stephen Frost | 2015-04-01 01:06:26 | Re: Fwd: SSPI authentication ASC_REQ_REPLAY_DETECT flag |