Re: Page Scan Mode in Hash Index

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Page Scan Mode in Hash Index
Date: 2017-09-20 13:14:44
Message-ID: CA+TgmoZj+rzoZZHxcBcK8E82YHQ6kgB28AsR0j0XRWvPy1KhOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 20, 2017 at 7:45 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Right, I was thinking from the perspective of the index entry. Before
> marking index entry as dead, we do check for heaptid. So, as heaptid
> can't be reused via Page-at-a-time index vacuum, scan won't mark index
> entry as dead.

It can mark index entries dead, but if it does, they correspond to
heap TIDs that are still dead, as opposed to heap TIDs that have been
resurrected by being reused for an unrelated tuple.

In other words, the danger scenario is this:

1. A page-at-a-time scan records all the TIDs on a page.
2. VACUUM processes the page, removing some of those TIDs.
3. VACUUM finishes, changing the heap TIDs from dead to unused.
4. Somebody inserts a new tuple at one of the existing TIDs, and the
index tuple gets put on the page scanned in step 1.
5. The page-at-a-time scan resumes and kills the tuple added in step 4
by mistake, when it really only intended to kill a tuple removed in
step 2.

What prevent this is:

A. To begin scanning a bucket, VACUUM needs a cleanup lock on the
primary bucket page. Therefore, there are no scans in progress at the
time that VACUUM begins scanning the bucket.

B. If a scan begins scanning the bucket, it can't pass VACUUM, because
VACUUM doesn't release the page lock on one page before taking the one
for the next page.

C. After 0003, it becomes possible for a scan to pass VACUUM if the
table is permanent, but it won't be a problem because of the LSN
check.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeevan Chalke 2017-09-20 13:21:07 Re: Partition-wise join for join between (declaratively) partitioned tables
Previous Message Robert Haas 2017-09-20 12:59:58 Re: Error: dsa_area could not attach to a segment that has been freed