Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering
Date: 2022-09-08 11:07:54
Message-ID: CAEze2WhL8uLMqynnnCu1LAPwxD5RKEo0nHV+eXGg_N6ELU88HQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

In Neon, we've had to modify the GIN fast insertion path as attached,
due to an unexpected XLOG insertion and buffer locking ordering issue.

The xlog readme [0] mentions that the common order of operations is 1)
pin and lock any buffers you need for the log record, then 2) start a
critical section, then 3) call XLogBeginInsert.
In Neon, we rely on this documented order of operations to expect to
be able to WAL-log hint pages (freespace map, visibility map) when
they are written to disk (e.g. cache eviction, checkpointer). In
general, this works fine, except that in ginHeapTupleFastInsert we
call XLogBeginInsert() before the last of the buffers for the eventual
record was read, thus creating a path where eviction is possible in a
`begininsert_called = true` context. That breaks our current code by
being unable to evict (WAL-log) the dirtied hint pages.

PFA a patch that rectifies this issue, by moving the XLogBeginInsert()
down to where 1.) we have all relevant buffers pinned and locked, and
2.) we're in a critical section, making that part of the code
consistent with the general scheme for XLog insertion.

Kind regards,

Matthias van de Meent

[0] access/transam/README, section "Write-Ahead Log Coding", line 436-470

Attachment Content-Type Size
v1-0001-Fix-GIN-fast-path-XLogBeginInsert-and-Read-LockBu.patch application/octet-stream 1.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-09-08 11:24:38 Re: Perform streaming logical transactions by background workers and parallel apply
Previous Message Ashutosh Sharma 2022-09-08 10:44:00 confirmed_flush_lsn shows LSN of the data that has not yet been received by the logical subscriber.