Hot Standby, deferred conflict resolution for cleanup records (v2)

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Subject: Hot Standby, deferred conflict resolution for cleanup records (v2)
Date: 2009-12-12 15:06:45
Message-ID: 1260630406.1984.97.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


I think I've found a better way of doing deferred conflict resolution
for WAL cleanup records. (This does not check for conflicts at block
level).

When a cleanup arrives, check *lock* conflicts to see who is accessing
the relation about to be cleaned.

If there are any lock conflicts, then wait, if requested.

If we waited, re-check *lock* conflicts to see who is accessing the
relation about to be cleaned. While holding lock, set latestRemovedXid
for the relation (protected by the partition lock).

Anyone acquiring a lock on a table should check the latestRemovedXid for
the table and abort if their xmin is too old. This prevents new lockers
from accessing a cleaned relation immediately after we decide to abort
anyone looking at that table. (Anyone queuing for the existing locks
would be caught by this).

We then cancel the list of current lock conflicts using the
latestRemovedXid (if there is one) as a cross-check to see if we can
avoid cancelling the query.

So if latestRemovedXid advances on a table you have locked, you will
have your xmin re-checked. If you access a table that has been or is
about to be cleaned then you will check xmin also.

Taken together this will mean that far fewer queries get cancelled,
since we check on both relid and latestRemovedXid. Reasonably simple
queries that take locks on a small number of relations at the start of
their execution will continue processing for long periods if they do not
access fast changing relations.

In particular, IMHO, this will cure about 90% of the btree delete issue,
since only users accessing a particularly busy index will need to cancel
themselves. Since many longer running queries don't use indexes at all
that trait alone will ensure that queries survive longer.

We need to keep track of latestRemovedXids for various relations in
shared memory. ISTM we can track top 8? common relids per lock partition
using a trivial LRU and then have a catch-all value for others. That
will allow us to track more than 100 relations without sweating too
much. All the fuss is handled during hot standby, so if you choose not
to use it, you have no impact.

--
Simon Riggs www.2ndQuadrant.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-12-12 15:19:53 Re: Streaming replication and non-blocking I/O
Previous Message KaiGai Kohei 2009-12-12 14:10:01 Re: SE-PostgreSQL/Lite Review