Re: Assertion failure on hot standby

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Subject: Re: Assertion failure on hot standby
Date: 2010-11-26 21:55:45
Message-ID: AANLkTind_2ZDhuxdUC+yO7tiKHJ0UkRY6RY4BhL6+z6G@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 26, 2010 at 2:06 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> As a concrete example, VACUUM acquires an AccessExclusiveLock when it
>>> wants
>>> to truncate the relation. A sequential scan running against the table in
>>> the
>>> standby will get upset, if the startup process replays a truncation
>>> record
>>> on the table without warning.
>>
>> This case is similar.  xl_smgr_truncate has only a relfilenode number,
>> not a relation OID, so there's no way for the startup process to
>> generate a conflicting lock request itself.  But if the standby
>> backends locked the relfilenode, or if the xl_smgr_truncate WAL record
>> included the relation OID, it would be simple.
>
> If you go down that path, you're going to spend a lot of time thinking
> through every single case that uses an AccessExclusiveLock, ensuring that
> the standby has enough information, and tinkering with the replay code to
> acquire locks at the right moment. And gain what, exactly?

Well, fewer useless locks on the standby, for one thing, in all
likelihood, and less WAL traffic. We probably don't have much of a
choice but to put in Simon's suggested fix (with a wal_level check
added) for 9.0. But I suspect we're going to end up revisiting this
down the road. Beyond the usability issues, the current approach
sounds brittle as heck to me. All somebody has to do is introduce a
mechanism that drops or rewrites a relation file without an access
exclusive lock, and this whole approach snaps right off (e.g. CLUSTER
CONCURRENTLY, taking only an EXCLUSIVE lock, with some kind of garbage
collection for the old files left behind; or an on-line
table-compaction operation that tries to systematically move tuples to
lower CTIDs by - for example - writing the new tuple, waiting until
all scans in progress at the time of the write have finished, and then
pruning the old tuples).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-11-26 22:56:04 Re: ALTER OBJECT any_name SET SCHEMA name
Previous Message David Fetter 2010-11-26 21:52:09 Re: Tab completion for view triggers in psql