Re: Hot standby and b-tree killed items

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: marcin mank <marcin(dot)mank(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby and b-tree killed items
Date: 2008-12-29 10:45:45
Message-ID: 4958AA59.6050506@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

marcin mank wrote:
>> Perhaps we should listen to the people that have said they don't want
>> queries cancelled, even if the alternative is inconsistent answers.

I don't like that much. PostgreSQL has traditionally avoided that very
hard. It's hard to tell what kind of inconsistencies you'd get, as it'd
depend on what plan is created, when a vacuum happens to run on master etc.

> I think an alternative to that would be "if the wal backlog is too
> big, let current queries finish and let incoming queries wait till the
> backlog gets smaller".

Yeah, that makes sense too.

Many approaches have been proposed, and they all have different
tradeoffs and therefore fit different use cases. I'm not sure which ones
are/will be included in the patch. We don't need all in 8.4, one or two
simplest ones will do just fine, and we can extend later.

Let me summarize. Whenever a WAL record conflicts with a
query-in-progress, we can:

1. kill the query, or
2. wait for the query to finish
3. let the query proceed, producing invalid results.

There's some combinations of those as well. You're proposal is a
variation of 2, to avoid the problem of WAL application falling behind
indefinitely. There's also the max_standby_delay option in the patch, to
wait a while, and then kill the query.

There's some additional optimizations that can be made to make those
options less painful. Instead of killing all queries that might be
affected by a vacuum record, only kill them when they actually hit a
block that was vacuumed (Simon's idea of latestRemovedLSN field in page
header).

Another line of attack is to avoid getting into the situation in the
first place, by affecting behavior on the master. If the standby has an
online connection to the master (per the synch rep patch), it can tell
master what the slave's OldestXmin is, and master can take that into
account and not remove tuples still needed by the slave. That's not good
from high availability point of view, you don't want a hung query in the
slave to cause a long-running-transaction situation in the master, but
for other use cases it would be fine. Or we can just add a constant # of
transactions to OldestXmin in master, to get some breathing room in the
server.

The bottom line is that we have enough options to make everyone happy.
Some understanding of the issue is required to tune it properly,
however, so documentation is important.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2008-12-29 11:02:04 Re: Synchronous replication, network protocol
Previous Message Simon Riggs 2008-12-29 10:27:56 Re: [PATCHES] Infrastructure changes for recovery (v8)