Re: Hot Standby query cancellation and Streaming Replication integration

From: Richard Huxton <dev(at)archonet(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Standby query cancellation and Streaming Replication integration
Date: 2010-02-26 16:43:24
Message-ID: 4B87FA2C.9030508@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Replying to my own post - first sign of madness...

Let's see if I've got the concepts clear here, and hopefully my thinking
it through will help others reading the archives.

There are two queues:
1. Cleanup on the master
2. Replay on the slave

Running write queries on the master adds to both queues.
Running (read-only) queries on the slave prevents you removing from both
queues.

There are two interesting measurements of "age"/"size":
1. Oldest item in / length of queue (knowable)
2. How long will it take to clear the queue (estimable at best)

You'd like to know #2 to keep up with your workload. Unfortunately, you
can't for certain unless you have control over new incoming queries (on
both master and slave).

You might want four separate GUCs for the two measurements on the two
queues. We currently have two that (sort of) match #1 "Oldest item"
(vacuum_defer_cleanup_age, max_standby_delay).

Delaying replay on a slave has no effect on the master. If a slave falls
too far behind it's responsible for catch-up (via normal WAL archives).

There is no point in delaying cleanup on the master unless it's going to
help one or more slaves. In fact, you don't want to start delaying
cleanup until you have to, otherwise you're wasting your delay time.
This seems to be the case with vacuum_defer_cleanup_age. If I have a
heavily-updated table and I defer vacuuming then when any given query
starts on the slave it's going to be half used up already.

There's also no point in deferring cleanup on the master if the standby
is already waiting on a conflict that will cause its queries to be
cancelled anyway. Not only won't it help, but it might make things worse
since transactions will be cancelled, the conflict will be replayed and
(presumably) queries will be re-submitted only to be cancelled again.

This is what Greg Smith's discussion of the keep-alives was about.
Giving the master enough information to be smarter about cleanup (and
making the conflicts more fine-grained).

The situation with deferring on one or both ends of process just gets
more complicated with multiple slaves. There's all sorts of unpleasant
feedback loops I can envisage there.

For the case of single slave being used to run long reporting queries
the ideal scenario would be the following. Master starts deferring
vacuum activity just before the query starts. When that times out, the
slave will receive the cleanup info, refuse to replay it and start its
delay. This gives you a total available query time of:
natural time between vacuums + vacuum delay + WAL transfer time +
standby delay

I can think of five useful things we should be doing (and might be
already - don't know).

1. On the master, deduce whether the slave is already waiting on a
query. If so, don't bother delaying cleanup. Clearly you don't want to
be signalling hundreds of times a second though. Does the slave pause
fetching via streaming replication if replay is blocked on a query?
Could we signal "half-way to max-age" or some such?

2. Perhaps simpler than trying to make the master smarter, just allow
SET this_transaction_is_probably_a_long_one=true on the slave. That (a)
clears the queue on the slave and (b) sends the signal to the master
which then starts deferring vacuum.

3. Do a burst of cleanup activity on the master after blocking. This
should concentrate conflicts together when they reach the slave. Perhaps
vacuum_defer_cleanup_age should be vacuum_deferred_queue_size and
measure the amount of work to do, rather than the max age of the oldest
cleanup (if I've understood correctly).

4. Do a burst of replay on the slave after blocking. Perhaps every time
it cancels a transaction it should replay at least half the queued WAL
before letting new transactions start. Or perhaps it replays any vacuum
activity it comes across and then stops. That should sync with #2
assuming the slave doesn't lag the master too much.

5. I've been mixing "defer" and "delay", as do the docs. We should
probably settle on one or the other. I think defer conveys the meaning
more precisely, but what about non-native English speakers?

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-02-26 16:50:37 Re: Avoiding bad prepared-statement plans.
Previous Message Kenneth Marshall 2010-02-26 16:41:45 Re: pgbouncer + psql 9.0a4