Re: cheaper snapshots redux

From: Markus Wanner <markus(at)bluegap(dot)ch>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jim Nasby <jim(at)nasby(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: cheaper snapshots redux
Date: 2011-08-25 14:19:15
Message-ID: 4E5659E3.2040603@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert,

On 08/25/2011 03:24 PM, Robert Haas wrote:
> My hope (and it might turn out that I'm an optimist) is that even with
> a reasonably small buffer it will be very rare for a backend to
> experience a wraparound condition.

It certainly seems less likely than with the ring-buffer for imessages, yes.

Note, however, that for imessages, I've also had the policy in place
that a backend *must* consume its message before sending any. And that
I took great care for all receivers to consume their messages as early
as possible. None the less, I kept incrementing the buffer size (to
multiple megabytes) to make this work. Maybe I'm overcautious because
of that experience.

> - a handful of XIDs at most - because, on the average, transactions
> are going to commit in *approximately* increasing XID order

This assumption quickly turns false, if you happen to have just one
long-running transaction, I think. Or in general, if transaction
duration varies a lot.

> So the backend taking a snapshot only needs
> to be able to copy < ~64 bytes of information from the ring buffer
> before other backends write ~27k of data into that buffer, likely
> requiring hundreds of other commits.

You said earlier, that "only the latest snapshot" is required. It takes
only a single commit for such a snapshot to not be the latest anymore.

Instead, if you keep around older snapshots for some time - as what your
description here implies - readers are free to copy from those older
snapshots while other backends are able to make progress concurrently
(writers or readers of other snapshots).

However, that either requires keeping track of readers of a certain
snapshot (reference counting) or - as I understand your description -
you simply invalidate all concurrent readers upon wrap-around, or something.

> That seems vanishingly unlikely;

Agreed.

> Now, as the size of the snapshot gets bigger, things will eventually
> become less good.

Also keep configurations with increased max_connections in mind. With
that, we not only the snapshots get bigger, but more processes have to
share CPU time, on avg. making memcpy slower for a single process.

> Of course even if wraparound turns out not to be a problem there are
> other things that could scuttle this whole approach, but I think the
> idea has enough potential to be worth testing. If the whole thing
> crashes and burns I hope I'll at least learn enough along the way to
> design something better...

That's always a good motivation. In that sense: happy hacking!

Regards

Markus Wanner

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-08-25 14:48:20 Re: cheaper snapshots redux
Previous Message Robert Haas 2011-08-25 13:41:54 Re: patch to slightly improve clarity of a comment in postgresql.conf.sample