Re: Allow "snapshot too old" error, to prevent bloat

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Ants Aasma <ants(at)cybertec(dot)at>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Allow "snapshot too old" error, to prevent bloat
Date: 2015-02-19 20:31:17
Message-ID: 1360007681.2022226.1424377877833.JavaMail.yahoo@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ants Aasma <ants(at)cybertec(dot)at> wrote:

> If I understood the issue correctly, you have long running snapshots
> accessing one part of the data, while you have high churn on a
> disjoint part of data. We would need to enable vacuum on the high
> churn data while still being able to run those long queries. The
> "snapshot too old" solution limits the maximum age of global xmin at
> the cost of having to abort transactions that are older than this and
> happen to touch a page that was modified after they were started.

Pretty much. It's not quite as "black and white" regarding high
churn and stable tables, though -- at least for the customer whose
feedback drove my current work on containing bloat. They are
willing to tolerate an occasional "snapshot too old" on a cursor,
and they can automatically pick up again by restarting the cursor
with a new "lower limit" and a new snapshot.

> What about having the long running snapshots declare their working
> set, and then only take them into account for global xmin for
> relations that are in the working set? Like a SET TRANSACTION WORKING
> SET command. This way the error is deterministic, vacuum on the high
> churn tables doesn't have to wait for the old transaction delay to
> expire and we avoid a hard to tune GUC (what do I need to set
> old_snapshot_threshold to, to reduce bloat while not having "normal"
> transactions abort).

Let me make sure I understand your suggestion. You are suggesting
that within a transaction you can declare a list of tables which
should get the "snapshot too old" error (or something like it) if a
page is read which was modified after the snapshot was taken?
Snapshots within that transaction would not constrain the effective
global xmin for purposes of vacuuming those particular tables?

If I'm understanding your proposal, that would help some of the
cases the "snapshot too old" case addresses, but it would not
handle the accidental "idle in transaction" case (e.g., start a
repeatable read transaction, run one query, then sit idle
indefinitely). I don't think it would work quite as well for some
of the other cases I've seen, but perhaps it could be made to fit
if we could figure out the accidental "idle in transaction" case.

I also think it might be hard to develop a correct global xmin for
vacuuming a particular table with that solution. How do you see
that working?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2015-02-19 20:32:29 Re: POLA violation with \c service=
Previous Message Peter Eisentraut 2015-02-19 20:30:53 Re: How about to have relnamespace and relrole?