Re: Allow "snapshot too old" error, to prevent bloat

From: Ants Aasma <ants(at)cybertec(dot)at>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Allow "snapshot too old" error, to prevent bloat
Date: 2015-02-19 22:19:21
Message-ID: CA+CSw_uurFEDmwOsXyrv5dsmLO+oNfE2L7dbMPciUsAaxNymKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Feb 19, 2015 10:31 PM, "Kevin Grittner" <kgrittn(at)ymail(dot)com> wrote:
> > What about having the long running snapshots declare their working
> > set, and then only take them into account for global xmin for
> > relations that are in the working set? Like a SET TRANSACTION WORKING
> > SET command. This way the error is deterministic, vacuum on the high
> > churn tables doesn't have to wait for the old transaction delay to
> > expire and we avoid a hard to tune GUC (what do I need to set
> > old_snapshot_threshold to, to reduce bloat while not having "normal"
> > transactions abort).
>
> Let me make sure I understand your suggestion. You are suggesting
> that within a transaction you can declare a list of tables which
> should get the "snapshot too old" error (or something like it) if a
> page is read which was modified after the snapshot was taken?
> Snapshots within that transaction would not constrain the effective
> global xmin for purposes of vacuuming those particular tables?

Sorry, I should have been clearer. I'm proposing that a transaction can
declare what tables it will access. After that the transaction will
constrain xmin for only those tables. Accessing any other table would give
an error immediately.

> If I'm understanding your proposal, that would help some of the
> cases the "snapshot too old" case addresses, but it would not
> handle the accidental "idle in transaction" case (e.g., start a
> repeatable read transaction, run one query, then sit idle
> indefinitely). I don't think it would work quite as well for some
> of the other cases I've seen, but perhaps it could be made to fit
> if we could figure out the accidental "idle in transaction" case.

Accidental idle in transaction seems better handled by just terminating
transactions older than some timeout. This is already doable externally,
but an integrated solution would be nice if we could figure out how the
permissions for setting such a timeout work.

> I also think it might be hard to develop a correct global xmin for
> vacuuming a particular table with that solution. How do you see
> that working?

I'm imagining the long running transaction would register it's xmin in a
shared map keyed by relation for each referenced relation and remove the
xmin from procarray. Vacuum would access this map by relation, determine
the minimum and use that if it's earlier than the global xmin. I'm being
deliberately vague here about the datastructure in shared memory as I don't
have a great idea what to use there. It's somewhat similar to the lock
table in that in theory the size is unbounded, but in practice it's
expected to be relatively tiny.

Regards,
Ants Aasma

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2015-02-19 22:33:28 Re: POLA violation with \c service=
Previous Message Gilberto Castillo 2015-02-19 22:15:17 Re: FDW for Oracle