Re: Do we need to handle orphaned prepared transactions in the server?

From: Hamid Akhtar <hamid(dot)akhtar(at)gmail(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Kellerer <shammat(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Do we need to handle orphaned prepared transactions in the server?
Date: 2020-01-31 14:02:27
Message-ID: CANugjhuz2ZQgV6tVmYu3pTXLRDuu3YrpWiTi5_funEzjp4GEZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 30, 2020 at 8:28 AM Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:

> On Thu, 30 Jan 2020 at 02:04, Hamid Akhtar <hamid(dot)akhtar(at)gmail(dot)com> wrote:
> >
> > So having seen the feedback on this thread, and I tend to agree with
> most of what has been said here, I also agree that the server core isn't
> really the ideal place to handle the orphan prepared transactions.
> >
> > Ideally, these must be handled by a transaction manager, however, I do
> believe that we cannot let database suffer for failing of an external
> software, and we did a similar change through introduction of idle in
> transaction timeout behavior.
>
> The difference, IMO, is that idle-in-transaction aborts don't affect
> anything we've promised to be durable.
>
> Once you PREPARE TRANSACTION the DB has made a promise that that txn
> is durable. We don't have any consistent feedback channel to back to
> applications and say "Hey, if you're not going to finish this up we
> need to get rid of it soon, ok?". If a distributed transaction manager
> gets consensus for commit and goes to COMMIT PREPARED a previously
> prepared txn only to find that it has vanished, that's a major
> problem, and one that may bring the entire DTM to a halt until the
> admin can intervene.
>
> This isn't like idle-in-transaction aborts. It's closer to something
> like uncommitting a previously committed transaction.
>
> I do think it'd make sense to ensure that the documentation clearly
> highlights the impact of abandoned prepared xacts on server resource
> retention and performance, preferably with pointers to appropriate
> views. I haven't reviewed the docs to see how clear that is already.
>

Having seen the documentation, IMHO the document does contain enough
information for users to understand what issues can be caused by these
orphaned prepared transactions.

>
> I can also see an argument for a periodic log message (maybe from
> vacuum?) warning when old prepared xacts hold xmin down. Including one
> sent to the client application when an explicit VACUUM is executed.
> (In fact, it'd make sense to generalise that for all xmin-retention).
>

I think that opens up the debate on what we really mean by "old" and
whether that requires a syntax change when creating a prepared
transactions as Thomas Kellerer suggested earlier?

I agree that vacuum should periodically throw warnings for any prepared
xacts that are holding xmin down.

Generalising it for all xmin-retention is a fair idea IMHO, though that
does increase the overall scope here. A vacuum process should (ideally)
periodically throw out warnings for anything that is preventing it
(including
orphaned prepared transactions) from doing its routine work so that
somebody can take necessary actions.

> But I'm really not a fan of aborting such txns. If you operate with
> some kind of broken global transaction manager that can forget or
> abandon prepared xacts, then fix it, or adopt site-local periodic
> cleanup tasks that understand your site's needs.
>
> --
> Craig Ringer http://www.2ndQuadrant.com/
> 2ndQuadrant - PostgreSQL Solutions for the Enterprise
>

--
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid(dot)akhtar(at)highgo(dot)ca
SKYPE: engineeredvirus

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-01-31 14:29:00 Re: standby apply lag on inactive servers
Previous Message Alvaro Herrera 2020-01-31 13:58:31 Re: Missing break in RelationFindReplTupleSeq