Re: [HACKERS] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Steve Kehlet <steve(dot)kehlet(at)gmail(dot)com>, Forums postgresql <pgsql-general(at)postgresql(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Date: 2015-06-15 13:47:18
Message-ID: 20150615134718.GN133018@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Andres Freund wrote:

> A first version to address this problem can be found appended to this
> email.
>
> Basically it does:
> * Whenever more than MULTIXACT_MEMBER_SAFE_THRESHOLD are used, signal
> autovacuum once per members segment
> * For both members and offsets, once hitting the hard limits, signal
> autovacuum everytime. Otherwise we loose the information when
> restarting the database, or when autovac is killed. I ran into this a
> bunch of times while testing.

Sounds reasonable.

I see another hole in this area. See do_start_worker() -- there we only
consider the offsets limit to determine a database to be in
almost-wrapped-around state (causing emergency attention). If the
database in members trouble has no pgstat entry, it might get completely
ignored. I think the way to close this hole is to
find_multixact_start() in the autovac launcher for the database with the
oldest datminmxid, to determine whether we need to activate emergency
mode for it. (Maybe instead of having this logic in autovacuum, it
should be a new function that receives database datminmulti and returns
a boolean indicating whether the database is in trouble or not.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Anton 2015-06-15 15:05:59 Re: pg_last_xact_replay_timestamp lies
Previous Message Sylvain MARECHAL 2015-06-15 09:19:40 BDR: Can a node live alone after being detached

Browse pgsql-hackers by date

  From Date Subject
Next Message Vik Fearing 2015-06-15 14:51:40 Tab completion for CREATE SEQUENCE
Previous Message Merlin Moncure 2015-06-15 13:31:06 Re: query execution time faster with geqo on than off: bug?