Re: ERROR: found multixact from before relminmxid

From: Jeremy Finzel <finzelj(at)gmail(dot)com>
To: Alexandre Arruda <adaldeia(at)gmail(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: ERROR: found multixact from before relminmxid
Date: 2018-06-08 17:38:03
Message-ID: CAMa1XUhoxkrq_XWqAEzK4Kzmg+amARsaaVJn-bZx4t9vRYPBzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Jun 5, 2018 at 8:42 PM, Alexandre Arruda <adaldeia(at)gmail(dot)com> wrote:

> Em seg, 28 de mai de 2018 às 16:44, Andres Freund <andres(at)anarazel(dot)de>
> escreveu:
> >
> > Hi,
> >
> > I think I found the bug, and am about to post a fix for it belo
> > https://postgr.es/m/20180525203736.crkbg36muzxrjj5e@alap3.anarazel.de.
> >
> > Greetings,
> >
> > Andres Freund
>
> Hi Andres,
>
> In end of April we did a complete dump/reload in database to version 10.3.
> Today, the problem returns:
>
> production=# vacuum verbose co27t;
> INFO: vacuuming "public.co27t"
> ERROR: found multixact 81704071 from before relminmxid 107665371
> production=# vacuum full verbose co27t;
> INFO: vacuuming "public.co27t"
> ERROR: found multixact 105476076 from before relminmxid 107665371
> production=# cluster co27t;
> ERROR: found multixact 105476076 from before relminmxid 107665371
>
> But this time, regular vacuum versus full/cluster are different in
> multixact number.
> Your patch is applicable to this issue and is in 10.4 ?
>
> Best regards,
>
> Alexandre
>
>
We encountered this issue ourselves for the first time on a busy OLTP
system. It is at 9.6.8. We found that patching to 9.6.9 on a snapshot of
this system did not fix the problem, but I assume that is because the patch
in 9.6.9 only prevents the problem moving forward. Is that accurate?

Before we take an outage for this patch, we want as much information as
possible on if this is indeed likely to be our issue.

Like the other people on this thread, amcheck didn't show anything on the
snap:
db=# select bt_index_parent_check(indexrelid,true) FROM
pg_stat_user_indexes WHERE relname = 'mytable';
bt_index_parent_check
-----------------------

(5 rows)

db=# select bt_index_check(indexrelid,true) FROM pg_stat_user_indexes WHERE
relname = 'mytable';
bt_index_check
----------------

(5 rows)

Not surprisingly, I can get the problem to go away in production if I use
pg_repack to rebuild the table. But we are interested of course in solving
this problem permanently.

Thanks,
Jeremy

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2018-06-08 17:49:29 Re: (2^63 - 1)::bigint => out of range? (because of the double precision)
Previous Message Adrian Klaver 2018-06-08 17:35:23 Re: (2^63 - 1)::bigint => out of range? (because of the double precision)