Re: BUG #15840: Vacuum does not work after database stopped for wraparound protection. Database seems unrepearable.

From: Thierry Husson <thusson(at)informiciel(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15840: Vacuum does not work after database stopped for wraparound protection. Database seems unrepearable.
Date: 2019-06-07 19:59:11
Message-ID: 20190607145911.Horde.MKizKAnShw4nKEYGDUC2M-T@webmail.iciel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Hi Andres,

Thank you for your anwser. Precisions bellow:

Andres Freund <andres(at)anarazel(dot)de> a écrit :

> Hi,
>
> On 2019-06-07 18:22:20 +0000, PG Bug reporting form wrote:
>> I was doing tables COPY between my old server with PG10.8 and the new one
>> with 12Beta1. After each table is done, I make a vacuum of it.
>> However PG12 has stopped working for wraparound protection. I was doing it
>> on around 10 cpu, 1 table by cpu.
>
> That was a new postgres 12 cluster, not a pg_upgraded one? And you just
> did a bunch of COPYs? How many?
>
> I'm not clear as to how the cluster got to wraparound if that's the
> scenario. We use one xid per transaction, and copy doesn't use multiple
> transactions internally. Any chance you have triggers on these tables
> that use savepoints internally?

Yes it was a new cluster. Around 30 copy were done.
Yes there is a trigger to manage partitions. Around 1200 tables were
created. 10 billions records transfered, I need to tranfert 180BR over
1700 tables.
I just realize I made vacuum on partitions for the first 8BR rows and
forgot for the last 2BR That would explain the wraparound protection.

>
>
>> postgres --single emet_zhen
>> VACUUM FREEZE VERBOSE;
>
> Don't FREEZE in wraparound cases, that just makes it take longer.
>
>
>> It worked a few hours and when I was thinking it was done as nothing was
>> loggin anymore, I made a ctrl-\ and restarted the DB.
>> I was still getting wraparound protection messages so I shutdown the DB
>> again & redo the vacuum command but it didn't work anymore:
>
>> zhen:semt700 $ postgres --single emet_zhen
>> 2019-06-07 17:23:36 UTC 7251 WARNING: database with OID 16394 must be
>> vacuumed within 999995 transactions
>> 2019-06-07 17:23:36 UTC 7251 HINT: To avoid a database shutdown, execute a
>> database-wide VACUUM in that database.
>> You might also need to commit or roll back old prepared
>> transactions, or drop stale replication slots.
>> PostgreSQL stand-alone backend 12beta1
>> backend> VACUUM VERBOSE;
>> 2019-06-07 17:23:59 UTC 7251 WARNING: database "emet_zhen" must be
>> vacuumed within 999995 transactions
>> 2019-06-07 17:23:59 UTC 7251 HINT: To avoid a database shutdown, execute a
>> database-wide VACUUM in that database.
>> You might also need to commit or roll back old prepared
>> transactions, or drop stale replication slots.
>> 2019-06-07 17:23:59 UTC 7251 LOG: duration: 2417.639 ms statement: VACUUM
>> VERBOSE;
>
> What do you mean by "didn't work anymore"? As far as I can tell the
> VACUUM here succeeded?
>
>
>> HINT: To avoid a database shutdown, execute a database-wide VACUUM in that
>> database.
>> You might also need to commit or roll back old prepared transactions, or
>> drop stale replication slots.
>
> Did you check whether any of these are the case?
>
> SELECT * FROM pg_replication_slots;
> SELECT * FROM pg_prepared_xacts;
These are empty.

emet_zhen=# SELECT max(age(pg_database.datfrozenxid)) / 2147483648.0 *
100.0 AS "Percentage of transaction ID's used" FROM pg_database;
Percentage of transaction ID's used
-------------------------------------
99.953434057533740997000

>
> Could you also show
>
> SELECT oid, datname, datfrozenxid, age(datfrozenxid), datminmxid,
> mxid_age(datminmxid) FROM pg_database ORDER BY age(datfrozenxid) DESC;
oid | datname | datfrozenxid | age | datminmxid | mxid_age
-------+-----------+--------------+------------+------------+----------
16394 | emet_zhen | 36464 | 2146483652 | 1 | 0
12672 | template0 | 504982897 | 1641537219 | 1 | 0
12673 | postgres | 2096520116 | 50000000 | 1 | 0
1 | template1 | 2096520116 | 50000000 | 1 | 0

> SELECT * FROM pg_control_checkpoint();
checkpoint_lsn | redo_lsn | redo_wal_file |
timeline_id | prev_timeline_id | full_page_writes | next_xid |
next_oid | next_multixact_id | next_multi_offset | oldest_xid |
oldest_xid_dbid | oldest_active_xid | oldest_multi_xid |
oldest_multi_dbid | oldest_commit_ts_xid | newest_commit_ts_xid |
checkpoint_time
32D/54074EC0 | 32D/54074E88 | 000000010000032D00000054 |
1 | 1 | t | 0:2146520116 | 475782 |
1 | 0 | 36464 | 16394 |
2146520116 | 1 | 16394 |
0 | 0 | 2019-06-07 18:11:39+00
(1 row)

Could it be that PG12 considers "vacuum" as a transaction and trigger
wraparound protection against it?

>
> Greetings,
>
> Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2019-06-07 20:10:02 Re: BUG #15840: Vacuum does not work after database stopped for wraparound protection. Database seems unrepearable.
Previous Message Andres Freund 2019-06-07 19:02:31 Re: BUG #15840: Vacuum does not work after database stopped for wraparound protection. Database seems unrepearable.

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-06-07 20:10:02 Re: BUG #15840: Vacuum does not work after database stopped for wraparound protection. Database seems unrepearable.
Previous Message Andres Freund 2019-06-07 19:02:31 Re: BUG #15840: Vacuum does not work after database stopped for wraparound protection. Database seems unrepearable.