Temporary tables versus wraparound... again

From: Greg Stark <stark(at)mit(dot)edu>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Temporary tables versus wraparound... again
Date: 2020-11-08 23:19:57
Message-ID: CAM-w4HNNBDeXiXrj0B+_-WvP5NZ6of0RLueqFUZfyqbLcbEfMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We had an outage caused by transaction wraparound. And yes, one of the
first things I did on this site was check that we didn't have any
databases that were in danger of wraparound.

However since then we added a monitoring job that used a temporary
table with ON COMMIT DELETE ROWS. Since it was a simple monitoring job
it stayed connected to the database and used this small temporary
table for a very long period of time.

The temporary table never got vacuumed by autovacuum and never by the
monitoring job (since it was being truncated on every transaction why
would it need to be vacuumed...).

We've been around this bush before. Tom added orphaned table
protection to autovacuum precisely because temporary tables can cause
the datfrozenxid to get held back indefinitely. Then Michael Paquier
and Tsunakawa Takayuki both found it worth making this more
aggressive.

But none of that helped as the temporary schema was still in use so
they were not considered "orphaned" temp tables at all.

I think we need to add some warnings to autovacuum when it detects
*non* orphaned temporary tables that are older than the freeze
threshold.

However in the case of ON COMMIT DELETE ROWS we can do better. Why not
just reset the relfrozenxid and other stats as if the table was
freshly created when it's truncated?

I put together this quick patch to check the idea and it seems to
integrate fine in the code. I'm not sure about a few points but I
don't think they're showstoppers.

1) Should we update relpages and reltuples. I think so but an argument
could be made that people might be depending on running analyze once
when the data is loaded and then not having to run analyze on every
data load.

2) adding the dependency on heapam.h to heap.c makes sense because of
heap_inplace_update bt it may be a bit annoying because I suspect
that's a useful sanity check that the tableam stuff hasn't been
bypassed

3) I added a check to the regression tests but I'm not sure it's a
good idea to actually commit this. It could fail if there's a parallel
transaction going on and even moving the test to the serial schedule
might not guarantee that never happens due to autovacuum running
analyze?

I didn't actually add the warning to autovacuum yet.

--
greg

Attachment Content-Type Size
0001-update-relfrozenxmin-when-truncating-temp-tables.patch text/x-patch 6.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-11-08 23:22:50 Re: upcoming API changes for LLVM 12
Previous Message Andres Freund 2020-11-08 23:18:54 Re: upcoming API changes for LLVM 12