| From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
|---|---|
| To: | Mikhail Kharitonov <mikhail(dot)kharitonov(dot)dev(at)gmail(dot)com> |
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: [PATCH] VACUUM: avoid pre-creation transactions holding back cleanup of newly created relations |
| Date: | 2025-12-09 19:10:24 |
| Message-ID: | CAAKRu_boKVYgiS4sFOQD0pMN_B6OruUjsDgLe2jetH1esuTn6A@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Mikhail,
Thanks for sending a patch!
I'll start by admitting that I am not sure exactly how to do this
feature correctly. But there are some things about your patch that I
think will definitely not work. I also have some suggestions for how
you can split up the project.
There are two distinct parts of this feature. The first part is
recording the transaction creating the table in pg_class. The second
part is using that xid in vacuum to allow actually pruning/freezing
tuples in the table if it was created after the long-running
transactions holding the horizon back.
I would split the patch into those two parts.
The first patch (having the birth xid of a table in pg_class) could
potentially be used in autovacuum to avoid selecting a table which we
won't be able to clean up anyway (in relation_needs_vacanalyze()).
This is the opposite of what your second patch would do, but it would
save us some useless vacuuming by itself. We don't currently calculate
the horizon in relation_needs_vacanalyze() and I'm not sure if it is
cheap enough to do or if we even can do it there, but doing this could
open more opportunities to avoid useless vacuuming.
Now some comments on the implementation:
You get the current transaction id in InsertPgClassTuple(). This
doesn't look like the right place. It's just a helper function to
assemble the record.
I'm also not convinced you handle the pg_upgrade case correctly. (see
how heap_create_with_catalog() has a special IsBinaryUpgrade path).
What will GetCurrentTransactionId() return in this case?
I presume you did it in InsertPgClassTuple() because you wanted index
relations to also have this xid, but I think you will want to do this
separately in index_create(). Anyway, for the case you are solving,
you don't need index creation xids, but it is probably best to have
those for completeness.
You modify GetOldestNonRemovableTransactionId(). I don't think you
should be this ambitious in this patch. You can change
vacuum_get_cutoffs() to take the newer of the table creation xid and
the value returned from GetOldestNonRemovableTransactionId(). Then you
don't have to figure out if all the cases where
GetOldestNonRemovableTransactionId() is used other than vacuum truly
want the creation xid or not. This would help descope the feature.
I also don't understand why you need to add two fields to
RelationData. RelationData has access to the rd_rel which would have
your creation xid. If it isn't valid, then don't use it in
vacuum_get_cutoffs() (and you should clearly explain in a comment what
the cases are where it may not be valid).
Those are just my initial thoughts. I tried searching the archive for
past efforts to add a table creation xid to pg_class and couldn't find
anything. Hopefully someone who has been around longer will notice
this thread and reply, because I'm very interested to know if it was
tried before, and, if so, why it didn't work out.
- Melanie
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tomas Vondra | 2025-12-09 19:20:32 | Re: Add a greedy join search algorithm to handle large join problems |
| Previous Message | Miłosz Bieniek | 2025-12-09 18:54:57 | Re: amcheck: support for GiST |