| From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | Kirill Reshke <reshkekirill(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: relfilenode statistics |
| Date: | 2025-11-12 17:03:55 |
| Message-ID: | aRS9+yaGhjDslRWj@ip-10-97-1-34.eu-west-3.compute.internal |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Mon, Nov 10, 2025 at 05:53:45PM +0900, Michael Paquier wrote:
> On Sun, Nov 09, 2025 at 08:33:54AM +0900, Michael Paquier wrote:
> > Looking at this part of the patch set for now, not looked at the rest
> > yet. This new stats_1.out is 2k lines long, introduced for the tests
> > related to rewrites as an effect of 2PC. It seems to me that a split
> > into a new stats_rewrite would be justified for this case, to reduce
> > the output duplication.
>
> did a split into a new file, which should also be fine because we
> don't rely on a system-wide stats reset, then applied the result.
Thanks!
> The patch is spending a great deal of effort on three fronts:
> - making sure that the statistics are copied over after a relation
> rewrite.
Right, in 0003.
> - making sure that we assign a "correct" object ID, assigning
> the fields of RelFileLocator based on a relation ID. Mapped and
> shared relations make the exercise a bit more difficult. It would be
> nice to avoid this kind of duplication with other code paths that
> assign a RelFileLocator.
Are you referring to the new pgstat_reloid_to_relfilelocator() function?
If so, I'll try to avoid code duplication with other code paths as suggested.
> - Partitioned tables, where we don't have a relfilenode but we need to
> track statistics. The patch relies on the relation oid to assign a
> key, as far as I've read.
Right. It's not doing that much in this area. It's needed so that things like
"last_analyze" on a partitioned table is populated (see "Ensure only the
partitioned table is analyzed" in vacuum.sql).
> Among the three points, the first one is the most invasive in the
> patch, it seems, and do we actually want to keep the stats across
> rewrites at all?
Not doing so would mean that all stats related to a relation will be lost after
a rewrite. I think that would be a major regression as compared to the current
behavior.
> The main reason of doing the relfilenode move
> would be to rebuild these stats on a WAL-record basis because the
> relfile locator is the only thing we know in the startup process, and
> once rewritten the state of the data is different.
> relation_needs_vacanalyze() then cares about three fields:
> - Number of dead tuples, which would be 0 after a rewrite.
> - ins_since_vacuum, which would be 0 after a rewrite.
> - mod_since_analyze, for analyze, again 0.
>
> I have not checked the recent autovacuum scheduling thread to see if
> this set changes there.
>
> Are these numbers worth the effort of copying over at the end?
I think so because that would impact all the other relation's stats (not only
the ones linked to relation_needs_vacanalyze()).
> Was
> this particular point discussed? I've seen this mentioned once here,
> but I am wondering what are the arguments in favor of copying the
> stats data versus not copying it across rewrites:
> https://www.postgresql.org/message-id/20240607031736.7izmr2yirznvidka%40awork3.anarazel.de
In favor of copying, I would say:
- no regression as compared to the current behavior. That means, for example,
not breaking DBA's activities/decisions based on the pg_stat_all_tables fields
after a rewrite.
- a rewrite is not changing the number of dead tuples, ins_since_vacuum and
mod_since_analyze. So, if don't copy those, then we'd change the
relation_needs_vacanalyze() decision(s) as compared to the current one(s) for no
reasons (as a rewrite has no impact on those).
In favor of not copying, I would say make the code simpler.
I'm in favor of copying but open to different point of views.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Álvaro Herrera | 2025-11-12 17:09:40 | Re: obsolete autovacuum comment |
| Previous Message | Arseniy Mukhin | 2025-11-12 16:57:51 | Re: Optimize LISTEN/NOTIFY |