Re: Statistics Import and Export

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Statistics Import and Export
Date: 2024-02-29 22:47:59
Message-ID: CADkLM=c8waqcMg+tidQmRudS8cziwPHoAQoLq4zKeEbskzVX2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
>
> Having looked through this thread and discussed a bit with Corey
> off-line, the approach that Tom laid out up-thread seems like it would
> make the most sense overall- that is, eliminate the JSON bits and the
> SPI and instead export the stats data by running queries from the new
> version of pg_dump/server (in the FDW case) against the old server
> with the intelligence of how to transform the data into the format
> needed for the current pg_dump/server to accept, through function calls
> where the function calls generally map up to the rows/information being
> updated- a call to update the information in pg_class for each relation
> and then a call for each attribute to update the information in
> pg_statistic.
>

Thanks for the excellent summary of our conversation, though I do add that
we discussed a problem with per-attribute functions: each function would be
acquiring locks on both the relation (so it doesn't go away) and
pg_statistic, and that lock thrashing would add up. Whether that overhead
is judged significant or not is up for discussion. If it is significant, it
makes sense to package up all the attributes into one call, passing in an
array of some new pg_statistic-esque special type....the very issue that
sent me down the JSON path.

I certainly see the flexibility in having a per-attribute functions, but am
concerned about non-binary-upgrade situations where the attnums won't line
up, and if we're passing them by name then the function has dig around
looking for the right matching attnum, and that's overhead too. In the
whole-table approach, we just iterate over the attributes that exist, and
find the matching parameter row.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-02-29 22:55:11 Re: DOCS: Avoid using abbreviation "aka"
Previous Message Tomas Vondra 2024-02-29 22:44:27 Re: BitmapHeapScan streaming read user and prelim refactoring