Re: Statistics Import and Export

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Statistics Import and Export
Date: 2023-12-28 17:28:06
Message-ID: CADkLM=em7HteSgn9tCzioko0+hx00a+kr2WXKWkb7zye7wA8Yg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 27, 2023 at 10:10 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Corey Huinker <corey(dot)huinker(at)gmail(dot)com> writes:
> > Export functions was my original plan, for simplicity, maintenance, etc,
> > but it seemed like I'd be adding quite a few functions, so the one view
> > made more sense for an initial version. Also, I knew that pg_dump or some
> > other stats exporter would have to inline the guts of those functions
> into
> > queries for older versions, and adapting a view definition seemed more
> > straightforward for the reader than function definitions.
>
> Hmm, I'm not sure we are talking about the same thing at all.
>

Right, I was conflating two things.

>
> What I am proposing is *import* functions. I didn't say anything about
> how pg_dump obtains the data it prints; however, I would advocate that
> we keep that part as simple as possible. You cannot expect export
> functionality to know the requirements of future server versions,
> so I don't think it's useful to put much intelligence there.
>

True, but presumably you'd be using the pg_dump/pg_upgrade of that future
version to do the exporting, so the export format would always be tailored
to the importer's needs.

>
> So I think pg_dump should produce a pretty literal representation of
> what it finds in the source server's catalog, and then rely on the
> import functions in the destination server to make sense of that
> and do whatever slicing-n-dicing is required.
>

Obviously it can't be purely literal, as we have to replace the oid values
with whatever text representation we feel helps us carry forward. In
addition, we're setting the number of tuples and number of pages directly
in pg_class, and doing so non-transactionally just like ANALYZE does. We
could separate that out into its own import function, but then we're
locking every relation twice, once for the tuples/pages and once again for
the pg_statistic import.

My current line of thinking was that the stats import call, if enabled,
would immediately follow the CREATE statement of the object itself, but
that requires us to have everything we need to know for the import passed
into the import function, so we'd be needing a way to serialize _that_. If
you're thinking that we have one big bulk stats import, that might work,
but it also means that we're less tolerant of failures in the import step.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2023-12-28 17:28:32 Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Previous Message Junwang Zhao 2023-12-28 16:02:36 Re: Transaction timeout