Quick Links

Re: pg_upgrade and statistics

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: pg_upgrade and statistics
Date:	2012-03-13 07:35:14
Message-ID:	CAAZKuFa0KUwWVjjFpWLyme07uUJ=h4zswM3wK2fzFK0A=qZs9A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Mar 12, 2012 at 9:12 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>> Copying the statistics from the old server is on the pg_upgrade TODO
>> list. I have avoided it because it will add an additional requirement
>> that will make pg_upgrade more fragile in case of major version changes.
>
>> Does anyone have a sense of how often we change the statistics data
>> between major versions?
>
> I don't think pg_statistic is inherently any more stable than any other
> system catalog.

Agreed, but it would appear that in practice that a fair amount of it
carries forward. If someone ripped up the statistics system and did
them all over in such a way that the old fields had no meaning on
future costing metrics, that'd probably be reasonable cause for a
caveat involving full-blown reanalyze...still, that doesn't seem to
happen every year.

> We've whacked it around significantly just last week,
> which might color my perception a bit, but there are other changes on
> the to-do list. (For one example, see nearby complaints about
> estimating TOAST-related costs, which we could not fix without adding
> more stats data.)

Is accruing additional statistics likely going to be a big problem? I
noticed the addition of the new anyarray (presumably for
array-selectivity) features; would planning with an "empty" assumption
be disastrous vs. the old behavior, which had no concept of those at
all?

I don't think it's necessary to make statistics porting a feature of
pg_upgrade in all circumstances, but it would be "nice" when possible.
That having been said, perhaps there are other ways for pg_upgrade to
be better invested in....or, best of all and somewhat unrelatedly,
full blown logical replication.

Although this conversation has taken focus on "how do we move stats
forward", I am about as interested in "how do I run statements (like
ANALYZE) more 'nicely'". The same general problem pervades many
background task issues, including autovacuum and large physical
reorganizations of data.

--
fdr

In response to

Re: pg_upgrade and statistics at 2012-03-13 04:12:27 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2012-03-13 08:42:03	Re: initdb and fsync
Previous Message	Daniel Farina	2012-03-13 07:33:09	Re: pg_upgrade and statistics