Re: Speeding up pg_upgrade

From: Mark Dilger <hornschnorter(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Alexander Kukushkin <cyberdemn(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speeding up pg_upgrade
Date: 2017-12-08 17:23:43
Message-ID: F0E0BDA9-2C86-42C9-9416-7ED88665D908@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On Dec 8, 2017, at 9:21 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>
> Mark,
>
> * Mark Dilger (hornschnorter(at)gmail(dot)com) wrote:
>>> On Dec 7, 2017, at 10:24 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>>> I think the big problem with two-stage pg_upgrade is that the user steps
>>> are more complex, so what percentage of users are going use the
>>> two-stage method. The bad news is that only a small percentage of users
>>> who will benefit from it will use it, and some who will not benefit it
>>> will use it. Also, this is going to require significant server changes,
>>> which have to be maintained.
>>
>> In my fork of the project, back when I was tracking 9.5, I added an option
>> to vacuum/analyze to make it behave a bit more like autovac, so that I could
>> run
>>
>> ANALYZE CONDITIONALLY;
>>
>> and it would only analyze those tables in the system which autovac would
>> analyze. In the grammar, CONDITIONALLY gets translated into a
>> VacuumOption flag. In vacuum (in src/backend/commands/vacuum.c), inside
>> the "Loop to process each selected relation", if this flag is set, it checks the
>> PgStat_StatTabEntry for the table to determine whether to vacuum or analyze
>> the table.
>>
>> I think this extension would be helpful in the context of the current conversation.
>> In those cases where pg_upgrade was able to migrate the statistics to the
>> new database, as long as it set the PgStat_StatTabEntry for each table where
>> statistics were migrated, then the user would just have to execute a
>> "VACUUM CONDITIONALLY" after upgrade, and the database would either
>> do a lot of analyze work, a little analyze work, or no analyze work depending
>> on which tables needed analyzing.
>>
>> The main advantage here is that the user would always run this command
>> after pg_upgrade, without having to think about whether pg_upgrade had
>> migrated statistics or not.
>>
>> If the community thinks this is useful, I could put together a patch.
>
> This certainly sounds nice though I have to admit to being a bit
> skeptical on the keyword selection, but perhaps blue really is the right
> color for that bike shed.
>
> One thing I'd wonder about is if that makes 'CONDITIONALLY' into a
> reserved keyword, which wouldn't be ideal. Perhaps a bit of a stretch
> but 'ANALYZE ALL NEEDED' might avoid that?

Yeah, I expected some complaint about CONDITIONALLY, and I don't have
any personal feelings about the choice of terms. I'm happy to go with
your choice, or whatever the community decides.

mark

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-12-08 17:24:16 Re: [HACKERS] Assertion failure when the non-exclusive pg_stop_backup aborted.
Previous Message Stephen Frost 2017-12-08 17:21:52 Re: Speeding up pg_upgrade