Re: Change in "policy" on dump ordering?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>, Jordan Gigov <coladict(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Change in "policy" on dump ordering?
Date: 2017-07-24 16:09:43
Message-ID: 28572.1500912583@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * Peter Eisentraut (peter(dot)eisentraut(at)2ndquadrant(dot)com) wrote:
>> On 3/6/17 03:33, Michael Banck wrote:
>>> Would this be a candidate for backpatching, or is the behaviour change
>>> in pg_dump trumping the issues it solves?

>> Unless someone literally has a materialized view on pg_policy, it
>> wouldn't make a difference, so I'm not very keen on bothering to
>> backpatch this.

> Agreed.

So actually, the problem with Jim's patch is that it doesn't fix the
problem. pg_dump's attempts to REFRESH matviews will still fail in
common cases, because they still come out before GRANTs, because pg_dump
treats ACLs as a completely independent thing to be done last. This
was noted as far back as 2015 (in a thread previously linked from this
thread), and it's also the cause of Jordan Gigov's current complaint at
https://www.postgresql.org/message-id/CA%2BnBocAmQ%2BOPNSKUzaaLa-6eGYVw5KqexWJaRoGvrvLyDir9gg%40mail.gmail.com

Digging around in the archives, I find that Kevin had already proposed
a fix in
https://www.postgresql.org/message-id/flat/20160202161407.2778.24659%40wrigleys.postgresql.org
which I didn't particularly care for, and apparently nobody else did
either. But we really oughta do *something*.

The main problem with Kevin's fix, after thinking about it more, is that
it shoves matview refresh commands into the same final processing phase
where ACLs are done, which means that in a parallel restore they will not
be done in parallel. That seems like a pretty serious objection, although
maybe not so serious that we'd be willing to accept a major rewrite in the
back branches to avoid it.

I'm wondering at this point about having restore create a fake DO_ACLS
object (fake in the sense that it isn't in the dump file) that would
participate normally in the dependency sort, and which we'd give a
priority before matview refreshes but after everything else. "Restore"
of that object would perform the same operation we do now of running
through the whole TOC and emitting grants/revokes. So it couldn't be
parallelized in itself (at least not without an additional batch of work)
but it could be treated as an indivisible parallelized task, and then the
matview refreshes could be parallelizable tasks after that.

There's also Peter's proposal of splitting up GRANTs from REVOKEs and
putting only the latter at the end. I'm not quite convinced that that's
a good idea but it certainly merits consideration.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Claudio Freire 2017-07-24 16:11:14 Re: Increase Vacuum ring buffer.
Previous Message Jeff Janes 2017-07-24 15:51:46 Re: why not parallel seq scan for slow functions