Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)
Date: 2018-11-14 06:28:05
Message-ID: CAH2-WzmQ7XeKspNyA+K4iRMaVcqquq8G1qtvfK_grdKV-1Oa9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 13, 2018 at 1:29 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> A solution does occur to me that I'm kind of embarrassed to suggest,
> but that would nonetheless probably do the job:

> Why not vary the objsubid value among entries that don't use it
> anyway, so that they have a negative integer objsubid values rather
> than 0? That would have the effect of forcing the scan order of
> pg_depend_depender_index scans to be whatever we want it to be.

I tried to implement this today, and it almost worked. However, what I
came up with was even more brittle than I thought it would be, because
knowledge of objsubid is all over the place. It's even in pg_dump.

Then I had a better idea: add a monotonically decreasing column to
pg_depend, and make it the last attribute in both
pg_depend_depender_index and pg_depend_reference_index (it has to be
both indexes). This makes almost all the regression test output go
back to the master branch output, since index-wise duplicates are now
accessed in a well defined order: reverse chronological order (the
reverse of the order of the creation of a pg_depend entry). There is a
tiny amount of stuff that still has a different order in the
regression test output, but the changes are trivial, and obviously
still correct.

Is this approach really so bad? I might want to find a way to make it
a monotonically increasing DESC column (there is a question around bki
grammar support for that), but that's just a detail. Some scheme
involving a new column seems like it will work reasonably well. It's
much more maintainable than anything else I'm likely to come up with.

The indexes are actually smaller with the change following a run of
the regression tests, provided the entire patch series is applied. So
even if the only thing you care about is system catalog size, you
still win (the table is slightly bigger, though). This approach will
fix most or all of the test flappiness that we have been papering over
with "\set VERBOSITY terse" before now. I can't promise that just
making this one pg_depend change will reliably make that class of
problem go away forever, since you could still see something like that
elsewhere, but it looks like the vast majority of problem cases
involve pg_depend. So we might just end up killing the "\set VERBOSITY
terse" hack completely this way.

We're already relying on the scan order being in reverse chronological
order, so we might as well formalize the dependency. I don't think
that it's possible to sort the pg_depend entries as a way of fixing
the breakage while avoiding storing this extra information -- what is
there to sort on that's there already? You'd have to infer a whole
bunch of things about the object types associated with pg_depend
entries to do that, and teach dependency.c about its callers. That
seems pretty brittle to me.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2018-11-14 06:38:01 Re: Undo logs
Previous Message John Naylor 2018-11-14 05:56:11 Re: Sync ECPG scanner with core