Re: Ideas about a better API for postgres_fdw remote estimates

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: Ideas about a better API for postgres_fdw remote estimates
Date: 2020-08-31 17:26:59
Message-ID: 20200831172659.GS29590@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> On Mon, Aug 31, 2020 at 12:56:21PM -0400, Stephen Frost wrote:
> > The point I was making was that it has value and people did realize it
> > but there's only so many resources to go around when it comes to hacking
> > on PG and therefore it simply hasn't been done yet.
> >
> > There's a big difference between "yes, we all agree that would be good
> > to have, but no one has had time to work on it" and "we don't think this
> > is worth having because of the maintenance work it'd require." The
> > latter shuts down anyone thinking of working on it, which is why I said
> > anything.
>
> I actually don't know which statement above is correct, because of the
> "forever" maintenance.

I can understand not being sure which is correct, and we can all have
different points of view on it too, but that's a much softer stance than
what I, at least, understood from your up-thread comment which was-

> I don't think there was enough value to do statistics migration just
> for pg_upgrade [...]

That statement came across to me as saying the latter statement above.
Perhaps that wasn't what you intended it to, in which case it's good to
have the discussion and clarify it, for others who might be following
this thread and wondering if they should consider working on this area
of the code or not.

> Yes, very true, but technically any change in any aspect of the
> statistics system would require modification of the statistics dump,
> which usually isn't required for most feature changes.

Feature work either requires changes to pg_dump, or not. I agree that
features which don't require pg_dump changes are definitionally less
work than features which do (presuming the rest of the feature is the
same in both cases) but that isn't a justification to not have pg_dump
support in cases where it's expected- we just don't currently expect it
for statistics (which is a rather odd exception when you consider that
nearly everything else that ends up in the catalog tables is included).

For my part, at least, I'd like to see us change that expectation, for a
number of reasons:

- pg_upgrade could leverage it and reduce downtime and/or confusion for
users who are upgrading and dealing with poor statistics or no
statistics for however long after the upgrade

- Tables restored wouldn't require an ANALYZE to get reasonable queries
against them

- Debugging query plans would be a lot less guess-work or having to ask
the user to export the statistics by hand from the catalog and then
having to hand-hack them in to try and reproduce what's happening,
particularly when re-running an analyze ends up giving different
results, which isn't uncommon for edge cases

- The postgres_fdw would be able to leverage this, as discussed earlier
on in this thread

- Logical replication could potentially leverage the existing stats and
not require ANALYZE to be done after an import, leading to more
predictable query plans on the replica

I suspect there's probably other benefits than the ones above, but these
all seem pretty valuable to me.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-08-31 17:42:15 Re: Get rid of runtime handling of AlternativeSubPlan?
Previous Message Tom Lane 2020-08-31 17:22:11 Re: Get rid of runtime handling of AlternativeSubPlan?