Re: Make deparsing of column defaults faster

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Subject: Re: Make deparsing of column defaults faster
Date: 2018-07-07 18:07:55
Message-ID: CAMkU=1zniSch8BsZZL1ngqC9QGfhR=nkCpp6RAdf_sQFBDV42Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 5, 2018 at 10:45 AM, Peter Eisentraut <
peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:

> On 29.06.18 05:15, Jeff Janes wrote:
> > Since pg_dump calls pg_get_expr once over and over again on the same
> > table consecutively, perhaps we could cache the column alias assignments
> > in a single-entry cache, so if it is called on the same table as last
> > time it just re-uses the aliases from last time. I am not planning on
> > working on that, I don't know where such a cache could be stored such
> > that is freed and invalidated at the appropriate times.
>
> I looked into that. deparse_context_for() is actually not that
> expensive on its own, well below one second, but it gets somewhat
> expensive when you call it 1600 times for one table. So to address that
> case, we can cache the deparse context between calls in the fn_extra
> field of pg_get_expr. The attached patch does that. This makes the
> pg_dump -s times pretty much constant even with 1600 columns with
> defaults. psql \d should benefit similarly. I haven't seen any other
> cases where we'd expect hundreds of related objects to deparse. (Do
> people have hundreds of policies per table?)
>

One case that your patch doesn't improve (neither does my posted one) is
check constraints. To fix that, pg_get_constraintdef_worker would also
need to grow a cache as well. I don't know how often people put check
constraints on most of the columns of a table. Some people like their NOT
NULL constraints to be named, not implicit.

But from the bigger picture of making pg_upgrade faster, a major issue is
that while pg_dump -s gets faster for the column default case, the restore
of that dump is still slow (again, my posted patch also doesn't fix that).
In that case it is deparse_context_for called from StoreAttrDefault which
is slow.

(I suppose you could create scenarios with very many such tables to make
> the overhead visible again.)
>

With partitioning, I think anything which happens once is likely to happen
many times. But I don't see any extra improvement from applying my patch
over yours (the bottleneck is shifted off to pg_dump itself), and yours is
definitely cleaner and slightly more general. I think that yours would be
worthwhile, even if not the last word on the subject.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-07-07 18:28:17 Re: peripatus build failures....
Previous Message Andres Freund 2018-07-07 18:05:48 Re: How can we submit code patches that implement our (pending) patents?