Re: Drastic performance loss in assert-enabled build in HEAD

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Drastic performance loss in assert-enabled build in HEAD
Date: 2013-04-02 18:40:19
Message-ID: 1364928019.4557.YahooMailNeo@web162901.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Kevin Grittner <kgrittn(at)ymail(dot)com> writes:
>> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> So maybe I'm nuts to care about the performance of an
>>> assert-enabled backend, but I don't really find a 4X runtime
>>> degradation acceptable, even for development work.  Does anyone
>>> want to fess up to having caused this, or do I need to start
>>> tracking down what changed?
>
>> I checked master HEAD for a dump of regression and got about 4
>> seconds.  I checked right after my initial push of matview code
>> and got 2.5 seconds.  I checked just before that and got 1
>> second.  There was some additional pg_dump work for matviews
>> after the initial push which may or may not account for the rest
>> of the time.
>
> I poked at this a bit, and eventually found that the performance
> differential goes away if I dike out the
> pg_relation_is_scannable() call in getTables()'s table-collection
> query.  What appears to be happening is that those calls cause a
> great many more relcache entries to be made than used to happen,
> plus many entries are made earlier in the run than they used to
> be.  Many of those entries have subsidiary memory contexts, which
> results in an O(N^2) growth in the time spent in AllocSetCheck,
> since postgres.c does MemoryContextCheck(TopMemoryContext) once
> per received command, and pg_dump will presumably issue O(N)
> commands in an N-table database.
>
> So one question that brings up is whether we need to dial back
> the amount of work done in memory context checking, but I'm loath
> to do so in development builds.  That code has fingered an awful
> lot of bugs.  OTOH, if the number of tables in the regression DB
> continues to grow, we may have little choice.
>
> Anyway, the immediate takeaway is that this represents a horribly
> expensive way for pg_dump to find out which matviews aren't
> scannable.  The cheap way for it to find out would be if we had a
> boolean flag for that in pg_class.  Would you review the bidding
> as to why things were done the way they are?  Because in general,
> having to ask the kernel something is going to suck in any case,
> so basing it on the size of the disk file doesn't seem to me to
> be basically a good thing.

The early versions of the patch had a boolean in pg_class to track
this, but I got complaints from Robert and Noah (and possibly
others?) that this got too ugly in combination with the support for
unlogged matviews, and they suggested the current approach.  For an
unlogged matview we need to replace the heap (main fork) with the
init fork before the database is up and running, so there would
need to be some way for that to result in forcing the flag in
pg_class.  I was attempting to do that when a matview which was
flagged as scannable in pg_class was opened, but I gotta agree that
it wasn't pretty.  Noah suggested a function based on testing the
heap to see if it looked like the init fork, and the current state
of affairs I my attempt to make it work that way.

We could definitely minimize the overhead by only testing this for
matviews.  It seemed potentially useful for the function to have
some purpose for other types of relations, so I was trying to avoid
that.  Maybe that was a bad idea.

> We could alleviate the pain by changing pg_dump's query to
> something like
>
>     (case when c.relkind = 'm'
>      then pg_relation_is_scannable(c.oid)
>      else false end)

Yeah, that's the sort of thing I was thinking of.  If we're going
to go that way, we may want to touch one or two other spots.

> but TBH this feels like bandaging a bad design.

So far nobody has been able to suggest a better way to support both
unlogged matviews and some way to prevent a matview from being used
if it does not contain materialized data for its query from *some*
point in time.  Suggestions welcome.

> Another reason why I don't like this code is that
> pg_relation_is_scannable is broken by design:
>
>     relid = PG_GETARG_OID(0);
>     relation = RelationIdGetRelation(relid);
>     result = relation->rd_isscannable;
>     RelationClose(relation);
>
> You can't do that: if the relcache entry doesn't already exist,
> this will try to construct one while not holding any lock on the
> relation, which is subject to all sorts of race conditions.

Hmm.  I think I had that covered earlier but messed up in
rearranging to respond to review comments.  Will review both new
calling locations.

> In general, direct calls on RelationIdGetRelation are probably
> broken.

If valid contexts for use of the function are that limited, it
might merit a comment where the function is defined.  I'm not sure
what a good alternative to this would be.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Atri Sharma 2013-04-02 18:45:56 Re: Page replacement algorithm in buffer cache
Previous Message Kohei KaiGai 2013-04-02 18:22:56 Re: [sepgsql 2/3] Add db_schema:search permission checks