Re: Thousands of schemas and ANALYZE goes out of memory

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Hugo <hugo(dot)tech(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Thousands of schemas and ANALYZE goes out of memory
Date: 2015-06-05 23:31:09
Message-ID: CAMkU=1zdGFQEg7czrJ5bD0YVFdZj07EQ9KYoRs0Uu5_F7wGRzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, May 13, 2015 at 3:10 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Is there a reason the following patch wasn't applied?
>
> I don't think anybody ever did the legwork to verify it was a good idea.
> In particular, it'd be good to check if sending a tabstat message for each
> table adds noticeable overhead.
>
> regards, tom lane
>

Sorry, I thought you had already committed this change.

It definitely fixes the worst CPU bottleneck. To ANALYZE (manually, no
table name specified) a database with 210,000 tables goes from 1280 seconds
to 120 seconds. It does not meaningfully change the memory usage, as
tabstat is not the bottleneck on memory usage, CacheMemoryContext is. It
takes about 1.5GB either with this patch or without it.

I'm having trouble coming up with an adverse scenario to test for
performance regressions. Perhaps a platform where IPC is pathologically
slow? I don't have one of those at hand, nor even know what such a one
might be. And presumably such a platform would be unsuited to the task of
running a very large database anyway.

By the way, this issue can now be partially worked around using "vacuumdb
-Z -j8". The quadratic CPU issue is fixed because with the parallel option
it issues a series of "ANALYZE table" commands rather than one unqualified
command for the whole database. And the memory usage originally reported
is ameliorated because each backend gets own address space. The total
amount of memory used remains the same, but the 32 bit limit doesn't apply
to the aggregate, only to each process.

Cheers,

Jeff

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Geoff Montee 2015-06-06 23:08:05 Re: SQL Server access from PostgreSQL
Previous Message Matheus de Oliveira 2015-06-05 22:34:30 Re: [PERFORM] Query running slow for only one specific id. (Postgres 9.3) version