From: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: autovacuum stress-testing our system |
Date: | 2012-09-26 22:48:34 |
Message-ID: | 50638642.2080606@fuzzy.cz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 26.9.2012 18:29, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> Excerpts from Euler Taveira's message of miĂŠ sep 26 11:53:27 -0300 2012:
>>> On 26-09-2012 09:43, Tomas Vondra wrote:
>>>> 5) splitting the single stat file into multiple pieces - e.g. per database,
>>>> written separately, so that the autovacuum workers don't need to read all
>>>> the data even for databases that don't need to be vacuumed. This might be
>>>> combined with (4).
>
>>> IMHO that's the definitive solution. It would be one file per database plus a
>>> global one. That way, the check would only read the global.stat and process
>>> those database that were modified. Also, an in-memory map could store that
>>> information to speed up the checks.
>
>> +1
>
> That would help for the case of hundreds of databases, but how much
> does it help for lots of tables in a single database?
Well, it wouldn't, but it wouldn't make it worse either. Or at least
that's how I understand it.
> I'm a bit suspicious of the idea that we should encourage people to use
> hundreds of databases per installation anyway: the duplicated system
> catalogs are going to be mighty expensive, both in disk space and in
> their cache footprint in shared buffers. There was some speculation
> at the last PGCon about how we might avoid the duplication, but I think
> we're years away from any such thing actually happening.
You don't need to encourage us to do that ;-) We know it's not perfect
and considering a good alternative - e.g. several databases (~10) with
schemas inside, replacing the current database-only approach. This way
we'd get multiple stat files (thus gaining the benefits) with less
overhead (shared catalogs).
And yes, using tens of thousands of tables (serving as "caches") for a
reporting solution is "interesting" (as in the old Chinese curse) too.
> What seems to me like it could help more is fixing things so that the
> autovac launcher needn't even launch a child process for databases that
> haven't had any updates lately. I'm not sure how to do that, but it
> probably involves getting the stats collector to produce some kind of
> summary file.
Yes, I've proposed something like this in my original mail - setting a
"dirty" flag on objects (a database in this case) whenever a table in it
gets eligible for vacuum/analyze.
Tomas
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2012-09-26 23:53:50 | Re: autovacuum stress-testing our system |
Previous Message | Gavin Flower | 2012-09-26 22:06:03 | Re: system_information.triggers & truncate triggers |