Re: autovacuum stress-testing our system

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: autovacuum stress-testing our system
Date: 2012-11-21 18:02:56
Message-ID: CA+TgmoZr3xUQq_OM2V1nmEN6NScVBFkpYJrpFFmbyOPJKvZdzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 18, 2012 at 5:49 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> The two main changes are these:
>
> (1) The stats file is split into a common "db" file, containing all the
> DB Entries, and per-database files with tables/functions. The common
> file is still called "pgstat.stat", the per-db files have the
> database OID appended, so for example "pgstat.stat.12345" etc.
>
> This was a trivial hack pgstat_read_statsfile/pgstat_write_statsfile
> functions, introducing two new functions:
>
> pgstat_read_db_statsfile
> pgstat_write_db_statsfile
>
> that do the trick of reading/writing stat file for one database.
>
> (2) The pgstat_read_statsfile has an additional parameter "onlydbs" that
> says that you don't need table/func stats - just the list of db
> entries. This is used for autovacuum launcher, which does not need
> to read the table/stats (if I'm reading the code in autovacuum.c
> correctly - it seems to be working as expected).

I'm not an expert on the stats system, but this seems like a promising
approach to me.

> (a) It does not solve the "many-schema" scenario at all - that'll need
> a completely new approach I guess :-(

We don't need to solve every problem in the first patch. I've got no
problem kicking this one down the road.

> (b) It does not solve the writing part at all - the current code uses a
> single timestamp (last_statwrite) to decide if a new file needs to
> be written.
>
> That clearly is not enough for multiple files - there should be one
> timestamp for each database/file. I'm thinking about how to solve
> this and how to integrate it with pgstat_send_inquiry etc.

Presumably you need a last_statwrite for each file, in a hash table or
something, and requests need to specify which file is needed.

> And yet another one I'm thinking about is using a fixed-length
> array of timestamps (e.g. 256), indexed by mod(dboid,256). That
> would mean stats for all databases with the same mod(oid,256) would
> be written at the same time. Seems like an over-engineering though.

That seems like an unnecessary kludge.

> (c) I'm a bit worried about the number of files - right now there's one
> for each database and I'm thinking about splitting them by type
> (one for tables, one for functions) which might make it even faster
> for some apps with a lot of stored procedures etc.
>
> But is the large number of files actually a problem? After all,
> we're using one file per relation fork in the "base" directory, so
> this seems like a minor issue.

I don't see why one file per database would be a problem. After all,
we already have on directory per database inside base/. If the user
has so many databases that dirent lookups in a directory of that size
are a problem, they're already hosed, and this will probably still
work out to a net win.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2012-11-21 18:04:54 Re: [PATCH] binary heap implementation
Previous Message Robert Haas 2012-11-21 17:54:30 Re: [PATCH] binary heap implementation