Re: Reducing stats collection overhead

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Reducing stats collection overhead
Date: 2007-04-29 16:00:40
Message-ID: 20070429160040.GE18593@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:

> The first design that comes to mind is that at transaction end
> (pgstat_report_tabstat() time) we send a stats message only if at least
> X milliseconds have elapsed since we last sent one, where X is
> PGSTAT_STAT_INTERVAL or closely related to it. We also make sure to
> flush stats out before process exit. This approach ensures that in a
> lots-of-short-transactions scenario, we only need to send one stats
> message every X msec, not one per query.

If you're going to make it depend on the timestamp set by transaction
start, I'm all for it.

> The cost is possible delay of stats reports. I claim that any
> transaction that makes a really sizable change in the stats will run
> longer than X msec and therefore will send its stats immediately.

I agree with this, particularly if it means we don't get to add another
gettimeofday().

FWIW, am I reading the code wrong or do we send the number of xact
commit and rollback multiple times in pgstat_report_one_tabstat, with
only the first one having non-zero counts? Maybe we could put these
counters in a separate message to reduce the size of the tabstat
messages themselves. (It may be that the total impact in bytes is
minimal, and the added overhead of an additional message is greater?)

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-04-29 16:21:16 Re: Reducing stats collection overhead
Previous Message Gregory Stark 2007-04-29 15:50:24 Re: Feature freeze progress report