Re: Reducing stats collection overhead

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Reducing stats collection overhead
Date: 2007-04-29 07:17:35
Message-ID: 200704290717.l3T7HZB08040@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Yes, it seems we will have to do something for 8.3. I assume the method
below would reduce frequent updates of the stats_command_string too.

---------------------------------------------------------------------------

Tom Lane wrote:
> Arjen van der Meijden told me that according to the tweakers.net
> benchmark, HEAD is noticeably slower than 8.2.4, and I soon confirmed
> here that for small SELECT queries issued as separate transactions,
> there's a significant difference. I think much of the difference stems
> from the fact that we now have stats_row_level ON by default, and so
> every transaction sends a stats message that wasn't there by default
> in 8.2. When you're doing a few thousand transactions per second
> (not hard for small read-only queries) that adds up.
>
> It seems to me that this could be fixed fairly easily by allowing the
> stats to accumulate across multiple small transactions before sending
> a message. There's surely not much point in kicking stats out quickly
> when the stats collector only reports them to the world every half
> second anyway.
>
> The first design that comes to mind is that at transaction end
> (pgstat_report_tabstat() time) we send a stats message only if at least
> X milliseconds have elapsed since we last sent one, where X is
> PGSTAT_STAT_INTERVAL or closely related to it. We also make sure to
> flush stats out before process exit. This approach ensures that in a
> lots-of-short-transactions scenario, we only need to send one stats
> message every X msec, not one per query. The cost is possible delay of
> stats reports. I claim that any transaction that makes a really sizable
> change in the stats will run longer than X msec and therefore will send
> its stats immediately. Cases where a client does a small transaction
> after sleeping for awhile (more than X msec) will also send immediately.
> You might get a delay in reporting the last few transactions of a burst
> of short transactions, but how much does it matter? So I think that
> complicating the design with, say, a timeout counter to force out the
> stats after a sleep interval is not necessary. Doing so would add a
> couple of kernel calls to every client interaction so I'd really rather
> avoid that.
>
> Any thoughts, better ideas?
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-04-29 07:27:16 Re: Reducing stats collection overhead
Previous Message Pavel Stehule 2007-04-29 05:33:08 Re: pgsql crollable cursor doesn't support one formofpostgresql's cu