Re: [HACKERS] Slow count(*) again...

From: david(at)lang(dot)hm
To: Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>
Cc: Kenneth Marshall <ktm(at)rice(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>, Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [HACKERS] Slow count(*) again...
Date: 2011-02-05 05:46:30
Message-ID: alpine.DEB.2.00.1102042142571.8162@asgard.lang.hm
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Fri, 4 Feb 2011, Vitalii Tymchyshyn wrote:

> 04.02.11 16:33, Kenneth Marshall ???????(??):
>>
>> In addition, the streaming ANALYZE can provide better statistics at
>> any time during the load and it will be complete immediately. As far
>> as passing the entire table through the ANALYZE process, a simple
>> counter can be used to only send the required samples based on the
>> statistics target. Where this would seem to help the most is in
>> temporary tables which currently do not work with autovacuum but it
>> would streamline their use for more complicated queries that need
>> an analyze to perform well.
>>
> Actually for me the main "con" with streaming analyze is that it adds
> significant CPU burden to already not too fast load process. Especially if
> it's automatically done for any load operation performed (and I can't see how
> it can be enabled on some threshold).

two thoughts

1. if it's a large enough load, itsn't it I/O bound?

2. this chould be done in a separate process/thread than the load itself,
that way the overhead of the load is just copying the data in memory to
the other process.

with a multi-threaded load, this would eat up some cpu that could be used
for the load, but cores/chip are still climbing rapidly so I expect that
it's still pretty easy to end up with enough CPU to handle the extra load.

David Lang

> And you can't start after some threshold of data passed by since you may
> loose significant information (like minimal values).
>
> Best regards, Vitalii Tymchyshyn
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vladimir Kokovic 2011-02-05 05:58:31 How to make contrib/sepgsql on Ubuntu Maverick ?
Previous Message Greg Smith 2011-02-05 05:36:39 Re: Linux filesystem performance and checkpoint sorting

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2011-02-05 06:37:49 Re: [HACKERS] Slow count(*) again...
Previous Message Mark Kirkwood 2011-02-05 05:38:07 Re: Talking about optimizer, my long dream