Re: [HACKERS] Slow count(*) again...

From: Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>, Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "david(at)lang(dot)hm" <david(at)lang(dot)hm>, Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [HACKERS] Slow count(*) again...
Date: 2011-02-03 15:43:47
Message-ID: 4D4ACD33.903@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

03.02.11 17:31, Robert Haas написав(ла):
>
>> May be introducing something like 'AutoAnalyze' threshold will help? I mean
>> that any insert/update/delete statement that changes more then x% of table
>> (and no less then y records) must do analyze right after it was finished.
>> Defaults like x=50 y=10000 should be quite good as for me.
> That would actually be a pessimization for many real world cases. Consider:
>
> COPY
> COPY
> COPY
> COPY
> COPY
> COPY
> COPY
> COPY
> COPY
> COPY
> COPY
> COPY
> COPY
> SELECT
If all the copies are ~ same in size and large this will make it:

COPY
ANALYZE
COPY
ANALYZE
COPY
COPY
ANALYZE
COPY
COPY
COPY
COPY
ANALYZE
COPY
COPY
COPY
COPY
COPY
SELECT

instead of

COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
ANALYZE (manual, if one is clever enough)
SELECT

So, yes this will add 3 more analyze, but
1) Analyze is pretty cheap comparing to large data loading. I'd say this
would add few percent of burden. And NOT doing analyze manually before
select can raise select costs orders of magnitude.
2) How often in real world a single table is loaded in many COPY
statements? (I don't say it's not often, I really don't know). At least
for restore it is not the case, is not it?
3) default thresholds are things to discuss. You can make x=90 or x=200
(latter will make it run only for massive load/insert operations). You
can even make it disabled by default for people to test. Or enable by
default for temp tables only (and have two sets of thresholds)
4) As most other settings, this threshold can be changed on up to
per-query basis.

P.S. I would also like to have index analyze as part of any create index
process.

Best regards, Vitalii Tymchyshyn

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-02-03 15:45:51 Re: Problem with postgresql database connection in combination with HUAWEI data modem
Previous Message Robert Haas 2011-02-03 15:38:31 Re: out of memory during COPY .. FROM

Browse pgsql-performance by date

  From Date Subject
Next Message Andy Colson 2011-02-03 15:44:03 Re: getting the most of out multi-core systems for repeated complex SELECT statements
Previous Message Robert Haas 2011-02-03 15:35:43 Re: [HACKERS] Slow count(*) again...