Re: Slow count(*) again...

From: Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Slow count(*) again...
Date: 2010-10-13 10:54:19
Message-ID: 4CB58FDB.3080505@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

12.10.10 21:58, Tom Lane написав(ла):
>
> I'm less than convinced that that approach will result in a significant
> win. It's certainly not going to do anything to convert COUNT(*) into
> an O(1) operation, which frankly is what the complainants are expecting.
> There's basically no hope of solving the "PR problem" without somehow
> turning COUNT(*) into a materialized-view reference. We've discussed
> that in the past, and know how to do it in principle, but the complexity
> and distributed overhead are daunting.
>
>
I've though about "aggregate" indexes, something like
create index index_name on table_name(count(*) group by column1, column2);
OR
create index index_name on table_name(count(*));
for table-wide count

To make it usable one would need:
1) Allow third aggregate function SMERGE that can merge one aggregate
state to another
2) The index should be regular index (e.g. btree) on column1, column2
that for each pair has page list to which it's data may belong (in
past/current running transactions), and aggregate state for each page
that were frozen previously
When index is used, it can use precalculated values for "pages with all
tuples vacuumed" (I suspect this is information from visibility map) and
should do regular calculation for all non-frozen pages with visibility
checks and everything what's needed.
When vacuum processes the page, it should (in sync or async way)
calculate aggregate values for the page.

IMHO Such an indexes would make materialized views/triggers/high level
caches unneeded in most cases.

Best regards, Vitalii Tymchyshyn

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2010-10-13 10:56:06 Re: wip: functions median and percentile
Previous Message Pavel Stehule 2010-10-13 10:51:51 Re: wip: functions median and percentile

Browse pgsql-performance by date

  From Date Subject
Next Message Craig Ringer 2010-10-13 11:42:00 Re: Slow count(*) again...
Previous Message Vitalii Tymchyshyn 2010-10-13 10:41:52 Re: Slow count(*) again...