| From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
|---|---|
| To: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*) |
| Date: | 2025-10-25 01:39:39 |
| Message-ID: | CAApHDvqGcPTagXpKfH=CrmHBqALpziThJEDs_MrPqjKVeDF9wA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Since e2debb643, we've had the ability to determine if a column is
NULLable as early as during constant folding. This seems like a good
time to consider converting COUNT(not_null_col) into COUNT(*), which
is faster and may result in far fewer columns being deformed from the
tuple.
To make this work, I invented "SupportRequestSimplifyAggref", which is
similar to the existing SupportRequestSimplify, which is for
FuncExprs. Aggregates use Aggrefs, so we need something else.
It's easy to see that count(*) is faster. Here's a quick test in an
unpatched master:
create table t (a int, b int, c int, d int, e int, f int, g int, h int
not null);
insert into t (h) select 1 from generate_Series(1,1000000);
vacuum freeze t;
master:
select count(h) from t;
Time: 16.442 ms
Time: 16.255 ms
Time: 16.322 ms
master:
select count(*) from t;
Time: 12.203 ms
Time: 11.402 ms
Time: 12.054 ms (+37%)
With the patch applied, both queries will perform the same.
It may be possible to apply transformations to other aggregate
functions too, but I don't want to discuss that here. I mostly want to
determine if the infrastructure is ok and do the count(*) one because
it seems like the most likely one to be useful.
One thing I wasn't too sure about was if we should make it possible
for the support function to return something that's not an Aggref. In
theory, something like COUNT(NULL) could just return '0'::bigint.
While that does seem an optimisation that wouldn't be applied very
often, I have opted to leave it so that such an optimisation *could*
be done by the support function. I also happen to test that that
doesn't entirely break the query, as ordinarily it would if we didn't
have Query.hasAggs (It's not too dissimilar to removing unused columns
from a subquery)
Should we do this?
David
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patch | application/octet-stream | 22.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Srinath Reddy Sadipiralla | 2025-10-25 01:50:52 | Re: Making pg_rewind faster |
| Previous Message | Michael Paquier | 2025-10-25 01:15:53 | Re: Making pg_rewind faster |