Select max(foo) and select count(*) optimization

From: John Siracusa <siracusa(at)mindspring(dot)com>
To: Postgres Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Select max(foo) and select count(*) optimization
Date: 2004-01-05 19:03:12
Message-ID: BC1F1F20.71370%siracusa@mindspring.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Speaking of special cases (well, I was on the admin list) there are two
kinds that would really benefit from some attention.

1. The query "select max(foo) from bar" where the column foo has an index.
Aren't indexes ordered? If not, an "ordered index" would be useful in this
situation so that this query, rather than doing a sequential scan of the
whole table, would just "ask the index" for the max value and return nearly
instantly.

2. The query "select count(*) from bar" Surely the total number of rows in
a table is kept somewhere convenient. If not, it would be nice if it could
be :) Again, rather than doing a sequential scan of the entire table, this
type of query could return instantly.

I believe MySQL does both of these optimizations (which are probably a lot
easier in that product, given its data storage system). These were the
first areas where I noticed a big performance difference between MySQL and
Postgres.

Especially with very large tables, hearing the disks grind as Postgres scans
every single row in order to determine the number of rows in a table or the
max value of a column (even a primary key created from a sequence) is pretty
painful. If the implementation is not too horrendous, this is an area where
an orders-of-magnitude performance increase can be had.

-John

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2004-01-05 19:05:48 Re: optimizing Postgres queries
Previous Message David Teran 2004-01-05 19:02:01 Re: optimizing Postgres queries