Skip site navigation (1) Skip section navigation (2)

Re: Slow count(*) again...

From: Reid Thompson <reid(dot)thompson(at)ateb(dot)com>
To: reid(dot)thompson(at)ateb(dot)com
Cc: Neil Whelchel <neil(dot)whelchel(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow count(*) again...
Date: 2010-10-10 18:33:04
Message-ID: 4CB206E0.6080707@ateb.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
  On 10/10/2010 11:02 AM, Reid Thompson wrote:
>>>> On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel<neil(dot)whelchel(at)gmail(dot)com>
>>>>
>> On the other hand, I copied a table out of one of my production servers that
>> has about 60,000 rows with 6 columns (numeric, numeric, bool, bool, timestamp,
>> text). The first numeric column has numbers evenly spread between 0 and 100
>> and it is indexed. I put the table in a pair of database servers both running
>> on the same physical hardware. One server is Postgres, the other is a popular
>> server (I am not mentioning names here). on Postgres: SELECT count(*) FROM
>> table where column>50; takes about 8 seconds to run. The other database server
>> took less than one second (about 25 ms) as it is using the index (I assume) to
>> come up with the results. It is true that this is not a fair test because both
>> servers were tested with their default settings, and the defaults for Postgres
>> are much more conservative, however, I don't think that any amount of settings
>> tweaking will bring them even in the same ball park. There has been discussion
>> about the other server returning an incorrect count because all of the indexed
>> rows may not be live at the time. This is not a problem for the intended use,
>> that is why I suggested another function like estimate(*). It's name suggests
>> that the result will be close, not 100% correct, which is plenty good enough
>> for generating a list of results pages in most cases. I am faced with a very
>> serious problem here. If the query to make a list of pages takes say 6 seconds
>> and it takes another 6 seconds to generate a page of results, the customer is
>> waiting 12 seconds. This is not going to work. If count made a quick estimate,
>> say less than a second, and it took 6 seconds to come up with the actual
>> results, I could live with that. Or if coming up with the window of results
>> via (OFFSET and LIMIT) and returned the total number of rows that would have
>> matched the query, then I would still have everything I need to render the
>> page in a reasonable time. I really think that this needs to be addressed
>> somewhere. It's not like I am the only one that does this. You see it nearly
>> everywhere a long list of results is (expected to be) returned in a web site.
>> Among the people I work with, this seems to be the most mentioned reason that
>> they claim that they don't use Postgres for their projects. t anyway.
>
> How big is your DB?
> How fast is your disk access?
> Any chance disks/RAM can be addressed?
>
> My disk access is pitiful...
> first run, 2.3 million rows.. 0m35.38s, subsequent runs.. real    0m2.55s
>
> rthompso(at)hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting
>   count
> ---------
>  2340704
> (1 row)
>
>
> real    0m35.38s
> user    0m0.25s
> sys     0m0.03s
>
> subsequent runs.... (count changes due to inserts.)
>
> rthompso(at)hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting
>   count
> ---------
>  2363707
> (1 row)
>
>
> real    0m2.70s
> user    0m0.27s
> sys     0m0.02s
> rthompso(at)hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting
>   count
> ---------
>  2363707
> (1 row)
>
>
> real    0m2.55s
> user    0m0.26s
> sys     0m0.02s
> rthompso(at)hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting
>   count
> ---------
>  2363707
> (1 row)
>
>
> real    0m2.50s
> user    0m0.26s
> sys     0m0.02s
>
> reporting=# SELECT pg_size_pretty(pg_total_relation_size('my_production_table'));
>  pg_size_pretty
> ----------------
>  1890 MB
> (1 row)
>
>
forgot to note, my table schema is significantly larger.

rthompso(at)hw-prod-repdb1> time psql -c "\d my_production_table_201010" reporting
                                               Table "public.my_production_table_201010"
            Column            |            Type             |                           Modifiers
-----------------------------+-----------------------------+----------------------------------------------------------------
                              | integer                     | not null default 
nextval('my_production_table_parent_id_seq'::regclass)
                              | character varying(20)       |
                              | character(1)                |
                              | character varying(32)       |
                              | character varying(32)       |
                              | character varying(20)       |
                              | character varying(5)        |
                              | character varying(5)        |
                              | date                        |
                              | character(1)                |
                              | character varying(32)       |
                              | character varying(32)       |
                              | character varying(32)       |
                              | character varying(2)        |
                              | character varying(10)       |
                              | character varying(10)       |
                              | character varying(32)       |
                              | character varying(7)        |
                              | character varying(10)       |
                              | character varying(2)        |
                              | character varying(9)        |
                              | character varying(9)        |
                              | character varying(9)        |
                              | character varying(10)       |
                              | character varying(32)       |
                              | character varying(32)       |
                              | character varying(20)       |
                              | character varying(5)        |
                              | character varying(5)        |
                              | character varying(32)       |
                              | character varying(32)       |
                              | character varying(32)       |
                              | character varying(2)        |
                              | character varying(10)       |
                              | character varying(10)       |
                              | character varying(10)       |
                              | character varying(10)       |
                              | integer                     |
                              | character varying(2)        |
                              | character varying(32)       |
                              | character varying(32)       |
                              | integer                     |
                              | integer                     |
                              | text                        |
                              | character varying(3)        |
                              | date                        |
                              | date                        |
                              | date                        |
                              | integer                     |
                              | integer                     |
                              | integer                     |
                              | integer                     |
                              | character varying(6)        |
                              | character varying(10)       |
                              | character varying(32)       |
                              | character varying(32)       |
                              | character varying(32)       |
                              | character varying(10)       |
                              | character varying(6)        |
                              | character varying(8)        |
                              | boolean                     |
                              | character(1)                |
                              | date                        |
                              | integer                     |
                              | date                        |
                              | character varying(11)       |
                              | character varying(4)        |
                              | character(1)                |
                              | date                        |
                              | character varying(5)        |
                              | character varying(20)       |
                              | date                        |
                              | character(1)                |
                              | character(1)                |
                              | character varying(2)        |
                              | text                        |
                              | integer                     |
                              | integer                     |
                              | timestamp without time zone | default now()
                              | timestamp without time zone |
                              | character varying(64)       |
                              | character varying(64)       |
                              | character varying(64)       |
Indexes:
     "my_production_table_201010_pkey" PRIMARY KEY, btree (id)
     "my_production_table_201010_date_idx" btree (xxxxdate), tablespace "indexspace"
     "my_production_table_201010_epatient_idx" btree (storeid, xxxxxxxxxxxxx), tablespace "indexspace"
     "my_production_table_201010_medicationname_idx" btree (xxxxxxxxxxxxxx), tablespace "indexspace"
     "my_production_table_201010_ndc_idx" btree (xxx), tablespace "indexspace"
Check constraints:
     "my_production_table_201010_filldate_check" CHECK (xxxxdate >= '2010-10-01'::date AND xxxxdate < 
'2010-11-01'::date)
Foreign-key constraints:
     "my_production_table_201010_pkgfileid_fkey" FOREIGN KEY (pkgfileid) REFERENCES my_production_tablefiles(id)
Inherits: my_production_table_parent



In response to

pgsql-performance by date

Next:From: Neil WhelchelDate: 2010-10-10 21:59:48
Subject: Re: Slow count(*) again...
Previous:From: Mladen GogalaDate: 2010-10-10 17:14:22
Subject: Re: Slow count(*) again...

pgsql-hackers by date

Next:From: Martijn van OosterhoutDate: 2010-10-10 20:23:31
Subject: Re: Which file does the SELECT?
Previous:From: Tom LaneDate: 2010-10-10 18:06:46
Subject: Re: WIP: Triggers on VIEWs

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group