Skip site navigation (1) Skip section navigation (2)

Why should such a simple query over indexed columns be so slow?

From: Alessandro Gagliardi <alessandro(at)path(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Why should such a simple query over indexed columns be so slow?
Date: 2012-01-30 19:13:08
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
So, here's the query:

SELECT private, COUNT(block_id) FROM blocks WHERE created > 'yesterday' AND
shared IS FALSE GROUP BY private

What confuses me is that though this is a largish table (millions of rows)
with constant writes, the query is over indexed columns of types timestamp
and boolean so I would expect it to be very fast. The clause where created
> 'yesterday' is there mostly to speed it up, but apparently it doesn't
help much.

Here's the *Full Table and Index Schema*:

  block_id character(24) NOT NULL,
  user_id character(24) NOT NULL,
  created timestamp with time zone,
  locale character varying,
  shared boolean,
  private boolean,
  moment_type character varying NOT NULL,
  user_agent character varying,
  inserted timestamp without time zone NOT NULL DEFAULT now(),
  networks character varying[],
  lnglat point,
  CONSTRAINT blocks_pkey PRIMARY KEY (block_id )


CREATE INDEX blocks_created_idx
  ON blocks
  USING btree
  (created  DESC NULLS LAST);

CREATE INDEX blocks_lnglat_idx
  ON blocks
  USING gist
  (lnglat );

CREATE INDEX blocks_networks_idx
  ON blocks
  USING btree
  (networks );

CREATE INDEX blocks_private_idx
  ON blocks
  USING btree
  (private );

CREATE INDEX blocks_shared_idx
  ON blocks
  USING btree
  (shared );

Here's the results from *EXPLAIN ANALYZE:*

"HashAggregate  (cost=156619.01..156619.02 rows=2 width=26) (actual
time=43131.154..43131.156 rows=2 loops=1)"
*"  ->  Seq Scan on blocks  (cost=0.00..156146.14 rows=472871 width=26)
(actual time=274.881..42124.505 rows=562888 loops=1)"
**"        Filter: ((shared IS FALSE) AND (created > '2012-01-29
00:00:00+00'::timestamp with time zone))"
**"Total runtime: 43131.221 ms"*
I'm using *Postgres version:* 9.0.5 (courtesy of Heroku)

As for *History:* I've only recently started using this query, so there
really isn't any.

As for *Hardware*: I'm using Heroku's "Ronin" setup which involves 1.7 GB
Cache. Beyond that I don't really know.

As for *Maintenance Setup*: I let Heroku handle that, so I again, I don't
really know. FWIW though, vacuuming should not really be an issue (as I
understand it) since I don't really do any updates or deletions. It's
pretty much all inserts and selects.

As for *WAL Configuration*: I'm afraid I don't even know what that is. The
query is normally run from a Python web server though the above explain was
run using pgAdmin3, though I doubt that's relevant.

As for *GUC Settings*: Again, I don't know what this is. Whatever Heroku
defaults to is what I'm using.

Thank you in advance!
-Alessandro Gagliardi


pgsql-performance by date

Next:From: Claudio FreireDate: 2012-01-30 19:24:54
Subject: Re: Why should such a simple query over indexed columns be so slow?
Previous:From: Andy ColsonDate: 2012-01-30 18:33:00
Subject: Re: How to improve insert speed with index on text column

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group