Skip site navigation (1) Skip section navigation (2)

Re: Slow queries on big table

From: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
To: "Tyrrill, Ed" <tyrrill_ed(at)emc(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow queries on big table
Date: 2007-05-18 19:36:22
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
Tyrrill, Ed wrote:
> I have a two column table with over 160 million rows in it.  As the size
> of the table grows queries on this table get exponentially slower.  I am
> using version 8.1.5 32-bit on Red Hat Enterprise Linux 3.  The hardware
> is an Intel 3 Ghz Xeon with 4GB RAM, and 6 disks in a RAID 5
> configuration.  For current testing I am running a single database
> connection with no other applications running on the machine, and the
> swap is not being used at all.
> Here is the table definition:
> mdsdb=# \d backup_location
>  Table "public.backup_location"
>   Column   |  Type   | Modifiers
> -----------+---------+-----------
>  record_id | bigint  | not null
>  backup_id | integer | not null
> Indexes:
>     "backup_location_pkey" PRIMARY KEY, btree (record_id, backup_id)
>     "backup_location_rid" btree (record_id)
> Foreign-key constraints:
>     "backup_location_bfk" FOREIGN KEY (backup_id) REFERENCES
> backups(backup_id) ON DELETE CASCADE
> Here is the table size:
> mdsdb=# select count(*) from backup_location;
>    count
> -----------
>  162101296
> (1 row)
> And here is a simple query on this table that takes nearly 20 minutes to
> return less then 3000 rows.  I ran an analyze immediately before I ran
> this query:
> mdsdb=# explain analyze select record_id from backup_location where
> backup_id = 1070;
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> -------------
>  Index Scan using backup_location_pkey on backup_location
> (cost=0.00..1475268.53 rows=412394 width=8) (actual
> time=3318.057..1196723.915 rows=2752 loops=1)
>    Index Cond: (backup_id = 1070)
>  Total runtime: 1196725.617 ms
> (3 rows)
I've got a few points.  Firstly, is your data amenable to partitioning?  
If so that might be a big winner.
Secondly, it might be more efficient for the planner to choose the 
backup_location_rid index than the combination primary key index.  You 
can test this theory with this cool pg trick:

alter table backup_location drop constraint backup_location_pkey;
explain analyze select ....

to see if it's faster.

> Obviously at this point the application is not usable.  If possible we
> would like to grow this table to the 3-5 billion row range, but I don't
> know if that is realistic.
> Any guidance would be greatly appreciated.

Without knowing more about your usage patterns, it's hard to say.  But 
partitioning seems like your best choice at the moment.

In response to


pgsql-performance by date

Next:From: Mark HarrisDate: 2007-05-18 19:37:00
Subject: Re: reading large BYTEA type is slower than expected
Previous:From: Ron MayerDate: 2007-05-18 19:21:39
Subject: Re: Background vacuum

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group