Re: 7.3.1 New install, large queries are slow

From: "Roman Fail" <rfail(at)posportal(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, <sszabo(at)megazone23(dot)bigpanda(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 7.3.1 New install, large queries are slow
Date: 2003-01-17 05:54:39
Message-ID: 9B1C77393DED0D4B9DAA1AA1742942DA0E4C09@pos_pdc.posportal.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tom and Tomasz:
I have to change the datatype of batchdetail.batchid from int4 to int8. After over 4 hours, the UPDATE transfer from the old column to new has not yet completed. After that I still have to build a new index and run VACUUM FULL. When that is all done I'll re-run the various queries, including a specific small one that Josh requested.

Chad Thompson suggested that I add single quotes around the literals in the WHERE clause, which sounded like a great idea based on his experience. Unfortunately, it did not make the query any faster. But read on!

For kicks, I tried this simple query, which should happen in an instant. It is the first row in the table.
EXPLAIN ANALYZE select batchdetailid from batchdetail where batchdetailid = 27321;
Seq Scan on batchdetail (cost=0.00..1960485.43 rows=1 width=8) (actual time=17.58..264303.76 rows=1 loops=1)
Filter: (batchdetailid = 27321)
Total runtime: 264303.87 msec
Does it make sense to do a sequence scan when the primary key index is available? Even so, it's still a pretty horrible time given the hardware.

HOWEVER.....look at this:
EXPLAIN ANALYZE select batchdetailid from batchdetail where batchdetailid = 27321::bigint;
Index Scan using batchdetail_pkey on batchdetail (cost=0.00..4.13 rows=1 width=8) (actual time=0.03..0.03 rows=1 loops=1)
Index Cond: (batchdetailid = 27321::bigint)
Total runtime: 0.07 msec

It sort of feels like a magic moment. I went back and looked through a lot of the JOIN columns and found that I was mixing int4 with int8 in a lot of them. All of these tables (except batchdetail) were migrated using pgAdminII's migration wizard, so I didn't really give a hard look at all the data types matching up since it has a nice data map (I used the defaults except for the money type).

Now I think I'm just going to drop the entire database and reload the data from scratch, making sure that the data types are mapped exactly right. Correct me if I'm wrong, but int4 only ranges from negative 2 billion to positive 2 billion. All the primary keys for my tables would fit in this range with the exception of batchdetail, which could conceivably grow beyond 2 billion someday (although I'd be archiving a lot of it when it got that big). Maybe I just shouldn't worry about it for now and make everything int4 for simplicity.

I doubt I will accomplish all this on Friday, but I'll give a full report once I get it all reloaded.

> Stephan Szabo wrote:
> Also you have two indexes on batchdetailid right now (primary key
> also creates one) which added to the confusion.

The 7.3.1 docs for CREATE TABLE don't mention anything about automatic index creation for a PRIMARY KEY. I didn't see any PK indexes via pgAdminII, so I read this line from the docs and decided to create them separately.
"Technically, PRIMARY KEY is merely a combination of UNIQUE and NOT NULL"
However, this query proves you are right:
trans=# select relname, relpages, indisunique, indisprimary from pg_class, pg_index
trans-# where indexrelid in (37126739, 8604257) and pg_class.oid = pg_index.indexrelid;
relname | relpages | indisunique | indisprimary
----------------------------------+----------+-------------+--------------
batchdetail_pkey | 121850 | t | t
batchdetail_ix_batchdetailid_idx | 63934 | f | f

All other columns in the two tables are identical for these two indexes. So now I've gone through and deleted all of these duplicate indexes I created (and then a VACUUM FULL). Perhaps an extra sentence in the docs might prevent someone else from making the same mistake as I?

*** Current postgresql.conf settings:
tcpip_socket=true
shared_buffers = 131072
max_fsm_relations = 10000
max_fsm_pages = 2000000
sort_mem = 32768
default_statistics_target = 30

Thanks again for all your help!

Roman Fail
Sr. Web Application Developer
POS Portal, Inc.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2003-01-17 06:06:10 Re: 7.3.1 New install, large queries are slow
Previous Message jasiek 2003-01-16 20:48:26 Re: 7.3.1 New install, large queries are slow