Quick Links

Optimization recommendations request

From:	"Joe Conway" <joe(at)conway-family(dot)com>
To:	"PostgreSQL-SQL" <pgsql-sql(at)postgresql(dot)org>
Subject:	Optimization recommendations request
Date:	2000-12-23 17:07:20
Message-ID:	018601c06d02$d007ace0$0705a8c0@jecw2k1
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-sql

Hello,

I'm working on an application where I need to design for one table to grow
to an extremely large size. I'm already planning to partition the data into
multiple tables, and even possibly multiple servers, but even so each table
may need to grow to the 10 - 15 million tuple range. This table will be used
for a keyed lookup and it is very important that the query return in well
under a second. I've done a small test using a dual ppro 200 server with 512
MB RAM, running RedHat Linux 6.1, and a late November 7.1 pre-beta snapshot
of PostgreSQL (PGOPTS="-D/opt/pgsql/data -i -B 25000"). I used a test table
with about 5 million tuples.

Details:

CREATE TABLE foo(
guid varchar(20) not null,
ks varchar(20) not null
);

--> insert 5,000,000 rows, both guid and ks are 20 byte sha-1 hashes
-- tried this first
-- create index foo_idx1 on foo(guid);
-- then tried
create index foo_idx1 on foo using HASH (guid);

SELECT ks FROM foo WHERE guid = 'f9ee1930f4010ba34cc5ca6efd27d988eb4f921d';

The query currently takes in excess of 40 seconds. I would appreciate any
suggestions for optimizing to bring this down substantially.

Thanks in advance,

Joe Conway

In response to

Re: Invoice number at 2000-12-21 17:08:49 from Stephan Szabo

Responses

Re: Optimization recommendations request at 2000-12-29 23:12:48 from Stephan Szabo
Re: Optimization recommendations request at 2000-12-29 23:49:31 from Tom Lane

Browse pgsql-sql by date

	From	Date	Subject
Next Message	xyzii	2000-12-23 19:16:07	left join ?
Previous Message	Najm Hashmi	2000-12-22 18:53:49	pg_dump Error