Re: TimeOf(Subselects|Joins)FromLargeTables?

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Hegedus, Tamas (dot)" <Hegedus(dot)Tamas(at)mayo(dot)edu>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: TimeOf(Subselects|Joins)FromLargeTables?
Date: 2004-06-04 01:58:54
Message-ID: D90A5A6C612A39408103E6ECDD77B8299CAB6A@voyager.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

How does this query perform:

SELECT p.name, p.seq
FROM prots p, kwx k
WHERE
p.fid=k.fid
AND
k.kw_acc=812
;

> -----Original Message-----
> From: Hegedus, Tamas . [mailto:Hegedus(dot)Tamas(at)mayo(dot)edu]
> Sent: Thursday, June 03, 2004 6:48 PM
> To: 'pgsql-general(at)postgresql(dot)org'
> Subject: [GENERAL] TimeOf(Subselects|Joins)FromLargeTables?
>
>
> Dear All,
>
> I am a biologist and I do not know what to expect from an RDB
> (PgSQL). I have large tables: 1215607 rows in prots, 2184596
> rows in kwx (see table details below). I would like to do
> something like that:
>
> SELECT name, seq FROM prots WHERE fid in (SELECT fid FROM kwx
> WHERE kw_acc=812);
>
> After executing this (either as a subquery or joins) the
> best/fastest result what I had (SET enable_seqscan=off):
> 83643.482 ms (see EXPLAIN ANALYZE below).
>
> The two (similar) parts of this query are executed much
> faster: SELECT fid FROM kwx WHERE kw_acc=812 -- takes 302ms,
> n(rows)=78050 SELECT name, seq FROM prots WHERE fid < 80000
> -- takes 1969.231 ms
>
> Is this realistic? OK?
> If not: how can I increase the speed by fine tuning of the
> RDB (indexes, run-time parameters) or my SQL query? (It came
> now into my mind: if I decrease the number of columns in the
> prots table (to have only 3 fields (fid, name, seq) instead
> of 20 columns), than the prots table will have smaller file
> size on disk, than this table may need less disk page
> fetches, queries may be faster. Is this true?)
>
> Thanks for your help!
> Tamas
>
> ===============================================
> Table "public.prots"
> Column | Type | Modifiers
> -----------+----------------------+----------
> fid | integer | not null
> name | character varying(10) | not null
> [...other 17 columns...]
> seq | text |
> Indexes:
> "prots_pkey" primary key, btree (fid)
> "ix_prots_acc" unique, btree (acc)
> "ix_prots_name" unique, btree (name)
> "ix_prots_class" btree ("class")
> ===============================================
> Table "public.kwx"
> Column | Type | Modifiers
> --------+--------+----------
> fid | integer |
> kw_acc | integer |
> Indexes:
> "ix_kwx_acc" btree (kw_acc)
> "ix_kwx_fid" btree (fid)
> Foreign-key constraints:
> "fk_kws_acc" FOREIGN KEY (kw_acc) REFERENCES kw_ref(kw_acc)
> "fk_kws_fid" FOREIGN KEY (fid) REFERENCES prots(fid)
> ===============================================
>
> EXPLAIN ANALYZE SELECT name, seq from prots inner join kwx on
> (prots.fid=kwx.fid) where kwx.kw_acc = 812;
>
> QUERY PLAN
>
> --------------------------------------------------------------
> --------------------------------------------------------------
> ------------------
> Merge Join (cost=0.00..160429.66 rows=84473 width=349)
> (actual time=29.039..83505.629 rows=78050 loops=1)
> Merge Cond: ("outer".fid = "inner".fid)
> -> Index Scan using ix_kwx_fid on kwx
> (cost=0.00..44987.55 rows=84473 width=4) (actual
> time=18.893..5730.468 rows=78050 loops=1)
> Filter: (kw_acc = 812)
> -> Index Scan using prots_pkey on prots
> (cost=0.00..112005.24 rows=981127 width=353) (actual
> time=0.083..76059.235 rows=1210377 loops=1) Total runtime:
> 83643.482 ms (6 rows)
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo(at)postgresql(dot)org
>

Browse pgsql-general by date

  From Date Subject
Next Message Marc G. Fournier 2004-06-04 02:54:30 Re: Problem with mailing list or time? Was: Re: Best
Previous Message Hegedus, Tamas . 2004-06-04 01:48:02 TimeOf(Subselects|Joins)FromLargeTables?