Query optimization

From: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Query optimization
Date: 2005-09-30 22:24:52
Message-ID: BF633374.F846%sdavis2@mail.nih.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have a couple of very large tables that I am querying on that gives the
following explain analyze output. If someone can help out with my mess,
that would be great.

Thanks,
Sean

explain analyze
select e.*,c.*
from u_all_est_mrna c
join g_rna_acc d on c.accession=d.accession,
(select a.gene_id,
b.db_id,
max(tend-tstart)
from g_rna_acc a
join u_all_est_mrna b on a.accession=b.accession
where gene_id<200
group by a.gene_id,b.db_id) e
where abs(tstart-tend)=e.max
and d.gene_id=e.gene_id
and c.db_id=e.db_id;

QUERY PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-----------------------------
Hash Join (cost=1054997.38..2472083.62 rows=2 width=219) (actual
time=122796.024..357269.576 rows=327 loops=1)
Hash Cond: ((("outer".accession)::text = ("inner".accession)::text) AND
(abs(("outer".tstart - "outer".tend)) = "inner".max) AND
(("outer".db_id)::text = ("inner".db_id)::text))
-> Seq Scan on u_all_est_mrna c (cost=0.00..932582.74 rows=24225174
width=179) (actual time=17.384..302484.904 rows=24225174 loops=1)
-> Hash (cost=1054973.98..1054973.98 rows=3119 width=52) (actual
time=11562.968..11562.968 rows=2276 loops=1)
-> Nested Loop (cost=1046393.15..1054973.98 rows=3119 width=52)
(actual time=11546.931..11558.704 rows=2276 loops=1)
-> HashAggregate (cost=1046393.15..1046395.98 rows=567
width=19) (actual time=11546.892..11547.188 rows=276 loops=1)
-> Nested Loop (cost=4.78..1046285.32 rows=14377
width=19) (actual time=0.148..11537.307 rows=1854 loops=1)
-> Index Scan using g_rna_acc_gene_id on
g_rna_acc a (cost=0.00..1049.44 rows=519 width=16) (actual
time=0.026..50.006 rows=1139 loops=1)
Index Cond: (gene_id < 200)
-> Bitmap Heap Scan on u_all_est_mrna b
(cost=4.78..2007.57 rows=510 width=26) (actual time=7.100..10.068 rows=2
loops=1139)
Recheck Cond: (("outer".accession)::text =
(b.accession)::text)
-> Bitmap Index Scan on uaem_accession
(cost=0.00..4.78 rows=510 width=0) (actual time=4.270..4.270 rows=2
loops=1139)
Index Cond:
(("outer".accession)::text = (b.accession)::text)
-> Index Scan using g_rna_acc_gene_id on g_rna_acc d
(cost=0.00..15.04 rows=6 width=16) (actual time=0.010..0.037 rows=8
loops=276)
Index Cond: (d.gene_id = "outer".gene_id)
Total runtime: 357270.873 ms
(16 rows)

\d+ u_all_est_mrna
Table
"public.u_all_est_mrna"
Column | Type |
Modifiers | Description
-----------------+-------------------+--------------------------------------
---------------------------------------+------------------------------------
all_est_mrna_id | integer | not null default
nextval('public.u_all_est_mrna_all_est_mrna_id_seq'::text) |
db_id | character varying |
|
seqtype | character varying |
|
matches | integer |
|
mismatches | integer |
|
repmatches | integer |
|
ncount | integer |
|
qnuminsert | integer |
|
qbaseinsert | integer |
|
tnuminsert | integer |
|
tbaseinsert | integer |
|
strand | character(1) |
|
accession | character varying |
| Genbank Accession without version.
qsize | integer |
|
qstart | integer |
|
qend | integer |
|
chrom | character varying |
| Chromosome, notation like "chr1"
tsize | integer |
|
tstart | integer |
| Blat hit start
tend | integer |
| Blat hit end
blockcount | integer |
|
blocksizes | character varying |
|
qstarts | character varying |
|
tstarts | character varying |
|
Indexes:
"u_all_est_mrna_pkey" PRIMARY KEY, btree (all_est_mrna_id)
"uaem_accession" btree (accession)
"uaem_chrom" btree (chrom)
"uaem_db_id" btree (db_id)
"uaem_seqtype" btree (seqtype)
"uaem_tend_chrom" btree (tend, chrom)
"uaem_tstart_chrom" btree (tstart, chrom)
Has OIDs: yes

=== psql 78 ===
\d+ g_rna_acc
Table "public.g_rna_acc"
Column | Type | Modifiers | Description
--------------+-------------------+-----------+-------------
gene_id | integer | |
accession | character varying | not null |
version | integer | |
accession_gi | integer | |
Indexes:
"g_rna_acc_pkey" PRIMARY KEY, btree (accession)
"g_rna_acc_accession" btree (accession)
"g_rna_acc_gene_id" btree (gene_id)
Foreign-key constraints:
"g_rna_acc_gene_id_fkey" FOREIGN KEY (gene_id) REFERENCES
g_main(gene_id) ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE
Has OIDs: no

Responses

Browse pgsql-general by date

  From Date Subject
Next Message hubert depesz lubaczewski 2005-10-01 10:01:04 Re: database bloat, but vacuums are done, and fsm seems to be setup ok
Previous Message Dawid Kuroczko 2005-09-30 21:46:35 Re: Many-To-Many Bridge Table Index