optimizing query

From: Chantal Ackermann <chantal(dot)ackermann(at)biomax(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: optimizing query
Date: 2003-01-22 10:30:32
Message-ID: 3E2E72C8.2080703@biomax.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

hello all,

I am getting the following output from EXPLAIN, concerning a query with
joins. The merge uses index scans but takes too long, in my opinion. The
query is in fact only a part (subquery) of another one, but it is the
bottle neck.

As I am quite ignorant in optimizing queries, and I have no idea where
to find documentation on the net on how to learn optimizing my queries,
I am posting this here in hope someone will give me either tips how to
optimize, or where to find some tutorial that could help me get along on
my own.

dropping the "DISTINCT" has some effect, but I can't really do without.

Thank you
Chantal

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

relate=# explain SELECT DISTINCT gene.gene_name,
gene_occurrences_puid.puid FROM disease, gene, disease_occurrences_puid,
gene_occurrences_puid WHERE
disease_occurrences_puid.puid=gene_occurrences_puid.puid AND
disease.disease_id=disease_occurrences_puid.disease_id AND
gene.gene_id=gene_occurrences_puid.gene_id;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=426503.59..436839.47 rows=137812 width=41)
-> Sort (cost=426503.59..429948.88 rows=1378118 width=41)
Sort Key: gene.gene_name, gene_occurrences_puid.puid
-> Hash Join (cost=67813.96..162375.07 rows=1378118 width=41)
Hash Cond: ("outer".disease_id = "inner".disease_id)
-> Merge Join (cost=63671.50..98237.36 rows=1378118
width=37)
Merge Cond: ("outer".puid = "inner".puid)
-> Index Scan using disease_occpd_puid_i on
disease_occurrences_puid (cost=0.00..14538.05 rows=471915 width=8)
-> Sort (cost=63671.50..64519.87 rows=339347
width=29)
Sort Key: gene_occurrences_puid.puid
-> Merge Join (cost=0.00..22828.18
rows=339347 width=29)
Merge Cond: ("outer".gene_id =
"inner".gene_id)
-> Index Scan using gene_pkey on gene
(cost=0.00..7668.59 rows=218085 width=21)
-> Index Scan using gene_id_puid_uni
on gene_occurrences_puid (cost=0.00..9525.57 rows=339347 width=8)
-> Hash (cost=3167.97..3167.97 rows=164597 width=4)
-> Seq Scan on disease (cost=0.00..3167.97
rows=164597 width=4)
(16 rows)

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ian Barwick 2003-01-22 10:36:45 Re: DBD::Pg & DBD::PgPP Cpan question
Previous Message Justin Clift 2003-01-22 08:21:54 Re: FW: ERROR: Relation "pg_user" does not exist

Browse pgsql-performance by date

  From Date Subject
Next Message Andrew Sullivan 2003-01-22 12:05:24 Re: Postgres 7.3.1 poor insert/update/search performance
Previous Message Seth Robertson 2003-01-22 07:19:45 Re: Postgres 7.3.1 poor insert/update/search performance