Re: Query performance

From: "Christian Rengstl" <Christian(dot)Rengstl(at)klinik(dot)uni-regensburg(dot)de>
To: "Hakan Kocaman" <Hakan(dot)Kocaman(at)digame(dot)de>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Query performance
Date: 2006-08-03 08:47:58
Message-ID: 44D1D45E.0AD0.0080.0@klinik.uni-regensburg.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

the complete query is the one i posted, but here comes the schema for mytable:
entry_no int8 NOT NULL DEFAULT nextval('entry_no_seq''::regclass),
pid varchar(15) NOT NULL,
crit varchar(13) NOT NULL,
val1 varchar(1),
val2 varchar(1),
aendat text,
aennam varchar(8),
CONSTRAINT "PK_ENTRY" PRIMARY KEY (entry_no)

myCritTable:
crit varchar(13) NOT NULL,
chr int2,
aendat timestamp,
CONSTRAINT pk_crit_master PRIMARY KEY (crit)

My server is 8.1.4. As a matter of fact, i have no idea where the text type comes from, because as you can see from above there are only varchar with maximum 15 characters.

"Hakan Kocaman" <Hakan(dot)Kocaman(at)digame(dot)de> wrote on 08/03/06 10:34 am:
> Hi,
>
> can you post the complete query,schema- and
> table-definition,server-version etc. ?
> This will help to identity the main problem.
>
> So at the moment i'm just guessing:
>
> Hash Cond: (("outer".snp_id)::text = ("inner".snp_id)::text)
> -> Bitmap Heap Scan on test2 (cost=232.92..132766.66 rows=37120
> width=23)
> (actual time=291.600..356707.737 rows=37539 loops=1)
> This part is very expensive, but i got no clue why.
> Maybe the text-type is not so ideal.
>
> Best regards
>
> Hakan Kocaman
> Software-Development
>
> digame.de GmbH
> Richard-Byrd-Str. 4-8
> 50829 Köln
>
> Tel.: +49 (0) 221 59 68 88 31
> Fax: +49 (0) 221 59 68 88 98
> Email: hakan(dot)kocaman(at)digame(dot)de
>
>
>
>> -----Original Message-----
>> From: pgsql-general-owner(at)postgresql(dot)org
>> [mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of
>> Christian Rengstl
>> Sent: Thursday, August 03, 2006 10:13 AM
>> To: pgsql-general(at)postgresql(dot)org
>> Subject: [GENERAL] Query performance
>>
>>
>> Hi everyone,
>>
>> i have a table with around 57 million tuples, with the
>> following columns: pid(varchar), crit(varchar),
>> val1(varchar), val2(varchar). Example:
>> pid crit val1 val2
>> p1 c1 x y
>> p1 c2 x z
>> p1 c3 y x
>> ...
>> What i am doing is to query all val1 and val2 for one pid and
>> all crit values:
>>
>> select val1, val2, crit from mytable where pid='somepid' and
>> crit in(select crit from myCritTable);
>> where myCritTable is a table that contains all crit values
>> (around 42.000) ordered by their insertion date.
>>
>>
>> QUERY PLAN
>>
>> --------------------------------------------------------------
>> ------------------
>> ----------------------------------------------------------
>> Hash IN Join (cost=1033.67..134959.41 rows=37120 width=23)
>> (actual time=357.11
>> 6..356984.535 rows=37539 loops=1)
>> Hash Cond: (("outer".snp_id)::text = ("inner".snp_id)::text)
>> -> Bitmap Heap Scan on test2 (cost=232.92..132766.66
>> rows=37120 width=23) (
>> actual time=291.600..356707.737 rows=37539 loops=1)
>> Recheck Cond: ((pid)::text = '1'::text)
>> -> Bitmap Index Scan on idx_test2_pid
>> (cost=0.00..232.92 rows=37120 w
>> idth=0) (actual time=234.516..234.516 rows=37539 loops=1)
>> Index Cond: ((pid)::text = '1'::text)
>> -> Hash (cost=700.20..700.20 rows=40220 width=13)
>> (actual time=65.055..65.0
>> 55 rows=40220 loops=1)
>> -> Seq Scan on snps_test (cost=0.00..700.20
>> rows=40220 width=13) (act
>> ual time=0.020..30.131 rows=40220 loops=1)
>> Total runtime: 357017.259 ms
>>
>> Unfortunately the query takes pretty long for the big table,
>> so maybe one of you has a suggestion on how to make it faster.
>>
>> --
>> Christian Rengstl M.A.
>> Klinik und Poliklinik für Innere Medizin II
>> Kardiologie - Forschung
>> Universitätsklinikum Regensburg
>> B3 1.388
>> Franz-Josef-Strauss-Allee 11
>> 93053 Regensburg
>> Tel.: +49-941-944-7230
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 4: Have you searched our list archives?
>>
>> http://archives.postgresql.org
>>

--
Christian Rengstl M.A.
Klinik und Poliklinik für Innere Medizin II
Kardiologie - Forschung
Universitätsklinikum Regensburg
B3 1.388
Franz-Josef-Strauss-Allee 11
93053 Regensburg
Tel.: +49-941-944-7230

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Huxton 2006-08-03 09:00:04 Re: Query performance
Previous Message Richard Huxton 2006-08-03 08:45:41 Re: Tape backup, 2 versions, same database name, which