Skip site navigation (1) Skip section navigation (2)

Re: complex query using postgresql

From: Alban Hertroys <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl>
To: Pau Marc Munoz Torres <paumarc(at)gmail(dot)com>
Cc: "PgSQL General ML" <pgsql-general(at)postgresql(dot)org>
Subject: Re: complex query using postgresql
Date: 2008-04-30 11:29:05
Message-ID: C20ACEDB-B48A-4412-B0B2-3AD1489604CB@solfertje.student.utwente.nl (view raw or flat)
Thread:
Lists: pgsql-general
On Apr 30, 2008, at 11:50 AM, Pau Marc Munoz Torres wrote:

> Hi everybody
>
>   I have de following table where i can perform two different queries:
>
> select * from precalc where idr(p1, p4, p6, p7, p9, 'HLA-DR7')>2;  
> where idr is a function used to create indicies

Are your indices generated by the function in your where clause? It  
is apparently computing something as well (probably its main  
purpose?). I'm not sure I understand what you mean here, as I  
understand it it seems a very odd thing to do...

> and
>
> select * from precalc where p1='S';
>
> Now i would like to perform a query as :
>
> select * from precalc where idr(p1, p4, p6, p7, p9, 'HLA-DR7')>2  
> and p1='S'...;
>
> but i don't know how
>
> any body can help me?

What is the problem with just performing that query? What is it you  
need help with?

Looking at the enormous number of indices you have on that table, I  
can't help to wonder whether your design is right. Updating that  
table is going to be a pain with that many indices and that many  
calculations going on, and choosing the right index to use for a  
query is might be difficult for the query planner, as it's cost-based.

I get the impression that the precalc table stores the input  
parameters for your function and you store the pre-computed values in  
different indices, with the only other variable being that code in  
the last parameter to your function?

Wouldn't your problem be solved mostly by adding a column for those  
codes and create a single index over idr(code, p1, p4, p6, p7, p9) ?  
That would replace most of the indices you have now by one index. If  
you make that code a foreign key to a table containing only the  
possible codes you'd have their values constrained too. I think  
effectively your p1-p9 and that code columns are the primary key of  
your table, not sure what your plans with the id column are (which  
you didn't make a PK btw as I think you intended to).

The query in your original question would probably be turned into a  
bitmap index scan between this new index and your "ip1" index by the  
query planner. That's likely to perform fine, as the number of  
possible values for p1-9 and code are rather finite; at most (127-32) 
^5 * 29 = 224 billion integers, if I counted correctly and assuming  
you're using ASCII characters. Probably significantly less if you  
don't need results for all possible values for p1-p9.

If this is too much, you could partition that table on code,  
effectively turning it into 29 tables constrained on a specific code  
value, each with their own index over (p1, p4, p6, p7, p9, code).

With an approach like this I wonder whether it'll be worthwhile, it  
may actually be faster to compute your function result on the fly  
instead of pre-calculating it when it's input parameters are  
inserted. I do hope the function is defined immutable (it has to  
behave like that, considering your index usage)?

> thanks
>
>
>  Column |     Type     |                      Modifiers
> --------+-------------- 
> +------------------------------------------------------
>  id     | integer      | not null default nextval 
> ('precalc_id_seq'::regclass)
>  p1     | character(1) |
>  p4     | character(1) |
>  p6     | character(1) |
>  p7     | character(1) |
>  p9     | character(1) |
> Indexes:
>     "h2iab" btree (idr(p1, p4, p6, p7, p9, 'H-2*IAb'::character  
> varying))
>     "h2iad" btree (idr(p1, p4, p6, p7, p9, 'H-2*IAd'::character  
> varying))
>     "h2iak" btree (idr(p1, p4, p6, p7, p9, 'H-2*IAk'::character  
> varying))
>     "h2ied" btree (idr(p1, p4, p6, p7, p9, 'H-2*IEd'::character  
> varying))
>     "hladqa10501" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DQA1*0501'::character varying))
>     "hladqb10201" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DQB1*0201'::character varying))
>     "hladr" btree (idr(p1, p4, p6, p7, p9, 'HLA-DR'::character  
> varying))
>     "hladr1" btree (idr(p1, p4, p6, p7, p9, 'HLA-DR1'::character  
> varying))
>     "hladr13" btree (idr(p1, p4, p6, p7, p9, 'HLA-DR13'::character  
> varying))
>     "hladr3" btree (idr(p1, p4, p6, p7, p9, 'HLA-DR3'::character  
> varying))
>     "hladr7" btree (idr(p1, p4, p6, p7, p9, 'HLA-DR7'::character  
> varying))
>     "hladrb10101" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*0101'::character varying))
>     "hladrb10102" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*0102'::character varying))
>     "hladrb10301" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*0301'::character varying))
>     "hladrb10302" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*0302'::character varying))
>     "hladrb10401" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*0401'::character varying))
>     "hladrb10402" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*0402'::character varying))
>     "hladrb10701" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*0701'::character varying))
>     "hladrb10802" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*0802'::character varying))
>     "hladrb10901" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*0901'::character varying))
>     "hladrb11101" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*1101'::character varying))
>     "hladrb11102" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*1102'::character varying))
>     "hladrb11103" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*1103'::character varying))
>     "hladrb11104" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*1104'::character varying))
>     "hladrb11301" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*1301'::character varying))
>     "hladrb11302" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*1302'::character varying))
>     "hladrb11501" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB1*1501'::character varying))
>     "hladrb40101" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB4*0101'::character varying))
>     "hladrb50101" btree (idr(p1, p4, p6, p7, p9, 'HLA- 
> DRB5*0101'::character varying))
>     "iid" btree (id)
>     "ip1" btree (p1)
>     "ip4" btree (p4)
>     "ip6" btree (p6)
>     "ip7" btree (p7)
>     "ip9" btree (p9)
>
>
> -- 
> Pau Marc Muñoz Torres
>
> Laboratori de Biologia Computacional
> Institut de Biotecnologia i Biomedicina Vicent Villar
> Universitat Autonoma de Barcelona
> E-08193 Bellaterra (Barcelona)
>
> telèfon: 93 5812807
> Email : paumarc(dot)munoz(at)bioinf(dot)uab(dot)cat 

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4818580a927661384610962!



In response to

pgsql-general by date

Next:From: Gurjeet SinghDate: 2008-04-30 11:30:16
Subject: Re: inheritance. more.
Previous:From: Pau Marc Munoz TorresDate: 2008-04-30 09:50:33
Subject: complex query using postgresql

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group