Re: complex query using postgresql

From: Alban Hertroys <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl>
To: Pau Marc Munoz Torres <paumarc(at)gmail(dot)com>
Cc: "PgSQL General ML" <pgsql-general(at)postgresql(dot)org>
Subject: Re: complex query using postgresql
Date: 2008-04-30 11:29:05
Message-ID: C20ACEDB-B48A-4412-B0B2-3AD1489604CB@solfertje.student.utwente.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Apr 30, 2008, at 11:50 AM, Pau Marc Munoz Torres wrote:

> Hi everybody
>
> I have de following table where i can perform two different queries:
>
> select * from precalc where idr(p1, p4, p6, p7, p9, 'HLA-DR7')>2;
> where idr is a function used to create indicies

Are your indices generated by the function in your where clause? It
is apparently computing something as well (probably its main
purpose?). I'm not sure I understand what you mean here, as I
understand it it seems a very odd thing to do...

> and
>
> select * from precalc where p1='S';
>
> Now i would like to perform a query as :
>
> select * from precalc where idr(p1, p4, p6, p7, p9, 'HLA-DR7')>2
> and p1='S'...;
>
> but i don't know how
>
> any body can help me?

What is the problem with just performing that query? What is it you
need help with?

Looking at the enormous number of indices you have on that table, I
can't help to wonder whether your design is right. Updating that
table is going to be a pain with that many indices and that many
calculations going on, and choosing the right index to use for a
query is might be difficult for the query planner, as it's cost-based.

I get the impression that the precalc table stores the input
parameters for your function and you store the pre-computed values in
different indices, with the only other variable being that code in
the last parameter to your function?

Wouldn't your problem be solved mostly by adding a column for those
codes and create a single index over idr(code, p1, p4, p6, p7, p9) ?
That would replace most of the indices you have now by one index. If
you make that code a foreign key to a table containing only the
possible codes you'd have their values constrained too. I think
effectively your p1-p9 and that code columns are the primary key of
your table, not sure what your plans with the id column are (which
you didn't make a PK btw as I think you intended to).

The query in your original question would probably be turned into a
bitmap index scan between this new index and your "ip1" index by the
query planner. That's likely to perform fine, as the number of
possible values for p1-9 and code are rather finite; at most (127-32)
^5 * 29 = 224 billion integers, if I counted correctly and assuming
you're using ASCII characters. Probably significantly less if you
don't need results for all possible values for p1-p9.

If this is too much, you could partition that table on code,
effectively turning it into 29 tables constrained on a specific code
value, each with their own index over (p1, p4, p6, p7, p9, code).

With an approach like this I wonder whether it'll be worthwhile, it
may actually be faster to compute your function result on the fly
instead of pre-calculating it when it's input parameters are
inserted. I do hope the function is defined immutable (it has to
behave like that, considering your index usage)?

> thanks
>
>
> Column | Type | Modifiers
> --------+--------------
> +------------------------------------------------------
> id | integer | not null default nextval
> ('precalc_id_seq'::regclass)
> p1 | character(1) |
> p4 | character(1) |
> p6 | character(1) |
> p7 | character(1) |
> p9 | character(1) |
> Indexes:
> "h2iab" btree (idr(p1, p4, p6, p7, p9, 'H-2*IAb'::character
> varying))
> "h2iad" btree (idr(p1, p4, p6, p7, p9, 'H-2*IAd'::character
> varying))
> "h2iak" btree (idr(p1, p4, p6, p7, p9, 'H-2*IAk'::character
> varying))
> "h2ied" btree (idr(p1, p4, p6, p7, p9, 'H-2*IEd'::character
> varying))
> "hladqa10501" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DQA1*0501'::character varying))
> "hladqb10201" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DQB1*0201'::character varying))
> "hladr" btree (idr(p1, p4, p6, p7, p9, 'HLA-DR'::character
> varying))
> "hladr1" btree (idr(p1, p4, p6, p7, p9, 'HLA-DR1'::character
> varying))
> "hladr13" btree (idr(p1, p4, p6, p7, p9, 'HLA-DR13'::character
> varying))
> "hladr3" btree (idr(p1, p4, p6, p7, p9, 'HLA-DR3'::character
> varying))
> "hladr7" btree (idr(p1, p4, p6, p7, p9, 'HLA-DR7'::character
> varying))
> "hladrb10101" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*0101'::character varying))
> "hladrb10102" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*0102'::character varying))
> "hladrb10301" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*0301'::character varying))
> "hladrb10302" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*0302'::character varying))
> "hladrb10401" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*0401'::character varying))
> "hladrb10402" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*0402'::character varying))
> "hladrb10701" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*0701'::character varying))
> "hladrb10802" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*0802'::character varying))
> "hladrb10901" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*0901'::character varying))
> "hladrb11101" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*1101'::character varying))
> "hladrb11102" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*1102'::character varying))
> "hladrb11103" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*1103'::character varying))
> "hladrb11104" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*1104'::character varying))
> "hladrb11301" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*1301'::character varying))
> "hladrb11302" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*1302'::character varying))
> "hladrb11501" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB1*1501'::character varying))
> "hladrb40101" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB4*0101'::character varying))
> "hladrb50101" btree (idr(p1, p4, p6, p7, p9, 'HLA-
> DRB5*0101'::character varying))
> "iid" btree (id)
> "ip1" btree (p1)
> "ip4" btree (p4)
> "ip6" btree (p6)
> "ip7" btree (p7)
> "ip9" btree (p9)
>
>
> --
> Pau Marc Muñoz Torres
>
> Laboratori de Biologia Computacional
> Institut de Biotecnologia i Biomedicina Vicent Villar
> Universitat Autonoma de Barcelona
> E-08193 Bellaterra (Barcelona)
>
> telèfon: 93 5812807
> Email : paumarc(dot)munoz(at)bioinf(dot)uab(dot)cat

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.

!DSPAM:737,4818580a927661384610962!

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Gurjeet Singh 2008-04-30 11:30:16 Re: inheritance. more.
Previous Message Pau Marc Munoz Torres 2008-04-30 09:50:33 complex query using postgresql