Re: Query take 101 minutes, help, please

From: Meetesh Karia <meetesh(dot)karia(at)gmail(dot)com>
To: Christian Compagnon <ccompagnon(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query take 101 minutes, help, please
Date: 2005-09-07 17:09:39
Message-ID: fc5b04ca0509071009388d72f1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

PG is creating the union of January, February and March tables first and
that doesn't have an index on it. If you're going to do many queries using
the union of those three tables, you might want to place their contents into
one table and create an index on it.

Otherwise, try something like this:

SELECT "Rut Cliente"
FROM "Internet_Abril"
WHERE "Rut Cliente" NOT IN (SELECT "Rut Cliente" FROM
"Internet_Enero")
AND "Rut Cliente" NOT IN (SELECT "Rut Cliente" FROM
"Internet_Febrero")
AND "Rut Cliente" NOT IN (SELECT "Rut Cliente" FROM
"Internet_Marzo");

You could also compare the performance of that to this and choose the one
that works the best:

SELECT "Rut Cliente"
FROM "Internet_Abril" a
LEFT JOIN "Internet_Enero" e ON a."Rut Cliente" = e."Rut Cliente"
LEFT JOIN "Internet_Febrero" f ON a."Rut Cliente" = f."Rut Cliente"
LEFT JOIN "Internet_Marzo" m ON a."Rut Cliente" = m."Rut Cliente"
WHERE e."Rut Cliente" IS NULL AND f."Rut Cliente" IS NULL and m."Rut
Cliente" IS NULL;

Meetesh

On 9/7/05, Christian Compagnon <ccompagnon(at)gmail(dot)com> wrote:
>
> Hello,
>
> I'm a newbie in postgresql, I've installed it on a Windows XP machine
> ( I can't use linux, it's a company machine ), I'm courious why this
> query takes so long
>
> SELECT "Rut Cliente"
> FROM "Internet_Abril"
> WHERE "Rut Cliente" NOT IN ((SELECT "Rut Cliente" FROM
> "Internet_Enero") UNION (SELECT "Rut Cliente" FROM
> "Internet_Febrero") UNION (SELECT "Rut Cliente" FROM
> "Internet_Marzo"));
>
> it takes about 100 minutes to complete the query.
> All tables has index created ( Rut Cliente is a VarChar ), and tables
> has 50.000 records each.
>
> The explain for the query tells the following
>
> "QUERY PLAN
> Seq Scan on "Internet_Abril" (cost=19406.67..62126112.70 rows=24731
> width=13)
> Filter: (NOT (subplan))
> SubPlan
> -> Materialize (cost=19406.67..21576.07 rows=136740 width=13)
> -> Unique (cost=17784.23..18467.93 rows=136740 width=13)
> -> Sort (cost=17784.23..18126.08 rows=136740 width=13)
> Sort
> Key: "Rut Cliente"
> -> Append (cost=0.00..3741.80 rows=136740 width=13)
> -> Subquery Scan "*SELECT* 1" (cost=0.00..1233.38
> rows=45069 width=13)
> -> Seq Scan on "Internet_Enero" (cost=0.00..782.69
> rows=45069 width=13)
> -> Subquery Scan "*SELECT* 2" (cost=0.00..1104.06
> rows=40353 width=13)
> -> Seq Scan on "Internet_Febrero" (cost=0.00..700.53
> rows=40353 width=13)
> -> Subquery Scan "*SELECT* 3" (cost=0.00..1404.36
> rows=51318 width=13)
> -> Seq Scan on "Internet_Marzo" (cost=0.00..891.18
> rows=51318 width=13)
>
> Any help will be apreciated, It's for my thesis
>
>
> saludos
> Christian
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-09-07 17:57:36 Re: Query take 101 minutes, help, please
Previous Message Steinar H. Gunderson 2005-09-07 16:57:54 Re: Query take 101 minutes, help, please