Re: Two "equivalent" WITH RECURSIVE queries, one of them slow.

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Octavio Alvarez <alvarezp(at)alvarezp(dot)ods(dot)org>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Two "equivalent" WITH RECURSIVE queries, one of them slow.
Date: 2010-07-07 13:14:25
Message-ID: AANLkTinZiilJ-_HJNffNVMIc8K6QsoQcmdlZyVjV__bt@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Jul 5, 2010 at 2:07 AM, Octavio Alvarez
<alvarezp(at)alvarezp(dot)ods(dot)org> wrote:
> Hello.
>
> I have a tree-like table with a three-field PK (name, date, id) and one
> parent field.
> It has 5k to 6k records as of now, but it will hold about 1 million records.
>
> I am trying the following WITH RECURSIVE query:
>
> WITH RECURSIVE t AS (
>                 SELECT par.id AS tid, par.name, par.date, par.id, par.text,
> par.h_title, par.h_name, par.parent
>                   FROM _books.par
>        UNION
>                 SELECT t.tid AS pid, p.name, p.date, p.id, p.text,
> p.h_title, p.h_name, p.parent
>                   FROM t, _books.par p
>                  WHERE p.name = t.name AND p.date = t.date AND t.id =
> p.parent
>        )
>  SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent
>   FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340';
>
> ... which takes 2547.503 ms
>
> However, if I try the same query but adding the same WHERE clause to the
> non-recursive term, I get much better results.
>
>
> WITH RECURSIVE t AS (
>                 SELECT par.id AS tid, par.name, par.date, par.id, par.text,
> par.h_title, par.h_name, par.parent
>                   FROM _books.par WHERE name = 'cfx' AND date = '2009-08-19'
> AND par.id = '28340'
>        UNION
>                 SELECT t.tid AS pid, p.name, p.date, p.id, p.text,
> p.h_title, p.h_name, p.parent
>                   FROM t, _books.par p
>                  WHERE p.name = t.name AND p.date = t.date AND t.id =
> p.parent
>        )
>  SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent
>   FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340';
>
> ... which takes 0.221 ms

If you want the fast plan, you might want to consider reworking your
query into a set returning function. It's pretty easy to do:

create or replace function f(arg int) returns setof something as
$$
with recursive foo as
(
select * from bar where id = $1
union all
[...]
)
select * from foo
$$ language sql;

Obviously, a pure view approach would be nicer but it just isn't going
to hapen at present. CTE are currently problematic generally when you
need quals in the 'with' term, especially in the case of recursive
CTE.

merlin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Michal Fapso 2010-07-07 13:31:29 Re: big data - slow select (speech search)
Previous Message Matthew Wakeling 2010-07-07 12:34:18 Re: Two different execution plan for the same request