Re: query from partitions

From: Ключников А(dot)С(dot) <alexs(at)analytic(dot)mv(dot)ru>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: query from partitions
Date: 2005-12-13 16:57:59
Message-ID: 20051213165759.GA10514@mail.analytic.mv.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

* Richard Huxton <dev(at)archonet(dot)com> [2005-12-13 15:59:11 +0000]:

> Ключников А.С. wrote:
> >And
> >select * from base
> > where id in (1,2) and datatime between '2005-05-15' and '2005-05-17';
> >10 seconds
> >
> >select * from base
> > where id in (select id from device where id = 1 or id = 2) and
> > datatime between '2005-05-15' and '2005-05-17';
> >10 minits
> >
> >Why?
>
> Run EXPLAIN ANALYSE on both queries to see how the plan has changed.
explain select distinct on(id) * from base where id in (1,2) and
data_type=2 and datatime < '2005-11-02' order by id, datatime desc;

Unique (cost=10461.14..10527.30 rows=2342 width=38)
-> Sort (cost=10461.14..10494.22 rows=13232 width=38)
Sort Key: public.base.id, public.base.datatime
-> Result (cost=0.00..9555.29 rows=13232 width=38)
-> Append (cost=0.00..9555.29 rows=13232 width=38)
-> Seq Scan on base (cost=0.00..32.60 rows=1
width=38)
Filter: (((id = 1) OR (id = 2)) AND (data_type =
2) AND (datatime < '2005-11-02 00:00:00'::timestamp without time zone))
-> Seq Scan on base_batch base (cost=0.00..32.60
rows=1 width=38)
.......................

-> Seq Scan on base_1_2004 base (cost=0.00..32.60 rows=1 width=38)
Filter: (((id = 1) OR (id = 2)) AND (data_type =
2) AND (datatime < '2005-11-02 00:00:00'::timestamp without time zone))
(записей: 34)

and
explain select distinct on(id) * from base where id in (select id from
device where id = 1 or id = 2) and data_type=2 and datatime < '2005-11-02'
order by id, datatime desc;

Unique (cost=369861.89..369872.52 rows=2126 width=38)
-> Sort (cost=369861.89..369867.21 rows=2126 width=38)
Sort Key: public.base.id, public.base.datatime
-> Hash IN Join (cost=5.88..369744.39 rows=2126 width=38)
Hash Cond: ("outer".id = "inner".id)
-> Append (cost=0.00..368654.47 rows=212554 width=38)
-> Seq Scan on base (cost=0.00..26.95 rows=2
width=38)
Filter: ((data_type = 2) AND (datatime <
'2005-11-02 00:00:00'::timestamp without time zone))
-> Seq Scan on base_batch base (cost=0.00..26.95
rows=2 width=38)
Filter: ((data_type = 2) AND (datatime <
'2005-11-02 00:00:00'::timestamp without time zone))
-> Seq Scan on base_lines_05_12 base
(cost=0.00..26.95 rows=2 width=38)
............................
-> Hash (cost=5.88..5.88 rows=2 width=4)
-> Seq Scan on device (cost=0.00..5.88 rows=2
width=4)
Filter: ((id = 1) OR (id = 2))
(записей: 851)

>
> My guess for why the plans are different is that in the first case your
> query ends up as ...where (id=1 or id=2)...
>
> In the second case, the planner doesn't know what it's going to get back
> from the subquery until it's executed it, so can't tell it just needs to
> scan base_1,base_2. Result: you'll scan all child tables of base.
>
> I think the planner will occasionally evaluate constants before
> planning, but I don't think it will ever execute a subquery and then
> re-plan the outer query based on those results. Of course, someone might
> pop up and tell me I'm wrong now...
>
> --
> Richard Huxton
> Archonet Ltd
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match

--
С уважением,
Ключников А.С.
Ведущий инженер ПРП "Аналитприбор"
432030 г.Ульяновск, а/я 3117
тел./факс +7 (8422) 43-44-78
mailto: alexs(at)analytic(dot)mv(dot)ru

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Ameet Kini 2005-12-13 17:25:40 Lots of postmaster processes (fwd)
Previous Message John Sidney-Woollett 2005-12-13 16:37:42 Re: Memory Leakage Problem