Re: Optimizer use of index slows down query by factor

From: Michael Ruf <mrf(at)inxmail(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Optimizer use of index slows down query by factor
Date: 2010-01-07 07:49:00
Message-ID: 4B4591EC.6030709@inxmail.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

Tom Lane wrote:
>
> I think you need to see about getting this rowcount estimate to be more
> accurate:
>
>> -> Index Scan using idx_link_1 on link
>> (cost=0.00..680.51 rows=13477 width=26) (actual time=5.707..12.043
>> rows=126 loops=1)
>> Index Cond: (task_id = 1556)
>> Filter: (((deletable IS NULL) OR (NOT
>> deletable)) AND ((link_type = 8) OR (link_type = 9)))
>
> If it realized there'd be only 126 rows out of that scan, it'd probably
> have gone for a nestloop join against the big table, which I think would
> be noticeably faster than either of the plans you show here.
>
> You already did crank up default_statistics_target, so I'm not sure if
> raising it further would help any.

After i've increased the statistic target for the specific column on the
link table "alter table link alter task_id set statistics 200;", the sql
runs fine ( < 1 second ):

Limit (cost=448478.40..448492.17 rows=1 width=30) (actual
time=850.698..860.838 rows=12 loops=1)
-> GroupAggregate (cost=448478.40..448492.17 rows=1 width=30)
(actual time=850.695..860.824 rows=12 loops=1)
-> Sort (cost=448478.40..448481.15 rows=1100 width=30)
(actual time=850.569..853.985 rows=6445 loops=1)
Sort Key: link.link_type, link.link_alias
Sort Method: quicksort Memory: 696kB
-> Nested Loop Left Join (cost=0.00..448422.84
rows=1100 width=30) (actual time=819.519..838.422 rows=6445 loops=1)
-> Seq Scan on link (cost=0.00..142722.52
rows=203 width=26) (actual time=819.486..820.016 rows=126 loops=1)
Filter: (((deletable IS NULL) OR (NOT
deletable)) AND (task_id = 1556) AND ((link_type = 8) OR (link_type = 9)))
-> Index Scan using idx_click_1 on click
(cost=0.00..1370.01 rows=10872 width=12) (actual time=0.003..0.088
rows=51 loops=126)
Index Cond: (link.link_id = click.link_id)
Total runtime: 860.929 ms

> What I'd suggest is trying to avoid
> using non-independent AND/OR conditions. For instance recasting the
> first OR as just "deletable is not true" would probably result in a
> better estimate. The size of the error seems to be more than that would
> account for though, so I suspect that the deletable and link_type
> conditions are interdependent. Is it practical to recast your data
> representation to avoid that?
>

I've tried that, but with no positive/negative effects.

Thanks for your help.

Michael

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2010-01-07 07:49:30 Re: Massive table (500M rows) update nightmare
Previous Message Craig Ringer 2010-01-07 07:40:00 Re: pg_connect takes 3.0 seconds