Re: Poor performance when joining against inherited tables

From: Lucas Madar <madar(at)samsix(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Poor performance when joining against inherited tables
Date: 2011-05-11 20:47:53
Message-ID: sig.81124526f3.4DCAF5F9.10904@samsix.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 05/11/2011 09:38 AM, Robert Haas wrote:
>> However, if I disable seqscan (set enable_seqscan=false), I get the
>> following plan:
>>
>> QUERY PLAN
>> ------------
>> Hash Join (cost=10001298843.53..290002337961.71 rows=8643757 width=506)
>> Hash Cond: (f.id = objects.id)
>> -> Append (cost=10000000000.00..290000536334.43 rows=8643757 width=20)
>> -> Seq Scan on item f (cost=10000000000.00..10000000026.30
>> rows=1630 width=20)
>> -> Index Scan using xxx_pkey on itemXX f (cost=0.00..10.60 rows=90
>> width=20)
>> -> Index Scan using yyy_pkey on itemYY f (cost=0.00..25.24
>> rows=266 width=20)
>> -> Index Scan using zzz_pkey on itemZZ f (cost=0.00..9.28 rows=2
>> width=20)
>> ...
>> -> Hash (cost=999347.17..999347.17 rows=3941949 width=490)
>> -> Index Scan using objects_pkey on objects (cost=0.00..999347.17
>> rows=3941949 width=490)
>>
>> This seems like a much more sensible query plan.
> I don't think so. Scanning the index to extract all the rows in a
> table is typically going to be a lot slower than a sequential scan.
>
> A more interesting question is why you're not getting a plan like this:
>
> Nested Loop
> -> Seq Scan on objects
> -> Append
> -> Index Scan using xxx_pkey on itemXX
> -> Index Scan using yyy_pkey on itemYY
> -> Index Scan using zzz_pkey on itemZZ

Compared to the previous query plan (omitted in this e-mail, in which
the planner was scanning all the item tables sequentially), the second
query is much more desirable. It takes about 12 seconds to complete,
versus the other query which I canceled after six hours. However, what
you propose seems to make even more sense.

>> But it seems to think doing
>> a sequential scan on the *empty* item table is excessively expensive in this
>> case.
>>
>> Aside from enable_seqscan=false, is there any way I can make the query
>> planner not balk over doing a seqscan on an empty table?
> Why would you care? A sequential scan of an empty table is very fast.
>
My issue is that it looks like it's avoiding the sequential scan:

Seq Scan on item f (cost=10000000000.00..10000000026.30 rows=1630 width=20)

It says the sequential scan has a cost that's way too high, and I'm
presuming that's why it's choosing the extremely slow plan over the much
faster plan. I don't know very much about plans, but I'm assuming the
planner chooses the plan with the lowest cost.

I'd much prefer it *does* the sequential scan of the empty table and
goes with the other parts of the plan.

Thanks,
Lucas Madar

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2011-05-11 21:44:37 Re: Postgres refusing to use >1 core
Previous Message gnuoytr 2011-05-11 19:53:09 Re: Postgres refusing to use >1 core