Re: Query plan for "heavy" SELECT with "lite" sub-SELECTs

From: Richard Huxton <dev(at)archonet(dot)com>
To: nikolay(at)samokhvalov(dot)com
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query plan for "heavy" SELECT with "lite" sub-SELECTs
Date: 2006-11-02 12:40:09
Message-ID: 4549E729.1080302@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Nikolay Samokhvalov wrote:
> 2. explain analyze select
> *,
> (select typname from pg_type where pg_type.oid=pg_proc.prorettype limit 1)
> from pg_proc offset 1500 limit 1;
> "Limit (cost=8983.31..8989.30 rows=1 width=365) (actual
> time=17.648..17.649 rows=1 loops=1)"
> " -> Seq Scan on pg_proc (cost=0.00..13486.95 rows=2252 width=365)
> (actual time=0.100..16.851 rows=1501 loops=1)"
> " SubPlan"
> " -> Limit (cost=0.00..5.91 rows=1 width=64) (actual
> time=0.006..0.007 rows=1 loops=1501)"
> " -> Index Scan using pg_type_oid_index on pg_type
> (cost=0.00..5.91 rows=1 width=64) (actual time=0.004..0.004 rows=1
> loops=1501)"
> " Index Cond: (oid = $0)"
> "Total runtime: 17.784 ms"
>
> We see that in the 2nd example Postgres starts with "Index Scan using
> pg_type_oid_index" (1501 iterations!).

No, what you see here is that the inner loop is the index-scan over
pg_type_oid. It's running a sequential scan on pg_proc and then runs
1501 index scans against pg_type.

> My understanding of SQL says me
> that the simplest (and, in this case - and probably in *most* cases -
> fastest) way to perform such queries is to start from main SELECT and
> then, when we already have rows from "main" table, perform "lite"
> sub-selects. So, I expected smth near 2.156 ms + 0.079 ms, but obtain
> 17.784 ms... For large table this is killing behaviour.

You've forgotten about the cost of matching up the two sets of rows.
Now, if the first part of the query outputs only one row then you might
be right, but I'm not sure that the SQL standard allows the subquery to
be delayed to that stage without explicitly organising the query that
way. From memory, the OFFSET/LIMIT takes place at the very end of the
query processing.

> What should I do to make Postgres work properly in such cases (I have
> a lot of similar queries; surely, they are executed w/o seqscans, but
> overall picture is the same - I see that starting from sub-selects
> dramatically decrease performance)?

Do you have a real example? That might be more practical.

--
Richard Huxton
Archonet Ltd

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Robins 2006-11-02 12:45:53 Locking vs. Exceptions
Previous Message Nikolay Samokhvalov 2006-11-02 11:07:19 Query plan for "heavy" SELECT with "lite" sub-SELECTs