Re: Postgres not using GiST index in a lateral join

From: Igor Stassiy <istassiy(at)gmail(dot)com>
To: Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Postgres not using GiST index in a lateral join
Date: 2015-03-04 21:18:57
Message-ID: CAKVOjeyJ0eH-KcanfZEjwNeLZpK51+F8EJfOi1k3hkYUR_Y+mw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I would like to stop executing the query for a row of table "a" when a
single row of "b" is found. This query would not stop
processing but will filter all the rows that are found at the end of
execution.

Is there a way to express this without a subquery?

On Wed, Mar 4, 2015 at 11:53 PM Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
wrote:

> Stop writing so many subqueries, think in joins; the poor planner!
>
> SELECT DISTINCT ON (a.id) a.id AS a_id, b.id AS b_id
> FROM a
> JOIN b
> ON ST_Contains(b.shape, a.shape)
> WHERE b.kind != 1
>
> Also, the DISTINCT ON syntax (PgSQL custom) lets you winnow a result
> set down to just one of the inputs.
>
> P.
>
>
> On Wed, Mar 4, 2015 at 6:36 AM, Igor Stassiy <istassiy(at)gmail(dot)com> wrote:
> > Hello,
> >
> > I have a query plan optimization question. It is formatted nicely on
> >
> > http://stackoverflow.com/questions/28856452/postgres-
> not-using-gist-index-in-lateral-join
> >
> > But here is a copy for the archive:
> >
> > Here is the setup:
> >
> > CREATE EXTENSION postgis;
> > DROP TABLE IF EXISTS A;
> > DROP TABLE IF EXISTS B;
> > CREATE TABLE A(shape Geometry, id INT);
> > CREATE TABLE B(shape Geometry, id INT, kind INT);
> > CREATE INDEX ON A USING GIST (shape);
> > CREATE INDEX ON B USING GIST (shape);
> >
> > I am running the following commands:
> >
> > ANALYZE A;
> > ANALYZE B;
> >
> > -- for each row in A, select exactly one row in B (if there is one)
> > -- such that B contains geometry of A
> > EXPLAIN SELECT * FROM (SELECT A.id as aid, (SELECT B.id FROM B WHERE
> > ST_Contains(B.shape, A.shape) AND B.kind != 1 LIMIT 1) as bid FROM A) AS
> > TMP;
> >
> > which gives me { "Plan": { "Node Type": "Seq Scan", "Relation Name": "A",
> > "Startup Cost": 0.00, "Total Cost": 2606592.33, "Plan Rows": 549745,
> "Plan
> > Width": 1646, "Plans": [ { "Node Type": "Limit", "Parent Relationship":
> > "SubPlan", "Subplan Name": "SubPlan 1", "Startup Cost": 0.00, "Total
> Cost":
> > 4.68, "Plan Rows": 1, "Plan Width": 8, "Plans": [ { "Node Type": "Seq
> Scan",
> > "Parent Relationship": "Outer", "Relation Name": "B", "Startup Cost":
> 0.00,
> > "Total Cost": 4.68, "Plan Rows": 1, "Plan Width": 8, "Filter": "((shape
> &&
> > A.shape) AND _st_contains(shape, A.shape))" } ] } ] } }
> >
> >
> > Note that there is a sequential scan inside the lateral join, however
> there
> > is clearly an index available. However after setting
> >
> > set enable_seqscan=false;
> >
> > the index is being used. This actually affects runtime significantly
> (around
> > 3 times faster) and seems that postgres should figure things like that
> > automatically. { "Plan": { "Node Type": "Seq Scan", "Relation Name": "A",
> > "Startup Cost": 10000000000.00, "Total Cost": 10004716493.85, "Plan
> Rows":
> > 549450, "Plan Width": 1677, "Plans": [ { "Node Type": "Limit", "Parent
> > Relationship": "SubPlan", "Subplan Name": "SubPlan 1", "Startup Cost":
> 0.00,
> > "Total Cost": 8.52, "Plan Rows": 1, "Plan Width": 8, "Plans": [ { "Node
> > Type": "Index Scan", "Parent Relationship": "Outer", "Scan Direction":
> > "NoMovement", "Index Name": "B_shape_idx", "Relation Name": "B", "Startup
> > Cost": 0.00, "Total Cost": 8.52, "Plan Rows": 1, "Plan Width": 8, "Index
> > Cond": "(shape && A.shape)", "Filter": "_st_contains(shape, A.shape)" }
> ] }
> > ] } }
> >
> > Is there any way to tell postgres to use index in a less hacky way?
> Possibly
> > by rewriting the query? From what I understand the use of set enable_...
> is
> > not recommended in production.
> >
> > When you actually run the commands above it will give
> >
> > { "Plan": { "Node Type": "Seq Scan", "Relation Name": "a", "Alias": "a",
> > "Startup Cost": 0.00, "Total Cost": 10372.75, "Plan Rows": 1230, "Plan
> > Width": 36, "Plans": [ { "Node Type": "Limit", "Parent Relationship":
> > "SubPlan", "Subplan Name": "SubPlan 1", "Startup Cost": 0.14, "Total
> Cost":
> > 8.41, "Plan Rows": 1, "Plan Width": 4, "Plans": [ { "Node Type": "Index
> > Scan", "Parent Relationship": "Outer", "Scan Direction": "NoMovement",
> > "Index Name": "b_shape_idx", "Relation Name": "b", "Alias": "b", "Startup
> > Cost": 0.14, "Total Cost": 8.41, "Plan Rows": 1, "Plan Width": 4, "Index
> > Cond": "(shape && a.shape)", "Filter": "((kind <> 1) AND
> _st_contains(shape,
> > a.shape))" } ] } ] } }
> >
> > Unfortunately I cannot provide data to reproduce the query plan results.
> >
> > Thanks,
> > Igor
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message AI Rumman 2015-03-05 05:10:23 Postgresql CIFS
Previous Message Paul Ramsey 2015-03-04 19:53:41 Re: Postgres not using GiST index in a lateral join