Re: Patch to support SEMI and ANTI join removal

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <jim(at)nasby(dot)net>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch to support SEMI and ANTI join removal
Date: 2014-09-11 15:35:02
Message-ID: CA+TgmoYHjYDP5Xo5pEXiEbSoiP8+t4YjB6Z9ZP21Y56c01NnoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 11, 2014 at 7:14 AM, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> Here's a quick demo, of the patch at work:
>
> test=# create table c (id int primary key);
> CREATE TABLE
> test=# create table b (id int primary key, c_id int not null references
> c(id));
> CREATE TABLE
> test=# create table a (id int primary key, b_id int not null references
> b(id));
> CREATE TABLE
> test=#
> test=# explain select a.* from a inner join b on a.b_id = b.id inner join c
> on b.c_id = c.id;
> QUERY PLAN
> -----------------------------------------------------
> Seq Scan on a (cost=0.00..31.40 rows=2140 width=8)
> Planning time: 1.061 ms
> (2 rows)

That is just awesome. You are my new hero.

> 1. I don't think that I'm currently handling removing eclass members
> properly. So far the code just removes the Vars that belong to the relation
> being removed. I likely should also be doing bms_del_member(ec->ec_relids,
> relid); on the eclass, but perhaps I should just be marking the whole class
> as "ec_broken = true" and adding another eclass all everything that the
> broken one has minus the parts from the removed relation?

I haven't read the patch, but I think the question is why this needs
to be different than what we do for left join removal.

> Assume there's a foreign key a (x) reference b(x)
>
> SELECT a.* FROM a INNER JOIN b ON a.x = b.x WHERE b.x = 1
>
> relation b should be removable because an eclass will contain {a.x, b.x} and
> therefore s baserestrictinfo for a.x = 1 should also exist on relation a.
> Therefore removing relation b should produce equivalent results, i.e
> everything that gets filtered out on relation b will also be filtered out on
> relation a anyway.
>
> I think the patch without this is still worth it, but if someone feels
> strongly about it I'll take a bash at supporting it.

That'd be nice to fix, but IMHO not essential.

> 3. Currently the inner join support does not allow removals using foreign
> keys which contain duplicate columns on the referencing side. e.g (a,a)
> references (x,y), this is basically because of the point I made in item 2.
> In this case a baserestrictinfo would exist on the referenced relation to
> say WHERE x = y.

I think it's fine to not bother with this case. Who cares?

> 4. The patch currently only allows removals for eclass join types. If the
> rel has any joininfo items, then the join removal is disallowed. From what I
> can see equality type inner join conditions get described in eclasses, and
> only non-equality join conditions make it into the joininfo list, and since
> foreign keys only support equality operators, then I thought this was a
> valid restriction, however, if someone can show me a flaw in my assumption
> then I may need to improve this.

Seems OK.

> 5. I've added a flag to pg_class called relhasfkey. Currently this gets set
> to true when a foreign key is added, though I've added nothing to set it
> back to false again. I notice that relhasindex gets set back to false during
> vacuum, if vacuum happens to find there to not be any indexes on the rel. I
> didn't put my logic here as I wasn't too sure if scanning pg_constraint
> during a vacuum seemed very correct, so I just left out the "setting it to
> false" logic based on the the fact that I noticed that relhaspkey gets away
> with quite lazy setting back to false logic (only when there's no indexes of
> any kind left on the relation at all).

The alternative to resetting the flag somehow is not having it in the
first place. Would that be terribly expensive?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-09-11 15:39:12 Re: Memory Alignment in Postgres
Previous Message Stephen Frost 2014-09-11 15:30:49 Re: Optimization for updating foreign tables in Postgres FDW