Re: Removing unneeded self joins

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Removing unneeded self joins
Date: 2018-11-08 05:59:19
Message-ID: CAKJS1f8-TuxfYMewFoBicGDtac1dK8RhbYzXgyzgW6Ej5U70sg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19 October 2018 at 01:47, Alexander Kuzmenkov
<a(dot)kuzmenkov(at)postgrespro(dot)ru> wrote:
> Here is a version that compiles.

I had a quick read through this and I think its missing about a 1-page
comment section detailing when we can and when we cannot remove these
self joins, and what measures we must take when we do remove them.

Apart from that, I noted the following during my read:

1. I don't think this is the right way to do this. There are other
places where we alter the varnoold. For example:
search_indexed_tlist_for_var(). So you should likely be doing that too
rather than working around it.

@@ -166,10 +166,13 @@ _equalVar(const Var *a, const Var *b)
COMPARE_SCALAR_FIELD(vartypmod);
COMPARE_SCALAR_FIELD(varcollid);
COMPARE_SCALAR_FIELD(varlevelsup);
- COMPARE_SCALAR_FIELD(varnoold);
- COMPARE_SCALAR_FIELD(varoattno);
COMPARE_LOCATION_FIELD(location);

+ /*
+ * varnoold/varoattno are used only for debugging and may differ even
+ * when the variables are logically the same.
+ */
+

2. Surely the following loop is incorrect:

for (i = toKeep->min_attr; i <= toKeep->max_attr; i++)
{
int attno = i - toKeep->min_attr;
toKeep->attr_needed[attno] = bms_add_members(toKeep->attr_needed[attno],
toRemove->attr_needed[attno]);
}

What if toRemove has a lower min_attr or higher max_attr?

3. "wind" -> "find"

+ * When we wind such a join, we mark one of the participating relation as

4. I think the following shouldn't be happening:

+------------------------------------------------
Result
One-Time Filter: false
-(2 rows)
+ -> Index Scan using parent_pkey on parent x
+ Index Cond: (k = 1)
+(4 rows)

5. I'd have thought the opposite. Surely there are more chances of
this being useful with more joins?

+ /* Limit the number of joins we process to control the quadratic behavior. */
+ if (n > join_collapse_limit)
+ break;

6. In remove_self_joins_one_level() I think you should collect the
removed relations in a Relids rather than a list.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2018-11-08 06:00:29 Re: Adding a TAP test checking data consistency on standby with minRecoveryPoint
Previous Message Amit Kapila 2018-11-08 05:52:53 Re: New function pg_stat_statements_reset_query() to reset statistics of a specific query