Re: Removing unneeded self joins

From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Removing unneeded self joins
Date: 2023-10-10 19:29:58
Message-ID: 96f66ae3-df10-4060-9844-4c9633062cd3@yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

I have reviewed your patch and I noticed a few things.

First of all, I think I found a bug in your latest patch version, and
this query shows it:

EXPLAIN (COSTS OFF)
SELECT c.oid, e.oid FROM pg_class c FULL JOIN (
  SELECT e1.oid FROM pg_extension e1, pg_extension e2
  WHERE e1.oid=e2.oid) AS e
  ON c.oid=e.oid;

In the current version we get such a query plan:

               QUERY PLAN
-----------------------------------------
 Hash Full Join
   Hash Cond: (c.oid = e2.oid)
   ->  Seq Scan on pg_class c
   ->  Hash
         ->  Seq Scan on pg_extension e2
(5 rows)

But I think it should be:

QUERY PLAN
-----------------------------------------
Hash Full Join
Hash Cond: (c.oid = e2.oid)
-> Seq Scan on pg_class c
-> Hash
-> Seq Scan on pg_extension e2
*Filter: (oid IS NOT NULL)*
(6 rows)

I have looked at the latest version of the code, I assume that the error
lies in the replace_varno_walker function, especially in the place where
we check the node by type Var, and does not form any NullTest node.

if (OidIsValid(reloid) && get_attnotnull(reloid, attnum)) -- this
condition works
        {
          rinfo->clause = (Expr *) makeBoolConst(true, false);
        }
        else
        {
          NullTest   *ntest = makeNode(NullTest);

          ntest->arg = leftOp;
          ntest->nulltesttype = IS_NOT_NULL;
          ntest->argisrow = false;
          ntest->location = -1;
          rinfo->clause = (Expr *) ntest;
        }

Secondly, I added some code in some places to catch erroneous cases and
added a condition when we should not try to apply the self-join-removal
transformation due to the absence of an empty self-join list after
searching for it and in general if there are no joins in the query.
Besides, I added a query for testing and wrote about it above. I have
attached my diff file.

In addition, I found a comment for myself that was not clear to me. I
would be glad if you could explain it to me.

You mentioned superior outer join in the comment, unfortunately, I
didn't find anything about it in the PostgreSQL code, and this
explanation remained unclear to me. Could you explain in more detail
what you meant?

/*
 * At this stage joininfo lists of inner and outer can contain
 * only clauses, required for *a superior outer join* that can't
 * influence on this optimization. So, we can avoid to call the
 * build_joinrel_restrictlist() routine.
*/
 restrictlist = generate_join_implied_equalities(root, joinrelids,
                              inner->relids,
                              outer, NULL);

--

Regards,
Alena Rybakina

Attachment Content-Type Size
self-join-removal.diff text/x-patch 5.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2023-10-10 19:30:44 Re: Lowering the default wal_blocksize to 4K
Previous Message Bruce Momjian 2023-10-10 19:14:34 Re: document the need to analyze partitioned tables