Re: Removing unneeded self joins

From: Hywel Carver <hywel(at)skillerwhale(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>
Subject: Re: Removing unneeded self joins
Date: 2021-03-12 09:05:06
Message-ID: 161553990603.28624.7702383984090736980.pgcf@coridan.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: not tested

I've built and tested this, and it seems to function correctly to me. One question I have is whether the added "IS NOT NULL" filters can be omitted when they're unnecessary. Some of the resulting plans included an "IS NOT NULL" filter on a non-nullable column. To be clear, this is still an improvement (to me) without that.

Here's the simple test script I ran, on master ("before") and with the latest patch applied ("after").

CREATE TABLE users (id BIGINT PRIMARY KEY, nullable_int BIGINT UNIQUE, some_non_nullable_int BIGINT NOT NULL);
CREATE VIEW only_some_users AS (SELECT * FROM users WHERE id < 10);
CREATE VIEW some_other_users AS (SELECT * FROM users WHERE id > 3);

EXPLAIN SELECT *
FROM users u1
INNER JOIN users u2
ON u1.id = u2.id;
-- before does HASH JOIN
-- after does seq scan with "id IS NOT NULL" condition

EXPLAIN SELECT *
FROM only_some_users
INNER JOIN some_other_users
ON only_some_users.id = some_other_users.id;
-- before does HASH JOIN
-- after does no JOIN, instead does scan, with an extra "id IS NOT NULL condition" (in addition to id < 10, id > 3)

EXPLAIN SELECT *
FROM users u1
INNER JOIN users u2
ON u1.nullable_int = u2.nullable_int;
-- before does HASH JOIN
-- after does scan with (nullable_int IS NOT NULL) filter

EXPLAIN SELECT *
FROM users u1
INNER JOIN users u2
ON u1.id = u2.nullable_int;
-- before does HASH JOIN
-- after correctly unchanged

EXPLAIN SELECT *
FROM users u1
INNER JOIN users u2
ON u1.id = u2.some_non_nullable_int
INNER JOIN users u3
ON u2.some_non_nullable_int = u3.id;
-- before does 2x HASH JOIN
-- now does 1x HASH JOIN, with a sequential scan over users filtered by id IS NOT NULL

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2021-03-12 09:16:18 Re: documentation fix for SET ROLE
Previous Message Peter Smith 2021-03-12 08:59:20 Re: [HACKERS] logical decoding of two-phase transactions