Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jian Guo <gjian(at)vmware(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Hans Buschmann <buschmann(at)nidsa(dot)net>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zhenghua Lyu <zlyu(at)vmware(dot)com>
Subject: Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500
Date: 2023-11-08 22:44:55
Message-ID: 891440.1699483495@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jian Guo <gjian(at)vmware(dot)com> writes:
> I found a new approach to fix this issue, which seems better, so I would like to post another version of the patch here. The origin patch made the assumption of the values of Vars from CTE must be unique, which could be very wrong. This patch examines variables for Vars inside CTE, which avoided the bad assumption, so the results could be much more accurate.

You have the right general idea, but there is nothing about this patch
that's right in detail. The outer Var doesn't refer to any particular
RTE within the subquery; it refers to a targetlist entry. You have to
drill down to that, see if it's a Var, and if so you can recurse into
the subroot with that Var. As this stands, it might accidentally get
the right answer for "SELECT * FROM foo" subqueries, but it will get
the wrong answer or even crash for anything that's not that.

The existing RTE_SUBQUERY stanza has most of what we need for this,
so I experimented with extending that to also handle RTE_CTE. It
seems to work, though I soon found out that it needed tweaking for
the case where the CTE is INSERT/UPDATE/DELETE RETURNING.

Interestingly, this does not change any existing regression test
results. I'd supposed there might be at least one place with a
visible plan change, but nope. Running a coverage test does show
that the new code paths are exercised, but I wonder if we ought
to try to devise a regression test that proves it more directly.

regards, tom lane

PS: please, please, please do not quote the entire damn thread
when replying. Trim it to just a minimum amount of relevant
text. You think people want to read all that again?

Attachment Content-Type Size
v2-0001-Examine-simple-variable-for-Var-in-CTE.patch text/x-diff 5.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Christensen 2023-11-08 22:47:06 Re: Moving forward with TDE [PATCH v3]
Previous Message Bruce Momjian 2023-11-08 21:34:57 Re: Wrong sentence in the README?