Re: Fail Fast In CTAS/CMV If Relation Already Exists To Avoid Unnecessary Rewrite, Planning Costs

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fail Fast In CTAS/CMV If Relation Already Exists To Avoid Unnecessary Rewrite, Planning Costs
Date: 2020-12-18 02:45:26
Message-ID: CALj2ACUBWv+jROCfC=H4+TmoCQHQOMadVFgxivkRhFXZfySdaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 18, 2020 at 7:18 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Thu, Dec 17, 2020 at 03:06:59PM +0530, Bharath Rupireddy wrote:
> > The behavior of the ctas/cmv, in case the relation already exists is as
> > shown in [1]. The things that have been changed with the patch are: 1) In
> > any case we do not rewrite or plan the select part if the relation already
> > exists 2) For explain ctas/cmv (without analyze), now the relation
> > existence is checked early and the error is thrown as highlighted in [1].
> >
> > With patch, there is no behavioral change(from that of master branch) in
> > explain analyze ctas/cmv with if-not-exists i.e. error is thrown not the
> > notice.
> >
> > Thoughts?
>
> HEAD is already a mixed bad of behaviors, and the set of results you
> are presenting here is giving a similar impression. It brings in some
> sanity by just ignoring the effects of the IF NOT EXISTS clause all
> the time still that's not consistent with the queries not using
> EXPLAIN.

I tried to make it consistent by issuing NOTICE (not an error) even
for EXPLAIN/EXPLAIN ANALYZE IF NOT EXISTS case. If issue notice and
exit from the ExplainOneUtility, we could output an empty plan to the
user because, by now ExplainResultDesc would have been called at the
start of the explain via PortalStart(). I didn't find a clean way of
coding if we are not okay to show notice and empty plan to the user.

Any suggestions on achieving above?

> Hmm. Looking for similar behaviors, I can see one case in
> select_into.sql where we just never execute the plan when using WITH
> NO DATA but still show the plan, meaning that the query gets planned
> but it just gets marked as "(never executed)" if attempting to use
> ANALYZE.

Yes, with no data we would see "(never executed)" for explain analyze
if the relation does not already exist. If the relation does exist,
then the error/notice.

>There may be use cases for that as the user directly asked directly for an EXPLAIN.

IMHO, in any case checking for the existence of the relations
specified in a query is must before we output something to the user.
For instance, the query "explain select * from non_existent_tbl;"
where non_existent_tbl doesn't exist, throws an error. Similarly,
"explain create table already_existing_tbl as select * from
another_tbl;" where the table ctas/select into trying to create
already exists, should also throw error. But that's not happening
currently on master. Which seems to be a problem to me. So, with the
patch proposed here, we error out in this case.

If the user really wants to see the explain plan, then he/she should
use the correct query.

> Note: the patch needs tests for all the patterns you would like to
> stress. This way it is easier to follow the patterns that are
> changing with your patch and compare them with the HEAD behavior (like
> looking at the diffs with the tests of the patch, but without the
> diffs in src/backend/).

Sure, I will add test cases and post v3 patch.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2020-12-18 03:14:14 Re: Proposed patch for key managment
Previous Message Fujii Masao 2020-12-18 02:11:56 Re: STANDBY_LOCK_TIMEOUT may not interrupt ProcWaitForSignal()?