Re: [PATCH] Erase the distinctClause if the result is unique by definition

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Erase the distinctClause if the result is unique by definition
Date: 2020-02-11 02:57:26
Message-ID: CAKU4AWq1q0+qSpcEMon_3yC1XX2o=H_OmOxWBeQhjR6GaWCeBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:

>
>
>>
>> [PATCH] Erase the distinctClause if the result is unique by
>> definition
>>
>
> I forgot to mention this in the last round of comments. Your patch was
> actually removing distictClause from the Query structure. Please avoid
> doing that. If you remove it, you are also removing the evidence that this
> Query had a DISTINCT clause in it.
>

Yes, I removed it because it is the easiest way to do it. what is the
purpose of keeping the evidence?

>
>
>>
>>
>> However the patch as presented has some problems
>> 1. What happens if the primary key constraint or NOT NULL constraint gets
>> dropped between a prepare and execute? The plan will no more be valid and
>> thus execution may produce non-distinct results.
>>
>> Will this still be an issue if user use doesn't use a "read uncommitted"
>> isolation level? I suppose it should be ok for this case. But even
>> though
>> I should add an isolation level check for this. Just added that in the
>> patch
>> to continue discussing of this issue.
>>
>
> In PostgreSQL there's no "read uncommitted".
>

Thanks for the hint, I just noticed read uncommitted is treated as read
committed
in Postgresql.

> But that doesn't matter since a query can be prepared outside a
> transaction and executed within one or more subsequent transactions.
>

Suppose after a DDL, the prepared statement need to be re-parsed/planned
if it is not executed or it will prevent the DDL to happen.

The following is my test.

postgres=# create table t (a int primary key, b int not null, c int);
CREATE TABLE
postgres=# insert into t values(1, 1, 1), (2, 2, 2);
INSERT 0 2
postgres=# create unique index t_idx1 on t(b);
CREATE INDEX

postgres=# prepare st as select distinct b from t where c = $1;
PREPARE
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = 1)
(2 rows)
...
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = $1)
(2 rows)

-- session 2
postgres=# alter table t alter column b drop not null;
ALTER TABLE

-- session 1:
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------------------
Unique (cost=1.03..1.04 rows=1 width=4)
-> Sort (cost=1.03..1.04 rows=1 width=4)
Sort Key: b
-> Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = $1)
(5 rows)

-- session 2
postgres=# insert into t values (3, null, 3), (4, null, 3);
INSERT 0 2

-- session 1
postgres=# execute st(3);
b
---

(1 row)

and if we prepare sql outside a transaction, and execute it in the
transaction, the other session can't drop the constraint until the
transaction is ended.

> --
> Best Wishes,
> Ashutosh Bapat
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-02-11 03:12:29 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Amit Kapila 2020-02-11 02:31:34 Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager