Re: [PATCH] Erase the distinctClause if the result is unique by definition

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Erase the distinctClause if the result is unique by definition
Date: 2020-02-11 12:43:11
Message-ID: 20200211124311.GA14732@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 11, 2020 at 08:14:14PM +0800, Andy Fan wrote:
> On Tue, Feb 11, 2020 at 3:56 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> > >
> > > and if we prepare sql outside a transaction, and execute it in the
> > > transaction, the other session can't drop the constraint until the
> > > transaction is ended.
> >
> > And what if you create a view on top of a query containing a distinct
> > clause
> > rather than using prepared statements? FWIW your patch doesn't handle such
> > case at all, without even needing to drop constraints:
> >
> > CREATE TABLE t (a int primary key, b int not null, c int);
> > INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
> > CREATE UNIQUE INDEX t_idx1 on t(b);
> > CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
> > EXPLAIN SELECT * FROM v1;
> > server closed the connection unexpectedly
> > This probably means the server terminated abnormally
> > before or while processing the request.
> >
> >
> This error can be fixed with
>
> - num_of_rtables = bms_num_members(non_semi_anti_relids);
> + num_of_rtables = list_length(query->rtable);
>
> This test case also be added into the patch.
>
>
> > I also think this is not the right way to handle this optimization.
> >
>
> do you have any other concerns?

Yes, it seems to be broken as soon as you alter the view's underlying table:

=# CREATE TABLE t (a int primary key, b int not null, c int);
CREATE TABLE

=# INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
INSERT 0 2

=# CREATE UNIQUE INDEX t_idx1 on t(b);
CREATE INDEX

=# CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
CREATE VIEW

=# EXPLAIN SELECT * FROM v1;
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=2 width=4)
(1 row)

=# EXPLAIN SELECT DISTINCT b FROM t;
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=2 width=4)
(1 row)

=# ALTER TABLE t ALTER COLUMN b DROP NOT NULL;
ALTER TABLE

=# EXPLAIN SELECT * FROM v1;
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=2 width=4)
(1 row)

=# EXPLAIN SELECT DISTINCT b FROM t;
QUERY PLAN
-------------------------------------------------------------
Unique (cost=1.03..1.04 rows=2 width=4)
-> Sort (cost=1.03..1.03 rows=2 width=4)
Sort Key: b
-> Seq Scan on t (cost=0.00..1.02 rows=2 width=4)
(4 rows)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Victor Yegorov 2020-02-11 12:51:56 Re: open-source equivalent of golden-gate
Previous Message Andy Fan 2020-02-11 12:14:14 Re: [PATCH] Erase the distinctClause if the result is unique by definition