Re: Bug with indexes on whole-row expressions

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Bug with indexes on whole-row expressions
Date: 2020-07-13 04:42:22
Message-ID: CAExHW5sisnxbU_uXu+-7G3gKQ5oOJpDVn50=8+mcrdoa3+1peg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 29, 2020 at 3:43 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
>
> CREATE TABLE boom (a integer, b integer);
>
> -- index on whole-row expression
> CREATE UNIQUE INDEX ON boom ((boom));
>
> INSERT INTO boom VALUES
> (1, 2),
> (1, 3);
>
> ALTER TABLE boom DROP b;
>
> TABLE boom;
>
> a
> ---
> 1
> 1
> (2 rows)
>
> REINDEX TABLE boom;
> ERROR: could not create unique index "boom_boom_idx"
> DETAIL: Key ((boom.*))=((1)) is duplicated.
>
> The problem here is that there *is* a "pg_depend" entry for the
> index, but it only depends on the whole table, not on specific columns.
>
> I have been thinking what would be the right approach to fix this:
>
> 1. Don't fix it, because it is an artificial corner case.
> (But I can imagine someone trying to exclude duplicate rows with
> a unique index.)
>
> 2. Add code that checks if there is an index with a whole-row reference
> in the definition before dropping a column.
> That feels like a wart for a special case.

Do we need to do something about adding a new column or modifying an
existing one. Esp. if the later changes the uniqueness of row
expression.
>
> 3. Forbid indexes on whole-row expressions.
> After all, you can do the same with an index on all the columns.
> That would open the question what to do about upgrading old databases
> that might have such indexes today.

This would be the best case. However, a whole row expression is not
necessarily the same as "all columns" at that moment. A whole row
expression means whatever column there are at any given point in time.
So creating an index on whole row expression is not the same as an
index on all the columns. However, I can't imagine a case where we
want an index on "whole row expression". An index containing all the
columns would always suffice and will be deterministic as well.

So this option looks best.

>
> 4. Add dependencies on all columns whenever a whole-row expression
> is used in an index.
> That would need special processing for pg_upgrade.

Again, that dependency needs to be maintained as and when the columns
are added and dropped.

--
Best Wishes,
Ashutosh Bapat

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2020-07-13 04:43:19 Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away
Previous Message Andrey V. Lepikhov 2020-07-13 04:41:12 Re: POC and rebased patch for CSN based snapshots