Re: BUG #17220: ALTER INDEX ALTER COLUMN SET (..) with an optionless opclass makes index and table unusable

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, "postgresql(at)zr40(dot)nl" <postgresql(at)zr40(dot)nl>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: BUG #17220: ALTER INDEX ALTER COLUMN SET (..) with an optionless opclass makes index and table unusable
Date: 2021-10-14 02:07:21
Message-ID: YWeQ2UXvXUH/Gt4T@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Wed, Oct 13, 2021 at 05:20:56PM +0000, Bossart, Nathan wrote:
> AFAICT the fact that these commands can succeed at all seems to be
> unintentional, and I wonder if modifying these options requires extra
> steps such as rebuilding the index.

I was looking at all this business with more attention, and this code
block is standing out in analyze.c:
/*
* Now we can compute the statistics for the expression columns.
*/
if (numindexrows > 0)
{
MemoryContextSwitchTo(col_context);
for (i = 0; i < attr_cnt; i++)
{
VacAttrStats *stats = thisdata->vacattrstats[i];
AttributeOpts *aopt =
get_attribute_options(stats->attr->attrelid,
stats->attr->attnum);

stats->exprvals = exprvals + i;
stats->exprnulls = exprnulls + i;
stats->rowstride = attr_cnt;
stats->compute_stats(stats,
ind_fetch_func,
numindexrows,
totalindexrows);

/*
* If the n_distinct option is specified, it overrides the
* above computation. For indices, we always use just
* n_distinct, not n_distinct_inherited.
*/
if (aopt != NULL && aopt->n_distinct != 0.0)
stats->stadistinct = aopt->n_distinct;

MemoryContextResetAndDeleteChildren(col_context);
}
}

When computing statistics on an index expression, this code means that
we would grab the value of n_distinct from the *index* if set and
force the stats to use it, and not use what the parent table has. For
example, say:
create table aa (a int);
insert into aa values (generate_series(1,1000));
create index aai on aa((a+a)) where a > 500;
alter index aai alter column expr set (n_distinct = 2);
analyze aa; -- n_distinct forced to 2.0 for the index stats

This code comes from 76a47c0 back in 2010. In PG <= 12, this would
work, but that does not as of 13~. Enforcing n_distinct for index
attributes was discussed back when this code was introduced:
https://www.postgresql.org/message-id/603c8f071001101127w3253899vb3f3e15073638774@mail.gmail.com

This means that we've lost the ability to enforce n_distinct for
expression indexes for two years. But, do we really care about this
case? My answer to that would be "no" as long as we don't have a
documented grammar rather, and we don't dump them either. But I think
that we'd better do something with the code in analyze.c rather than
letting it just dead, and my take is that we should remove the call to
get_attribute_options() for this code path.

Any opinions? @Robert: you were involved in 76a47c0, so I am adding
you in CC.
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2021-10-14 02:55:51 Re: Inconsistent behavior of pg_dump/pg_restore on DEFAULT PRIVILEGES
Previous Message Peter Geoghegan 2021-10-14 01:26:20 Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-10-14 02:21:28 Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Previous Message Masahiko Sawada 2021-10-14 01:39:07 Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns