Re: Redundant filter in index scan with a bool column

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Redundant filter in index scan with a bool column
Date: 2019-01-16 15:05:10
Message-ID: 27810.1547651110@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru> writes:
> The filter is not needed, why is it there? Turns out we can't recognize
> that the restriction clause 'b' and the index clause 'b = true' are
> equivalent.

Yeah, it's intentional that we don't get rid of the extra clause;
it doesn't really seem worth the expense and complexity to do so.
Indexes on bool columns are a pretty niche case in the first place.
Usually, if you are interested in just the rows where b = true,
you're better off using "where b" as an index predicate. In your
example, we can do this instead:

regression=# create index on t(i) where b;
CREATE INDEX
regression=# explain select * from t where i = 300 and b;
QUERY PLAN
------------------------------------------------------------------
Index Scan using t_i_idx on t (cost=0.12..24.19 rows=6 width=5)
Index Cond: (i = 300)
(2 rows)

resulting in a much smaller index, if the b=true condition is selective
enough to be worth indexing. Even in the case you showed, how much is
the redundant filter clause really costing?

> My first reaction was to patch operator_predicate_proof to
> handle this case, but there is a more straightforward way: mark the
> expanded index clause as potentially redundant when it is generated in
> expand_indexqual_conditions. There is already RestrictInfo.parent_ec
> that is used to mark redundant EC-derived join clauses. The patch
> renames it to rinfo_parent and uses it to mark the expanded index
> clauses.

That's an unbelievable hack that almost certainly breaks existing uses.

The approach of teaching predtest.c that "b = true" implies "b" would
work, but it seems a bit brute-force because ordinarily such cases
would never be seen there, thanks to simplify_boolean_equality having
canonicalized the former into the latter. The problem we have is that
indxpath.c re-generates "b = true" in indexscan conditions. Thinking
about it now, I wonder if we could postpone that conversion till later,
say do it in create_indexscan_plan after having checked for redundant
clauses. Not sure how messy that'd be.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Shay Rojansky 2019-01-16 15:11:36 Re: [PATCH] Allow UNLISTEN during recovery
Previous Message James Coleman 2019-01-16 14:49:23 Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's