Re: Reducing planning time on tables with many indexes

From: Luc Vlaming Hummel <luc(dot)vlaming(at)servicenow(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Geier <geidav(dot)pg(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Reducing planning time on tables with many indexes
Date: 2022-08-08 12:29:22
Message-ID: 211D23CD-F7BF-482A-95A5-B0FE1C572B99@servicenow.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27.07.22, 18:39, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

[External Email]

David Geier <geidav(dot)pg(at)gmail(dot)com> writes:
> We tracked down the root cause of this slowdown to lock contention in
> 'get_relation_info()'. The index lock of every single index of every single
> table used in that query is acquired. We attempted a fix by pre-filtering
> out all indexes that anyways cannot be used with a certain query, without
> taking the index locks (credits to Luc Vlaming for idea and
> implementation). The patch does so by caching the columns present in every
> index, inside 'struct Relation', similarly to 'rd_indexlist'.

I wonder how much thought you gave to the costs imposed by that extra
cache space. We have a lot of users who moan about relcache bloat
already. But more to the point, I do not buy the assumption that
an index's set of columns is a good filter for which indexes are of
interest. A trivial counterexample from the regression database is

regression=# explain select count(*) from tenk1;
QUERY PLAN

--------------------------------------------------------------------------------
------------
Aggregate (cost=219.28..219.29 rows=1 width=8)
-> Index Only Scan using tenk1_hundred on tenk1 (cost=0.29..194.28 rows=100
00 width=0)
(2 rows)

It looks to me like the patch also makes unwarranted assumptions about
being able to discard all but the smallest index having a given set
of columns. This would, for example, possibly lead to dropping the
index that has the most useful sort order, or that has the operator
class needed to support a specific WHERE clause.

Thanks for checking out the patch!

Just to make sure we're on the same page: we're only making this assumption if you select no fields at all.
If you select any fields at all it will check for column overlap, and if there's any overlap with any referenced field,
then the index will not be filtered out.

For producing a row count with no referenced fields it is true that it should select the truly cheapest
index to produce the row count and there should be some Index-am callback introduced for that.
For now it was just a quick-and-dirty solution.
Wouldn't a callback that would estimate the amount of data read be good enough though?

For sort orders the field to sort by should be listed and hence the index should not be filtered out,
or what am I missing? Likely I've missed some fields that are referenced somehow (potentially indirectly),
but that shouldn't disqualify the approach completely.

In short, I'm not sure I buy this concept at all. I think it might
be more useful to attack the locking overhead more directly. I kind
of wonder why we need per-index locks at all during planning ---
I think that we already interpret AccessShareLock on the parent table
as being sufficient to block schema changes on existing indexes.

Could you elaborate as to why this approach is not good enough? To me it seems that avoiding work
ahead of time is generally useful. Or are you worried that we remove too much?

Unfortunately, as things stand today, the planner needs more than the
right to look at the indexes' schemas, because it makes physical accesses
to btree indexes to find out their tree height (and I think there are some
comparable behaviors in other AMs). I've never particularly cared for
that implementation, and would be glad to rip out that behavior if we can
find another way. Maybe we could persuade VACUUM or ANALYZE to store that
info in the index's pg_index row, or some such, and then the planner
could use it with no lock?

regards, tom lane

The thing you're touching on is specific for a btree. Not sure this generalizes to all index types that
are out there though? I could see there being some property that allows you to be "no-lock",
and then a callback that allows you to cache some generic data that can be transformed
when the indexopt info structs are filled. Is that roughly what you have in mind?

Best,
Luc

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-08-08 12:40:23 Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?
Previous Message Marcos Pegoraro 2022-08-08 11:34:08 Re: bug on log generation ?