Re: Feature request for adoptive indexes

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Hayk Manukyan <manukyantt(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Feature request for adoptive indexes
Date: 2021-10-26 19:39:20
Message-ID: CALT9ZEHoH1PLQqh=YLtA67jtAsKpTWEY5i+7P2RzkyaZA60OCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've already answered OP but it seems in the wrong thread, so I copy it
here:

I think now in many cases you can effectively use covering index to have
fast index-only scans without index duplication. It will help if you don't
have great selectivity on the last column (most probably you don't). E.g.:

CREATE INDEX ON table_name (`job`,`nlp`,`year`) INCLUDE (`scan_id`,
`issue_flag`, `sequence`)

But I consider the feature can be useful when there is a very little
selectivity in the first index columns. I.e. if (job`,`nlp`,`year') has
many repeats and the most selection is done in the last column. I am not
sure how often this can arise but in general, I see it as a useful b-tree
generalization.

I'm not sure how it should be done. In my view, we need to add an ordered
posting tree as a leaf element if b-tree and now we have index storage only
for tuples. The change of on-disk format was previously not easy in nbtree
and if we consider the change, we need an extra bit to mark posting trees
among index tuples. Maybe it could be done in a way similar to deduplicated
tuples if some bits in the tuple header are still could be freed.

Thoughts?

If I get what you propose, you want to have a "top" tree for (job, nlp,
> year), which "splits" the data set into subsets of ~5000-7000 rows. And
> then for each subset you want a separate "small" trees on each of the
> other columns, so in this case three trees.
>
> Well, the problem with this is pretty obvious - each of the small trees
> requires separate copies of the leaf pages. And remember, in a btree the
> internal pages are usually less than 1% of the index, so this pretty
> much triples the size of the index. And if you insert a row into the
> index, it has to insert the item pointer into each of the small trees,
> likely requiring a separate I/O for each.
>
> So I'd bet this is not any different from just having three separate
> indexes - it doesn't save space, doesn't save I/O, nothing.
>

Tomas, I really think we should not try realizing this feature using
existing index pages that contain only tuples. You are right, it will cause
large overhead. If instead we decide and succeed in creating "posting
trees" as a new on-disk page entry type we can have an index with space
comparable to the abovementioned covering index but with sorting of values
in these trees (i.e. all values are sorted, and "key" ones).

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2021-10-26 19:43:30 Re: XTS cipher mode for cluster file encryption
Previous Message Mahendra Singh Thalor 2021-10-26 19:31:36 Re: Replication & recovery_min_apply_delay