Re: Feature request for adoptive indexes

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Hayk Manukyan <manukyantt(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Feature request for adoptive indexes
Date: 2021-10-29 23:44:08
Message-ID: 39a8239e-b25d-d9ce-4935-4fb7fe10be7e@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/29/21 15:32, Hayk Manukyan wrote:
> Hi all
> First of all thank you all for fast and rich responses, that is really nice.
> I don't have that deep knowledge of how postgres  works under the hood
> so I will try to explain more user side.
> I want to refer for some points mentioned above.
>  - First INCLUDE statement mostly eliminates the necessity to refer to
> a clustered index or table to get columns that do not exist in the
> index. So filtering upon columns in INCLUDE statement will not be
> performant. It can give some very little performance if we include
> additional columns but it is not in level to compare with indexed one. I
> believe this not for this case
> - Tomas Vondra's Assumption that adaptive should be something between
> this two
> 1) (job, nlp, year, sequence)
> 2) (job, nlp, year, scan_id, issue_flag, sequence)
> is completely valid. I have made fairly small demo with this index
> comparison and as I can see the difference is noticeable. Here is git
> repo and results
> <https://github.com/HaykManukyanAvetiky/index_comparition/blob/main/results.md> ,
> I had no much time to do significant one sorry for that ))

I find those results entirely unconvincing, or maybe even suspicious.

I used the script to create the objects, and the index sizes are:

Name | Size
------------------------------------------+---------
job_nlp_year_scan_id_issue_flag_sequence | 1985 MB
job_nlp_year_sequence | 1985 MB

So there's no actual difference, most likely due to alignment making up
for the two smalling columns.

And if I randomize the queries instead of running them with the same
parameters over and over (see the attached scripts), then an average of
10 runs, each 60s long, the results are (after a proper warmup)

pgbench -n -f q4.sql -T 60

4 columns: 106 ms
6 columns: 109 ms

So there's like 3% difference between the two cases, and even that might
be just noise. This is consistent with the two indexes being about the
same size.

This is on machine with i5-2500k CPU and 8GB of RAM, which is just
enough to keep everything in RAM. It seems somewhat strange that your
machine does this in 10ms, i.e. 10x faster. Seems strange.

I'm not sure what is the point of the second query, considering it's not
even using an index but parallel seqscan.

Anyway, this still fails to demonstrate any material difference between
the two indexes, and consequently any potential benefit of the proposed
new index type.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
q6.sql application/sql 321 bytes
q4.sql application/sql 321 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2021-10-29 23:46:52 Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)
Previous Message Jeff Davis 2021-10-29 22:56:50 Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)