Re: Feature request for adoptive indexes

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Hayk Manukyan <manukyantt(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Feature request for adoptive indexes
Date: 2021-11-02 14:04:14
Message-ID: CALT9ZEHLqF+xgNV=KrvS3ex8-bhDq83-dUy69G6rp+AjVyE6Rw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

вт, 2 нояб. 2021 г. в 16:04, Hayk Manukyan <manukyantt(at)gmail(dot)com>:

> Tomas Vondra
> > Are you suggesting those are not the actual best/worst cases and we
> > should use some other indexes? If yes, which ones?
>
> I would say yes.
> In my case I am not querying only sequence column.
> I have the following cases which I want to optimize.
> 1. Select * from Some_table where job = <somthing> and nlp = <something>
> and year = <something> and *scan_id = <something> *
> 2. Select * from Some_table where job = <somthing> and nlp = <something>
> and year = <something> and *Issue_flag = <something> *
> 3. Select * from Some_table where job = <somthing> and nlp = <something>
> and year = <something> and *sequence = <something> *
> Those are queries that my app send to db that is why I said that from *read
> perspective* our *best case* is 3 separate indexes for
> *(job, nlp, year, sequence)* AND *(job, nlp, year, Scan_ID)* and *(job,
> nlp, year, issue_flag)* and any other solution like
> (job, nlp, year, sequence, Scan_ID, issue_flag) OR (job, nlp, year )
> INCLUDE(sequence, Scan_ID, issue_flag) OR just (job, nlp, year) can be
> considered as* worst case *
> I will remind that in real world scenario I have ~50m rows and about *~5k
> rows for each (job, nlp, year )*
>

So you get 50M rows /5K rows = 10K times selectivity, when you select on
job = <somthing> and nlp = <something> and year = <something> which is
enormous. Then you should select some of the 5K rows left, which is
expected to be pretty fast on bitmap index scan or INCLUDE column
filtering. It confirms Tomas's experiment

pgbench -n -f q4.sql -T 60

106 ms vs 109 ms

fits your case pretty well. You get absolutely negligible difference
between best and worst case and certainly you don't need anything more than
just plain index for 3 columns, you even don't need INCLUDE index.

From what I read I suppose that this feature indeed doesn't based on the
real need. If you suppose it is useful please feel free to make and post
here some measurements that proves your point.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-11-02 14:11:39 Re: pgbench bug candidate: negative "initial connection time"
Previous Message Tomas Vondra 2021-11-02 14:03:06 Re: Feature request for adoptive indexes