Re: Feature request for adoptive indexes

From: Hayk Manukyan <manukyantt(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Feature request for adoptive indexes
Date: 2021-11-01 12:24:38
Message-ID: CAF+kZOHed=__24BvtOLw4KGR+rcD=rEGUbMhXwvyw3VFfLHvRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I agree with the above mentioned.
The only concern I have is that we compare little wrong things.
For read we should compare
(job, nlp, year, sequence) AND (job, nlp, year, Scan_ID) and (job, nlp,
year, issue_flag ) VS (job, nlp, year, sequence, Scan_ID, issue_flag)
OR (job, nlp, year INCLUDE(sequence, Scan_ID, issue_flag) )
Because our proposed index for reading should be closer to a combination of
those 3 and we have current solutions like index on all or with Include
statement.
We should try to find a gap between these three cases.
For DML queries
(job, nlp, year, sequence, Scan_ID, issue_flag) OR (job, nlp, year
INCLUDE(sequence, Scan_ID, issue_flag) ) VS (job, nlp, year, sequence) AND
(job, nlp, year, Scan_ID) and (job, nlp, year, issue_flag )
Because again the proposed index should be just one and cover all 3
separate ones.

If you agree with these cases I will try to find a bigger time frame to
compare these two cases deeper.
The issue is not high prio but I strongly believe it can help and can be
nice feature for even more complicated cases.

Best regards.

вс, 31 окт. 2021 г. в 21:33, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>:

>
>
> On 10/31/21 16:48, Pavel Borisov wrote:
> > 4 columns: 106 ms
> > 6 columns: 109 ms
> >
> > So there's like 3% difference between the two cases, and even that
> > might
> > be just noise. This is consistent with the two indexes being about
> the
> > same size.
> >
> > I also don't think we can get great speedup in the mentioned case, so it
> > is not urgently needed of course. My point is that it is just nice to
> > have a multicolumn index constructed on stacked trees constructed on
> > separate columns, not on the index tuples as a whole thing.
>
> Well, I'd say "nice to have" features are pointless unless they actually
> give tangible benefits (like speedup) to users. I'd bet no one is going
> to implement and maintain something unless it has such benefit, because
> they have to weight it against other beneficial features.
>
> Maybe there are use cases where this would be beneficial, but so far we
> haven't seen one. Usually it's the OP who presents such a case, and a
> plausible way to improve it - but it seems this thread presents a
> solution and now we're looking for an issue it might solve.
>
> > At least there is a benefit of sparing shared memory if we don't need
> > to cache index tuples of several similar indexes, instead caching one
> > "compound index". So if someone wants to propose this thing I'd
> > support it provided problems with concurrency, which were mentioned
> > by Peter are solved.
> >
>
> The problem with this it assumes the new index would use (significantly)
> less space than three separate indexes. I find that rather unlikely, but
> maybe there is a smart way to achieve that (certainly not in detail).
>
> I don't want to sound overly pessimistic and if you have an idea how to
> do this, I'd like to hear it. But it seems pretty tricky, particularly
> if we assume the suffix columns are more variable (which limits the
> "compression" ratio etc.).
>
> > These problems could be appear easy though, as we have index tuples
> > constructed in a similar way as heap tuples. Maybe it could be easier if
> > we had another heap am, which stored separate attributes (if so it could
> > be useful as a better JSON storage method than we have today).
> >
>
> IMO this just moved the goalposts somewhere outside the solar system.
>
>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-11-01 12:47:17 Re: parallel vacuum comments
Previous Message Tomas Vondra 2021-11-01 12:22:04 Re: Added schema level support for publication.