Re: bitmaps and correlation

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: bitmaps and correlation
Date: 2021-01-28 12:51:10
Message-ID: CAD21AoCLF5kJO3yni_rYaEWG62HM_SB7R4NQKMXwpJ7_2dVz6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 28, 2020 at 5:49 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> > Other than that, and a quick pgdindent run, this seems ready to me. I'll
> > mark it as Ready for Committer.
>
> I dunno, this seems largely misguided to me.
>
> It's already the case that index correlation is just not the right
> stat for this purpose, since it doesn't give you much of a toehold
> on whether a particular scan is going to be accessing tightly-clumped
> data. For specific kinds of index conditions, such as a range query
> on a btree index, maybe you could draw that conclusion ... but this
> patch isn't paying any attention to the index condition in use.
>
> And then the rules for bitmap AND and OR correlations, if not just
> plucked out of the air, still seem *far* too optimistic. As an
> example, even if my individual indexes are perfectly correlated and
> so a probe would touch only one page, OR'ing ten such probes together
> is likely going to touch ten different pages. But unless I'm
> misreading the patch, it's going to report back an OR correlation
> that corresponds to touching one page.
>
> Even if we assume that the correlation is nonetheless predictive of
> how big a part of the table we'll be examining, I don't see a lot
> of basis for deciding that the equations the patch adds to
> cost_bitmap_heap_scan are the right ones.
>
> I'd have expected this thread to focus a whole lot more on actual
> examples than it has done, so that we could have some confidence
> that these equations have something to do with reality.
>

Status update for a commitfest entry.

The discussion has been inactive since the end of the last CF. It
seems to me that we need some discussion on the point Tom mentioned.
It looks either "Needs Review" or "Ready for Committer" status but
Justin set it to "Waiting on Author" on 2020-12-03 by himself. Are you
working on this, Justin?

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-01-28 12:51:51 Re: CREATE INDEX CONCURRENTLY on partitioned index
Previous Message Masahiko Sawada 2021-01-28 12:50:05 Re: [PATCH] remove deprecated v8.2 containment operators