Re: bitmaps and correlation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: bitmaps and correlation
Date: 2020-11-27 20:48:58
Message-ID: 724657.1606510138@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> Other than that, and a quick pgdindent run, this seems ready to me. I'll
> mark it as Ready for Committer.

I dunno, this seems largely misguided to me.

It's already the case that index correlation is just not the right
stat for this purpose, since it doesn't give you much of a toehold
on whether a particular scan is going to be accessing tightly-clumped
data. For specific kinds of index conditions, such as a range query
on a btree index, maybe you could draw that conclusion ... but this
patch isn't paying any attention to the index condition in use.

And then the rules for bitmap AND and OR correlations, if not just
plucked out of the air, still seem *far* too optimistic. As an
example, even if my individual indexes are perfectly correlated and
so a probe would touch only one page, OR'ing ten such probes together
is likely going to touch ten different pages. But unless I'm
misreading the patch, it's going to report back an OR correlation
that corresponds to touching one page.

Even if we assume that the correlation is nonetheless predictive of
how big a part of the table we'll be examining, I don't see a lot
of basis for deciding that the equations the patch adds to
cost_bitmap_heap_scan are the right ones.

I'd have expected this thread to focus a whole lot more on actual
examples than it has done, so that we could have some confidence
that these equations have something to do with reality.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2020-11-27 22:01:36 Re: Log message for GSS connection is missing once connection authorization is successful.
Previous Message Tom Lane 2020-11-27 19:24:09 Re: Setof RangeType returns