Re: Improve output of BitmapAnd EXPLAIN ANALYZE

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Emre Hasegeli <emre(at)hasegeli(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve output of BitmapAnd EXPLAIN ANALYZE
Date: 2016-11-01 14:10:53
Message-ID: CA+TgmoaV+OG0kZfDFDFqrvhweLdrbJ8JOBAQGoUkzPXYmY__4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 1, 2016 at 9:46 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I don't like Tom's proposal of trying to fake up a value here when
>> EXPLAIN ANALYZE is in use. Reporting "exact" and "lossy" values for
>> BitmapAnd would be a fine enhancement, but artificially trying to
>> flatten that back into a row count is going to be confusing, not
>> helpful. (Just last week I saw a case where the fact that many pages
>> were being lossified caused a performance problem ... so treating
>> lossy pages as if they don't exist would have led to a lot of
>> head-scratching, because under Tom's proposal the row count would have
>> been way off.)
>
> It would very often be the case that the value I suggested would be exact,
> so this complaint seems off-base to me.

From my point of view, something that very often gives the right
answers isn't acceptable. We certainly wouldn't accept a query
optimization that very often gives the right answers. It's gotta
always give the right answer.

> If we were willing to add an additional output line, we could also report
> the number of lossy pages in the result bitmap, and people would then
> know not to trust the reported rowcount as gospel. But it's still useful
> to have it. I'm envisioning output like
>
> -> BitmapOr (cost=... rows=2000 width=0) (actual time=... rows=1942 loops=1)
>
> in the no-lossy-pages case, otherwise
>
> -> BitmapOr (cost=... rows=4000 width=0) (actual time=... rows=3945 loops=1)
> Lossy Bitmap: exact entries=2469, lossy pages=123
>
> There's nothing misleading about that, IMO. (Exercise for the reader:
> what rows/page estimate did I assume?)

(4000-2469)/123 = 12.44715 ?

I think it's inherently misleading to report values that were
concocted specifically for EXPLAIN ANALYZE. Things that we report
there should have some underlying reality or relevance. People -
including me - tend to assume they do, and you don't want to spend
time chasing down something that's PURELY an EXPLAIN ANALYZE artifact
with no actual relevance to the runtime behavior.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-11-01 14:14:03 Re: Patch: Implement failover on libpq connect level.
Previous Message Andres Freund 2016-11-01 14:05:45 Re: emergency outage requiring database restart