From: | Stephen Frost <sfrost(at)snowman(dot)net> |
---|---|
To: | Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Improve output of BitmapAnd EXPLAIN ANALYZE |
Date: | 2016-10-21 17:30:10 |
Message-ID: | 20161021173010.GC13284@tamriel.snowman.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
* Jim Nasby (Jim(dot)Nasby(at)BlueTreble(dot)com) wrote:
> On 10/21/16 8:21 AM, Stephen Frost wrote:
> >Counting each page as the relation's average number of tuples per page
> >seems entirely reasonable to me, for what that is trying to report.
>
> My concern is that still leaves a lot of room for confusion when
> interpreting EXPLAIN ANALYZE. Every other node will tell you exactly
> what happened and it's pretty easy to reason about whether rows
> should have gone up or down based on the type of node. You can't do
> that for Bitmap(And|Or) unless you know the details of how
> TIDBitmaps work. Reporting N/A makes it crystal clear that these
> nodes operate very differently than all the others.
I don't see why you think the numbers reported by BitmapAnd based on
this approach wouldn't go up and down in a similar manner to what you
would expect to get, based on that node type. Reporting N/A is entirely
punting on it when we have perfectly useful information that can be
reported.
> (On a related note, it would also be nice if we reported fractional
> rows when the row count low and loops is high.)
I can certainly understand that, though I think I'd rather have an
actual 'total' value or similar instead, but that's really a different
discussion.
> >That said, I'm a big fan of how we have more detail for things like a
> >HashJoin (buckets, batches, memory usage) and it might be nice to have
> >more information like that for a BitmapAnd (and friends). In
> >particular, I'm thinking of memory usage, exact vs. lossy pages, etc.
> >Knowing that the bitmap has gotten to the point of being lossy might
> >indicate that a user could up work_mem, for example, and possibly avoid
> >recheck costs.
>
> I think that's the best way to handle this: report N/A in the header
> and then provide details on exact vs lossy. That provides a clear
> indication to users that these kinds of nodes are special, as well
> as a reminder as to why they're special. Certainly the node could
> report an exact rowcount in the header if there were no lossy pages
> too.
I don't see why we would want to stick 'N/A' in for the header, even if
we are reporting the details, when we can provide a pretty reasonable
number. In particular, I certainly don't think we would want to report
N/A sometimes (lossy case) and then an actual number other times (all
exact case). That strikes me as much more likely to be confusing.
Thanks!
Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2016-10-21 17:38:58 | Re: PSA: Systemd will kill PostgreSQL |
Previous Message | Robert Haas | 2016-10-21 17:29:42 | Re: Default setting for autovacuum_freeze_max_age |