Re: Showing applied extended statistics in explain

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Showing applied extended statistics in explain
Date: 2021-07-29 14:09:54
Message-ID: 20210729140954.e5ptavuk6zpwysc6@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Tue, Jul 27, 2021 at 10:20:34PM +0200, Tomas Vondra wrote:
>
> >> 1) The information is stashed in multiple lists added to a Plan. Maybe
> >> there's a better place, and maybe we need to invent a better way to
> >> track the info (a new node stashed in a single List).
> >>
> >> ...
> >>
> >> 3) It does not work for functional dependencies, because we effectively
> >> "merge" all functional dependencies and apply the entries. Not sure how
> >> to display this, but I think it should show the individual dependencies
> >> actually applied.
> >>
> >> ...
> >>
> >> 5) It includes just statistics name + clauses, but maybe we should
> >> include additional info (e.g estimate for that combination of clauses).
> >
> > Yes, a new node would be nice to have. The other questions above are
> > somewhat related to what it should contain, and I guess it depends on
> > the use case this patch targets. E.g. for the case "help to figure out
> > if an extended statistics was applied" even "merged" functional
> > dependencies would be fine I believe.
>
> What do you mean by "merged" functional dependencies? I guess we could
> say "these clauses were estimated using dependencies" without listing
> which exact dependencies were applied.

Yes, that's exactly what I was thinking. From the original email I've
got an impression that in case of functional dependencies you plan to
display the info only with the individual dependencies (when
implemented) or nothing at all. By "merged" I wanted to refer to the
statement about "merge" all functional dependencies and apply the
entries.

> > More advanced plan troubleshooting may benefit from estimates.
>
> I'm sorry, I don't know what you mean by this. Can you elaborate?

Yeah, sorry for not being clear. The idea was that the question about including
"additional info" strongly depends on which use cases the patch tries to
address, and I follow up on that further. There is no more hidden detailed
meaning here :)

> > I've got few more questions after reading the patch:
> >
> > * Is there any particular reason behind choosing only some scan nodes to
> > display extended stats? E.g. BitmapHeapScan is missing.
> >
>
> All nodes that may apply extended stats to estimate quals should include
> this info. I guess BitmapHeapScan may do that for the filter, right?
>

Yes, something like this (stats output added by me):

Bitmap Heap Scan on public.tenk1
Output: unique1
Recheck Cond: (tenk1.unique1 < 1000)
Filter: (tenk1.stringu1 = 'xxx'::name)
Statistics: public.s Clauses: ((unique1 < 1000) AND (stringu1 = 'xxx'::name))
-> Bitmap Index Scan on tenk1_unique1
Index Cond: (tenk1.unique1 < 1000)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-07-29 14:29:13 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message Tom Lane 2021-07-29 13:57:12 Re: Out-of-memory error reports in libpq